Keywords

Planning methods have been the source of much discussion over the past few decades. Practitioners and researchers have examined what methods planning schools teach and how these methods are used in practice. The suite of traditional methods courses taught in planning programs—inferential statistics, economic cost-benefit analysis, sampling, and research design for policy evaluation—remains largely stagnant, despite the rapidly changing reality in which planners are expected to work. Although the focus of this paper is on the impact of big data for planning methods, other variables have also contributed to the need for additional methods to tackle planning problems. The rise of ubiquitous computing and a hyper-connected communication network as well as new private investment in data collection have created an environment in which greater amounts of data exist than ever before. The ability of the planner to analyze and use this data is no longer limited by computing power or the cost of data collection, but by the knowledge that planners possess to employ data analytics and visualization techniques.

Educating planners with skills that are useful for practice has been a key tenant of many planning programs over the years. Several studies have been conducted to understand how well planning programs are succeeding at this goal or not. Surprisingly, the most recent comprehensive investigation of planning education and skills demanded by practitioners was conducted in 1986. In this survey, four important conclusions were identified as relevant to how planners were being educated and the professional skills they would be required to use (Contant and Forkenbrock 1986). They found that the methods taught in planning programs remained highly relevant to the methods needed for practicing planners, and the authors concluded based on their survey results that planning educators were adequately preparing their students to solve planning problems in practice. They cited communication skills (writing and speaking) and analysis and research design as critical components of planning education and practice, but noted that educators needed to remain vigilant on seeking relevance (Contant and Forkenbrock 1986). The article also identified several changes that were occurring throughout the 1980s that affected the planning profession—the rise of micro-computing and the expansion of methods being offered by planning schools. Contant and Forkenbrock (1986) wrote “…there is little to suggest that planning schools are overemphasizing analytic methods, nor do they appear to be failing to any real extent in meeting the demands of practitioners interviewed. While more techniques are required than these practitioners feel that all planners should understand, it certainly is arguable that this situation is not at all bad.” That survey of methods is now nearly 30 years old, and new realities exist that require educators to revise and expand the scope of methods taught in planning schools (Sawicki and Craig 1996; Goodspeed 2012).

Despite wide acknowledgement of the changing data landscape, planning curricula still resemble their traditional form. Kaufman and Simons completed a follow-up to this investigation which surveyed planning programs specifically on methods and research design. The more limited focus on this 1995 study “revealed a rather surprising lack of responsiveness among planning programs over time to practitioner demand for [quantitative research methods]” and that “planning programs do not seem to teach what practitioners practice, and not even what practitioners should practice” (Kaufman and Simons 1995). In a 2002 study focused on the use of technology within planning programs, Urey claims that the haphazard approach with which planning programs have introduced the use of technology to serve larger goals (research, analysis, modeling) might be problematic as increased microcomputing power becomes more widespread. While manual techniques serve learning objectives within planning methods courses, the use of technology is now required (Urey 2002). This leaves planning educators today with two questions relevant to big data and methods: what new methods must we now include in our curriculum, and what technology must students understand to employ these methods in an ethical, accurate, and precise way? Given these questions, we reviewed current methods requirements at planning schools to assess whether or not planning programs have begun to respond to these questions and adapt to the changing data landscape.

In a non-scientific review of methods taught at the top ten planning schools (as listed by Planetizen in 2014 [http://www.planetizen.com/education/planning]), we discovered that almost all programs require that planners be trained in statistics, economic cost-benefit analysis, and research design. Of the programs reviewed, including MIT, Cornell, Rutgers, UC Berkley, University of Illinois Urbana Champaign, UNC Chapel Hill, University of Southern California, Georgia Institute of Technology, UCLA, and University of Pennsylvania, none required students to seek additional data analysis courses outside of the planning department. Although the review of these programs was not scientific and limited to information published online for prospective students, it does suggest that planning education has yet to see value in teaching planners methods widely adopted in the fields of computer science and engineering. We argue, as Contant and Forkenbrok did 30 years ago, that maintaining the relevance of planning education to planning practice is important. Contant and Forkenbrok reminded educators to be vigilant in their understanding of skills that are in demand for practitioners—yet we have failed to do this in regards to our methods curricula.

The one big exception to the static nature of planning methods offerings is geographic information systems (GIS). Almost all of the top programs include a required course on GIS or include a significant section on GIS as a portion of a required methods course. This technology, once the province of a subset of computing nerds, has spilled out of the methods sequence and permeated the curriculum. It is now common to see planning students using GIS as a part of land use, housing, transportation and economic development courses. The adoption and use of GIS has been the most sweeping change in planning methods curriculum over the past 30 years. For a discussion of this history and how this technology is evolving, see Drummond and French (2008).

Big data, although currently a popular topic, is not new—and the concept of big data dates back to 2001, when industry analyst Doug Laney articulated the definition of big data as any data set that was characterized by the three Vs: Volume, Velocity and Variety (Laney 2001). Big data sets are characterized by containing a large number of observations, streaming and fast speed and requiring real time analytics. Big data sets are also usually mixed format combining both structured and unstructured data, joined by a common field such as time or location. In sum, any data sets that are too large and complex to process using conventional data processing applications can be defined as big data.

Several pioneers in the industry have already started to process and analyze big data (Lohr 2012, Cuzzocrea et al. 2011). For instance, UPS now tracks 16.3 million packages per day for 8.8 million customers, with an average of 39.5 million tracking requests from customers per day. The company stores more than 16 petabytes of data. Through analyzing those datasets, UPS is able to identify real time on-road traffic conditions, daily package distribution patterns and together with the latest real time GIS mapping technology, the company is able to optimize the daily routes for freight. With all the information from big data, UPS has already achieved savings in 2011 of more than 8.4 million gallons of fuel by cutting 85 million miles off of daily routes (Davenport and Dyché 2013). IBM teamed up with researchers from the health care field to use big data to predict outbreaks of dengue fever and malaria (Schneider 2013). It seems that big data, together with advanced analysis and visualization tools, can help people from a wide variety of industries explore large, complex data sets and reveal patterns that were once very difficult to discover. Given the increasing use of big data across fields that share interests with the field of city planning, planners should more deliberately explore and develop methods for using big data to develop insights about cities, transportation patterns and the basic patterns of urban metabolism.

Data analytics, as a powerful tool to investigate big data, is becoming an interdisciplinary field. There are new programs at universities across the United States that aim to teach students how to grapple with big data and analyze it using various analytic tools. For this paper, we collected and reviewed some common tools and skills that are taught in data analytics courses. We gathered course information from John Hopkins, Massachusetts Institute of Technology, University of Washington, and Georgia Institute of Technology. We noted that machine learning/data mining and data visualization are the tools that are frequently taught in these programs to prepare students to handle big data and some of them are actually quite new to urban planners.

Machine learning is a core subarea of artificial intelligence. Machine learning uses computer algorithms to create explanatory models. There are different types of learning approaches, including supervised learning, unsupervised learning, and reinforcement learning. Although some of the terminologies may be completely new to planners, the actual methods turn out to be quite familiar. For example, the regression model is one of the methods that is frequently used in supervised learning process. Planners who work with remote sensing images often apply supervised classification methods to reclassify the images into land cover images based on various color bands in the image. However, planners may not be familiar with other machine learning methodologies or algorithms, such as unsupervised learning and reinforcement learning. Unsupervised learning tries to identify regularities (or clusters or groupings) in the input datasets without correct output values provided by the supervisors. Reinforcement learning is primarily used in applications where the output of the system is a sequences of actions (e.g. playing chess). In this case, what’s important is not a single action, but a sequence of actions that will achieve the ultimate goal. When machine learning methods are applied to large databases, such as big data, it is often called data mining. Data mining tries to identify and construct a simple model with high predictive accuracy, based on the large volume of data. The model is then applied to predict future values. This is the kind of projection that planners have been doing for years with less sophisticated methods.

Most of the programs we reviewed also include data visualization components to help identify patterns in the data and communicate the results of data analysis. Some data visualization techniques, such as multivariate data representations, table and graph designs are quite conventional. However, those techniques may also be applied in innovative ways to help convey information behind data in a clearer manner. One example is the information graphics or infographics, which improve human cognition by utilizing graphics to improve the visual system’s ability to extract patterns and trends (Smiciklas 2012; Few 2009). The latest trend in data visualization is to take the advantage of webs to present data in an interactive way. To effectively present big data interactively, the designer needs to be equipped with knowledge regarding how human beings interact with computers, and how different interaction types (i.e. filtering, zooming, linking, and brushing) will affect human being’s cognition ability. In the example below, viewers can interact with data generated from Foursquare check-ins across Manhattan (Williams 2015). These interactive visualizations can be used on both big, and small data, but allowing interaction allows for more data to be presented to viewers (Fig. 1).

Fig. 1
figure 1

Example of interactive data visualization from Here Now

In addition to the core courses, these new interdisciplinary programs require the students to master at least one programming or query language. SQL is a popular requisite and, in a survey on tools for data scientists, over 71 % of respondents used SQL (King and Magoulas 2013). Some programs also require students to understand and use open statistics software, such as R and R studio.

While these methods for analyzing data may seem somewhat out of place within a planning methods framework, they actively seek to create ways in which researchers can describe, explore, and explain data. These categories of data analysis are described in depth in Earl Babbie’s Survey Research Methods (1990). This text serves as one of many fundamental introductions to methods for planners, and by grouping the new suite of tools available to planners and data scientists within these categories, planners can see how these tools might be useful to them. For example, data visualization is one of the key ways in which data scientists are exploring big data sets (Few 2009). Data visualization acknowledges that our typical methods of data exploration (descriptive statistics, graphing, and the like) are ill-equipped to handle larger data sets, and even less equipped to communicate information derived from those data sets to the public and to decision makers. By introducing planners to the growing field of data visualization, we can expand their ability to not only to use larger data set’s but to communicate the information garnered from those data sets. As the basis for research, exploration of data sets will allow planners to ask additional questions. These additional questions will require explanatory analysis, and within this group of methods, tools such as machine learning and data mining can help planners generate predictive models from larger data sets.

Many of the data sets that planners will deal with in the future will be big data. Credit card data or web browsing histories may help planners to predict the focus of emerging public concerns. As a matter of fact in MIT’s big data courses, there is a case study regarding how to utilize the Google search records to estimate the trends within the real estate industry (MIT 2014). Social media, such as Twitter and Facebook, have already become powerful information sources regarding almost every aspect of social life. Analysis of twitter feeds can help to identify the extent and intensity of hazard events. There are already studies on how to utilize information extracted from Facebook’s friend list to forecast the use of airplanes. GPS or real time transportation information can help planners to calibrate and develop more accurate activity based travel demand models to forecast future travel patterns. Moreover, the real time information about energy flows such as water, sewer, and electricity flows may equip planners with critical information to design more energy efficient and sustainable cities to make built environment more resilient to natural hazards and climate change. Planning is characterized by its special affinity for place-based issues, and this focus on place will be one of the critical ways in which typical data sets can become “big data.” Location is the ultimate relational field, and our ability to link data sets through location will create big data sets that are especially useful to planners. If location is the ultimate relational connector, then planning data sets will only continue to increase in size, speed, and complexity in the future. The importance of teaching planners how to effectively and accurately examine and explore this data cannot be understated, yet, our work to prepare this paper leads us to believe that planning programs have not yet taken the steps required to introduce these methods to planning students.

Big data analysis tools, such as machine learning and data visualization, can help planners to make better use of the big data sets. The Memphis Police Department has used machine learning and data mining approaches to predict potential crime based on past crime events. As a result, the serious crime rate was reduced by approximately 30 %. The city of Portland, Oregon optimized their traffic signals based on big traffic data, and was able to reduce more than 157,000 metric tons of CO2 emissions in 6 years (Hinssen 2012). In sum, the machine learning techniques can help planners to analyze the future development of urban areas in a more accurate way to solve current problems and eliminate or at least ease some the impacts of new development. The explanatory power of machine learning will be critical for planners seeking to use big data to solve long-term challenges in cities and communities.

Data visualization has always been considered useful in the planning process, primarily as a communication method. However, it is now a critical tool for exploring large, complex data sets. Data visualization can help planners better understand how people live, work and behave within urban context. When paired with more explanatory tools such as machine learning, data visualization becomes a critical tool in the planning process. Visualization can also continue to be used as a way for planners to convey their planning concepts to corresponding stakeholders during the public participation process. In this way, visualization is used as an interpretation toolkit to help people digest the complex analysis results from big data. Planners continue to be more comfortable using traditional graphs, tables, and animation images to visualize their results. However, some planners are now using more advanced web based tools to display the information in interactive ways to encourage public participation. This trend has been on the rise for some time, and the demand for practitioners with visualization skills continues to increase (Few 2009; Sawicki and Craig 1996; Goodspeed 2012).

We argue in this paper that planners would benefit greatly from the introduction of more advanced methods of descriptive, exploratory, and explanatory data analysis in order to more effectively use an ever increasing amount of available data. When considering adding new methods to the planning curriculum, there is always the question of what will be displaced from the existing curriculum. We would urge planning educators to review their current methods carefully to see if the current offering are suitable as we move from a data poor environment to one of data abundance, At the very least, planning programs should strive to make all students aware of big data and give them some introduction to the means and methods of analyzing this data. This basic overview may be sufficient for the generalist planner, with more in depth training in big data available those who want it. This is similar to the model that was initially followed with respect to GIS—all planning students were given some basic GIS skills and vocabulary so they could communicate with spatial analysis specialists. All planning students should get some exposure to big data and its analytical techniques, but some should be able to develop more depth and the ability to collaborate with data scientists.

Two key issues for additional research emerged as we prepared this paper. The field of planning is inherently place-based, and it, therefore, has the potential to take many types of data and transform it into big data by linking mixed format information into databases based on location This suggests that planning can draw upon all types of data that is location based, including cell phone locations, license plate readers, infrastructure sensors, drone videos, and building performance data. The challenge will be how to build a theoretical framework that will allow planners to use this wealth of information. Second, the field of planning is predominantly concerned with the long-term. To date most big data applications have been used to provide insights into short term challenges. As planners, we need to be asking a larger question that relates to not just what methods can be used to analyze this data, but how this data can be employed in our search for long-term solutions. How can minute-by-minute Twitter text analysis related to planning issues allow us to reframe planning issues for years to come? How does real time transportation data help us understand how to shape transportation systems for the next generation? We did not set out to answer these questions in this paper, but we do believe that posing them will help frame the discussion of planning methods for the next generation of planning students and practitioners.

Big data represents an exciting new asset for planners who have always struggled to explore and explain patterns and trends based on limited observations of discrete data. We should make the best use of this data by giving planners the tools with which to analyze it, understand it and communicate it. Like others who have written on the topic of big data in cities, we do caution that data should not be used for data’s sake. Planners are tasked with a more complex task that our data science colleagues: we must find ways in which to use the data to make existing communities better and to provide better solutions than were previously available (Sawicki and Craig 1996; Mattern 2013). In order to help planners achieve these goals, we must revamp the methods offerings in our planning programs to take full advantage of the new world of large, fast moving, ubiquitous data.