Methodology – A new Chinatown: Demographics, business landscapes evolve in Chicago’s 11th Ward

By Grace Xue and Tianshu Hu
Medill Reports

The datasets used in this report were downloaded from the U.S. Census Bureau, the Chicago Metropolitan Agency for Planning (CMAP) and the Chicago Data Portal. These datasets are all publicly available. 

Three different types of data were obtained from these official agencies and authorities. The U.S. Census Bureau provides detailed demographic data about race and ethnicity. We specifically examined the population of Chinese ethnicity in ZIP codes 60608, 60609, 60616 and 60632. CMAP gathered data from the 2017-2021 American Community Survey five-year estimates, which is a survey conducted by the U.S. Census Bureau. This survey provides information about the languages spoken in households, which can indicate the Chinese population since most Chinese speakers are Chinese. The percentage of non-English languages spoken can also reflect the diversity of a neighborhood. The Chicago Data Portal contains a dataset named “Chicago Chinatown Chamber of Commerce New Businesses Ward 11,” which includes all current and active business licenses issued by the Department of Business Affairs and Consumer Protection and are members of the Chicago Chinatown Chamber of Commerce.

To streamline the analysis process, we standardized the formats and removed unwanted columns. For example, when cleaning the business license data from the Chicago Chinatown Chamber of Commerce, we deleted rows with registration IDs and selected only the registration date among other dates to simplify the dataset. For the Census Bureau’s population data, we combined different downloaded CSV files into one Google sheet and added different ZIP codes as columns to consolidate the data. We also used OpenRefine to remove duplicates of business names, eliminate spaces and ensure the numbers were in the same format. We used the sorting and PivotTable functions in Google Sheets for further data analysis and utilized Flourish to create visualizations.

There are no major ethical concerns with our data since none of it involves personal information about individuals or households. All three datasets come from credible government agencies: the U.S. Census Bureau, the Chicago Metropolitan Agency for Planning and the Department of Business Affairs and Consumer Protection of the City of Chicago. The census and CMAP data only reflect large-scale statistics, such as the number of people, median household income and languages spoken. Although the Ward 11 business dataset contains detailed license numbers and dates, they are intended to be transparent for consumers, and we do not use this information in our analysis. We only analyze the business license dataset to observe trends and changes over time in business numbers and types.

One limitation of the Ward 11 business dataset is that it only contains 387 members who joined the Chinatown Chamber of Commerce, while there are 1,260 businesses in total in Ward 11. We selected the dataset limited to chamber members because many chain businesses, such as CVS, Starbucks and Chipotle, are not owned by or catering to Chinese immigrants, which could dilute the results when analyzing local business trends in Chinatown. Our interviewed source, executives from the Chinatown Chamber of Commerce, indicated they could only speak about the insights and patterns of businesses that are chamber members. However, it is important to be aware there are other businesses in the greater Chinatown area that have not been included in our analysis, which future stories can investigate further.


Data Diary

  1. Data Collection
  • All raw data are in CSV or PDF format and imported into Google Sheets.
  • U.S. Census Bureau
    • Go to filter menu and filter out ZIP codes 60608, 60609, 60616 and 60632 which represent neighborhoods of Armour Square, Bridgeport, Brighton Park, McKinley Park and Archer Heights.
    • Look for DP05 sheet (ACS Demographic and Housing Estimates) for specific Chinese population.
    • Select years 2017, 2018, 2019, 2020, 2021 and 2022 to download the according data.
  • CMAP
    • Go to CMAP Community Data Snapshots
    • Download PDF of Armour Square, Bridgeport, Brighton Park, McKinley Park and Archer Heights.
    • For each neighborhood PDF, use Tabula to extract table of “Language Spoken at Home and Ability to Speak English, 2017-2021” on page 4, “Household Income, 2017-2021” on page 5, and “Language Spoken at Home and Ability to Speak English, Over Time” on page 12.
  • Chicago Data Portal
    • Download Chicago Chinatown Chamber of Commerce New Businesses Ward 11 as CSV format.
  1. Data Cleaning

We clean the datasets by deleting unnecessary column for the story to make it easier for sorting and filtering. Below are steps we took for cleaning and the data elements we kept.

  • Chinese Population from Census Bureau
    • Only keep RACE section.
    • Highlight Asian and Chinese rows.
    • Copy-paste Chinese data of each year to a separate sheet to see change over time.
  • CMAP
    • Only keep data of “English Only”, “Spanish”, “Chinese”, “Total Non-English” and “Speak English Less than ‘Very Well’”.
  • Ward 11 Business License
    • Only keep “Site Number”, “Legal Name”, “Doing Business as Name”, “Address”, “Zip Code”, “License Description”, “Date Issued”, “Latitude” and “Longitude”.
    • Use “Year(serial_number)” to turn specific date to a year to only see “Year Issued”.
    • Create pivot table 1 of the count of license issued every year
      • Having rows as “year issued” (order by ascending, sort by year issued), and values as “year issued” (summarize by COUNTA, show as default).
    • Create pivot table 2 of the count of type of license issued in total.
      • Having rows as “license description” (order by ascending, sort by license description”), and values as license description (summarize by COUNTA, show as default).


Data Dictionary

We have different data dictionaries for different datasets; please see the first sheet for each dataset.


CMAP Data_language spoken

Chamber_Ward11 Business