NLP & AI FOR CIVIC TECH

Block Party: A Platform to Explore NYC Community Board Meetings

Local policy information more accessible and byte-sized

Sarah June Sachs
5 min readMar 1, 2021
Explore how civic-technology can bring together Community Board meetings in New York City.

Block Party seeks to bridge the information gap between Community Boards and the people they represent. Key to hyperlocal democracy, what is discussed, prioritized, and passed at these meetings can affect your day-to-day life. Topics range from education, transportation, sanitation, public safety, economic development, housing, parks, and how your neighborhood is coping with Covid-19.

But, have you ever attended a Community Board meeting?

Maybe you know what Community District you live in, but don’t have the time (or appetite) to attend several 1–3 hour-long meetings. Across the city, there should be a way to explore the topics and issues at the forefront of these neighborhood meetings. This is why we built Block Party, our goal is to make local policy information accessible and byte-sized.

The purpose of block party is to make local policy information accessible and byte-sized.

With vast responsibilities, Community Boards advocate new initiatives, advise on permits, approve land-use and zoning policy, allocate budgets, and since the Covid-19 lockdown — their meetings became accessible in new ways.

Due to the city-wide shutdown, civic engagement is more virtual. Community Boards have adapted and turned to Zoom, Webex, Facebook, and YouTube to host their meetings live and publish the recordings online. Aligned with the principles of the New York Open Meetings Law, these government bodies provide public access to what was said at their meetings.

Out of the city’s 59 Community Boards, we found 31 districts host meetings on YouTube. In addition to driving an increase in attendance levels, we saw a new opportunity to process the video’s closed-caption text into a full transcript, in order to share the meeting conversation with a wider audience.

53% of NYC Community Board meetings are available on YouTube

Over the past few months, we developed a pipeline to transform the raw text from the YouTube recordings into a full transcript, meeting highlights, and topic classification. With open-source tools in Python, we leverage Natural Language Processing (NLP) and Artificial Intelligence (AI) to create a structured dataset of meeting information that can help explore the conversations throughout New York City at a local level.

To date, we have more than 1,300 meeting transcripts available in our public archive, collected from any district with a YouTube channel representing all five boroughs since the start of the pandemic. We process and add about 20–30 meetings each week.

We automatically tag each meeting with a topic category. Our taxonomy framework was inspired by the priorities listed in the NYC Department of City Planning’s Community District Profiles.

A Community Board meeting can be tagged with the following topics:

Human Services, Employment, Youth, Education, Health, Safety, Zoning, Landmarks, Housing, Commercial Development, Land Use, Quality of Life, Transportation, Infrastructure, Parks, Waterfront, Budget, Equity, Arts and Culture, Technology, Police, Utilities, Elections, Libraries

If your Community Board shares their meeting on YouTube and has the closed-caption text-enabled, we can share the meeting highlights and full transcript. Filter by date, location, or topic to view the meeting conversation. You can also subscribe to your Community Board so it’s easier to stay in the loop. On a weekly basis, we send an auto-generated email with quotes from the most recent meeting.

Because people might not know which Community Board district they live in, we visualize the GIS Community Districts from NYC Open Data to provide a map of each Community Board. Our web application prompts a user to click on a specific Community District or search by address to select their district.

We continue to improve our process and update our database with each week of meetings. Our collection of transcript data can also be analyzed to find a signal throughout NYC Community Board meetings. At Open Data Week, a partnership led by BetaNYC and the Mayor’s Office of Data Analytics, we will present our findings with a case study about the trends in topics we found about transportation.

We are looking to connect with Data Enthusiasts, Policymakers, Researchers, and the Civic-Tech community. We are open to sharing insight found in Community Board meetings and brainstorm how this data can be further used, for civic engagement, local democracy, and community building.

In the rest of this post, we will provide more information about how we built the tool.

Transcript Generation Process

Get Transcript

First, we gather the text from YouTube based on the speech-to-text transcription for each NYC Community Board’s channel and meeting recording available.

Format Text

The raw text from YouTube is just a list of phrases and a duration timestamp. It lacks functional grammar, such as sentence structure, punctuation, and capitalization of proper nouns. In order to read like a transcript, we transform the text into sentences.

We do not alter the recorded dialog in any way, outside of fixing spelling errors and removing the words ‘um’ or ‘uh’ to improve the overall flow and readability.

We show how many times we identified and removed the words “um” and uh” from the full transcript.

Interestingly enough, YouTube is not yet familiar with the word “Covid-19” and misspells it almost every time it is transcribed. Some notable spelling errors we have added to our pipeline include “cova da 19”, “kobit”, “cobid”, and ”coca-19". We search and replace these mishaps with the string “Covid-19”.

Generate Summary

Next, we generate a high-level summary by extracting key sentences from the full transcript.

Share Meeting

Lastly, we process the summary, full transcript, and meeting metadata into a structured database that feeds our front-end web application and weekly email delivery. We host all of the transcript data on our website.

We hope you found this post helpful — get in touch if you’d like to continue the conversation.

Please subscribe to your Community Board or follow us on Twitter, we highlight and share timely quotes from meeting transcripts.

--

--