Hacking Data to Serve a Community

What in the world is a data hack-a-thon? A traditional “hack-a-thon” involves several programmers coming together, writing code to solve a problem Fundamentally, a data hack-a-thon entails a group of data professionals coming together (with a lot of pizza!) to solve a problem for a local non-profit organization, but using data and analysis rather than coding.

In anticipation of the event, we needed to identify a benefactor, understand their problem, and gather the data sets needed. SEI-Cincinnati partnered with St. Aloysius – an organization and school offering psychiatric services to community youth to solve the problem of student “no-show” rate: almost as high as 30%. This no-show rate for their services meant they couldn’t serve more of the community.

During the hack-a-thon, we broke up into 5 groups consisting of SEI, St. Aloysius employees, and community data professionals. Each group asked “Why?”, and to answer the question, utilized Tableau, a common BI reporting tool, to analyze private and public data prepared in advance. We found some data quality issues and had to learn quickly about how St. Aloysius operates. Data quality issues are common in data analysis and can sometimes take up most (90%+) of an analyst’s time to resolve. One example we faced was correct categorization – making sure that each visit was tagged with the appropriate category and determining how to handle the visits 3+ years ago before categorization was always captured. A second example is using age vs. birthday – a student record sometimes has both age and birthday listed, but age is only relevant when the record is created or updated. It was decided that when doing analysis on the age of students, it is better to derive age at the time of the visit by using birthday.

As the evening progressed, each team covered a variety of analysis topics, including age and gender, different program types, geographic areas of the students, and teacher impact. Some educational programs have a vastly different no-show rate than others, mainly between group and one-on-one sessions (the former having a worse no-show rate). In addition, the students that live in neighborhoods directly around St. Aloysius tended to be more likely to no-show than other neighborhoods, which was surprising. Everyone agreed as to how meaningful and enriching the experience was. The director at St. Aloysius became tearful when talking about how much our efforts meant to her and the youth they serve. Over twenty St. Aloysius employees attended the evening event, showing just how much, they cared and valued the group’s efforts.

After the event, we consolidated and prepared the analysis findings in a report for the St. Aloysius staff and board of directors. This could be used by the St. Aloysius employees to brainstorm measures to improve the no-show rate. The main takeaway from the St. Aloysius team was their newfound ability to be more self-sufficient with their data and analysis, both in the importance of capturing quality data and the methods of analyzing it effectively. The employees were also empowered and excited to brainstorm and implement improvements based on data, with the ability to report on the quantifiable results. SEI helped set them up with a good reporting environment and tips to maintain data quality for accurate reporting. In addition, we also used this study and data to expand our practical knowledge of machine learning by creating a featured data set to feed into a neural network and develop a prediction model for clients at risk of no-shows.

As organizations, both public and private, for-profit and non-profit, realize the value in their data to help drive their business, SEI professionals are excited and passionate about diving in to help, either professionally or in a volunteer capacity. Tools and technology are becoming more generally available and affordable, and there are rapid innovations in machine learning and analytics to support near-real-time decision-making. The foundational data concepts are still important though – data accessibility and quality lead to better analysis and prediction.

Data hack-a-thons are growing in popularity city-wide, as other Cincinnati-based corporations and organizations work to sponsor some of their own, with case studies such as crime rates and public transportation routes for schools. Hopefully, this is a concept that continues to advance and help us serve our community!

For more information, check us out on the local news!

St. Aloysius partners with local firm to determine why they have so many no-shows – WCPO Cincinnati, OH

Lauren McDonald

About Lauren McDonald