There are many accomplishments one can be proud of. It might be getting a good grade, scoring a goal in a sport, or getting a job they wanted. For me, the list is extensive, but it comes down to two of my proudest achievements. Creating this non-profit “Let’s Do Our Part”, focused on spreading global awareness and impact around our social responsibility; and writing two research papers.
Research papers are what we understand is after college, or that last senior thesis as you head off to the workforce. Being given the opportunity to do so through a summer course, made me stressed but also really proud I did something so large as young as a rising junior in high school. My research papers focused on two things, water potability and air quality. Through a summer course, a group of five to six students, including myself, worked with Prof. Goldsztein of Georgia Tech, to learn machine learning and how to utilize machine learning and data science in a broader topic of our interest. I was also able to work with Ms. Davida Kollmar from New York University who assisted me with programming the models.
When questioned what we were gonna apply machine learning to, for our final project, i.e. the research paper, I was surprised to hear that many knew what they wanted to do. Some said medical related topics, one said fantasy football, and one said AI, but I was still not sure. But, what I noticed was that most of my interests focused on the environment and how people dealt with change within the environment and how we utilize it for ourselves. So I started to look into datasets on an interface called Kaggle, that compiles a broad variety of datasets that are easy to use by anyone. Through that I found a water potability dataset, specifically the data focused on the impact of pollutants on the potability, or usage of the water source by people. Capturing various water samples and the complex relationships between different chemical and physical parameters affecting potability, machine learning techniques hold great potential in increasing efficiency and accuracy. Through the use of Python coding, specifically pandas, and applying it to this binary classification problem, I came to a conclusion. Using artificial neural networks I was able to determine the significant parameters that affected the potability of water sources. Out of the nine features, or parameters, it came down to Hardness, Chloramines, and Organic Carbon.
The next summer, prior to my last year of high school, I was so excited to gain another opportunity to work with Prof. Goldsztein again, on another paper focusing on machine learning. This time I was able to learn a little more machine learning and data science, which allowed me to realize how it could be used for other datasets, such as an air quality dataset. The research project focused on applying machine learning techniques to assess air quality. Data set from Kaggle, pertaining to air pollution collected from urban cities in India, was analyzed. Python based analytics was implemented using Google Collaboratory and I was able to determine the specific factors that impact air quality, as well as prediction accuracy.
I am amazed to see how machine learning and data science could help in the future, and makes me excited to see how far my efforts can take me.
