Avatar

Community-Engaged Data Science

Spring 2024

College of the Atlantic

Community-Engaged Data Science

Real-world applications of data science can serve the public good and help students develop transferable skills. In this project-based course students will work collaboratively with community partners to collect, visualize, analyze, and communicate data for a term-long data science project. The projects identified by community partners may have long histories or be in their infancy and each will have different data needs and goals.

This course emphasizes putting knowledge into practice, including going beyond individual fields of study to solve real world problems and understand community partner needs. Students will build skills in project management, using agile methodologies and frequent meetings with community partners designed to foster co-development and iterative and incremental project delivery. Students will also develop and improve their communication of data analysis projects and build skills in reproducible analysis and collaboration using modern programming tools and techniques. In addition to developing their statistical and programming skills, students will build qualities valued by employers, such as teamwork, reproducible analysis, effective communication, independent thinking, and problem solving.

This course is intended to appeal to a wide range of students and create an opportunity for students to do collaborative and advanced project work. Through the course, students will be exposed to a range of scientific ways of knowing and doing, helping students to cultivate an interdisciplinary perspective on what data science can do. Evaluation will be based on contributions to the process and final product of their team’s term-long community project, participation in skills workshops, and progress on their personal development plan.

Timetable

Afternoons: Project Work. Morning: Optional Tutorials.

Monday 1-4

Sprint planning and project work

Wednesday

Help Session Drop In

Thursday 1-4

Project work, workshop and/or meetings with community partners

Term Schedule

Class Activities

Week
Date
Activity
Readings and Assignments
Week One Monday 4/1 1-4 Welcome to Community-Engaged Data Science
Wednesday 4/3 4-6 Drop-in help session with Shea
Thursday 4/4 1-4 Complexity and Project Planning Workshop - Kevin Callahan
  • Watch the pre-workshop video
Week Two Wednesday 4/10 1-2:30 Intro to Agile and Project Management
Wednesday 4/10 4-6 Drop-in help session with Shea
Thursday 4/11 1-4 Project Time (Snow date for Complexity Workshop)
  • PDP plan due
  • Team Contract due
Week Three Monday 4/15 1-4 Sprint Planning and Project Time
Wednesday 4/17 4-6 Drop-in help session with Shea
Thursday 4/18 1-4 Partner Check-in, Sprint Retrospective and Planning
Week Four Monday 4/22 1-4 Survey Analysis Workshop with Sahir Advani
  • Submit the first iteration of your project on google classroom
  • Submit your Sprint Retrospective on google classroom
Wednesday 4/24 4-6 Drop-in help session with Shea
Thursday 4/25 1-4 Reproducible Reporting Using Quarto with Jadey Ryan
Week Five Monday 4/29 1-4 Project Time
Wednesday 5/1 4-6 Drop-in help session with Shea
Thursday 5/2 1-4 Partner Check-in, Sprint Retrospective and Planning
Week Six Monday 5/6 1-4 Project Time
  • Submit the second iteration of your project on google classroom
  • Submit your Sprint Retrospective on google classroom
Wednesday 5/8 4-6 Project Time
Thursday 5/9 1-4 Project Time and Code Review
Week Seven Monday 5/13 1-4 Project Time
Wednesday 5/15 4-6 Drop-in help session with Shea
Thursday 5/16 1-4 Partner Check-in, Sprint Retrospective and Planning
Saturday 5/18 All Day Field Trip to Penobscot River
Week Eight Monday 5/20 1-4 Sprint Planning and Project Time
  • Submit the third iteration of your project on google classroom
  • Submit your Sprint Retrospective on google classroom
Wednesday 5/22 4-6 Project Time
Thursday 5/23 1-4 Creating interactive applications with RShiny with Johnathan Evanilla
Week Nine Monday 5/27 1-4 Sprint Planning and Project Time
Wednesday 5/29 4-6 Drop-in help session with Shea
Thursday 5/30 1-4 Final Presentations
Week Ten Monday 5/3 1-4 Project Time
Wednesday 5/5 4-6 Drop-in help session with Shea
Thursday 6/6 1-4 Developing your data science cover letter and CV with Jeffry Neuhouser

Welcome to Community-Engaged Data Science!

Projects

*

Developing a Comprehensive Monitoring Program for Community-wide Energy Efficiency Upgrades

Maine is the most heating-oil dependent state in the country, with 58% of homes using fuel oil as the primary source of heat (EIA). …

Maine River Herring Network

River herring and shad (RH/S) are monitored by different entities coast wide, and it is challenging to get a big-picture perspective on …

Maritime life in Frenchman Bay in the late 19th century

Who was Capt. Freeland Bunker and what changes did he experience in Frenchman Bay from the 1870s to early 1900s?

Monitoring Water Quality in the Penobscot River

The Pαnawαhpsekewi (people of where the rocks widen), also known as the Penobscot Nation, are deeply connected to their relative, the …

Syllabus

Course Description

Real-world applications of data science can serve the public good and help students develop transferable skills. In this project-based course students will work collaboratively with community partners to collect, visualize, analyze, and communicate data for a term-long data science project. The projects identified by community partners may have long histories or be in their infancy and each will have different data needs and goals.

This course emphasizes putting knowledge into practice, including going beyond individual fields of study to solve real world problems and understand community partner needs. Students will build skills in project management, using agile methodologies and frequent meetings with community partners designed to foster co-development and iterative and incremental project delivery. Students will also develop and improve their communication of data analysis projects and build skills in reproducible analysis and collaboration using modern programming tools and techniques. In addition to developing their statistical and programming skills, students will build qualities valued by employers, such as teamwork, reproducible analysis, effective communication, independent thinking, and problem solving.

This course is intended to appeal to a wide range of students and create an opportunity for students to do collaborative and advanced project work. Through the course, students will be exposed to a range of scientific ways of knowing and doing, helping students to cultivate an interdisciplinary perspective on what data science can do. Evaluation will be based on contributions to the process and final product of their team’s term-long community project, participation in skills workshops, and progress on their personal development plan.

The course content is organized in four units:

  • Week 1 to 2 - Project Planning and Scoping: This unit is an introduction to the projects and toolkit of the course. Students will define and identify key project milestones, brainstorm ideas and approaches to solving the problem, become familiar with project management techniques and plan the project tasks.
  • Week 3 to 8 - Exploring and analyzing the data: This unit focuses on exploring the data and getting iterative feedback from partners.
  • Week 9 - Final Project Presentations After completing the analysis phase, students present their findings and next steps to partners.
  • Week 10 - Project Hand Over write up the project in a final report and organize code and repositories to hand off the project.

Learning Objectives

This course is designed as a community learning journey. Together, we will:

  • Use research skills to integrate and apply multiple forms of knowledge to an issue of interest to a community partner.
  • Demonstrate community leadership skills as a collaborator that shares strengths, builds weaknesses, and contributes to a broader shared understanding.
  • Gain experience in data collection, wrangling, visualization, exploratory data analysis, predictive modeling and effective oral and written communication of results to audiences beyond the College of the Atlantic.
  • Build skills valued by employers including teamwork, reproducible analysis, effective communication, independent thinking, and problem solving.
  • Identify areas for personal growth and develop and implement a personal development plan to work towards individual and collaborative goals related to the project.

It is also my hope that in this course you:

  • Cultivate an interdisciplinary understanding of what it means to “do research”, including embracing multiple ways of knowing and doing beyond those of your discipline and the academic environment.
  • Develop an appreciation for reproducibility, transparency, accessibility and inclusivity in data collection, analysis, and communication.
  • Build knowledge and skills in data science skills to tackle questions that are important to you.
  • Engage and reflect on contemporary issues in environmental and social justice related to your digital world, community and positionality.

Course Model

The central goal of the course is to work with community partners on a jointly-defined research project of benefit to the wider Maine community. While there are specific deliverables for each project — bi-weekly oral presentations, a final written report, and anything else agreed upon with the partner — this course also requires deep listening, respectful as well as thoughtful feedback, and collaborative engagement within the classroom. You will be evaluated both on the final products you produce and the process by which you get there — your individuals contributions to the project, to group dynamics, and to the collaborative environment of mutual support within the classroom.

Grades

A growing body of research indicates that traditional approaches to grading fail to produce the sorts of meaningful learning desired by both teachers and students. Such approaches often reinforce inequitable power dynamics between teaches and students, promote faulty reward systems that disincentivize creativity and risk-taking, and devalue important aspects of learning (including revision and feedback). Given this context, instead of a traditional approach to grading in which you do work that is evaluated singularly by me, this course assumes that you out to take ownership or and responsibility over your performance and engagement with the class. To make this happen, this course uses a “contract grading” scheme, which gives you a voice in the grading process, provides you with the agency to specify your intended course performance, and then share in the responsibility for evaluating whether or not you fulfilled your intended obligations. Please see the contract grading document (on Google Classroom) for a more-fleshed-out explanation of this approach and how it will operate in the course.

Course Requirements:

How you choose to engage in this course—both in and out of class—is central to the success of your project and our collective learning. As such, participation constitutes a substantial part of the expectations in this course. Much of this will require individual reflection on your part, and be documented through a midterm and end-of-course peer- and self-evaluation. I will spend time talking about these expectations during class.

Group project:

Four groups of 2-3 students each will undertake projects with our community partners. Each group will determine how to structure their project; what forms of information are necessary and how to gather them; how to allocate and organize tasks; and how to produce the “deliverables” agreed upon with the community partner.

  • Project plan: In the first two weeks of the course, each project group develops a written plan in consultation with their community partner, the instructors, and their peers. The plan should include a clear statement of the particular questions you will address, how you will address them, and project milestones you will work towards. As part of your consultation with project partners, you will also develop a kanban board for the project that breaks the project down into specific tasks that will be added to throughout the project. Kanban boards are a common tool used by data science and software development teams to manage projects. You will use the kanban board to brainstorm, plan and allocate work - both temporally and among yourselves - to complete the project.

The reader of the project plan should understand why you are doing this project, what methods and approaches you will use to accomplish the project goals, and what materials, people, and organizations may be of assistance as you work on the project. The plan should indicate what the final product is for the partner, key milestones, and who the users for your final product will be as well as any additional potential efforts to disseminate the outcomes of your project. Ultimately, the reader should be convinced that you can professionally complete the work you outline in the time-frame indicated. I will give you lots of feedback on your plan and kanban board to help guide the project and give you aspects that will be reusable in your final report.

  • Partner Questions and Notes There will be an opportunity to meet and check in with your partners every two weeks. Ahead of the meetings you will brainstorm a list of questions for the project partners and take notes on the answers to the questions as well as any actions generated for the meetings. The notetaker on the meeting will rotate throughout the team, and part of your participation grade will be on the questions and notes you take.

  • Bi-weekly presentations and feedback Each group will give bi-weekly informal presentations (around 15 minutes) to update the project partner on your progress to date, including key findings and accomplishments, and any roadblocks or challenges you encounter along the way. This is an opportunity to meet with project partners and get feedback and guidance to inform the final project.

  • During the presentation, groups will take a note of what to update on the project and use this to inform/update the tasks on the kanban.
  • Final presentations and feedback: Each group must make a final presentation of their project to their community partner in week 9. Each group will give a “rough cut” of any core pieces of work that will go into the final presentation to the class early on in week 9, where we will give you substantial feedback ahead of your final presentation. Generous and honest feedback on your peers’ presentations is an essential and expected component of the course.

  • Final Written Report: Each group submits a final written report, including any final products and documented code, to both the community partner and the instructors. This is a primary outcome of the course and should represent top-level writing and professional standards of presentation. The format and structure of reports and the final product may vary, based on the nature of your particular project. All code must be well documented and organized in a way that the project partner can continue to build on the analysis/visualization. In addition to the final group report and final product, each individual student writes a final self- and peer-evaluation. All components – the group report, final products, and the evaluation – will be considered in grading the final written report. The group report will be sent electronically to the instructors. Each group should give their community partners electronic copies of the final report and all supporting documents, including code and references.

Note: Because the final reports from this class are put onto a website (this website!) that can be viewed by the public, you must pay particular attention to copyright issues in preparing your report. Any images you incorporate into the final report must be available in the public domain. Care must also be exercised in the usage and citing of textual material. If you have questions about whether inclusion of something in your report would infringe on copyright laws, you can refer to Bates' copyright policies.

  • Personal Development Plan and Reflections: As part of this project you will be submitting a plan for personal development. I will meet with each of you 1 on 1 during the first two weeks of class to discuss and form your project goals. Throughout the course there will bi-weekly reflections to reflect on progress towards these goals.

Key Course Policies:

Communication

Working collaboratively requires cultivating mutual respect and engaging in clear, honest, and thoughtful communication. My aim in this course is to create a collective learning experience animated by these values. I expect the kinds of practices in the classroom that would be expected in any professional situation where a large team and a number of sub-teams are working together with a shared mission and smaller team assignments that serve that larger mission. If you are struggling and unable to meet your commitments to the group, your partner, or the whole class — or if you have feedback for us that might help to improve your experience in the class and in your group work — please communicate these concerns to me directly in a timely manner. It is my intention to ensure that all students have a successful experience.

Course components

Sharing / reusing code

I am well aware that a huge volume of code is available on the web to solve any number of problems. Unless I explicitly tell you not to use something the course’s policy is that you may make use of any online resources (e.g. StackOverflow) but you must explicitly cite where you obtained any code you directly use (or use as inspiration). You are welcome to discuss the problems together and ask for advice, you may share helpful code between teams but each person on the team must be aware of what the code does and you must cite it accordingly.

Academic Integrity (excerpt from Course Catalog)

By enrolling in an academic institution, a student is subscribing to common standards of academic honesty. Any cheating, plagiarism, falsifying or fabricating of data is a breach of such standards. A student must make it their responsibility to not use words or works of others without proper acknowledgement. Plagiarism is unacceptable and evidence of such activity is reported to the provost or their designee. Two violations of academic integrity are grounds for dismissal from the college. Students would request in-class discussions of such questions when complex issues of ethical scholarship arise.

Universal Learning and Learning in Community

Many of us learn in different ways. For example, you may process information by speaking and listening, so while lectures are quite helpful for you, some of the written material may be difficult to absorb. You might have difficulty following lectures, but are able to quickly assimilate written information. You may need to fidget to focus in class. You might take notes best when you can draw a concept. For some of you, speaking in class can be a stressful or daunting experience. For some of you, certain topics or themes might be so traumatic as to be disruptive to learning. The principle of Universal Design for Learning calls for our classrooms, our virtual spaces, our practices and our interactions to be designed to include as many different modes of learning as possible, and is a principle I take seriously in this class.

It is also my goal to create an inclusive classroom, which depends on community building, and which requires everyone to come to class with mutual respect, civility, and a willingness to listen to and observe others. As such the syllabus serves as a contract of some expectations between all members of the class, including myself.

If you anticipate or experience any barriers to learning in this course, please reach out to me and your student support advisor. If you have a disability, or think you may have a disability, COA’s Disability Support Services located within the Office of Student Life in Deering Commons to develop a plan for your academic accommodations. You can find out more information in the course catalog under Accommodating students with disabilities. If you have already been approved for accommodations through the Disability Support Services please let me know! We can meet 1-1 to explore concerns and potential options.

Wellbeing

I want to make sure that you learn everything you were hoping to learn from this class. If this requires flexibility, please don’t hesitate to ask.

  • You never owe me personal information about your health (mental or physical) but you’re always welcome to talk to me. If I can’t help, I likely know someone who can.

  • I want you to learn lots of things from this class, but I primarily want you to stay healthy, balanced, and grounded.

Personal Development Plan

Defining your learning goals and working towards them.

What is a Personal Development Plan?

As part of the course you will be putting together and designing your own personal development plan. Personal development plans (PDP) are commonly used in industry and are typically planned and agreed upon with your manager. Personal Developments Plans are about reflecting on your current skills and knowledge and thinking about the areas you want to grow in. For this course your PDP will consist of making at least three personal goals for areas where you would like to grow in the class and putting together a plan for reaching those goals.

Creating Smart Goals

SMART-goals

These goals should be SMART (Specific, Measurable, Achievable, Relevant, and Time-based). As far as possible you should break your goal down into realistic tasks that you can track. An example of a SMART goal might be:

Example Goal: Learn to create a shiny app

Plan (during course):

  • Work through Posit’s RShiny tutorials
  • Read the reactivity chapter sections in Mastering RShiny
  • Write comments in existing RShiny app to identify what part does what.
  • Attend Shiny workshop and drop-in help sessions to work through materials.

Other goals could be:

  • Learn to use git from the command line.
  • Improve project management skills (lead sprint planning, or sprint retrospective, meetings with partners),
  • Improve science communication writing skills,
  • Learn to create a shiny app,
  • Learn and implement text analysis in project.

Support

As part of your pdp plan you will identify readings and tutorials designed to help you work toward your goals. The R for Data Science trello board and the resources can help you get ideas resources and for technical skills. I can also help you identify resources during our meeting, but you should some research in preparation.

Help

Most of you will need help at some point and I want to make sure you can identify when that is without getting too frustrated and feel comfortable seeking help.

  • Google Classroom Forum: Outside of project work sessions, please post any questions on google classroom. The best way to get answers on course content, technology, logistics, policies is to post your question on the google classroom stream. And you are encouraged to answer each others' questions here as well.

  • Help Sessions: We will be doing the majority of our coding project work in class. However, if you would like additional support, there will be one on one help sessions that you can schedule with the TA in class either as a group or individually.

  • Email: Please refrain from emailing any course content questions (those should go on Google Classroom), and only use email for questions about personal matters that may not be appropriate for the public course forum (e.g. illness, missed assignments).

  • For more general support and advice, please make use of the resources on Campus which you will find in the College course catalog. If you’re not sure where to go for help, just ask.

Resources

Course Materials:

Books:

All books for this course are freely available online as e-books and/or linked articles. There will be specific readings for each project.

For project management and reproducibility you might find the following useful:

For programming and statistics you might find the following useful:

Technology:

  • Bring a laptop to every class, we will be programming most days! You can choose whatever programming language you feel most comfortable using, however some project partners may request a particular language based on the tool that they are most comfortable using and where they will continue the analysis. You will each have access to an PositCloud account where you can analyze your data.If you have any difficulties with accessing a computer, let me know.

Tools

Cheatsheets

People

Course organisers

Acknowledgements

This course was supported by a Henry David Thoreau Faculty Grant awarded to Laurie Baker. This course takes inspiration from Data Study Groups at the Alan Turing Institute, project management at the Data Science Campus, and a series of workshops on Data Science for Social Good programs sponsored by the e-science institute at the University of Washington.

This course draws on several resources including the Turing Way and includes featured artwork by @AllisonHorst and Scriberia.

The course structure and grading system draws heavily upon course organization and policies in ENVR 417, Community Engaged Research and PIC MATH, and the work of Bates faculty Ethan Miller, Misty Beck, Francis Eanes, Adriana Salerno as well as scholars beyond Bates.

The website structure is inspired by the 2020 Data Science course website taught by Mine Çetinkaya-Rundel at the University of Edinburgh.