Data Science Ethics

University of Michigan via Coursera

Go to Course: https://www.coursera.org/learn/data-science-ethics

Introduction

### Course Review: Data Science Ethics on Coursera In an era where data is heralded as the new gold, understanding the ethical implications of data science is more crucial than ever. The **Data Science Ethics** course on Coursera aims to equip learners with the necessary framework to analyze and navigate the complex ethical landscape surrounding data privacy, consumer control, and the larger societal impacts of data analytics. #### Overview of the Course As the digital landscape evolves, so too do the concerns regarding privacy, data breaches, and the ethical utilization of consumer information. This course delves deep into these issues, allowing participants to explore the consequences of data collection and management, while uncovering the principles of fairness, accountability, and transparency. #### Course Syllabus Breakdown 1. **What are Ethics?** The course begins with a foundational understanding of utilitarian ethics. This module emphasizes achieving a collective agreement on moral intuitions, laying the groundwork for the ethical discussions that follow. It's crucial for students to approach these concepts with a unified mindset, ensuring productive discussions. 2. **History, Concept of Informed Consent** A historical perspective highlights the trials and tribulations of early human subject research, underlining the importance of informed consent. The course effectively debates the challenges faced with retrospective studies and the ethics of consent in the digital age. 3. **Data Ownership** In this module, the conversation shifts to the intricate question of data ownership. It engages with personal data rights, providing relatable examples such as online content ownership and the limits of data recording and usage. This topic encourages critical thinking about who truly owns the data we generate. 4. **Privacy** Understanding privacy as a fundamental human need is central to this module. It examines evolving societal values related to privacy, particularly among younger generations, and the impact of these shifts. The comparison of "data" and "metadata" elevates the discourse on how our information is utilized. 5. **Anonymity** Anonymity is another essential aspect of ethical data science explored in-depth in this course. Through the examination of blockchain and Bitcoin, students gain insights into both the possibilities and limitations of anonymous transactions. 6. **Data Validity** This module addresses a common pitfall in data science: the misuse of data leading to flawed conclusions. It broadens understanding of sampling techniques, emphasizing the need for representative samples in analysis—a must-know for aspiring data scientists. 7. **Algorithmic Fairness** One of the most pressing issues in data ethics, algorithmic bias, is tackled here. This module brings to light how human biases can infiltrate algorithmic processes, urging students to consider the implications of their data practices. 8. **Societal Consequences** In Module 8, the course takes a macro view, discussing the societal ramifications of data science. It examines challenges such as information asymmetry and the concept of ossification in algorithmic methods, stressing the importance of addressing systemic inequities. 9. **Code of Ethics** The course culminates in a synthesis of all discussed topics, presenting a two-point code of ethics for data practitioners. This practical guideline serves as a valuable takeaway for applying ethical considerations in real-world contexts. #### Recommendation The **Data Science Ethics** course is a comprehensive resource for anyone interested in the nuances of ethical data practices. It's suitable for data scientists, business analysts, and anyone interacting with consumer data in the digital age. The course's structure, combining theoretical knowledge and practical insights, encourages active engagement and critical thinking. Whether you are a seasoned professional or a newcomer to data science, this course will illuminate the essential ethical considerations of working with big data. By the end, you will not only be better equipped to confront ethical dilemmas but also contribute positively to the discourse surrounding data responsibility and integrity. Enroll in **Data Science Ethics** on Coursera today to empower yourself with the knowledge that every data scientist needs in our data-driven world.

Syllabus

What are Ethics?

Module 1 of this course establishes a basic foundation in the notion of simple utilitarian ethics we use for this course. The lecture material and the quiz questions are designed to get most people to come to an agreement about right and wrong, using the utilitarian framework taught here. If you bring your own moral sense to bear, or think hard about possible counter-arguments, it is likely that you can arrive at a different conclusion. But that discussion is not what this course is about. So resist that temptation, so that we can jointly lay a common foundation for the rest of this course.

History, Concept of Informed Consent

Early experiments on human subjects were by scientists intent on advancing medicine, to the benefit of all humanity, disregard for welfare of individual human subjects. Often these were performed by white scientists, on black subject. In this module we will talk about the laws that govern the Principle of Informed Consent. We will also discuss why informed consent doesn’t work well for retrospective studies, or for the customers of electronic businesses.

Data Ownership

Who owns data about you? We'll explore that question in this module. A few examples of personal data include copyrights for biographies; ownership of photos posted online, Yelp, Trip Advisor, public data capture, and data sale. We'll also explore the limits on recording and use of data.

Privacy

Privacy is a basic human need. Privacy means the ability to control information about yourself, not necessarily the ability to hide things. We have seen the rise different value systems with regards to privacy. Kids today are more likely to share personal information on social media, for example. So while values are changing, this doesn’t remove the fundamental need to be able to control personal information. In this module we'll examine the relationship between the services we are provided and the data we provide in exchange: for example, the location for a cell phone. We'll also compare and contrast "data" against "metadata".

Anonymity

Certain transactions can be performed anonymously. But many cannot, including where there is physical delivery of product. Two examples related to anonymous transactions we'll look at are "block chains" and "bitcoin". We'll also look at some of the drawbacks that come with anonymity.

Data Validity

Data validity is not a new concern. All too often, we see the inappropriate use of Data Science methods leading to erroneous conclusions. This module points out common errors, in language suited for a student with limited exposure to statistics. We'll focus on the notion of representative sample: opinionated customers, for example, are not necessarily representative of all customers.

Algorithmic Fairness

What could be fairer than a data-driven analysis? Surely the dumb computer cannot harbor prejudice or stereotypes. While indeed the analysis technique may be completely neutral, given the assumptions, the model, the training data, and so forth, all of these boundary conditions are set by humans, who may reflect their biases in the analysis result, possibly without even intending to do so. Only recently have people begun to think about how algorithmic decisions can be unfair. Consider this article, published in the New York Times. This module discusses this cutting edge issue.

Societal Consequences

In Module 8, we consider societal consequences of Data Science that we should be concerned about even if there are no issues with fairness, validity, anonymity, privacy, ownership or human subjects research. These “systemic” concerns are often the hardest to address, yet just as important as other issues discussed before. For example, we consider ossification, or the tendency of algorithmic methods to learn and codify the current state of the world and thereby make it harder to change. Information asymmetry has long been exploited for the advantage of some, to the disadvantage of others. Information technology makes spread of information easier, and hence generally decreases asymmetry. However, Big Data sets and sophisticated analyses increase asymmetry in favor of those with ability to acquire/access.

Code of Ethics

Finally, in Module 9, we tie all the issues we have considered together into a simple, two-point code of ethics for the practitioner.

Attributions

This module contains lists of attributions for the external audio-visual resources used throughout the course.

Overview

What are the ethical considerations regarding the privacy and control of consumer information and big data, especially in the aftermath of recent large-scale data breaches? This course provides a framework to analyze these concerns as you examine the ethical and privacy implications of collecting and managing big data. Explore the broader impact of the data science field on modern society and the principles of fairness, accountability and transparency as you gain a deeper understanding of the i

Skills

Probabilty and Statistics Data Analysis Data Ethics

Reviews

This course is really amazing for the data science professionals. We get to know so many things pertaining to Data Science and the ethics which should be practiced in this domain.

This course is short, slow, and easy, but I ranked it five stars because the content is important in today's growing reliance on data science.

The instructor really explained everything well and in detailed manner. I appreciate all the videos and case studies. Those tools help me to understand the subject in a deeper manner.

This course is special. The instructor is special.. He is smart and intelligent. We need this kind of instructors and people in our world.\n\nThank you very much

This course is very helpful about the ethics to be followed in data capturing, data sharing and data usage etc. Over all it's very useful for me to get understanding on Data ethics.