This course provides an introduction to critical and ethical issues surrounding data and society. It blends social and historical perspectives on data with ethics, policy, and case examples—from Facebook’s “Emotional Contagion” experiment to search engine algorithms to self-driving cars—to help students develop a workable understanding of current ethical issues in data science. Ethical and policy-related concepts addressed include: research ethics; privacy and surveillance; data and discrimination; and the “black box” of algorithms. Importantly, these issues will be addressed throughout the lifecycle of data—from collection to storage to analysis and application.
Course Objectives
Upon completion of the course, students will 1) identify and articulate some basic ethical and policy-based frameworks; 2) understand the relationship between data, ethics, and society; and 3) be able to critically assess their own work and education in the area of data science. In particular, course assignments will emphasize researcher and practitioner reflexivity, allowing students to explore their own social and ethical commitments as future data scientists and information professionals.
For more information on the Data Science Education Program at UC Berkeley, please visit databears.berkeley.edu
Contact
Instructor: Anna Lauren Hoffmann
Email: annalauren@berkeley.edu
Office Hours: Tuesdays 2:00-3:30 PM
Office Hours Location: South Hall 302
Course Schedule
Module 1- Situating "Data" I: What is data?
Objective: to "shake loose" the idea of data as an object for critical and ethical inquiry
Reading(s):
Kitchin, R. (2014). Conceptualising Data. In The data revolution (pp. 1-26). New York: SAGE.
Module 2- Situating "Data" II: A pre-history of data
Objective: to explore some historical precedents of today's "big data" moment
Case(s): Censuses & Area Codes
Reading(s):
Seltzer, W., & Anderson, M. (2001). The dark side of numbers: The role of population data
systems in human rights abuses. Social Research, 68(2), 481-513.
Garber, M. (2014, February 13). Our numbered days: The evolution of the area code. The
Atlantic. Retrieved from http://www.theatlantic.com/technology/archive/2014/02/our-
numbered-days-the-evolution-of-the-area-code/283803/
Module 3- Ethical Toolbox I: Research and applied ethics
Objective: introduce and explore applied ethical frameworks for thinking about data
Case(s): Facebook’s emotional contagion experiment; OK Cupid match rank testing
Reading(s):
The Belmont Report. (1979). The Belmont Report: Ethical principles and guidelines for the
protection of human subjects of research. Retrieved from
http://www.hhs.gov/ohrp/humansubjects/guidance/belmont.html
Gray, M. (2014, July 9). When science, customer service, and human subjects research collide.
Now what? Culture Digitally. Retrieved from http://culturedigitally.org/2014/07/when-science-
customer-service-and-human-subjects-research-collide-now-what/
Short Writing Assignment #1 due before start of class
Module 4- Ethical Toolbox II: Concepts of privacy and publicity
Objective: explore basic concepts of privacy and anonymity (access, control, and context)
Case(s): 2006 AOL search data release; Facebook “Taste, Ties, Time” data release
Reading(s):
Tavani, H. (2012). Privacy and cyberspace. In Ethics and technology: Controversies, questions,
and strategies for ethical computing, 4th Edition (pp. 131-168). Hoboken, NJ: Wiley.
Barocas, S., & Nissenbaum, H. (2014, November). Big data’s end run around procedural privacy
protections: Recognizing the inherent limitations of consent and anonymity. Communications of
the ACM, 57(11), 31-33.
Module 5- Lifecycle of Data I (Part I): Issues in data collection and data mining
Objective: attend to ethical questions in the collection and mining of online data
Case(s): data mining, social games, consent, Terms of Service
Reading(s):
van Wel, L., & Royakkers, L. (2004). Ethical issues in data mining. Ethics and Information
Technology, 6, 129-140.
Willson, M., & Leaver, T. (2015). Zynga’s FarmVille, social games, and the ethics of big data
mining. Communication and Research Practice, 1(2), 147-158.
Module 6- Lifecycle of Data I (Part II): Issues in data collection and data mining
Objective: (cont’d from Module 5)
Case(s): data collection, personal fitness trackers, and the Quantified Self
Reading(s):
Crawford, K., Lingel, J., & Karppi, T. (2015). Our metrics, ourselves: A hundred years of self-
tracking from the weight scale to the wrist wearable device. European Journal of Cultural
Studies, 18(4-5), 479-496.
Eveleth, R. (2014, December 15). How self-tracking apps exclude women. The Atlantic.
Retrieved from http://www.theatlantic.com/technology/archive/2014/12/how-self-tracking-
apps-exclude-women/383673/
Watson, S.M. (2014, September 25). Stepping down: Rethinking the fitness tracker. The
Atlantic. Retrieved from http://www.theatlantic.com/technology/archive/2014/12/how-self-
tracking-apps-exclude-women/383673/
Short Writing Assignment #2 due before the start of class
Module 7- Lifecycle of Data II: Issues in data storage and security
Objective: explore ethical and privacy issues in data, information, and computer security
Case(s): educational data and students’ privacy
Reading(s):
Brey, P. (2007). Ethical aspects of information security and privacy. In M. Petković & W. Jonker
(eds.), Security, privacy, and trust in modern data management (pp. 21-36). New York: Springer.
boyd, d. (2015, May 22). Which students get to have privacy? The Message [Web log post].
Retrieved from https://medium.com/message/which-students-get-to-have-privacy-
e9773f9a064#.urtohca12
Module 8- Lifecycle of Data III (Part I): Issues in analyzing and exploring data
Objectives: building on discussions from Module 5, tackling ethical issues in data analysis
Case(s): “Spurious Correlations,” app design, data inclusion
Reading(s):
boyd, d., & Crawford, K. (2012). Critical questions for big data. Information, Communication,
and Society, 15(5), 662-679. (Introduction and Section 1 “Big Data changes the definition of
knowledge”)
Carr, N. (2014, April 16). The limits of social engineering. MIT Technology Review. Retrieved
from http://www.technologyreview.com/review/526561/the-limits-of-social-engineering/
Ananny, M. (2011, April 14). The curious connection between apps for gay men and sex
offenders. The Atlantic. Retrieved from
http://www.theatlantic.com/technology/archive/2011/04/the-curious-connection-between-
apps-for-gay-men-and-sex-offenders/237340/
Module 9- Lifecycle of Data III (Part II): Issues in analyzing and exploring data
Objectives: (cont’d from Module 8)
Case(s): Hurricane Sandy, marginalized populations, data exclusions
Reading(s):
Lerman, J. (2013, September 3). Big data and its exclusions. Stanford Law Review Online, 66, 55-
63.
Crawford, K. (2013, April 1). The hidden biases in big data. Harvard Business Review. Retrieved
from https://hbr.org/2013/04/the-hidden-biases-in-big-data/
Chalabi, M. (2014, July 29). Why we don’t know the size of the transgender population.
FiveThirtyEight. Retrieved from http://fivethirtyeight.com/features/why-we-dont-know-the-
size-of-the-transgender-population/
Final Essay Proposals due before the start of class
SPRING BREAK
Module 10- Lifecycle of Data IV (Part I): Ethics of algorithms and automated systems
Objectives: building on Modules 8/9, examining consequences of automation and
implementation
Case(s): algorithmic cruelty, data and discrimination
Reading(s):
Gillespie, T. (2012). Can an algorithm be wrong? Limn, 2, n.p. Retrieved from
http://escholarship.org/uc/item/0jk9k4hj
Slavin, K. (2011). How algorithms shape our world [video]. TEDGlobal 2011. Retrieved from
http://www.ted.com/talks/kevin_slavin_how_algorithms_shape_our_world
Wolf, C., & Polonetsky, Jules. (2014, November 5). Big data: Putting the heat on hate. Re/code.
Retrieved from http://recode.net/2014/11/05/big-data-putting-heat-on-the-hate/
Meyer, E. (2014, December 24). Inadvertent algorithmic cruelty. MeyerWeb.com. Retrieved
from http://meyerweb.com/eric/thoughts/2014/12/24/inadvertent-algorithmic-cruelty/
Module 11- Lifecycle of Data IV (Part II): Ethics of algorithms and automated systems
Objectives: (cont’d from Module 10)
Case(s): Google search, redlining, race, and gender
Reading(s):
Noble, S.U. (2012, Spring). Missed connections: What search engines say about women. Bitch
Magazine, 54, 37-41. Retrieved from
https://safiyaunoble.files.wordpress.com/2012/03/54_search_engines.pdf
Barr, A. (2015, July 1). Google mistakenly tags black people as ‘gorillas,’ showing limits of
algorithms. WSJ Bits Blog. Retrieved from http://blogs.wsj.com/digits/2015/07/01/google-
mistakenly-tags-black-people-as-gorillas-showing-limits-of-algorithms/
Mock, B. (2015, September 28). Redlining is alive and well—and evolving. CityLab. Retrieved
from http://www.citylab.com/housing/2015/09/redlining-is-alive-and-welland-
evolving/407497/
Module 12- Lifecycle of Data V: Issues in dissemination and evaluation of data
Objectives: Trace ethical challenges in the evaluation and communication of results
Case(s): Google Flu trends, United Nation Waste Crimes report
Reading(s):
Harris, J. (2014, May 22). Distrust your data. Source.opennews.org. Retrieved from
https://source.opennews.org/en-US/learning/distrust-your-data/
Madrigal, A.C. (2015, October 6). The deception that lurks in our data-driven world. Fusion.
Retrieved from http://fusion.net/story/202230/true-data-can-lie/
Lepawsky, J., Goldstein, J., & Schulz, Y. (2015, June 24). Criminal negligence? Discard studies:
Social studies of waste, pollution, & externalities. Retrieved from
http://discardstudies.com/2015/06/24/criminal-negligence/
Lazer, D., & Kennedy, R. (2015, October 1). What we can learn from the epic failure of Google
flu trends. Wired. Retrieved from http://www.wired.com/2015/10/can-learn-epic-failure-
google-flu-trends/
Module 13- Interrogating Data Science: Asking critical and ethical questions
Objectives: strategies for thinking pragmatically about ethics as an info professional
Student-generated session – readings to be chosen and lecture to be led by student groups, facilitated by instructor
Module 14- Data Futures: Thinking ethically, thinking ahead
Objectives: identify and explore ethical issues on data science’s horizons
Student-generated session – readings to be chosen and lecture to be led by student groups,
facilitated by instructor
Module 15- Course Reflection: Revisiting our Data Doubles
Final Essay Projects due
Assignments:
Course assignments revolve around short, accessible styles of writing. Emphasis is placed not on
writing for academics, but in writing for the broad potential audiences a data scientists or
information professional might have the opportunity to engage—for example, co-workers,
clients, or the general public. Assignments include both reflective writing (to get students
thinking through their own ethical commitments and practices) and compelling argumentation
(as in the form of a critical or persuasive blog post or newspaper op-ed).
Short Writing Assignment #1 – Sketching Your Data Double: In this brief personal essay (300
words), students will identify a data-intensive online service they regularly use (i.e.,
Facebook, Google, reddit, etc...) and describe how the site “sees” them, that is, students
should offer a description of who or what the site “thinks” they are like.
Short Writing Assignment #2 - Values Exploration: In this brief writing assignment (400-500
words), students will choose one of the three values identified in the Belmont Report
(respect, beneficence, or justice) and 1) describe how it is defined in the Report, 2) conduct
light research to identify a different or alternative definition of the value, and 3) reflect on
what it might mean to heed this value in the context of data science.
Group Project – Student-Generated Session: Recognizing the value of diverse inputs and
community engagement, students will be split into four groups and tasked with developing
(with assistance from the instructor) content for one Module of the course. These student-
generated sessions will span two sessions, with individual groups being responsible for the
content (i.e., selected reading, reading notes, and short presentation) for half of one
session. Though students will have relatively broad license, they will be required to work
towards the pedagogical aims and content themes of the Module (see Modules 13 and 14
below).
Final Essay Proposal: Students will sketch a brief idea (300 word max) for their final essay
topic (see below). This proposal process will give the students an opportunity to think about
their final project in advance and allow the instructor to give them constructive feedback
early on.
Final Essay Project: Students (either alone or in pairs) will write an extended essay (1200
words) on the topic of their choosing (so long as the topic relates to ethics and data, broadly
construed). The essays should take the form of 1) a blog post, of the sort that might be
circulated online or on an internal company blog or 2) op-ed commentary for a newspaper
or online publication. Students should have a clear idea of their intended audience, and will
receive guidance on how to write persuasively in that direction. Overall, the assignment is
intended to help students learn to communicate ethical ideas and critical issues outside of
the classroom.
Academic Integrity
The high academic standard at the University of California, Berkeley, is reflected in each degree
that is awarded. As a result, every student is expected to maintain this high standard by
ensuring that all academic work reflects unique ideas or properly attributes the ideas to the
original sources. Individual departments often have their own ways of citing and attributing
work, so it is the responsibility of each student to seek that information out if it is not otherwise
provided through a syllabus, course website, or other means.
These are some basic expectations of students with regards to academic integrity:
Any work submitted should be your own individual thoughts, and should not have been
submitted for credit in another course unless you have prior written permission to re-
use it in this course from this instructor.
All assignments must use "proper attribution," meaning that you have identified the
original source and extent or words or ideas that you reproduce or use in your
assignment. This includes drafts and homework assignments!
If you are unclear about expectations, ask your instructor or GSI.
Do not collaborate or work with other students on assignments or projects unless you have
been given permission or instruction to do so. For more information visit:
http://sa.berkeley.edu/conduct/integrity
UC Berkeley Statement on Diversity
These principles of community for the University of California, Berkeley are rooted in a mission
of teaching, research and public service and will be enforced in our classroom this term.
We place honesty and integrity in our teaching, learning, research and administration at
the highest level.
We recognize the intrinsic relationship between diversity and excellence in all our
endeavors.
We affirm the dignity of all individuals and strive to uphold a just community in which
discrimination and hate are not tolerated.
We are committed to ensuring freedom of expression and dialogue that elicits the full
spectrum of views held by our varied communities.
We respect the differences as well as the commonalities that bring us together and call
for civility and respect in our personal interactions.
We believe that active participation and leadership in addressing the most pressing
issues facing our local and global communities are central to our educational mission.
We embrace open and equitable access to opportunities for learning and development
as our obligation and goal.
For more information, visit UC Berkeley's Division of Equity, Inclusion & Diversity page:
http://diversity.berkeley.edu/vcei
Learning Accomodations and Access
If you need accommodations for any physical, psychological, or learning disability, please speak
to me after class or during office hours.
Additional Campus Resources
These additional campus units may, at times, prove helpful during the course of the semester: