class: center, middle, inverse, title-slide .title[ # Overview of the Perspectives sequence ] .author[ ###
MACS 30000
University of Chicago ] --- <!-- Welcome to Perspectives on Computational Analysis! In this video I will outline the basic structure of the perspectives sequence and what you can expect to learn in this class, as well as offer some tips to be successful. --> # Perspectives core sequence 1. Perspectives on Computational Analysis * Introduction to computational social science research design 1. Perspectives on Computational Modeling * Statistical modeling techniques and machine learning algorithms 1. Perspectives on Computational Research * Conduct your own computational research project using tools and techniques from the first two terms * Guided introduction to conducting a computational social science research project from start to finish * MACSS students take the full sequence * Certificate students take the first two courses <!-- The perspectives sequence is designed to introduce students to a range of computationally-enhanced research designs and modeling techniques. These methods form the foundation for the field of computational social science. The first course, Perspectives on Computational Analysis, introduces students to core research designs in the social sciences and explores how computation enhances these methods and provides new approaches to asking and answering questions of interest. Perspectives on Computational Modeling surveys a range of statistical and machine learning techniques used to empirically answer questions using different types of data sets. Perspectives on Computational Research brings the methods from the first two quarters together as you develop your own computationally-enhanced research project, from identifying a topical research question all the way to communicating your findings to the academic community. --> -- ## What it is not * A sequence in programming * Actual programming courses * MACS 30111-112-113/30121-122-123 * MACS 30500 <!-- One thing I want to make clear is that the Perspectives sequence is not a set of courses intended to teach students how to program. No programming experience is necessary for the first course, while programming experience in a language such as Python or R is assumed for the latter two courses. If you want to learn how to program, you can take classes such as Computer Science with Social Science Applications or Computing for the Social Sciences. --> --- # Major learning objectives * Introduce major research paradigms in computational social science * Assess the strengths and weaknesses of competing research designs for a question of interest * Read and critique recent seminal papers * Develop an original and (more importantly) defensible research proposal <!-- In the first quarter, we will spend the bulk of our time reading and gaining understanding on the major approaches to quantitative social scientific research: observational studies, surveys, and experiments. We will do this through the main textbook for the class, Bit by Bit, which is an excellent primer to the field of computational social science. We will supplement this book with short recorded lectures, class discussions, and by reading and dissecting academic papers. Every week in your discussion sections you will critique one or two papers from the major disciplines of economics, sociology, political science, and psychology, which employ methods from computational social science. Remember that critiques are not just about criticizing papers and finding flaws. You will need to identify the strengths and weaknesses of these studies, and understand the trade offs involved in selecting one research design over another. In the final weeks of the quarter, you will develop your own original research proposal utilizing some form of computationally-enhanced research methods. --> --- # Broad course structure * Research design (6 weeks) * Review basic social scientific research methods * How does computation allow us to ask and answer questions that otherwise could not be addressed? * Research ethics (1 week) * How to be ethical scholars * Ethics and privacy in a digital age * The IRB * Developing a research question (3 weeks) * How to identify and define a social scientific research question * Construct a theory/model of behavior * Derive testable implications/hypotheses * Design a research study <!-- The quarter is divided into three major sections. In the first section, we will explore basic social scientific research methods and identify how computation allows us to ask and answer questions that otherwise could not be addressed. Remember that we do not use the tools of computational social science just because they exist. Instead, there must be some additional value generated by these techniques compared to already existing methodologies. We next spend time discussing research ethics and how to use the tools of computational social science appropriately and safely. Big data and machine learning have been widely declared as the next forefront of technological innovation, but serious concerns have been raised regarding the moral, ethical, and social implications of these methods. As part of the next generation of researchers employing these techniques, we want to know how to engage in ethical research. In the final weeks, you will work to develop a research proposal. This proposal will seek to answer some question of interest in the social sciences, using a computationally-enhanced methodology. You will work with peers and your discussion section leader to identify an appropriate question, develop an expectation or model of behavior, derive one or more hypotheses, and design a research study which could plausibly answer the question. --> --- # Tips for student success * Keep up with the assigned readings and exercises * Be active in class * Ask questions * Grow from a knowledge seeker to a producer <!-- To be successful in this class, you want to ensure you keep up with the readings. Each week you will typically read one chapter from Bit by Bit, along with one or two journal articles or book chapters. Compared to many seminars where you may need to read one book each week, this is not a lot. However published research papers can be quite dense and hard to discern on the first read through. To read an article thoroughly and be adequately prepared to discuss its contents, you may need at least one or two hours. Make sure not to push your readings off too late in the week. Be an engaged student! Ask questions, come prepared with insights and thoughts on the readings. Your discussion sections are modeled on typical seminars in the social sciences. This is your opportunity to engage with your colleagues and have a civil discussion over research design, scientific theory, and empirical reasoning. If you don't have a background in the social sciences, this is a great opportunity to develop your knowledge of social scientific inquiry. For those coming in with prior training in the social sciences, use this opportunity to start identifying fields of study or topics which interest you. --> --- # What is computational social science? - *Computational* and *social science* - Social computing - Often involves ethical/privacy questions that are now considered complex <!-- In its contemporary form, computational social science was first defined in 2009 by Lazer and his co-authors. It envisions a new approach to studying social processes and behavior, leveraging rich new data sources of human behavior and advanced computational techniques to analyze and extract knowledge from this data. Unlike traditional methods for social science inquiry which focus on strong theories, small datasets, and traditional statistical methods, computational social science blends the computational with the social sciences. James Evans, our faculty director, recently posited on a new paradigm of social computing, identifying computational methods as an integral component to the scientific process. This approach more closely links the disciplines of computer science and the social sciences, forcing social scientists to explicitly incorporate computational techniques when designing studies, relaxing modeling assumptions and ceding more intelligence and creativity to algorithms. Theory remains at the forefront, but we can use computation to better sort through and identify the strongest theories based on the data. Computational social science frequently involves the ethical concerns of research on big datasets of human activity. As Ben Parker tells us, with great power comes great responsibility. Academics and data scientists have access to a wide realm of detailed records on human behavior. The tools we have developed can be harnessed for great discovery, but also have the power to hurt individuals and endanger marginalized populations. We need to find a careful balance between the benefits the digital-age offers to better understand human behavior, while minimizing the risks when we utilize these tools. --> --- # Is computational social science a fad? ![Figure 1.1 in *Bit by Bit*](images/bitbybit1-1_hilbert_worlds_2011_fig2_and_5.png) <!-- Perhaps you wonder -- is this even relevant? Will computational social science truly revolutionize how we engage in social scientific discovery? I think the answer is yes, simply because of the astronomical increase in digital data and computational power. If you think about it, so much of human activity is now recorded and digitized. Computers are everywhere: laptops, smart phones, cars, watches, thermostats, eyeglasses, drones, lawn mowers, and more. There are so many possibilities of how this data can be used to better understand human behavior, if only we have the access and the imagination. --> --- .pull-left[ ## Readymades <img src="images/duchamp.jpg" width="70%" /> ] .pull-right[ ## Custommades <img src="images/david.jpg" width="70%" /> ] <!-- Another important theme in computational social science is how we can leverage two styles of inquiry to better understand human behavior. Salganik identifies ready-mades as datasets which already exist, and that scientists can leverage to answer the questions they care about. Like Marcel Duchamp's Fountain, social scientists repurpose existing data to answer questions it was never intended to address. Likewise, custom-mades are data generated by a researcher starting with a specific question, then using the tools of the digital age to create the data needed to answer that question. Michaelangelo's statue of David took three years of work and craft to construct from an ordinary slab of marble, but the result is exactly what he desired. Each of these techniques are powerful tools for computational social science. What you need to learn is how to identify when each approach is appropriate for given a specific research question. Once you decide on the approach, then you can think about the exact details necessary to collect the data. --> --- # Readymade data - mobile call records - digital transaction records - social media posts - citation networks - and so on ... -- ## Why aren't they enough? --- # Custommade data - online surveys - digitally-enhanced surveys - digital experiments - simulated data -- # Sometimes, the distinction is blurry. - training a machine to read digital text - identifying objects from images - pooling data from different sources --- class: center, inverse, middle # How do we create a computational social science community? <!-- How do we create a computational social science community? For one, creating programs such as this one to train the next generation of researchers in the fundamental skills in computational social science. --> --- class: center, middle # Social Scientists `\(\longleftrightarrow\)` Data Scientists <!-- More fundamentally, we need to bridge the gap between social scientists and data scientists. Each group contains many talented and inquisitive researchers, however their training and approaches to inquiry differ in many important ways. --> --- ## What is Data Science? - Goes back to Tukey (1962) - Learning from data - Statistics, computer science, machine learning, and much more <!-- Data science may have become a hot buzzword over the past fifteen years, but its origins date back over fifty years to John Tukey's call for a reformation of the field of statistics to focus more explicitly on learning from data, or "data analysis". Nowadays data science has become the hottest professional field, with companies hiring hundreds of thousands of data scientists and academia reorganizing its curriculum to meet the needs and interests of students hoping to get into this lucrative field. Perhaps an oversimplification, data science can be seen as a combination of statistics, computer science, and machine learning. Industry groups define a data scientist as "a professional who uses scientific methods to liberate and create meaning from raw data." What is typically missing from many data science degree programs is a component focused on applying these techniques to social processes and engaging in ethical work. To be fair, many activist groups have pushed the academy and industry to do better at ensuring data science is taught and performed ethically, and a growing number of organizations have risen to meet this challenge. But most data science programs still lack rigorous instruction and practice in social scientific inquiry. Likewise, most classical degree programs in the social sciences have, until recently, ignored the growth of computational methods and their potential application to social research. --> ## Come together, right now - Data science alone is not enough if we want to study social behavior - Social science alone is not enough if we want to use new data sources <!-- This leaves a significant gap between the disciplines that must be bridged. Data science alone is not enough if we want to study social behavior. And social science alone is not enough if we want to use new data sources. --> -- Data science | Social science -----------------|--------------- Study anything | Study social things Methods driven | Question driven Large found data | Small designed data Prediction | Explanation <!-- At their heart, data science and social science seem to be very at odds with each other. Data scientists believe they can study anything, while social scientists stick to the stuff they know (aka social processes). Data science tends to be driven by the methods, making every problem into something to be solved by x g boost or deep learning. Whereas social science typically leads with a question, waiting to choose a method until the question is well-defined. Data scientists leverages large observational datasets which can be repurposed to the problem at hand, whereas social scientists tend to generate their own original datasets at much smaller scales. Finally, the fundmental goals of the disciplines seem contrary to each other. Data science focuses its attention on prediction, being able to make an informed forecast of a specific behavior for a specific individual. Whereas social science cares much more about explanation, identifying relationships between variables and identifying causality. --> --- .center[ ![](images/Glass_Half_Full_bw_1_cropped_and_smaller.jpg){fig-align="center"} 】 <!-- If we consider the stereotypical personalities of each discipline, the data scientist would see this glass of water as half full. Data scientists tend to be optimists, believing with the appropriate data and methods that any problem can be solved. Whereas a social scientist would probably see this glass as half empty. Through their training in the scientific method, social scientists are much more critical and avoid sweeping statements. --> --- class: center, middle ## Social science `\(+\)` data science `\(=\)` Computational social science <!-- What we hope to develop in your generation of computational social scientists is a blend of the social science and data science. Have the optimism of a data scientist and know the computational tools and techniques that can harness the digital age. And also have the caution and rigor of a social scientist, to know what makes a good question, when a certain method is appropriate, and when a study is ethically sound. --> --- class: center, inverse, middle # Mental Excersie for Thursday --- # Mental Exercise A **political ideology** is "a set of ideas, beliefs, values, and opinions, exhibiting a recurring pattern, that competes deliberately as well as unintentionally over providing plans of action for public policy making in an attempt to justify, explain, contest, or change the social and political arrangements and processes of a political community" (Freeden 2001). Propose three different ways to *operationalize* political ideology such that you could measure an individual voter's ideology. Try to come up with at least one method making use of *readymade* data and at least one method making use of *custommade* data. Think about the pros and cons of each method. Does it really measure ideology or perhaps something else? Suggested reading: Jost, J. T., Federico, C. M., & Napier, J. L. (2009). Political ideology: Its structure, functions, and elective affinities. Annual review of psychology, 60(1), 307-337. --- # Operationalization - Sometimes, the phenomenon we care about is abstract/general. - operationalization: turning abstract theoretical constructs into something measurable -- ## Example: How to measure religiousity? - observe/ask how many times in a month/year a person goes to church/read the Bible/Quran - ask whether or the degree in which a person believes in supernatural beings (e.g. "Do you believe in angels?" "How often do you think about afterlife?") - observe/measure the amount of scarifies that people make (e.g. how long a person fast during Ramadan) --- class: center, inverse, middle # Plagiarism and academic honesty --- # Plagiarism and academic honesty We take academic honesty very seriously and have absolutely zero tolerance of plagiarism at the University of Chicago - Writing assignments must put in quotes and cite any excerpts taken from another work. - If the cited work is the particular paper referenced in the Assignment, no works cited or references are necessary at the end of the composition. - If the cited work is not the particular paper referenced in the Assignment, you MUST include a works cited or references section at the end of the composition. - Any copying of other students’ work will result in a zero grade and potential further academic discipline. When in doubt, please ask! It is far better to check with us prior to submitting an assignment than waiting. --- # Is it plagiarism or not? > A political ideology is a set of beliefs and ideas that exhibit a recurring pattern and provide plans of action for public policy making. One way to measure political ideology is through conducting social surveys ... -- `$$\\[1in]$$` > Freeden (2001) defines political ideology as a set of beliefs and ideas that exhibit a recurring pattern and provide plans of action for public policy making. One way to measure political ideology is through conducting social surveys ... --- # Correct practice > Freeden (2001) defines political ideology as "a set of ideas, beliefs, values, and opinions, exhibiting a recurring pattern [...] that provide plans of action for public policy making." One way to measure political ideology is through conducting social surveys ... -- ## Or > Political ideology can be defined as a system of interrelated beliefs and ideas that compete with other systems over providing guidance on designing public policies (Freeden 2001). One way to measure political ideology is through conducting social surveys ...