Allows anyone to host a project and anyone to volunteer on a project
Benefits/drawbacks to microtasks vs. volunteers?
What is human computation best for?
Best for
Easy task, big scale
Tasks not easily solved by computers (yet)
Tasks can be done by non-experts
Classification is not subjective
Is this news story biased?
Is this bad policy?
Augmented and scaled by computer assisted human computation
Open Calls
Pose a problem asking for specific, measurable solutions from other people
Offer a reward/incentive for participation
Compare and evaluate the solutions using a consistent and measurable metric
Generate broad participation from a wide range of researchers
Netflix prize
Need to predict what movies customers would enjoy
Internal research plateaus
Release an anonymized dataset of 100 million movie ratings to predict 3 million held-out ratings
Anyone who could create an algorithm that improved the existing model by 10% or better would win 1 million dollars
Clear and unbiased evaluation criteria
Solicited over 40,000 solutions
Discussion
The best predictive models in the Netflix Prize open call were hybrids of multiple models (ensemble methods). What characteristic of one model relative to other models made it improve the overall prediction when blended with the other models?
In your opinion, what kind of tasks are better suited for open call contests? What kind of tasks are not?
What are the benefits to the researchers proposing the problem?
What are the benefits to the participants proposing the solutions?
Are open calls better tailored to questions of prediction or questions of explanation? How might we utilize open calls to tackle explanations?
Sometimes, what questions are interesting enough. But oftentimes, they are not.
Event/data-driven project: What questions are answerable? What variables are measurable? What do they measure?
Not a bad idea to start with a descriptive/how question; but better to be turned into a causal/why question at some later point
Literature review helps.
Tips
Can derive dozens of questions from even a specific topic
Pick and choose between them - what is interesting/relevant?
Event/data-driven project:
Okay to start with a single case/event/platform/site
Focus on internal variation
Worry about generalizability later
Evaluate your questions
There’s no such thing as a dumb question, but not all questions are equally good
Avoid settled facts
Event/theory-driven project: What data exists?
Theory-driven project: Do you have a “fair” sample?
Turn into a research problem
“Nobody has studied it before”
could be your starting point
is almost never a good ending point
Your research must be valuable: Determine your potential audience - what will concern them?
Pure vs. applied research
Science is okay with pure research problems
We want understanding
In other contexts, application is far more important
Mental exercise for Thursday
Identify ONE research topic (as defined in Booth et. al. [2016]) within the social sciences which you find personally interesting. Briefly explain the research topic and your interest. If it helps, feel free to use the framing device the authors suggest at the end of chapter 3 (e.g. what you are writing about -> what you don’t know about it -> why you want your reader to know and care about it).
Based on your proposed research topic, identify up to THREE specific research questions based on that topic. You should be capable of answering these questions within a single research paper. For each question, identify the problem and target audience of the question. Finally, explain how a computationally-enhanced research design could assist you in answering this question.
In class on Thursday
Bring your research topic and questions to class
Group discussion (up to 30 mins)
Present your topic and questions to class
Discussion groups
0
Abbey / Yuhan / Ertong
1
Emma / Jiazheng / Thomas
2
Andy / Kuang / Agnes
3
Anny / Zhuojun / Max
4
Daniela / Pritam / Cosmo
5
Huanrui / Lorena / Yue
6
Tianle / Kexin / Tian
Example
Topic: Polarization in consumption of science
Bigger picture
Theory-driven: Is science beyond politics?
Data-driven: What can we do with millions of online book co-purchases?
Research questions from general to specific
What/which science books do liberals and conservatives read?
Are liberals or conservatives more interested in reading science books?
Do liberals and conservatives read the same (or similar) science books?
How are liberals and conservatives different in their selection of science books?
Which disciplines are read more by liberals/conservatives?
What are the characteristics of the disciplines favored by liberals and conservatives?
When a scientific discipline attracts equal attention from both sides, is there any internal division within the discipline?
Within each discipline, what are the breadth of books read by liberals and conservatives?
What explains the difference? Applied vs. pure?
Answerable questions given co-purchase data
What/which science books are co-purchased with liberal/conservative political books?
Are science books co-purchased more with liberal or conservative books?
Which disciplines of books are more co-purchased with liberals/conservative books?
What percentages of books within each discipline have shared links with liberal/conservative books?
Within each discipline, what are the breadth of books co-purchased with liberal/conservatives books?
Are political alignment/shared interest/difference in breadth correlated with any characteristics of disciplines?
Operationalization
What constitute a co-purchase link?
Should links be weighted?
Given a set of linked books, how to define their breadth?
How to measure whether a discipline is more pure or applied?
Class exercise
As a class, identify a focused topic within #MeToo
In small groups and then as a class, identify
Up to three research questions
Identify the significance of the question - that is, what is the problem trying to be solved?
Using BibTeX or other citation managers ensures consistency in formatting
If you still have trouble understanding how to integrate your sources into your writing (e.g. when to cite, how to paraphrase) read chapter 14 in Booth or ask me.