Skip to content

Week 3. Jan. 24: Sampling, Bias, and Causal Inference with Deep Learning - Possibilities #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ShiyangLai opened this issue Jan 19, 2025 · 22 comments

Comments

@ShiyangLai
Copy link
Collaborator

Pose a question about one of the following articles:

Double/Debiased Machine Learning for Treatment and Causal ParametersLinks to an external site.”. 2018. Victor Chernozhukov, et al. C1-C68.

Dissecting racial bias in an algorithm used to manage the health of populationsLinks to an external site.”. 2019. Z. Obermeyer, B. Powers, C. Vogeli, S. Mullainathan. Science 366(6464): 447-453.

“Semantics derived automatically from language corpora contain human-like biases.”Links to an external site. 2017. A. Caliskan, J. J. Bryson, A. Narayanan. Science 356(6334):183-186.

The moral machine experimentLinks to an external site..” 2018. Awad, Edmond, Sohan Dsouza, Richard Kim, Jonathan Schulz, Joseph Henrich, Azim Shariff, Jean-François Bonnefon, and Iyad Rahwan. Nature 563, no. 7729: 59-64.

Deep Neural Networks for Estimation and InferenceLinks to an external site.”. 2021. Max H. Farrell, Tengyuan Liang, Sanjob Misra. Econometrica 89(1).

BoTorch: A Framework for Efficient Monte-Carlo Bayesian OptimizationLinks to an external site.”. 2020.

Show and Tell: A Neural Image Caption GeneratorLinks to an external site.”. 2015.

@yangyuwang
Copy link

In the paper of Semantics derived automatically from language corpora contain human-like biases, the authors examined how deep learning models based on human data would contain human-like biases. One question aroused from my reading is on what indeed the "biased model" is and how we can cope with it.
The problem here, from my perspective, is not whether there is bias in the models, but how the bias would be reproduced to who used the models. The key here is "noise", as mentioned in chapter 7 of our orienting readings. For the inputs and outputs of deep learning models, they are with little noise: considering echo chambers, the input data would be biased, while considering the uniform answer given by models, the outputs would be also biased.
So, the question here is how we can add noise to the models, making the models de-biased. Regularization could be one kind of idea in the modeling part, but what is the others, especially considering data? For example, could we adjust the original sampling of human data, to make it more neutral? (And how is this possible and ethical?) Is it possible to make the output layers generate more variance in the answer (like error term in statistics)?

@kiddosso
Copy link

The Caliskan et al. (2017) paper suggests that word embeddings may capture historical biases present in human language. For the authors, this claim remains a hypothesis that requires further validation. However, papers in the coming weeks such as The Geometry of culture base their findings on the hypothesis that word embeddings can appropriately capture real words' meanings. For me, there is a strong tension between these two sets of research. So has Caliskan et al.'s conclusion been more or less confirmed? If not, why the later papers take this assumption more or less as granted?

@ulisolovieva
Copy link

“Semantics derived automatically from language corpora contain human-like biases.”Links to an external site. 2017. A. Caliskan, J. J. Bryson, A. Narayanan. Science 356(6334):183-186.

Given advances in AI, can we now run multi-word rather than single-word embeddings? (e.g., biology scientist & computer scientist differentiation vs. just scientist to track field-specific ability beliefs and brilliance stereotypes).

Is it possible for high cosine similarity to also just mark frequency of occurrences but not stereotypes? For example, female and nurse embeddings might have high cosine similarity but because of other unrelated word co-occurances? (trying to think of boundary cases for robustness of the method). And how can we establish/identify causal pathways and not just correlations (e.g., embedding associations & biases from real-world data like the occupations example)?

@Daniela-miaut
Copy link

In Semantics Derived Automatically from Language Corpora Contain Human-Like Biases, authors discussed the measurement of biases derived from language corpora. I am curious what are the possible ways to add regulations to avoid these biases in neural networks? And how to evaluate the regulation effects, as well as how much degree of biase remains after regularization, since the neural networks today are more complicated and less interpretable than those in the year when the paper was written?

@psymichaelzhu
Copy link

In the moral machine paper,
do participants’ response patterns to different preferences (or combinations) reflect deeper underlying motivations/mental states, such as fairness or utility maximization?
Perhaps hierarchical Bayesian modeling/neural network could be used to extract the latent components underlying these preferences.

@baihuiw
Copy link

baihuiw commented Jan 24, 2025

In the "Double/Debiased Machine Learning for Treatment and Causal Parameters" Paper:
What are the potential limitations or challenges in applying the "double/debiased" machine learning framework across diverse fields, and how might these be addressed to ensure robust causal inference in high-dimensional settings?

@Sam-SangJoonPark
Copy link

Transformer models have automated data learning and pattern recognition, bringing significant innovation to research processes. In social science research, are there any examples of research designs that have complemented traditional qualitative and quantitative methods using Transformers? If so, what are they? If not, how can we design such studies? Additionally, what specific considerations should be taken into account when integrating Transformers into research design?

@haewonh99
Copy link

From "Semantics derived automatically from language corpora contain human-like biases": Once the machine learning model is trained, can we 'debias' it so that we could get a better representation of people's answers if they are free from biases? How can this be done-maybe with RoRa if we can get 'bias-free' dataset?

@yilmazcemal
Copy link

Farrell et al. develop a deep neural network-based approach that can estimate each individual observations "parameters," generalizable to most structural economic models, giving insight into the unit heterogeneity. From what I could understand, this depends on "getting the model right" in the first place. Is it possible to develop measures/methods can tell us when the model we have is a bad approximation of reality?

@christy133
Copy link

In the paper, Obermeyer et al., 2019 critique the use of healthcare costs as a proxy for health needs in predictive algorithms. This choice introduces racial bias because costs reflect systemic inequities, such as differences in access to care or treatment patterns rather than actual health conditions. They advocate for choosing alternative labels, such as measures of health (e.g., chronic conditions), that more accurately capture patients' needs and reduce bias in algorithmic predictions. Then the question becomes how can we determine whether the selected alternative label reflects the needs of all patients, especially those with less visible or unreported health issues, such as mental health conditions while accounting for social disparities?

@zhian21
Copy link

zhian21 commented Jan 24, 2025

For Caliskan, Bryson, & Narayanan (2017), how can biases in pre-trained AI models, such as those used in hiring, search engines, and content recommendations, lead to unintended discrimination or reinforce societal inequalities? What techniques can be implemented to detect and mitigate these biases while maintaining model accuracy and fairness?

@chychoy
Copy link

chychoy commented Jan 24, 2025

In the article "Moral Machine Experiment," the authors highlight the importance of establishing a strong moral framework before deploying intelligent machines into a space where it shall face decisions without human supervision. However, as they have mentioned themselves, human decisions in real life are made with countless dimensions. For a model, is it physically possible to account for "enough" dimensions for the model to be "ethical" to deploy? Or does that even matter at all, whether or not the machine accounts for everything a "normal" person might account for during a decision? Furthermore, as we discussed in "Semantics derived automatically from language corpora contain human-like biases," how do human researchers parse for these biases (or should they?) while training a model, especially as the models become more and more complicated? I understand that different people might have different takes on this, but still interested to hear people's thoughts.

@DotIN13
Copy link

DotIN13 commented Jan 24, 2025

Is it feasible or desirable for machine learning models, particularly those governing autonomous vehicles, to adhere to a universal moral framework, or should they instead adapt to the cultural and societal norms of the regions in which they operate? More broadly, should LLMs do so as well? How might combining diverse moral philosophies into a single model enhance or impair model performance?

@xpan4869
Copy link

In the article of Dissecting racial bias in an algorithm use highlighted the importance of choosing the right label for prediction, which is beyond the model accuracy itself. What the model prediction proxy is the bias bringing to real-life. It was fortunate for researchers finding such complete data in a relatively well-documented industry of healthcare. I am wondering if there are frameworks or methodologies can be used to evaluate whether the chosen proxy aligns with ethical and equitable outcomes?

@CallinDai
Copy link

Reading: Caliskan et al.(2017)

I find it intriguing to compare artificial intelligence with human intelligence, as AI can be viewed as a form of collective human intelligence, reflecting societal ideologies in a way. This collective nature often amplifies existing prejudices, mirroring how societies tend to adopt more extreme perspectives on issues, leading to biases and prejudices that can surpass those of individuals. I wonder whether this amplification in AI models might serve as a lens to better understand similar dynamics in human societies.

The authors (p.10) highlight that models trained on human data, such as language corpora, inevitably inherit the biases embedded within. This raises an interesting question: could debiasing technologies developed for AI be applied to mitigate polarized ideologies in social science contexts, potentially fostering less divisive community narratives? However, the applicability of these techniques brings up two critical concerns. First, as the authors emphasize, biases in language may be inherently difficult to “debias” because word meanings are relational and rooted in cultural and historical contexts. Second, there is the risk of losing valuable semantic information during debiasing, which could reduce a model’s ability to capture subtle but meaningful aspects of human language.

Another point worth exploring is whether human interaction with debiased LLMs could influence users’ own biases, similar to how human communication can shift perspectives. If debiased LLMs consistently reinforce less prejudiced or more neutral language patterns, could this interaction lead to a gradual change in societal biases?

@tonyl-code
Copy link

Regarding the moral machine experiment, I was wondering if there can be a two way interaction in terms of morality between the model and human? For example, thinking about language models like ChatGPT, would it be possible for not only the user to be influenced by preset policies within the model, but the morality of the model can be altered the user after a period of interaction?

@JairusJia
Copy link

In the paper 'Dissecting racial bias in an algorithm used to manage the health of populations', If the algorithm's prediction target is changed from health care costs to patient health outcomes, would it affect the algorithm's overall performance and applicability?

@xiaotiantangishere
Copy link

Even rational people can hardly reach a consensus on collective goals, so expecting AI to navigate these moral dilemmas seems inherently problematic. The black-box nature of DNNs and LLMs makes them lack interpretability, making their use in societal decision-making - such as autonomous driving - highly contentious. Issues of accountability and fairness arise. Moreover, should AI attempt to replicate human morality, with all its contradictions and cultural biases, or should it prioritize rational optimization to assist human decision-making? How should we approach the future development of AI morality?

@shiyunc
Copy link

shiyunc commented Jan 24, 2025

In the "The moral machine experiment" paper, the authors emphasized that we are living an age where machines are allowed to make moral and ethical decisions, and "we need to have a global conversation to express our preferences to the companies that will design moral algorithms". However, as is shown in the study, there are many real-life dimensions that affect moral decision, and there are robust cultural differences in the preference. To design applicable machine ethics, should we prioritize the consistency of moral preferences or their diversity?

@youjiazhou
Copy link

For the show and tell piece: Why was the combination of CNN and RNN chosen to solve this problem? Are there other possible architecture choices?

@CongZhengZheng
Copy link

"Double/Debiased Machine Learning for Treatment and Structural Parameters":

In the paper, we discuss the use of debiased or double machine learning (DML) techniques to address the biases introduced by high-dimensional nuisance parameters in estimating a target parameter. How the application of Neyman-orthogonal moments and cross-fitting techniques in DML helps in achieving N^-1/2 consistency in parameter estimation, and how this approach compares to traditional estimation methods that might fail under high-dimensional settings?

@siyangwu1
Copy link

How do double/debiased machine learning methods, particularly through the use of Neyman-orthogonal moments and cross-fitting, manage to maintain reliable parameter estimates in high-dimensional settings, and how do they differ from conventional approaches that might fail or yield biased estimates when faced with numerous nuisance parameters?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests