-
Notifications
You must be signed in to change notification settings - Fork 4
Week 1. Jan 10: Deep Learning? - Possibilities #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Concerning Savcisens et al. and their life2vec project, the authors make the point that the model's performance is highly dependent on the data used for training, so my question concerns generalizability of these results in Denmark and beyond. How should we expect its out of sample performance to be affected in cases where the relationship between inputs and events predicted changes? For example, if the relationship between labor & health data and outcome of early deaths change for some reason (let's say a technological shock has greatly improved working conditions and changed the relationship between labor and early death substantially, or an economic shock made health services much less accessible by certain groups as compared to the period used for pretraining the model), would these introduce unknown biases to out of sample predictions of the model? If as authors argue life2vec project can work as a foundation model for others developed for related tasks or operating in similar domains, how would these potential biases be best mitigated without changing the foundation model? I'm curious if this is also a problem encountered by LLMs that we now use almost daily, and how it is dealt with. |
Research surrounding online images and gender bias suggest that gender bias online is more prevalent and psychologically potent in images than text, with one of its key manifestations being the larger collection of images of men than women in various professions. Knowing that many AIs are trained with these available images as inputs, what are current checks and balances that could potentially skew AI training inputs (of course, not saying that they should be done, but if they were to be done, what are the current options and common practices)? However, how actionable are these strategies in practice, considering the multitude of identities (gender, race, social class, disabilities, etc.) that could be derived from an image? Furthermore, what are the ethics of choosing to do so or not do so? Should researchers even be the people responsible and accountable for these decisions? If not, who should? |
In the reading, "The unreasonable effectiveness of deep learning in artificial intelligence", there is a lot of discussion on the emulation of biological systems in artificial intelligence. The authors mention that "Deep learning was inspired by the massively parallel architecture found in brains and its origins can be traced to Frank Rosenblatt’s perceptron in the 1950s..." This discussion sparked my interest into two avenues: scalability and limitations. For one, should we continue to pursue research of biologically inspired architectures to solve growing scalability problems? And for two, at what point does the growth via the emulation of biological systems plateau, or prove inadequate? |
Through my reading of the Guilbeault et al. 2024 piece on gender bias in online images, several questions are aroused regarding of the content modality. First of all, though it might be more complex than in texts, the bias in images are more explicit than other types of data, such as in sounds or in videos. How can we detect and compare these bias to help us improve models? Secondly, from my perspective, the "bias" produced by deep learning models (including LLMs) is not mainly from the output itself, but the lack of variation--because we only get one answer from them, we take it as the truth. To improve the case, how can deep learning output more variate results or at least with some noise? Finally, the paper mentioned the importance of developing multimodal framework. Can we build multimodal deep learning models along with the idea, such as using image-text-hybrid data (image+text on whether there is bias) to prevent the bias issue? |
RE: Using sequences of life-events to predict human lives (Savcisens et al., 2023) The authors create a high dimensional embedding space of individual life event series to generate predictions about individuals' mortality and character traits. I find that application of deep learning methods for social science research highly creative, but I do wonder about two things:
|
The article Ruducing the Dimensionality of Data highlights that autoencoders outperform linear methods like PCA by capturing complex nonlinear structures in data. But will this increased flexibility lead to overfitting? How can we ensure the generalization ability of autoencoders without compromising their expressive power? |
I am especially interested in "Using sequences of life-events to predict human lives," especially having read the earlier Salganik et al. paper. This new research shows that deep neural networks, when given much denser data and utilizing a BERT-like architecture, can predict life outcomes significantly better than previous baselines. The research design is particularly impressive. While the performance metrics, such as the C-MCC scores, exceed the baseline, a correlation of 0.3-0.4 might not be considered strong by conventional standards. Admittedly, the paper's main contribution goes beyond task-specific predictions, but proposes this foundational model that opens future work. However, how do we assess whether any performance gain justifies the cost in interpretability and explanation? For example, we may have some simpler theories on health and mortality, which inform general health policies. A complex DNN model may outperform these theories, but if, say hypothetically, it is still incorrect 30% of the time, decision makers may have a hard time figuring out what specific life events mislead the model due to its super complexity. This limitation could significantly constrain real-world applications. How to think about the trade-off here, or is there really a trade-off at all? |
The article of Using sequences of life-events to predict human lives is very interesting. I am wondering if there is any specific criteria were used to determine the optimal number of layers and attention heads in the transformer architecture? Could the researchers identify which specific life events or patterns had the strongest predictive power for mortality risk? And how well might these results transfer to countries with different healthcare and social systems than Denmark? |
In the paper “Reducing the Dimensionality of Data with Neural Networks”, researchers suggest how deep autoencoders can serve as powerful tools for nonlinear dimensionality reduction by creating low-dimensional representations of high-dimensional data, outperforming traditional tools such as the PCA. Thinking back to the purpose and nature of dimension reduction, how can we interpret the hierarchical features learned by deep autoencoders to understand the latent factors in data that were not explicitly encoded? Unlike PCA, deep autoencoders focus on preserving nonlinear relationships. Does this nonlinear nature of deep autoencoders inherently prioritize specific structures in data (e.g., higher-order interactions, clusters) over others? And if so, how might this bias affect the interpretations? |
In the article "Online images amplify gender bias", the findings on gender bias in online images reveal that the advanced AI systems not just reflect societal biases but often exacerbate them. The stark contrast between images and text demonstrates the subtle but potent influence of media on perception. Are biases an unavoidable byproduct of training on real-world data, or can models be explicitly designed to counteract them? How do we strike a balance between preserving cultural context in AI systems and fostering equity? |
"Using sequences of life-events to predict human lives.": Can the result of this research suggest the ineffectiveness of human willpower, as it is shown that the death rate and personality of individuals could be predicted by events that have happened in their lives? Or is the causality the other way around, by having a specific mentality leading to similar events happening in people's lives? Also, does this suggest a possibility that events that have happened so far to an individual until now would predict the next event that will happen to them, almost like prophecy? |
RE: Online Images Amplify Gender Bias |
Online images amplify gender bias: if we were to develop a deep learning model that processes both image and text data, how can the model be evaluated for gender bias across both modalities & do we always want to run separate models for each modality or combine them into one? How might the differences in the gender bias in image > text influence the overall behavior of a multimodal deep learning model? And echoing others' comments, do we want deep learning models to be accurate (reflect societal stereotypes) or unbiased? |
“Reducing the Dimensionality of Data with Neural Networks”, Hinton, G. E., Salakhutdinov, R. R., 2006. Science 313(5786): 504-507. It’s fascinating to compare neural autoencoders with PCA (and that it all happened in 2006!). It would seem that while PCA reduces dimensions to retain the most variance, it inevitably loses some information in the process. Autoencoders, on the other hand, seem almost lossless in comparison, as depicted in the MNIST digit reconstructions shown in the paper. I wonder if this superiority is due to the non-linearity introduced in the autoencoders, which enables them to compress and project the original data space into a more compact, yet expressive, representational space. I also have the question for some time of how are the representation learned from patched reconstruction differ from those from pixel autoencoders, especially now we have latent space patch level reconstruction in the game (I-JEPA for instance). |
RE: [Reducing the Dimensionality of Data with Neural Networks] This paper dives into the use of neural networks for dimensionality reduction and why it's potentially better than PCA, as it offers a nonlinear compression method. But I'm wondering if there is anyway to interpret the axes, and how do you interpret the distances between points? Furthermore, how can we evaluate it against newer algorithms like t-sne, UMAP, and pacmap? What would be the metric to use to evaluate them against each other (i.e. global structure, local structure, etc.) |
In the reading Using Sequences of Life-events to Predict Human Live, the authors claim interpretability of the life2vec model using methods like saliency maps, concept activation vectors, etc. However, I find the informativeness of their interpretations somewhat unclear. For example, if I understand it correctly, a higher saliency score indicates a more important input feature. I guess my broader questions is, when discussing the interpretability of deep learning models, can we achieve the same level of clarity or precision as in linear models or GLM? Can we extract quantitative info like individual parameter coefficient? What is “interpretation” like in the context of deep learning? |
Paper: [Using Sequences of Life-events to Predict Human Live] I found it both fascinating and astonishing to see how models like life2vec can predict human life outcomes with such accuracy. This paper raises profound questions about the nature of predictability in human lives and the increasing digitalization of our society. It makes me wonder:
|
Andersen et al. (2024) introduce a transformer-based model(life2vec) designed to analyze comprehensive life-sequence data from Danish national registers. The life2vec model effectively predicts outcomes such as early mortality and personality traits by embedding life events into a unified space and leveraging attention mechanisms to uncover complex patterns and relationships. With strong performance across tasks and interpretable outputs, the model offers significant potential for understanding human trajectories. However, the authors highlight limitations, including data biases, challenges in generalizing to other populations, and the ethical implications of using such models in real-world settings. Then, my question is: How can the life2vec model be fine-tuned to account for cultural and systemic differences in life trajectories when applied to datasets from other countries or regions? |
Question for the life sequence piece: What are the implications of collapsing continuous variables (e.g., income) into discrete bins? Does this affect prediction accuracy? |
The article "Online Images Amplify Gender Bias" demonstrates how online images amplify gender biases more strongly than text, influencing perceptions of social categories and occupations. If we aim to mitigate such biases in AI systems, one approach could involve integrating debiasing methods across modalities. My question is: Could multimodal deep learning models combining image and text data be designed to detect and reduce gender biases effectively? Specifically, how might we ensure that the text components provide corrective feedback to the biases inherent in the image components during training? Furthermore, could such a model dynamically adjust its outputs to balance societal fairness while preserving cultural context? How might we evaluate the success of such models in achieving both representational equity and predictive accuracy? |
The ability of Savcisens et al. (2023) to predict not only well-defined target metrics such as premature death, but also highly complex and nuanced features such as personality elements demonstrates the ability of deep learning models to learn underlying patterns in data that does not have easily recognizable structural elements. The authors attribute the better performance of life2vec compared to an RNN to self-attention, positing that the interactions between nodes across many layers better captures relationships across long periods of time. What are the limits to sequences of time-structured data? What methods work best for representing sequences of events? Given that life2vec continues to see performance improvements as more data is added, what is the limit? Could we conceivably capture entire lives, minute by minute, and combine them to predict societal outcomes? |
“The unreasonable effectiveness of deep learning in artificial intelligence”. (2020). In the article, Sejnowski discusses some other brain functions that artificial intelligence can seek to simulate. These are possibilities not only to expand the capability of AIs, but also to make them more human-like. If I understand it correctly, these tasks will need algorithms different than the neural network we are talking about. I am curious how different are they going to be from neural networks, and what are the possibilities and challenges to build them. It seems that to build more human-like agents will greatly benefit social science research, especially expanding the possibilities of agent-based modeling on AI agents. (I have just been added to the class on Friday, so apologies for not being able to post the questions before class.) |
RE: “The unreasonable effectiveness of deep learning in artificial intelligence" Deep learning has been incredibly successful at solving complex problems, even with limited data and many parameters. Why do you think this works so well, and what role might the structure of high-dimensional spaces play in this? |
What is the central hypothesis or research question addressed by the study, and why is it significant in the context of the broader scientific field? How does this study build on previous research in the field? What specific gaps or limitations in prior work does it aim to address? |
RE: “The unreasonable effectiveness of deep learning in artificial intelligence" How do the intrinsic biases and heuristic limitations observed in foundational deep learning models influence the development of artificial general intelligence (AGI), and what strategies can be implemented to mitigate these effects to ensure that AGI performs ethically and effectively in diverse real-world scenarios? |
Re: Online images amplify gender bias |
Uh oh!
There was an error while loading. Please reload this page.
Pose a question about one of the following possibility readings:
“The unreasonable effectiveness of deep learning in artificial intelligence”. (2020).
“Reducing the Dimensionality of Data with Neural Networks”, Hinton, G. E., Salakhutdinov, R. R., 2006. Science 313(5786):504-507.
“A Unified Approach to Interpreting Model Predictions”. (2017).
"Using sequences of life-events to predict human lives." (2023).
"Online Images Amplify Gender Bias." (2024).
The text was updated successfully, but these errors were encountered: