Skip to content

Week 4: Jan. 31: Text Learning, Transformers, and Interpretability - Possibilities #9

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ShiyangLai opened this issue Jan 28, 2025 · 25 comments

Comments

@ShiyangLai
Copy link
Collaborator

ShiyangLai commented Jan 28, 2025

Pose a question about one of the following articles:

“Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.” 2022. Wei et al. NeurIPS. “Chain of thought” prompting controls and improves how LLMs reason.

“The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings.” 2019. A. Kozlowski, M. Taddy, J. Evans. American Sociological Review.

“Aligning Multidimensional Worldviews and Discovering Ideological Differences”. 2021. J. Milbauer, A. Mathew, J. Evans. EMNLP.

“Who Sees the Future? A Deep Learning Language Model Demonstrates the Vision Advantage of Being Small” 2020. P Vicinanza, A. Goldberg, S. Srivastava.

Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases” W. Guo, A. Caliskan. arXiv: 2006.03955.

“Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases.” 2020.

“Can Pandemics Transform Scientific Novelty? Evidence from COVID-19.” 2020.

“Application of Deep Learning Approaches for Sentiment Analysis.” 2019.

A deep learning model for detecting mental illness from user content on social media” (2020)

Using Word Embeddings to Analyze how Universities Conceptualize "Diversity" in their Online Institutional Presence” (2019)

A mathematical theory of semantic development in deep neural networks” (2019)

Transformer Circuits Thread”. 2024. Anthropic Interpretability Group.

“Processing Sequences Using RNNs and CNNs”, “Natural Language Processing with RNNs and Attention”, Hands-On Machine Learning with Scikit-Learn, Keras & Tensorflow, chapters 15-16.

Neuronpedia: Interactive reference and tooling for analyzing neural networks”. 2023. Lin, J., & Bloom, J.

@yangyuwang
Copy link

For the “The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings”, Kozlowski et al. shared insights on how word embeddings could be with cultural meaning, by comparing them to survey variables. As the corpus used in the paper mainly came from published books, I was wondering whether it would omit word meanings in subcultures, and only return one simple meaning of the word (as a word only have one single word2vec).

For example of the word "queer", it had extremely negative meaning before a certain time, and increasingly turned into a positive umbrella term. There might be struggle between different usages of "queer", according to discourse theory, but in word2vec idea, it can only see the changes in vectors across time.

An additional question regarding the uniformity of word2vec is how LLMs handle these kinds of issues. Still in the example of "queer", ChatGPT could easily report four different meanings of "queer". Is it because of the architectures of LLMs, such as self-attention and deep architecture?

@chychoy
Copy link

chychoy commented Jan 31, 2025

From the paper "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models," it seems that asking the model to break down larger questions into smaller checkpoints significantly improves performance. Does this type of prompting require additional computing, and if this were a back-and-forth conversation, would this hinder its performance in extracting conversation context as the conversation gets longer?

@christy133
Copy link

In Guo & Caliskan 2020, the authors demonstrate that intersectional biases are actually stronger and more insidious than single-category biases. Specifically, neural language models showed the highest magnitude of bias when dealing with intersectional groups like African American women and Mexican American women, compared to biases based on single categories like gender or race alone. Could this suggest that current debiasing methods (which often focus on single categories like gender or race) might be insufficient for addressing how bias manifests in reality? And if so, what would an effective intersectional approach to debiasing actually look like?

@lucydasilva
Copy link

The Chain of Thought essay was interesting -- it demonstrated how prompting can encourage a model to describe its "rationale" for going from a task/problem to a conclusion/solution. Beyond the performance improvements CoT modeling offers, I'm really interested in the fact that explicable rationality is what grounds intelligence and frameworks of knowing. But I think it is very underdeveloped, and treating it as a means to increased performance (rather than an end in itself!) is missing the whole point. The fact that explicable rationality is something a model is capable of, to me, is far, far, far more exciting than its ability to predict answers. After all, you don't actually know anything until you explain it to a five year old. I am wondering if anyone else thinks that intelligence in this context could be better understood if it we understood as an ends rather than a means? (obviously it would never get funding unless limited to utility, but I'm asking the question in an ideal world. which is probably committing some fallacy but if there's a time to be speculative it seems to be now)

@Sam-SangJoonPark
Copy link

Sam-SangJoonPark commented Jan 31, 2025

Using Word Embeddings to Analyze how Universities Conceptualize "Diversity" in their Online Institutional Presence

This reading analyzes how U.S. universities focus on external demographic diversity (e.g, race, gender, ethnicity), while giving less attention to internal intellectual diversity (e.g perspectives and beliefs). If these findings contradict commonly accepted notions of diversity, should we trust the analysis or rely on existing beliefs? Furthermore, is society prepared to accept diversity based on internal factors like perspectives and beliefs, rather than visible external differences?

External diversity is easily recognized and more straightforward to address in policy, but internal diversity is harder to perceive and may be more challenging to accept. Even though when analytical results are logically sound, how people interpret and react to them is a separate issue. If such results disrupt social norms, they may provoke resistance or distorted interpretations. Therefore, it is crucial to approach these findings with both a focus on evidence and an understanding of how they will be perceived and received. How can we deal with this problem?

@DotIN13
Copy link

DotIN13 commented Jan 31, 2025

Is Chain-of-Thought reasoning solely applicable to text-based models, or can it also enhance multi-modal scenarios? For instance, in models processing both text and images, could Chain-of-Thought be used to guide reasoning about the text/image input before generating a caption or even synthesizing a new image via diffusion?

@zhian21
Copy link

zhian21 commented Jan 31, 2025

Guo & Caliskan (2006) describe methods for detecting biases in word embeddings, including intersectional and emergent biases, using techniques like the contextualized embedding association test and emergent intersectional bias detection. While effective, these methods are primarily post-hoc analytical tools applied after model training, raising the question of whether they can be integrated into training processes to proactively mitigate biases rather than merely identifying them. Given that biases develop dynamically, could these detection methods be adapted for real-time bias monitoring or data augmentation strategies to ensure that neural language models actively reduce bias during training rather than requiring correction afterward?

@psymichaelzhu
Copy link

(Kozlowski et al., 2019,)
The research primarily used the Google Ngrams corpus, which is mainly based on book texts rather than news, social media, or personal communication. Does the different text sources affect the expression of class cultural dimensions?

@JairusJia
Copy link

The article “The Geometry of Culture: Analyzing the Meanings of Class through Word" mainly relies on book text data (Google Ngrams) to do the research, I wonder how can we combine more contemporary text data such as social media, news, film and television dramas to enhance the analysis of cognitive changes in social classes?

@Daniela-miaut
Copy link

“Aligning Multidimensional Worldviews and Discovering Ideological Differences” introduces a method to attend to multidimensional ideological differences instead of reducing them to a "left" vs "right" dichotomy. I am curious how is the method (including other word-embedding methods) accepted in the community of political science, and also, how can the current LLMs be incorporated into the measurement of ideologies.

@xpan4869
Copy link

The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings (Kozlowski et al., 2019)

The authors use word embeddings to map semantic relationships, arguing that words are positioned based on shared context rather than direct co-occurrence. How might we critically evaluate the reliability of this method? What potential limitations exist in using computational linguistics to map cultural meanings, and what validation strategies could further strengthen this approach?

The researchers acknowledge their corpus is limited to a "literary public" and may not represent marginalized voices. How can we develop more comprehensive computational methods to capture semantic structures across diverse linguistic and cultural contexts? What strategies could address potential representational biases in computational text analysis?

@baihuiw
Copy link

baihuiw commented Jan 31, 2025

How does the effectiveness of Chain-of-Thought prompting vary across different languages, and what challenges arise when applying it to non-English or low-resource languages?

@haewonh99
Copy link

From 'A mathematical theory of semantic development in deep neural networks(https://www.pnas.org/doi/10.1073/pnas.1820226116)' - it's really interesting and quite surprising that neural networks resemble semantic development and cognition. Is there an explanation for why this is happening, and vice versa, is there methodologies suggested to copy human development to enhance the performance of models(apart from activation functions that are widely known to model neurons)?

@tonyl-code
Copy link

For the deep learning model to detect mental illnesses, I was wondering what it really means to infer a person-level trait from a message. Wouldn't a better method here to be to maybe average, or someone aggregate over a person's entire chat history to do inference. In that case, we would also need to do some longitudinal analysis, I suppose. Also, for Geometry of Culture, would there be a way to incorporate transformers into here?

@kiddosso
Copy link

For the word2vec models, I wonder if there is any chance that the curse of dimension would happen on these models. These to vector models often have very high dimensions. Are they are prone to have some problems? Or those vectors are reliable?

@CallinDai
Copy link

[Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. 2022. Wei et al.]

Since 2022, LLMs have become significantly better at multi-step reasoning. How have state-of-the-art models (DeepSeek, GPT-4, GPT-01) integrated and improved upon Chain-of-Thought prompting? Do they use CoT in novel ways beyond manual few-shot prompting? ( we often see similar 'thinking process' of them before generating the output to our end.

Are modern LLMs internally structuring their reasoning like CoT, even when not explicitly prompted to?

@youjiazhou
Copy link

In Using Word Embeddings to Analyze how Universities Conceptualize “Diversity” in their Online Institutional Presence, the authors manually identify words that capture the 2 overarching categories of diversity. The usage seems to depend heavily on previous knowledge about the topic. I am wondering what other tasks word2vec could do. Could it do the tasks that are more about exploration rather than examination? For example, if I want to identify some themes but don't know what themes might be, could word2vec do that?

@xiaotiantangishere
Copy link

[“Chain-of-Thought Prompting Elicits Reasoning in Large Language Models.”]

The current evaluation approach mostly measuring the end result rather than the reasoning process itself. I'm curious about how could the model directly assess the correctness for each reasoning steps. In addition, the paper suggests that CoT prompting does not positively impact performance for small models, but for smaller, domain-specific models, could external fact-checking, symbolic reasoning, or knowledge retrieval be integrated to enhance CoT accuracy and reliability?

@ulisolovieva
Copy link

ulisolovieva commented Jan 31, 2025

Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases” W. Guo, A. Caliskan. arXiv: 2006.03955.

The paper introduces significant methodological and theoretical innovations for studying intersectional bias. It demonstrates that when LLMs have increased contextualization of embeddings, like presenting the same role/job across different social groups/contexts, it reduces bias. However, does this reduction in bias stem from censorship or positivity bias rather than genuine model neutrality? And how do we tease them apart?

@yilmazcemal
Copy link

Possibility readings explain how LLMs and other textual analysis methods are useful for understanding the social world. For this task, do LLMs we have today offer absolute superiority (to get embeddings, classification, other tasks) over "older" methods like word-level embeddigns, or classification models? Namely, today are we better off with a huge LLM in terms of e.g. classification performance than trying to use a custom model for our classification? Or are there trade-offs beyond open / closed models and running locally / vs through APIs?

@tyeddie
Copy link

tyeddie commented Jan 31, 2025

Regarding the article A deep learning model for detecting mental illness from user content on social media, Kim et al. trained a series of deep learning classifiers to detect mental illness based on Reddits posts. However, the authors seem not interested in the interpretation or the mechanisms about how neural networks made the decision. I wonder what procedures we can add on to it in order to have interpretable results from the models?

@shiyunc
Copy link

shiyunc commented Jan 31, 2025

In the "Aligning Multidimensional Worldviews and Discovering Ideological Differences", the researchers invented an unsupervised cultural analysis method that avoided the predetermined framework of defining values or the subjectivity in manual coding. I think the emphasis on the multi-dimensionality of belief systems is very important, because the semantic meaning of identity concepts themselves (e.g., left, right) could be changing over time, and make the definition unclear. This method helps us better detect the dimensions and contents of contemporary worldviews. A question might be: can we categorize the dimensions detcted by the unsupervised model and form theoretical explanations?

@siyangwu1
Copy link

How can we ensure that word embeddings and LLMs accurately capture the diversity of meanings in words that shift across time and cultural contexts? Given that word2vec models assign a single vector per word, how do newer architectures like transformers and contextual embeddings address polysemy and semantic change more effectively?

@CongZhengZheng
Copy link

"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models": Why is it hard for the model to directly translate all of the semantics into a single equation, even if the equation appears simple? Would adding an external calculator to alms be a fair design, since it significantly boosts the performance of chain-of-thought prompting? Why do models have the problem of reading semantics wrong or even inventing answers? How do models arrive at the correct answer via incorrect reasoning paths and how do we fix it?

@siyangwu1
Copy link

How can computational approaches like word embeddings be adapted to better capture semantic variation across different social groups and subcultures, especially for words that hold distinct or contested meanings? Would incorporating multi-source corpora (e.g., social media, oral histories, niche publications) or leveraging contextual embedding models (like BERT) provide a more nuanced representation of cultural meaning?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests