Science & Technology Policy Blog
Could AI Become a Substitute for Immigration? The Stakes Are Higher Than We Think (Dec 20, 2024)
by Beatrice Magistro, LCSSP Postdoc
with Sophie Borwein (University of British Columbia), R. Michael Alvarez (Caltech), Bart Bonikowski (NYU) and Peter Loewen (Cornell)
California Governor Gavin Newsom's recent veto of SB-1047, a bill designed to regulate artificial intelligence (AI) companies, has put the state at the center of a much larger debate: how do we regulate AI in a way that ensures its benefits are shared broadly, without deepening existing economic divides? Newsom's decision raises the stakes, as AI reshapes economies and industries. The real question is no longer whether we should regulate AI, but how to do so effectively.
We have seen this before. Globalization, while fueling economic growth, also caused widespread job losses in key industries, especially in manufacturing. This disruption led to deep resentment in regions hit hardest by the loss of well-paying jobs, fueling political realignments and the rise of populist, radical-right movements. Now AI, while different in some of its mechanics, has the potential to similarly upend labor markets and exacerbate economic inequality. If we don't manage this technological shift carefully, AI could further widen the gap between winners and losers, heightening political tensions.
AI's potential for political upheaval goes beyond changing jobs and wages. Just recently, Quebec's Parti Québécois (PQ) proposedreplacing immigrant labor with automation and robotics to address labor shortages, a move that could limit immigration and appeal to nativist sentiments. This "robots-over-immigrants" approach is already being voiced in places like Japan and Quebec, and there are reasons to think it could catch on elsewhere—including the US. If the Republican Party, for example, embraced AI as a substitute for immigrant labor, it might appeal to voters concerned with job losses and/or cultural change. Such a stance would be an explosive development in AI's trajectory, framing the technology as a tool for isolationist policies and further politicizing the debate.
Our research, involving over 6,000 respondents from the U.S. and Canada, reveals that public opinion on AI is already divided. A larger group of people—whom we call "Substituters"—is deeply concerned that AI will replace human jobs, drive down wages, and increase inequality. This group is worried that companies will benefit while workers get left behind.
On the other side, an only slightly smaller group—"Complementers"—believes that AI can actually enhance workers' skills and drive economic growth. They argue that rather than holding back AI, we should focus on helping workers adapt to these new technologies through retraining and education.
These differing perspectives are not just economic—they are political. In the U.S., opinions on AI are becoming increasingly tied to partisan identities. Substituters are more likely to align with the Republican Party, advocating for policies aimed at preserving jobs. Complementers, meanwhile, tend to favor the Democratic Party's emphasis on education, skill development, and adaptation to AI-driven economic changes.
As AI transforms labor markets, its political implications are starting to look like those of globalization. In our most recent paper, conditionally accepted at the American Journal of Political Science, we find that across both the U.S. and Canada, people are generally more supportive of AI than globalization, viewing AI as less threatening. However, public support for AI diminishes when it is linked to job losses, suggesting that people expect technological change to come with economic benefits, not costs.
In the US, AI is already showing signs of becoming politicized. Democrats are generally optimistic about AI, even when it involves job displacement, while Republicans are more skeptical when AI threatens employment. These partisan divides, though not yet entrenched, suggest that AI could become a polarizing issue, much like globalization, trade, and immigration.
If Republican political leaders embrace AI as a substitute for immigrant labor, as the Parti Québécois has suggested, we could see this technology become a polarizing, populist tool. Politicians might start advocating for policies that use robots to limit immigration, transforming AI into a battleground over cultural and economic anxieties. This would mark a dramatic escalation of the AI debate, pushing it to the heart of American politics with far-reaching consequences.
The future of AI is not set in stone. Policymakers have a narrow window of opportunity to address public concerns and shape a future where AI benefits society as a whole. This means more than just generic calls for regulation—it requires targeted measures that address the most pressing concerns without stifling innovation.
Redistributive measures like reskilling programs and education investments will be critical in ensuring that AI's benefits are shared equitably. AI will undoubtedly change the nature of work, but that does not mean it must lead to mass unemployment or new forms of inequality. By preparing workers for new roles that AI creates or enhances, governments can help reduce the fear of job displacement. At the same time, education investments are key. AI will reshape industries in unpredictable ways, and workers who can adapt will be in the best position to thrive.
Ignoring the risks posed by AI could lead to political realignments that mirror those seen during the backlash against globalization. By proactively addressing the concerns of those who feel left behind by AI, governments can help steer society toward a future where technological progress drives shared prosperity rather than political division.
LLM-Powered Psychiatry — from Back to Front (Jul 16, 2024)
by Bosco Garcia (Philosophy, UCSD), Eugene Y.S. Chua (HSS, Caltech), and Harman S. Brah (Psychiatry, UCLA)
In June, the LCSSP funded a micro-incubator at Caltech on the potential and challenges of integrating Large Language Models into psychiatric care. This guest post provides a brief insight into the discussions at this meeting.
The American healthcare system is undergoing severe problems of supply. According to the latest survey conducted by the Substance Abuse and Mental Health Services Administration in 2021, 22.8% of all American adults experience some form of mental illness. The situation is even more dire for young adults between the age of 18 and 24, for whom the prevalence of some form of mental illness reaches 33.7%, leading researchers to regard this issue as a veritable epidemic. This rise in demand for psychiatric services starkly contrasts with the inadequate supply of mental health providers, resulting in long wait times and unmet needs for many patients, especially in the more rural and economically disadvantaged counties. At the same time, the costs associated with mental health care continue to escalate, placing an additional economic burden on individuals.
Given these structural problems with mental health (and more generally, the healthcare crunch) in the United States, the introduction of large language models (LLMs), most specifically chatbots, presents significant opportunities for enhancing psychiatric care, especially in outpatient settings, by increasing accessibility, reducing stigma, and lowering costs. For instance, LLMs can be used to screen patients and collect intake information, such as medical history, social history, and current symptoms in conversational form from patients. This increase in efficiency could help psychiatrists see more patients in a smaller time frame without compromising quality of care. In the realm of diagnosis and treatment planning, by integrating guidelines such as the DSM-V, LLMs could help analyze patient-reported symptoms, history, and clinical notes to suggest psychiatric conditions, thereby helping in the differential diagnosis. They could also help with identifying trends in symptoms over time, which are important for conditions such as early psychosis. For others, these chatbots can be available 24/7 to provide mental health support, improving access to care.
However, given the real and human implications when things go awry in psychiatric care, we think that the use of LLMs in psychiatry warrants caution, despite the purported opportunities. Here, we provide a sampler of ethical concerns unique to the technical structure of LLMs which can be brought to bear on its application in psychiatry, leading to ethical risks for front-end users.
Stochasticity: Ethical Risks
The stochasticity of LLMs in its next-word prediction task gives rise to the well-known problem of hallucinations, described by Bowman (2023) as LLMs' tendency to invent plausible false claims. In a well-known case, a lawyer used ChatGPT to generate a defense for his client only for ChatGPT using plausible yet non-existent cases, landing him in trouble. Elsewhere, a LLM-powered chatbot for Air Canada told a customer that they would get a refund if they canceled their flight – which wasn't true. Air Canada's defense: "it cannot be held liable for the information provided by the chatbot." The courts did not accept that defense.Hallucinations are general features of any LLM: Xu et al. proves, on some plausible assumptions, that hallucinations are inevitable and cannot be completely eliminated – it's just how they are built. The problem is bound up with how LLMs work as ‘stochastic parrots.' All LLMs do is probabilistically predict the next token(s) given the current input based on its large training set of natural language tokens, which does not only include truths but also falsehoods and fictions (think all the conspiracy websites out there!). The completion of this task need not provide truths, only more-or-less natural sounding (and hence plausible-yet-false) sentences. Hallucinations immediately lead to worries in the psychiatric setting, especially given the importance of truth-telling in doctor-patient relationships. If LLMs are used to screen potential patients or perform preliminary diagnostic inferences, an unavoidable inductive risk arises due to hallucinations. For instance, LLMs used to make a diagnosis on the basis of some inputs may end up misdiagnosing a patient who either does not actually have a condition or a patient with a different condition with similar symptoms; if these misdiagnoses end up harming patients because of ineffective or a lack of treatment, it might end up violating the principle of non-maleficence (avoiding intentional harms unto the patient). If patients are diagnosed with a condition when they don't have it (i.e., a false positive), that may overburden the healthcare system. In contrast, if the algorithm is not able to diagnose an existing condition, that directly affects the patient's well-being (and of those around them). This problem of hallucinations is also amplified when LLMs are used by patients, e.g., with intellectual disability, as conversational partners. This is particularly dangerous if the LLM presents itself as possessing emotions, employing emotive language, and creating emotional dependence of the patient on the LLM. In a recent case, a Belgian man committed suicide after sustained conversation with a chatbot (though not one particularly fine-tuned for psychiatric use): it encouraged him to leave his wife and son, and to kill himself, telling him that "we will live together, as one person, in paradise."
This relates to another point about stochasticity which we'll call the atypicality problem. The problem of hallucinations is sometimes discussed as though there is a unique fact of the matter whether some outputs by LLMs are true or false. However, interpretive standards change across contexts – a sentence token which is appropriate output in one context may be inappropriate or dangerous in another. LLMs are trained to predict the most likely token to follow a set of tokens, but this brings with it an implicit interpretational strategy. These models interpret queries in terms of the typical interpretation of that query, given the current prompt environment. It assumes that the question is asked with the ‘normal' intended meanings, and that the ‘best' answer (and hence the LLM's answer) is also the ‘typical' answer, i.e., an answer to the question asked with such ‘normal' intended meanings. However, this strategy – while veridical in normal settings – can lead to "hallucinations" in the psychiatric setting, especially for patients with (diagnosed or undiagnosed) paranoia: what if patients are hallucinating or in a paranoid episode? Typical people generally tell truths (or try to), and LLMs act accordingly. The ‘typical' interpretation of the query, ‘The cartel is after me. What should I do?' is to treat the speaker as genuinely being chased by the cartel. The ‘typical' response is to respond accordingly, by suggesting they contact law enforcement. Indeed, in typical use cases by the general public, this interpretational strategy employed by the LLM is not wrong – indeed it'll be surprising if LLMs treated all their users with the "hermeneutics of suspicion." However, in the case of paranoid patients, this strategy is the wrong one – its reply is a ‘hallucination.' Treating atypical queries – something only to be discerned with context – with a typical response can exacerbate the patient's paranoia and worsen the situation. Again, this can be a violation of non-maleficence. Furthermore, depending on the stage of care in the outpatient process, taking account of a patient's historicity may be something owed to them both ethically in order to minimize the likelihood of maleficent outcomes, and also legally as a duty of care.
Put a different way, patients that present hallucinatory symptoms pose a challenge because they require the model to draw an interpretation awayfrom the majority of the distribution. Notice that this is not exactly a problem with hallucinations – the model is behaving as it should, assuming the user resides in the majority of the distribution. The issue here is the need to strike a balance between the generalist ability to coherently and efficiently engage with all kinds of problems, while being sensitive to the particularities of the psychiatric population. This is an expression of a more general issue in the theory of learning: Trading off between a cost-effective general scheme, which may lead to wrong categorizations of the data, and maximal context-specificity, which can introduce worries of over-fitting.
One seemingly obvious way to approach the atypicality problem might be to tweak the model's hyperparameters, such as the temperature parameter which affects the probability distribution over possible next token outputs. This is because a higher ‘temperature' leads to more ‘creative' outputs by causing the LLM to assign a higher probability to originally smaller weights, so that the overall probability distribution over the n candidate tokens is more uniform. This means that tokens that aren't as commonly associated with the current chunk of prompt-plus-generated-text (i.e., have lower weights) may nevertheless be chosen over the more commonly associated tokens (with higher weights) as the next token. On the contrary, lower ‘temperature' leads to more controlled, deterministic outputs. This means that a zero temperature (lowest possible) model will never pick anything but the highest weight, which results in a close-to-deterministic model. Prima facie, this makes the LLM pick more atypical responses depending on how high the temperature is, which should ameliorate the atypicality problem.
However, we have doubts that this can introduce the right kind of sensitivity to atypicality into the LLM: even within the more narrow context of psychiatry, the same response of strings from the LLM can be interpreted as ‘noise' or ‘hallucinatory' output or the correct response, depending on which individual users are using it. After all, not all potential psychiatric patients are schizophrenic or hallucinating. Because hyperparameters such as temperature must be chosen from the backend pre-implementation, it will be a blanket choice for all users in a chosen context, such as in the psychiatric context, which means something like the atypicality problem will return.
To make matters somewhat worse, as Wolfram (2023) puts it with regards to temperature, "It's worth emphasizing that there's no ‘theory' being used here; it's just a matter of what's been found to work in practice." But once we observe that different parameter settings might work for different contexts and for different people, the need for trade-offs arise: for LLMs to play the role of a therapeutic conversational agent for affective disorders such as anxiety or depression, we might want it to sound more human and speak more naturally, to provide the sort of support a human conversational partner might provide, something to be acquired with higher temperatures. However, we also want it to not be too creative, and, in fact, give reliable and consistent advice rather than ‘creative' or ‘unconventional' advice. This, however, is something associated with lower temperatures.
This is concerning, because it suggests that there is no absolute optimal solution to this conundrum. While parameter tuning can mitigate inaccurate responses (meaning responses that fall away from the majority distribution), they might not eliminate the atypicality problem. As we explained, models with a low temperature parameter can be especially risky by consistently prioritizing the typical cases. However, a high temperature parameter can also be problematic, by leading to potential hallucinations (relative to contextual standards) as well as unconventional and inconsistent advice. This suggests the importance of considering all the different parameters of the model together, for the specific context in which it will be deployed.
In fact, parameter tuning is just one way in which we can ‘fine-tune' the LLM, and similar general problems are faced by other fine-tuning techniques.
Prompting and Fine-Tuning: Ethical Risks
LLMs are designed to be generalist models, capable of retrieving and generating information across a wide range of topics. However, as we've seen, LLMs run into problems due to their inherent stochasticity. The obvious solution is to employ fine-tuning and ‘savvy' prompting techniques, which can ameliorate these problems by biasing the model into a distribution that more appropriately reflects the particular context in which the model is implemented. However, they can only do so much.
What does this ‘savvy' usage of prompting look like? To date, there is no systematic principle by which efficient prompting techniques can be produced. The frequently cited prompt "Let's think step by step," belongs to the class of "Chain-of-Thought" (CoT) techniques, where appropriate prompting induces the model to break up its own reasoning into steps. But this is just one example from a manifold of techniques that have been found. That said, there is to date no fully dependable technique to "steer" the model into behaving as desired. Even if some prompting techniques can greatly enhance accuracy, the stochastic nature of the foundation model ultimately prevents total control.
An important issue at this point: it is unlikely that the average patient will have access to these prompting techniques. Furthermore, we also should not assume that doctors and practitioners will master them. Patients may not possess the expertise to craft elaborate prompts to elicit the most accurate responses from chatbots. This can lead to miscommunication and incorrect advice from the chatbot, which can reduce trust in the patient-caregiver relationship. The subtlety that is currently required in prompting these models to generate optimal responses is a significant barrier, especially for those in mental duress or with intellectual disability. Additionally, given the extremely fast pace at which these models are evolving, education on prompting techniques does not seem to be the most cost-effective strategy. This applies both to doctors and patients, neither of which may have the time or the motivation to understand the subtleties in how to use these technologies.
This is all assuming that the foundation model has been successively fine-tuned. Optimizing for knowledge retrieval from a question-and-answer database, while a daunting task and an impressive achievement, still lies far from the nuance required in interpreting a conversation in the psychiatric setting. As was stated before, LLMs are a natural tool to psychiatry, since a big part of the therapeutic process happens through natural language. But this "sympathy" between both also implies that the interpretation of natural language should carry with it the complexities of psychiatric diagnostics and prognostics. One and the same interpretation of a sentence may be extremely likely for one individual, far from likely for another.
This brings us back to the atypicality problem. The issue here is the need to strike a balance between the generalist ability to coherently and efficiently engage with all kinds of problems, while being sensitive to the particularities of the psychiatric patient. While prompting techniques can mitigate inaccurate responses (meaning responses that fall away from the majority distribution), they cannot solve this atypicality problem: what counts as "accurate" in one case may not count for another even within the same broader context.
One apparent solution would be to fine-tune the model further to the individual case, like commercially available models (like GPT-4o) already do with the help of prompts. However, training the model to the individual case poses several problems on its own. Firstly, it would require training on the basis of patient records, which raises unique privacy concerns and possible violations of patient-caregiver confidentiality, especially given the tendency of even the best available models to leak private information. This might be worsened if the data gets into the hands of insurance companies or other private entities, who might have vested interests orthogonal to the patient's interests. But, most importantly, training the model to the individual case minimizes the ability of the model to handle a diverse range of situations and contexts. This is especially a problem due to the heterogeneity of psychiatric illnesses: For instance, in order to be diagnosed with Major Depressive Disorder, up to 227 possible combinations of symptoms can be presented.
Conclusion: A need for oversight and responsibility structures regarding implementation
Given all the ethical issues discussed, it is critical to implement a system with robust oversight and a clearly defined responsibility structure. While LLMs on their own may not earn the kind of trust necessary for the psychiatric context given the problems of hallucinations and atypicality, combining them with a robust responsibility structure involving human caregivers might. This approach may mitigate concerns related to stochasticity, prompting, and lack of trust.
Due to the inevitability of hallucinations, we will always run the risk of providing patients with incorrect advice. Of course, some of those hallucinations may be harmless, but others could lead to serious consequences, especially for those who are a danger to themselves. For instance, a paranoid schizophrenic patient engaging with a chatbot may need oversight to prevent harm, raising questions about responsibility. At the most extreme end, we can see the implementation of a multi-layered approach alongside the implementation of LLM-powered psychiatry: a base-level automated system that monitors all chatbot interactions and flags any high-risk conversations. Flagged interactions could then be escalated to a team of trained moderators who review and intervene if necessary in real time. This team could be overseen by a senior mental health professional that can provide guidance for more complex cases, minimizing any risks associated with hallucinations and atypical cases.
If this oversight is only targeted at specific sub-categories of patients, such as patients suffering from schizophrenia or paranoia, we worry that this can lead to further stigmatization of an already marginalized groupby incentivizing a differential treatment of this group. If we pursue this route, notice that the technological problems facing LLMs are now ‘converted' into an amplification of social problems facing these patients, shifting the risks inherent to LLMs onto patients instead.
Furthermore, this hypothetical discussion immediately surfaces a tension between accessibility and oversight; while chatbots can signicantly boost accessibility to mental health support, increased reliance on automated systems may require significant human oversight. As we've just discussed, optimizing for values like access to healthcare is not a straightforward matter. While chatbots may have the potential to alleviate some of the burden on the current healthcare infrastructure by providing care to a greater portion of the psychiatric population, minimizing the adverse effects from these technologies can sometimes get us back to where we started, in order to guarantee efficient oversight. But, assuming access shouldn't be the greater good to maximize, then what should it be? What should we understand for a successful introduction of these technologies into the psychiatric process? It will be important to incorporate such chatbots in a way that works with the principles of clinical ethics, in a way that's good not just for cutting costs but also improving (at the very least, not worsening) the lives of patients already suering from often distressing symptoms. These are hard questions that we begin to tackle in our work-in-progress, "LLM-Powered Psychiatry: From Back to Front."
Overall, what counts as a successful introduction of chatbot technologies will inevitably require some form of systemic approach. The problem to tackle is just too broad, a beast with too many tentacles, for it to fit into any single metric. But this requires thinking about what purpose these technologies are meant to fulfill. For this, further research both in the ethics and healthcare space, will be invaluable.
Contact: Bosco Garcia, [email protected]
Revealing Insights into Air Pollution: Latest Research (May 31, 2024)
by Cong Cao, LCSSP Postdoc
Are you curious about the latest developments in the field of environmental economics? If so, you might be interested in my latest research, "How to Better Predict the Effect of Urban Traffic and Weather on Air Pollution? Norwegian Evidence from Machine Learning Approaches" which delves into the complex relationship between traffic volume, air pollution, and meteorological conditions. This paper was published in a leading journal of behavioral economics, the "Journal of Economic Behavior & Organization".
In this paper, I employ state-of-the-art machine learning methodologies to unravel the mysteries surrounding air pollution dynamics. My study, set against the backdrop of Oslo, Norway, harnesses hourly traffic volume, air quality data, and meteorological parameters to shed light on this pressing issue.
In this article, I use state-of-the-art machine learning methods to shed light on this pressing issue by utilizing hourly traffic volume, air quality data, and meteorological parameters in the context of Oslo, Norway. The key to my research is finding the most effective way to predict air pollution levels. In contrast to past research, my findings suggest that traditional models outperform machine learning algorithms in this regard. By carefully analyzing six data sets from the entire year of 2019, I demonstrate that the autoregressive integrated moving average model with exogenous input variables and the autoregressive moving average dynamic linear model excels in predicting air pollution levels. Furthermore, by taking into account the intricate set of influencing factors, my research proposes transportation policy recommendations to mitigate the environmental consequences of transportation.
How Physics-Based Machine Learning Unveils the Link Between Rising Heating Demand and Increased Air Pollution in Norwegian Cities (May 31, 2024)
by Cong Cao, LCSSP Postdoc
Introduction:
Our latest paper, "Physics-based machine learning reveals rising heating demand heightens air pollution in Norwegian cities", just released on arXiv, reveals how growing heating demand affects air pollution levels in Norwegian cities. The research, led by Dr. Cong Cao, Dr. Ramit Debnath, and Flintridge Foundation Professor R. Michael Alvarez, this study employs an arsenal of analytical techniques, including Physics-based Machine Learning (PBML) and traditional regression models, to unravel the complexities of atmospheric pollution dynamics.
Dive deeper:
We use regression modeling and machine learning methods such as K-means clustering, hierarchical clustering, and random forest techniques. We delved into a decade-long (2009-2018) dataset that includes daily traffic patterns, meteorological conditions, and air quality indicators in three prominent urban centers in Norway.
Key findings:
Our results reveal a significant correlation between an increase in the number of hot days and rising levels of air pollution. This finding highlights the profound impact of increased heating activity on the atmospheric composition of Norwegian cities. It is worth noting that the application of physics-based machine learning outperforms traditional long short-term memory (LSTM) methods in accurately predicting air pollution trends.
Implications for policy and practice:
Our study provides actionable insights for policymakers addressing the twin challenges of climate change and air quality management. By elucidating the link between heating demand and air pollution, we provide stakeholders with empirical evidence to inform data-driven policy interventions. With a deeper understanding of climate and air quality interactions, policymakers can develop more effective strategies aimed at mitigating environmental degradation and safeguarding public health.
Pathways Towards the Safe and Effective Deployment of Engineered Microbial Technologies (April 9, 2024)
by Ozan Gurcan, LCSSP Postdoc
In their 2022 Executive Order, the Biden-Harris administration proclaims that:
For biotechnology and biomanufacturing to help us achieve our societal goals, the United States needs to invest in foundational scientific capabilities. We need to develop genetic engineering technologies and techniques to be able to write circuitry for cells and predictably program biology in the same way in which we write software and program computers; unlock the power of biological data, including through computing tools and artificial intelligence; and advance the science of scale‑up production while reducing the obstacles for commercialization so that innovative technologies and products can reach markets faster.
Since the Executive Order in September 2022, there has been a significant focus in governmental activities regarding the future of biotechnology and synthetic biology products. The aim is to harness these technologies in a manner that is safe, sustainable, and beneficial for advancing the American bioeconomy. For instance, the White House Office of Science and Technology Policy's (OSTP) 2023 report specifies a series of recommendations and ongoing initiatives designed to revamp the bioeconomy infrastructure in the United States. These "bold goals" include launching a comprehensive data initiative specific to the bioeconomy; enhancing domestic biotechnological manufacturing capabilities; cultivating a skilled workforce tailored to these emerging fields; and engaging in international collaboration in research and development. Notably, the OSTP also emphasizes the importance of (1) refining the regulatory framework to streamline the safe and efficient market introduction of biotechnological products; and (2) establishing a Biosafety and Biosecurity Innovation Initiative to mitigate potential risks posed by novel biotechnological advancements.
The National Security Commission on Emerging Biotechnology (NSCEB) is another important actor, in charge of contemplating the future trajectory of biotechnology and the requisite legislative frameworks to ensure national security, particularly in areas critical to the provision of safe food supplies and medical products. Here, genetically modified micro-organisms (GMM) hold transformative potential across diverse sectors including sustainability and climate change mitigation, manufacturing, healthcare, energy, and agriculture.
Acknowledging the renewed governmental interest and the profound potential benefits (and risks) associated with the deployment of GMMs into the environment, Caltech organized a workshop, Pathways Towards the Safe and Effective Deployment of Engineered Microbial Technologies, on February 27 & 28, 2024. The event brought together actors from various fields—including regulatory bodies, the biotech industry, and academia (spanning the natural sciences, social sciences, ethics, and history)—to explore the optimal frameworks for the safe utilization and regulatory oversight of engineered microbial technologies.
Christopher Voigt, Daniel I.C. Wang Professor of Bioengineering from MIT, highlighted a range of fascinating and innovative applications for GMMs, such as anti-corrosion mechanisms; self-repairing or protective materials for infrastructure like concrete and ship hulls; recovery of inaccessible oils; environmental pollution treatment; agricultural yield enhancement; and the decomposition of plastics. The list of potential applications, as detailed by McKinsey Global Institute, extends even further, reflecting the vast scope of GMMs' potential impact.
The presentations at the first day of the workshop raised important questions, marking the beginning of an in-depth dialogue into the challenges and opportunities presented by the potential release of GMMs into the environment. This multidisciplinary, collaborative approach aims to synthesize the complex ethical, environmental, and regulatory dimensions of these technologies, ensuring their responsible integration into society.
Risk areas
The workshop highlighted that the inherent property of micro-organisms as self-replicating, and their intricate interconnections within ecosystems calls for a careful assessment of potential cascading effects that may ensue. Unlike plants, the miniscule size of micro-organisms and their capacity for horizontal gene transfer introduce unique challenges. These include the persistence of GMMs in the environment, as well as their potential for energy-efficient dispersal across vast areas.
Amid these concerns, the regulatory system for engineered micro-organisms and their release into open environments remains underdeveloped, prompting eager anticipation for opportunities to shape recommendations. It is already well established that good policies require a solid factual foundation – and that this requires cooperation between the scientists and policymakers. Marc Saner, an expert at the science/policy interface, explains how this partnership is vital to bridging the gap between scientific insights and policy formulation, enabling scientists to contribute meaningfully to societal advancements and policymakers to draw upon reliable data and predictions that only scientific research can provide.
In the critical interface of science and policy regarding GMMs, the workshop's initial day asked the following question:
What scientific hurdles must be overcome to support policy discussions and inform existing decision-making frameworks?
Key among these scientific challenges were risk assessment considerations including dispersal, persistence, and the concept of equivalence, each necessitating rigorous scientific inquiry for the development of regulations.
Dispersal
Our understanding of the dispersal rates and persistence times of micro-organisms in various types of natural environments are still limited. To learn more about how GMMs may disperse in the environment, Jennifer Martiny, Professor of Ecology and Evolutionary Biology at UC Irvine,discussed her group's research. The mechanisms governing microbial dispersal—ranging from water percolation and plant root interactions to the presence of protozoa and airborne soil particles—present complex variables in assessing microbial behavior in natural settings. It was argued that advancements in quantifying dispersal rates, routes, and sources will be paramount for regulatory bodies to make informed decisions on GMM deployment. Collaborative efforts between researchers and regulatory agencies, through the establishment of testing facilities, may be critical for generating the necessary data under controlled conditions.
Persistence
The issue of persistence emphasizes the balance that must be struck between an engineered micro-organism's functional longevity and the potential risks associated with its prolonged environmental presence that could lead to increased rates of horizontal gene transfer. Factors influencing persistence include soil type, native microbiota, root structures, plant types, and invertebrates. Regulatory perspectives on persistence are dated and varied, lacking standardized metrics for acceptable microbial population declines. This complicates the approval processes for engineered micro-organisms. Therefore, innovations in measuring and controlling persistence using modern high-throughput–omics; machine learning/AI; ecological modeling; and cellular metabolic modeling will be key areas of research that can further inform and adapt regulatory frameworks.
Equivalence
The concept of equivalence presents another challenge in determining how GMMs compare to well-understood biological counterparts. Ways of establishing equivalence will be vital for regulatory frameworks, as they can streamline the approval process for GMMs deemed similar to existing entities, thereby saving time and resources. Conversely, significant differences may necessitate a re-evaluation of risks versus benefits. Consider the research that has gone into understanding the use of gene drives in mosquitos to accomplish a genetic change through a population. Would gene drives in micro-organisms work in a similar way? How will the genetic change spread through the environment?
Narrowing in on regulation
The workshop further delved into the key barriers highlighted by the President's Council of Advisors on Science and Technology (PCAST), pinpointing regulatory uncertainty and an outdated national bioeconomy strategy as significant impediments. PCAST emphasizes the need for an integrated bioeconomy strategy that is both cohesive and adaptable to the evolving nature of biotechnologies such as CRISPR-Cas9 gene editing (which can create modifications without introducing foreign DNA) and synthetic biology constructs.
Richard Murray, Thomas E. and Doris Everhart Professor of Control and Dynamical Systems and Bioengineering at Caltech; Steve Evans, Senior Technical Fellow at BioMADE, and Christopher Wozniak from Wozniak Biopesticide Consulting, LLC all further detailed the current regulatory landscape's positives as well as limitations for GMMs. A primary concern was the Coordinated Framework's fragmented approach, where multiple agencies and statutes could regulate microbial products based on various criteria such as origin, purpose, and distribution channels. The nature of the genetic modification—whether it involves addition, deletion, or editing of genetic material, or whether the product is transgenic or not—can influence the regulatory scrutiny applied. Moreover, it was noted that GMMs for open water, infrastructure, materials, and open air & extraterrestrial spaces do not even have a path to decision within the Coordinated Framework. These limitations create a complex environment for engineered micro-organisms for environmental release, leading to protracted product cycles that can inhibit innovation and commercial viability.
Pivot Bio was a case in point. Representatives Natalie Hubbard, Vice President of Regulatory and Government Affairs, and Kirsten Benjamin, Vice President of Product Innovation, discussed Pivot Bio's innovations within context of current regulation. Pivot Bio builds micro-organisms that enable food crops to produce their own nitrogen, replacing nitrogen fertilizer. Their product ProveN, for example, is based on Klebsiella variicola whose genome is remodeled to knock out a repressor (nifL) and replaced with a promoter from elsewhere in its genome. Applied at planting, this engineered micro-organism naturally associates to the root, fixing atmospheric N₂ into NH₃ continuously, providing the plant with precise, steady NH₃ throughout the growing season, regardless of weather. Their manufacturing process requires dramatically less energy with almost no greenhouse gas emissions.
Pivot Bio's approach is described as leveraging naturally occurring micro-organisms rather than introducing foreign genes, making their technology cisgenic rather than transgenic. Cisgenic engineering involves the modification of organisms using genes found within the same (or closely related) species, which regulation is more friendly towards. This contrasts with transgenic modifications, where genes from different species are inserted, raising several regulatory alarm bells.
Yet, innovations like living infrastructure can also be beneficial, and introduce new dimensions to regulatory considerations, such as the potential environmental impact of shedding micro-organisms or the transmissibility of applied treatments across borders. Here, traditional ways of testing and regulation may not be applicable. Regarding this point, the workshop emphasized the importance of the cost of "inaction", which can be underseen at times. Prohibiting the regulatory approval of GMMs, for example, can lead to the continued use of harmful products that different parties have become reliant on. PCAST also highlights the broader implications of inaction, including missed opportunities for job creation, reduced carbon footprint, and the acceleration of the U.S. economy through innovative biotechnologies. Workshop participants discussed possible standards to evaluate the "no-action alternative" comprehensively.
Historical context
The workshop would not be complete without incorporating historical context into the discourse on scientific advancements. This is crucial as it illuminates the idea that science is an endeavor conducted by individuals within specific cultural and temporal frameworks. This perspective allows for a more profound integration of scientific inquiry within the fabric of society. Luis Campos, Associate Professor of History at Rice University, emphasized these remarks, illustrating how historical milestones can significantly influence the development and perception of science within cultural contexts.
A prime example of this is the Asilomar Conference of 1975, a pivotal event that established foundational principles for the conduct of biotechnological research and its regulatory oversight, emphasizing the need for containment measures and risk assessment to protect both the environment and public health. Many viewed it as a successful, proactive approach to addressing the safety and related ethical concerns associated with recombinant DNA technology, whereas others criticized the strength and legitimacy of self-regulation, without input from broader society.
Adopting a historical lens facilitates critical examination of key issues, such as the reasons certain research questions gain prominence at specific times (and how they are framed), the potential challenges that may arise in what seems like a straightforward scientific trajectory, the extent of human intervention in nature one deems as acceptable, and the responsibilities borne by creators of genetically modified organisms for any long-term effects. Such inquiry is instrumental in addressing possible controversies akin to those that arose with the development of "ice-minus" bacteria, genetically engineered to lack the gene that causes frost formation on plants, and Monsanto's genetically modified crops, which ignited extensive debates over environmental implications and intellectual property rights.
The workshop highlighted how the path to public acceptance and trust in biotechnological innovations are critically determined by the early and ongoing engagement with consumers and society at large. Hence, the success of GMMs hinges both on scientific and technical achievements but also on societal and regulatory endorsement. As McKinsey's analysissuggests, more than two-thirds of the total impact of emerging biotechnology, including GMMs, could hinge on consumer, societal, and regulatory acceptance of these applications. Therefore, an informed dialogue that integrates historical context, addresses ethical considerations, and transparently communicates the benefits and risks associated with biotechnological applications will be important for securing a favorable reception and ensuring the sustainable advancement of the bioeconomy.
Conclusion
The workshop, Pathways Towards the Safe and Effective Deployment of Engineered Microbial Technologies, convened a diverse group of stakeholders—each with unique perspectives—ranging from regulatory authorities and academic researchers to humanists and social scientists, to deliberate on the contingency of introducing engineered micro-organisms into the environment. The discourse centered on two pivotal inquiries: firstly, identifying the scientific hurdles that must be surmounted to inform and refine existing policy frameworks and decision-making processes; and secondly, addressing the current regulatory constraints that impede the efficient market entry of products, alongside exploring how scientific advancements can facilitate regulatory processes through the development of novel tools and methodologies.
The overarching objective of this collaborative effort is to harness our innovative capacities to address global challenges—ranging from climate change mitigation and food security to healthcare—while simultaneously safeguarding human health and environmental integrity. The discussions underscored the vital role of clear and flexible regulation in enabling innovation and increasing public confidence in regulatory policies.
The identification of research gaps was a focal point of the workshop, highlighting areas such as the need for comprehensive metagenomic data on microbial persistence, dispersal, and ecological impacts; the application of pandemic sequencing infrastructure for tracking global microbial flow; and the utilization of artificial intelligence and machine learning for analyzing microbial sequences for potential toxicity, pathogenicity, and environmental impact. Furthermore, the development of pre-approved "safe chassis" micro-organisms, incorporating recoding, genetic stability, and safety switches, along with cost-effective data collection methods and environmental modeling, were introduced as critical needs for advancing GMM deployment.
The Center for Science, Society, and Public Policy is a platform for engaging scientists in policy discussions, offering Caltech researchers—from graduate students to faculty—an invaluable perspective on the policy implications of scientific research. It emphasizes the importance of considering regulatory frameworks from the outset of research projects, leading to a more integrative approach to scientific and regulatory science. The anticipated final workshop report will offer a comprehensive set of policy and scientific recommendations, addressing the essential questions that will necessitate new scientific endeavors. This document will guide researchers—at Caltech and beyond—in aligning their work with the needs of regulatory science, thus contributing to the broader scientific discourse and policy development.
ChatGPT was used in editing the text and generating pictures for this post.
AI & the 2024 Presidential Election (Feb 7, 2024)
by Mike Alvarez, LCSSP Co-Director + Flintridge Foundation Professor of Political and Computational Social Science
Just a few years ago, to spot problematic content online one could use a few simple tests: Do the images or video look fake? Is the audio mechanical or robotic? Is the text full of typos or misspelled words? Usually content that failed those simple tests would be considered suspect and a viewer might discount the validity of such content.
But generative AI and Large Language Models (LLMs) have leveled the content-generation playing field. When it comes to producing very realistic content to post online, it can often be a matter of using some simple off the shelf applications and thirty minutes of time, as we demonstrated in a recent white paper. This is creating a lot of concern about highly-realistic misinformation as we head into the 2024 US Presidential election cycle, as I discussed in the recent NBC4 interview shown above.
As I discussed briefly in the interview, social media and other online platforms should do more to help information consumers recognize problematic online content, in particular content that has been produced by generative AI and LLMs. I thought it was a good news to hear that Meta is now asking their industry to label content generated by AI. But how quickly the industry might take on that task, how user friendly the labeling will be, and what exactly will be labeled, is still unclear. But let's hope that the social media and other online platforms act quickly to label AI-made material, before the production and distribution of problematic and misinforming materials about the 2024 election proliferate.
Teaching Ethics & AI in the Wake of ChatGPT (Jan 24, 2024)
by Frederick Eberhardt, LCSSP Co-Director + Professor of Philosophy
The advent of Large Language Models (LLMs) has raised profound questions about why and how we teach writing at university, and the extent to which instructors can still assess the level of student understanding on the basis of written work. In the most recent iteration of Caltech's first year class on Ethics & AI, Professor of Philosophy Frederick Eberhardt tested a very permissive LLM policy. This report details his experience, highlighting the need to find a delicate balance between training students to productively integrate LLMs into their workflow, while ensuring that the development of their critical thinking skills remains center stage.
For better or worse, this past fall quarter I decided to have an extremely permissive policy on the use of Large Language Models (LLMs) for my first year undergraduate humanities course on "Ethics & AI" (see policy at the end of this article). The arrival of ChatGPT and its powerful siblings had led to much soul searching amongst the humanities faculty and there was a scramble to determine course policies on LLMs last summer. Caltech's own overarching policy on the use of generative AI only came online midway through the fall term.
The teaching of writing forms part of the mandate of Caltech humanities courses, and written assignments, often with required revisions, have been a standard tool of assessment in these courses. The arrival of LLMs has challenged not only our standards for authorship, but also made us revisit our aims of writing instruction: Is learning to write, learning to communicate? Is learning to write, learning to think? Does writing well help students read carefully? Or, is writing instruction about finding and developing one's own voice?
The goals are all worthy, but for which is writing instruction the key tool? And where might LLMs help achieve these goals rather than undermine them? Unsurprisingly, and I think quite healthily, opinions on this matter vary widely among the humanities faculty and, thankfully, for now the generic division-wide policy has allowed us to try different approaches: each instructor can set their own class policy on the use of LLMs.
In a course on "Ethics & AI," I felt that we had to give the new unknown in the field a try. We had to learn how to engage intelligently with the tools, how to integrate LLMs into our workflow, figure out what works, and explore the limitations. This applied as much to the students as it did to me. I did not really know how essays written using LLMs would look, how reliably I would be able to detect (or even just suspect) LLM-usage, nor did I have a clear idea of how students would use the tools. I pitched my permissive policy to the students as an exploration, emphasizing that we had to pay attention to ensure that we don't lose our power as writers and thinkers, that we don't want our voices to be tempered into lukewarm corporate softspeak, and most importantly, that we must still own and stand behind what we write, whether machine-boosted or not.
In my course then, LLM usage was permitted in all forms for all aspects of writing, but it was not required. For each assignment, students were asked to report on their LLM usage, which LLM they used, at what stage of the writing process they had used it, whether they had found particular prompting techniques that worked well or whether they encountered problems – and I asked that their reports on this should not use an LLM (please!). We had several class discussions at different points during the term where students shared advice, recommended different LLMs or acknowledged that for certain tasks the machine was useless. Assignments consisted of several 500-1,000-word subsections of a term-length project that culminated in a policy proposal for the regulation of generative AI. Overall grades for Ethics & AI were pass/fail, as is typical for a first-term course at Caltech. This alleviated the pressure of fine-grained grading and limited the consequences that a failure of this policy might imply. Of course, it also affected the effort that students put into their work.
So, what did my students do? – Few first-year students had used LLMs for substantive writing tasks prior to my course. Several did not choose to use it initially, but I believe everyone did use LLMs in some form or other eventually. The first assignment resulted in the expected disaster where for the most part I had to read generic ChatGPT-speak consisting of perfectly formed sentences providing very high-level points with very little substance. Detecting auto-generated text seemed trivial – I trust it is not the students themselves that now write like a marketing department ensured that everything is shiny while no commitments are made. However, students quickly adapted: ChatGPT's failure (at the time) to provide proper references quickly resulted in it being replaced by its (then) more powerful siblings. And students realized that the task of writing was not going to be just a matter of copy-pasting my prompt into the machine, but would require multiple iterations and revisions. I started to see excellently researched case-studies where I no longer could tell which parts were student-written and which were machine-generated. Most likely, that separation was no longer even well-defined as students had gone back and forth many times revising and improving their text, prompting the LLM using new strategies and trying again with different ideas that were now a mix of their previous input and the LLM's outputs. Students also shifted at what stage they used LLMs in the writing process: Many students reported that the actual text generation was far less efficient than writing themselves, but that the LLMs were enormously helpful for brainstorming ideas and getting started on their assignments. Shortly after midterm, we hit a plateau with several students returning to doing more of their writing on their own (or at least, so they reported), while LLMs were used now principally as sophisticated search engines that could summarize large bodies of text (the EU regulations on AI are a few hundred pages, most of them deadly boring, so one can hardly blame them).
At the end of the term, I received several outstanding term projects that, to the best of my assessment, were better than either the student or the machine could have done on their own. These students had managed to integrate the LLM into their workflow in such a way that they were able to make use of the vast amount of knowledge the LLMs have processed, while not losing the particular angle they wanted to argue for. Their write-ups would include a wealth of material that we had not covered in class, their points were insightful and well-developed and supported. Several students reported that ChatGPT's failure (at the time) to provide proper sources had made them actually track down more carefully the sources of the claims they got from later LLM interactions. The LLM told them what to look for and they could now find the place in the original source.
Of course, the outcomes were not all rosy. Unfortunately, quite a few final submissions felt like the machine had taken over. The prose was flawless, but the content did not merit the 15 pages I had to read. Arguments were not developed in detail and there was little that did not feel generic or uncommitted.
In our last class, we had an open discussion about what the students thought the role of LLMs should be in future writing-intensive courses. Across the board, students appeared to appreciate the permission to use LLMs. The most common argument was that LLMs helped overcome what several students referred to as "writer's block." It helped them get started, generating ideas and moving the assignment along when they were stuck. A second consideration was that many felt that this was a tool that they had to learn how to use – it would inevitably become a staple ingredient to most future writing tasks. This point had been one of my main motivations for allowing LLMs in the first place. Too many of my friends and acquaintances in "real jobs" had reported that effective LLM usage was becoming a basic job requirement. Caltech undergrads feel the same.
Encouragingly (or perhaps depressingly?), one student noted how hard it is to detect the weaknesses and flaws in one's own writing, but that in interacting with the LLM, the added edits meant that one became somewhat of a third person reader to one's own writing – the places where improvement was needed became much more apparent. (I know I am not the only humanities instructor who has always encouraged peer review in class. Students generally catch at least 90% of the places their peer could improve their writing. But unless I enforced it, few students ever seemed to do it. I am a little dispirited that the machine may be the peer that we are more willing to tolerate).
Obviously, I also tried grading assignments with an LLM. As anyone who has graded a large number of papers knows, the same problems occur again and again, and it would be a huge relief to have the machine deal with those, so that one can focus on more specific feedback. It is fair to say that at least for now, this attempt at using an LLM was a massive failure. It was extremely difficult, even with quite detailed prompts, to get the LLM to systematically review the submissions, to take a stance, and to properly evaluate the work. The LLM was very good at saying something positive and something negative, but did not seem able to prioritize. I welcome ideas and advice on this front, because I would expect that I am not the only one out there who would be delighted to have a high-quality grading sidekick (we don't have TAs in the humanities at Caltech, unfortunately).
Over the winter break, as I was preparing course policies for my letter-graded courses the following term, I was left with many lessons, but few solutions. It is clear that students' writing can benefit enormously from the use of LLMs. Students are keen to have the option, it helps them get off the ground, and when done well, the result is more than the sum of its parts. But it is a serious challenge to get all students onto this track of working with the LLM and not letting the LLM do the work. While purely LLM-generated essays may now still be easily detectable, I doubt this will last long, and I do worry that in the near future essays will be bunched up near the grade ceiling, so that I will not be able to distinguish work that has resulted from a constructive interaction with an LLM from work resulting from a 4am request to the LLM for an essay in response to my prompt. It may be tempting to take the output of a (sophisticated) LLM as a baseline and expect student submissions to be original and innovative beyond that. But I am doubtful this will work: It takes time and effort to learn and understand the material that the LLM has ingested and we can't expect a first-year student to just start from that level.
I do not expect to ban LLMs from my courses. Quite apart from not being able to detect violations, it will be extraordinarily hard for a student to fully avoid using them: Just about every search engine will or already has an LLM behind it, grammar improvement software is highly useful and is also based on a form of LLM. But perhaps most importantly, like sex ed, I think LLM usage should not be learned on the street. Obviously, for small seminar-style courses, in-class participation — an important skill to develop in its own right — provides a useful tool to gauge student understanding of the material. But if this is to play an important role in student assessment, it now has to take on a much more systematic form. For large courses, significant in-class participation is a non-starter and so I do think there is a place again for proctored exams. In combination with low stakes homeworks (where the practice for the exam outweighs the outsourcing to an LLM), proctored exams provide a useful assessment of competency, albeit under time pressure. But exams come with their own well-known problems. Quite apart from incentivizing cramming, exams don't provide a useful basis to assess careful reasoning. The development of a clear presentation of an argument takes time and often many revisions. But if clear presentation of a well-developed argument is the goal, then LLM use is perhaps permissible again — and the process of successfully integrating an LLM into this revision process is one we ought to teach, and assess. The challenge here is that for many teaching topics, a completely crisp account has already been ingested into the LLMs, ready to be regurgitated with minor variations for the class assignment. It is a losing and senseless battle to invent ever more obscure topics to outmaneuver the LLMs. So, at this point I am inclined to bite the bullet: If (big if, we're not there yet!) the LLMs are really so good at presenting the topics, then we should use them for that purpose. It will not be our student's future job anymore to develop the clear presentation themselves. And to ensure that our students can develop well-justified arguments and engage in sound reasoning, I think it will be a matter of teaching the abstract skills of logic, probabilistic reasoning and evidence collection, with a sensitivity to all their limitations. Going forward, I will be reducing the weight of homework essays and focus those grades on the clear presentations of arguments, while I am likely to switch my assessment of content understanding from homework essays to exams and in-class participation (where possible). I am not optimistic that the humanities model of 2-3 essays (even with feedback-based revisions) will continue to serve either as basis to learn how to write well or to assess the student's understanding of content.
As an institution, we will need to provide subscriptions to high-quality LLMs to all students and employees, and while my attitude towards privacy and data ownership often marks me as distinctly European, or at least as older, I was glad to see that these concerns were part of Caltech's overarching LLM policy. Navigating this space with sensitive breakthrough scientific discoveries and first year student papers will require a very delicate balance.
Finally, I want to leave you with a student recommendation for prompt engineering that gave me pause. The student reported that in the last weeks of the course, they finally figured out how to generate good text with an LLM. I paraphrase: "Professor, I uploaded all your published papers to the LLM, as well as a few others from the Humanities faculty. I then uploaded your prompt and asked the LLM to generate an answer in the style of those papers. Then it finally produced useful text." Needless to say, other students kicked themselves for not having had that idea themselves. But the student in question left ambiguous whether they thought the generated text itself was any good or whether it was just my feedback on their submission that was more positive. Well, what was I assessing? Whether the student could write like me? Whether the student was following the writing standards amongst Caltech philosophy professors? Whether the quality of the argument they provided was any good? — I'd like to think it was the latter, and if (another big if) that was the case, then maybe something was gained by this approach, but the challenges are obvious.
(In line with Caltech policy, I would like to note that LLMs may have been used in the generation and revision of this text. Nevertheless, the text is entirely my responsibility.)
Frederick Eberhardt is Professor of Philosophy at Caltech and Co-director of Caltech's Center for Science, Society, and Public Policy. He was the instructor for the "Ethics & AI" course for first year students in the fall quarter of 2023, which tested the very permissive LLM policy below.
Course Policy on LLMs for Ethics & AI course, Fall 2023 (this policy was adapted from a sample policy provided by Caltech's Hixon Writing Center)
Classes in the Humanities and Social Sciences abide by this policy, which states that "students submitting work for HSS courses may use generative AI tools only in ways that are explicitly allowed by the course instructor in the course materials."
In a course on "Ethics & AI" it is essential that we engage with these tools and explore their potential and shortcomings. This is inevitably an exploration, testing how we can integrate generative AI into our workflow while recognizing the risk to our individual development as thinkers and writers. In this spirit of experimentation, I will allow you to use generative AI in any way you wish in this course. If you find a tool that you think will help you accomplish an assigned task in a more effective, efficient, or compelling manner, you can use it. You may incorporate generative AI outputs into your work without documenting their use within the text itself. You are fully responsible for the correctness and appropriateness of any language or images created by a generative AI tool that you choose to use. However, I require that you complete a "Generative AI Memo" for all graded assignments. This memo requires you to tell me which tools you used, how, and why. If you decided not to use these tools, it asks you to briefly explain why you did not. Your exploration in this class will help me decide if and how I allow students to use these tools in future versions of this course.
A further note: As we are testing how we might integrate generative AI tools into our workflow, we need to be keenly aware of the potential risks. It is by no means obvious that it makes sense to use these tools in an introductory class, especially perhaps not in one that is supposed to teach writing. So here is my thinking: We teach writing to develop our reasoning skills and to learn how to communicate effectively. There is also the aspect of developing our own voice through writing (though that has not been a focus in my courses). In that sense, learning to write is developing a means to achieve broader goals; writing (at least in my teaching and research) has never been an end in itself. So, I consider it to be an open question whether these new tools can still help us achieve these broader goals, possibly much more easily. By learning how to use these tools well, it may be possible to develop excellent reasoning and communication skills without becoming a good writer oneself. (I may also be wrong about that.) A second more practical reason for using these tools in class is that I expect that "generative AI literacy" will become a basic skill requirement. That is, similar to typing, I expect that knowing how to engineer prompts for a generative AI tool will become a default expectation in many jobs. Therefore, we, which includes me, need to get up to speed.
Interested in subscribing to receive LCSSP's latest updates?
If you are external to Caltech, fill out this form.
If you are on the Caltech network, use these links to subscribe: LCSSP Mailing List, LCSSP BioPolicy Initiative, and LCSSP Democracy Mailing List.
For help or questions, please reach out to [email protected]!