Rodrigo Valadao 8/12/20 Rodrigo Valadao 8/12/20

Great IDeaS for Network and Fun

By Tanja Ohlson (August 12, 2020)

What’s missing most at conferences in this Covid-summer is the opportunity to meet new people and network. The lack of personal interactions makes it even harder for new and upcoming communities like the IDeaS group to grow and flourish – but thanks to Maggie Cascadden and Rodrigo Valadao, both Doctoral Students at the University of Alberta, young scholars interested in interpretive data science still managed to network and make new contacts.

Following up on a PDW at the Academy of Management Annual (virtual) Meeting (AoM), Rodrigo and Maggie organized an IDeaS group hour of networking and fun outside of the official AoM program and invited young scholars from both the PDW and the IDeaS conference in Edmonton in October 2019. The hour-long session resembled a cocktail hour at a conference, with a little ice-breaker game and breakout groups to get to know others interested in the field. “Two truths and a lie”, which every participant had to submit about themselves before the session, helped the conversation to immediately flow a bit more openly and created a level of trust that is usually absent in academic discussions at virtual conferences.

The most valuable part of the session might have been the “homework” we were sent away with: Rodrigo and Maggie strongly encouraged everyone to contact at least one other participant that had similar interests or needs, in order to set up a call in the days after the networking session. AoM might be over, and the next IDeaS conference still some time away, but I am now looking forward to several calls with young scholars from all over the world during the next week!

Rodrigo Valadao 8/11/20 Rodrigo Valadao 8/11/20

IDeaS 2019 Conference: Perspectives (II)

By Bandita Deka Kalita

Day 1

Kyle Murray and Dev Jennings flagged off Day 1 of the IDeaS conference by talking about how big data can transform organizations and how it can be interpreted. On the topic of interpreting big data, Dev emphasized a theme that resonated throughout the two days of the conference - that rendering corpora is inherently interpretive in nature. He then invited Laura Nelson, and Tim Hannigan, to share insights from their respective projects.

Laura Nelson presented her ongoing project on rendering social movement strategy. She explained that in empirical studies of social movement, count data of events are generally used. However, she reminded the audience that care must be taken when interpreting various forms of a certain action, because they may not mean the same thing. In fact, in emphasizing that meanings lie in discourses, Laura laid the groundwork for the core tenet undergirding much discussion in the forthcoming sessions of the IDeaS conference. The CHC’ (Computation-Human interpretation-Computation) technique she discussed during her talk also prompted discussion around the implications of the sequence of computation and human interaction (“CHC’ or HCH’?”) for particular research contexts. Laura, and several other researchers in the room, stressed the importance of context in meaning making.

Tim Hannigan’s work on the rendering of the kernel of the British MP scandal provided a nice segway into further deep discussions about meanings and the tools available to social scientists to uncover meanings from textual data. Tim provided a candid description of the various challenges and opportunities that presented themselves in the course of the project, and how eventually the team uncovered clusters of meaning from the visualization of the “kernel” of the scandal: the first seven days of textual data from newspapers. Overall, the work of Tim and colleagues highlighted the utility of topic modeling as an emerging approach enabling qualitative and quantitative researchers with powerful technology for exploring data and theorizing in management.

Mark Kennedy’s presentation was themed around hopes for a growing community between scholars of qualitative and quantitative research methodologies respectively. In emphasizing the “back and forth between counts and stories”, Mark reminded the audience of the fact that it is the stories we tell with numbers that really matter. His spirited presentation was aptly titled “We could be Friends”.

The participants enjoyed meeting and discussing projects with colleagues from various places over the coffee and networking sessions throughout the two days of the conference. There were six simultaneously occurring round table discussions on the late morning of the first day with these diverse scholars working on their respective projects using similar methods. These discussions were respectively themed around qualitative studies on deviance & crowds, qualitative studies on open strategy, blockchain and distributed trust, topic modeling and field emergence, applied topic modeling, and quantitative studies on entrepreneurship and technology battles.

After lunch, Dev Jennings moderated a session on sharing the solutions that emerged from the above described round table discussions. Discussion was generated in the room around the question of qualitative data archives and the sharing of these data. Other critical issues that were raised were around big data copyright matters, practices around citation of codes, and the utility of topic modeling and semantic analysis in coding qualitative data. Marc-David Seidel moderated the next session on publishing papers with computational methods. He triggered interesting discussions around topics such as human limitations around interpreting data, and how computational methods can bridge those limitations.

Joe Porac’s presentation on meaning and interpretation was themed around the difference between big data and small data. His fascinating talk on the embeddedness of meaning in the social world took the audiences to several places – from Peter Winch’s theses of rules, practicality, and participation, and Jacques Derrida’s “aporias” of undecidability, a point where a text has been deconstructed to such an extent that the meaning is undecipherable.

Day 2

The second day of the conference started with an address by Trish Reay, who made a poignant reference to the different history of big data usage by the First Nations of Canada, as she continued with thought strands about marginalized contexts from the first day. She introduced keynote speaker Wendy Espeland who delivered a fascinating talk on “Governance by Numbers”. She spoke about the increasing pressure to code meanings and value in terms of numbers, in the context of university ranking systems. The association of desired traits with measurement and quantification has implications for meaning, and hence for governance systems, and Wendy’s work investigates into matters of rationality and inclusion arising from these implications. She described some of her work with governance issues in the Yavapai community, and associated matters of power, and accountability.

Chris Steele moderated a panel on “the politics of data”, the panel consisting of Wendy Espeland, David Kirsch, Dev Jennings, and Joel Gehman. As a precursor to the panel discussion, Chris also presented some of his work around the “ecology of facticity”: how is it that certain things come to be accepted as fact? The panel discussion that ensued picked up on some of the questions inspired by Chris’s and Wendy’s presentations. Some of the many interesting points of conversation were around matters pertaining to the “what” of the data to be interpreted - how institutionally or bureaucratically freighted they may be, and what implications for stories of power and marginalization could these allude to?

Marc-David Seidel presented a session on HIBAR research, and described the suitability of HIBAR research towards making a difference by solving socio-technical problems. He urged the participants to engage in answering the following question: if you had the access to all of Google Scholar’s data, what data would you like to see included in evaluation of “metrics” for individual academics?

The two optional sessions in the afternoon of the second day registered an impressive turnout by participants. These sessions were respectively on “Deep Learning”, led by Muhammed Abdul-Mageed, and a “practicum on creating corpora, topics, and artifacts”, led by Tim Hannigan and Rodrigo Valadao.

Rodrigo Valadao 8/11/20 Rodrigo Valadao 8/11/20

IDeaS 2019 Conference: Perspectives (I)

By Stephanie Habersang

Session 1: Interpretive data science: Rendering meaning ‘in the wild’

Vern Glaser kicked-off the first session of this fantastic workshop. The aim of this session was to give the audience an idea what type of research we can do with topic modeling. Laura Nelson, Tim Hannigan, and Mark Kennedy reflected on different empirical examples in which they used topic modeling to build theory in social science research. The first empirical example was provided by Laura Nelson. She presented a compelling example how topic modeling helped her to identify new and overlooked tactics in environmental social movements and how these tactics were used by the movement to achieve change at different levels of society. Her research challenged current assumption in which political ideology is seen as a key dimension to distinguish different environmental social movements. Rather she postulates that it is a movement’s goal orientation, that is, at which level of society the movement claims responsibility to initiate environmental change (e.g. individual, collective, or institutional). This example was extremely interesting as it showed the potential to challenge current assumptions and build new robust theory with topic modeling. Another interesting application was presented by Tim Hannigan. In his research he rendered the kernel of a corporate scandal in the British parliament by studying the micro-processes of stigma. He showed that the extent of a scandal and MPs resignation did not depend on the degree of the scandal itself but rather the laughability of the scandal in the first seven days after disclosure. What I think was particularly interesting is that this research was initially not conducted as a project to understand the micro-processes of stigma. Instead it was the iterative back and forth between research question, data analysis, and theory building that finally let to the framing of the paper. Hence, this example nicely illustrated how topic modeling enables abductive reasoning and resonates with a qualitative, interpretive approach to theory-building. Last but not least, Mark Kennedy provided an insightful refection on both presentations. He advocated to “team-up” and to build stronger bridges between qualitative and quantitative research. Teaming up and being a community means that we can both manage the risks and opportunities from big data in a more reflective and fruitful way.

From my perspective, the three panelists’ did not only provide fascinating insights on the type of issues interpretive data science can address, but they also discussed some fundamental implications how to use topic modeling. First, Laura Nelson emphasized the importance of context. Understanding the context in which the data is embedded is essential for interpreting the results. She made it very clear that interpretive data science does not seek to identify universal patterns or physical laws that can be applied to all contexts. Rather interpretation is rooted in a qualitative understanding of the data. This understanding must be used to give voice to marginalized topics/issues in the data and to show diversity rather than uniformity. In line with this argument Laura Nelson skillfully concluded what interpretive data science should be all about: (1) meaning making not universal patterns (2) understanding not social law, and (3) contextual not universal understanding. Second, another very important implication came from Tim Hannigan’s presentation. It is often that we can best grasp the meaning of a large data set through one powerful exemplary story or case. In his example it was the illustrative case of one MP. However, a compelling single story must be supported by strong visualizations and representations. This does not only involve creativity but also exploration and computational skills. Finally, Mark Kennedy emphasized that differences between qualitative and quantitative researchers are less profound than we often think. By combining our complementary skills and using the community to enhance our toolkit we might be able to better explain, understand, and predict social phenomena. Overall, this introduction panel was the perfect kick-off for an inspiring workshop that fostered an inclusive climate to develop a multi-disciplinary community.

Session 2: Publishing papers with computational methods

Publishing with computational methods was the subject of the second session. The panel speakers Mark Kennedy, Richard Haans, Hovig Tachalia, and Muhammad Abdul-Mageed had a vibrant discussion about the possibilities and pitfalls in publishing interpretive data science. In the beginning the panel discussed the “coolest thing” that they recently saw on topic modelling. The panel shared examples from the material sciences, from discourse studies on Brexit, from the field of deep learning, and from management research where topic modeling is increasingly used as a first step to create an abductive leap in grounded theory methodology. Furthermore the panel discussed how topic modeling may help us in doing research. Computational methods are definitely helpful in enhancing human coding procedures, identifying general patters (that we might not see in smaller datasets), and in challenging existing frames. Similarly, computational methods can also help us to reduce type 2 errors and decrease the likelihood that we miss interesting findings.

However, the panelists’ also acknowledged the challenges that come with using a new method and communicating this method to the general reader. There are a couple of strategies that the panel highlighted as helpful to convince editors and reviewers to publish a paper that builds upon a new method: (1) Using computational methods to validate previous findings (theory testing and validation), (2) Showing that results persist even if models change (e.g. as additional robustness checks for new theory building); (3) Using online appendices to explain complex methodological issues and keeping it simple in the actual method section; (4) Publishing a methodological paper beforehand; (5) Optimizing and actively managing the reviewer pool to get fair and proficient feedback; and finally (6) Presenting the paper draft as often as possible to many different people before submitting (get ideas out to potential editors and reviewers early on). An important learning from this session was that we should not take institutions (e.g. journal standards) for granted. Although most journals change very slowly and stick to tried-and-true methods, many editors are becoming increasingly open for new methodological ideas and representations. As such, the overall recommendation of the panel was to build a community and dare to publish interpretive data science also in general management journals.

Session 3: Meaning and interpretation: Is big data any different than small data?

In the afternoon Joseph Porac draws from his large expertise in socio-cognitive dynamics as well as computerized text analysis and gave a highly interesting talk about “Meaning and interpretation: Is Big (Text) Data any Different than Small data?” In his talk he presents some very convincing examples what we can learn from the interpretation of small data to better interpret and translate big data. Drawing on two extreme translation concepts (Derrida’s non-referential deconstruction, in which translating any text into stable meaning is almost impossible vs. google translation, in which universal translations are easily generated in a “quick-and-dirty” manner or “good-enough” fashion) Joseph Porac exemplifies the difficulties that we face when we talk about meaning-making and translation. He points out that basically everything is always about meaning-making, e.g. how we interpret the results of an experiment, how we interpret a stylized questionnaire to develop meaningful questions for a certain research setting, and how we attach (or cannot attach) meaning to things that we have not experienced on our own.

He then draws on Peter Winch’s 1958 book “The Idea of a Social Science and its relation to Philosophy” and introduces three theses to illustrate that the issues of meaning and interpretation are deeply embedded in the idea of social science – independent of big or small data. The first thesis he discusses is the “the rule thesis”. This thesis postulates that understanding human language use involves seeing the rules or properties in accordance with which it is produced, not just regularities in its production. If we wish to understand how a person represents things, in particular what they say about things, then we need to know the rules that govern their thoughts and words. We need to know what would make it right for them to say what they say, and what would make it wrong. In this sense, the context from which these rules emerge is fundamental for understanding – and this line of thinking is equally applicable when we want to interpret big data. Second, “the practicability thesis” states that understanding human language use does not mean just grasping the intellectual ideas that permeate it but, more deeply, cottoning to the practical orientations of the actors. While human language and action essentially involve rule-following, the rules in question cannot all be grasped in an intellectual manner. The main take-away here is that rule-following ultimately rests on a foundation of practice. Finally, the third thesis is the “the participation thesis”. It states that understanding human language use involves participating in the society of the agents, at least in imagination, not just standing back and surveying that which they are doing. This is closely related to the aspect that interpretation involves empathy for the ones that we research. As such, it does not matter if we interpret small or big data we as researchers cannot and should not act as detached observers.

The important take-away from this talk was that the above mentioned issues of meaning and interpretation will not “go away” with the rise of big data. On the contrary, with big data these issues may be exacerbated. And although big data becomes increasingly important in social science, small data and the insights that we can draw from it will not be going away anytime soon. Hence, Joseph Porac closed his talk by emphasizing the necessity of an interpretive data science.

Session 5: What happens when we govern with numbers?

While the previous day was all about interpretive data science, the second day of the workshop focused more broadly on the politics of big data. The morning session started with the amazing Wendy Espeland and her talk about “What happens when we govern with numbers” or how do people do things with numbers and what are the sociological consequences? While Prof Espeland acknowledged the positive side of quantification, for example that it can make previously invisible groups visible (e.g. large-scale studies about LTGB movements); she also emphasizes more critically the performative aspect of numbers. Instead of attributing essential values to governing with numbers we must understand its implications on power. Wendy Espeland used three powerful examples to critically reflect on governance with numbers: sentencing guidelines, university rankings, and cost-benefit analysis. For example, university rankings were initially developed to provide information and transparency helping people to make better decisions which school to attend. Over time, however, university rankings have merged into something new: a regime of surveillance. This regime locks universities into a competitive system that forces deans to care more about a “winning season” than working toward long-term impact. While this was an unintentional shift in governance with numbers the consequences are very real and irreversible. Hence, quantification can reorganize power structures, just as the emergence of college rankings chipped away at the power of university deans.

The main take-away from this speech was that numbers do things and they do organize social life. As such, we should be worried and sensitive to the unintended consequences of governance by numbers. Once we quantify something and its gets out we cannot control what other people do with it. Thus, Wendy Espeland challenges us to consider five rules when we study governance by numbers: (1) Follow the number over time, (2) What happens to power and accountability, (3) What happens to status, (4) What happens to visibility, and finally (5) How can the number be challenged? The powerful message that remains from this talk is that ethics, morality and politics are fundamentally intertwined with numbers. All of us who work with numbers and interpret numbers have a responsibility to constantly ask: “What is there that we don’t see? How can we make the invisible visible? Who benefits and why?”

Session 6: The politics of big data

In the panel session “The politics of big data” Chris Steele discussed with Wendy Espeland, David Krisch, Dev Jennings, and Joel Gehman truth(s) and fact(s) in the context of big data. Chris Steele started with an interesting introduction on the question “How are facts made?” using practice-driven institutionalism to introduce an ecology of fact-making. For the study of data per se he concludes that revealing the political ecology of facticity within which data is made and within which its consequences arise become fundamentally important in our field.

The main take-away from this panel discussion was that big data can easily give the impression that we overestimate what we can know - simply due to the sheer amount of data that is available. However, all data (big and small) may reveal some things and exclude other things. Therefore, the panel calls for a strong emphasis on reflexivity in collecting, analyzing, and interpreting big data. It is essential that we critically examine what we see and what we do not see/exclude in the data. It is our responsibility as scholars to understand the “taken-for-granted” assumptions that underpin our data (collection) and shape our interpretation. One way to reflect upon our own taken-for granted assumptions is constantly asking: What must be true for the data that we collect? It is important to think about the potential biases that we build into the data, not only during data collection but also by using the wrong methods.

Session 7: HIBAR Research – Exploring how research can make a difference in the real world

Marc David Seidel introduced the HIBAR approach in the very last session. HIBAR stands for highly integrative basic and responsive research. This approach seeks to combine a desire for discovery and a desire to solve major problems (often related to grand challenges) through collaboration between academic and non-academic experts. The HIBAR approach highlights collaboration that is interdisciplinary and transdisciplinary and supports diverse expert teams to make a real difference in the world. This keynote provided an inspiring closing as Marc-David Seidel reminded us to think about solving important problems and daring to cross disciplinary boundaries. In this regard, interpretive data science might offer a promising opportunity to bring together scholars from various fields (e.g. Management, Information Systems, Sociology, Sustainability, etc.) with different abilities (e.g. qualitative or quantitative methods) to tackle important real world problems.