Gender bias and statistical fallacies, disinformation and mutual intelligibility

Recent research

Gender bias and statistical fallacies, disinformation and mutual intelligibility

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

New study claims to have found quantitative proof of gender bias in Wikipedia's deletion processes – but has it?

Almost half a century ago, officials at the University of California, Berkeley became concerned about apparent gender bias against women at their institution's graduate division: 44% of male applicants had been admitted for the fall 1973 term, but only 35% of female applicants – a large and statistically significant difference in success rates. The university asked statisticians to look into the matter. Their findings,^{[supp 1]} published with the memorable subtitle

"Measuring bias is harder than is usually assumed, and the evidence is sometimes contrary to expectation."

became famous for showing that not only did such a disparity not provide evidence for the suspected gender bias, rather, on closer examination, the data in that case even showed "small but statistically significant bias in favor of women" (to quote from the Wikipedia article about the underlying paradox). The Berkeley admissions case has since been taught to generations of students of statistics, to caution against the fallacy that it illustrates.

But not, apparently, to Francesca Tripodi, a sociology researcher at the UNC School of Information and Library Science, who received a lot of attention on social media over the past month (and was interviewed on NPR by Mary Louise Kelly) about a paper published in New Media & Society, titled "Ms. Categorized: Gender, notability, and inequality on Wikipedia"^[1]. Her summary of one the two main quantitative results mirrors the same statistical fallacy that had tripped up the UC Berkeley officials back in 1973:

"I sought to compare if the overall percentage of biographies about women nominated for deletion each month was proportionate to the available biographies about women. If the nomination process was not being biased by gender, the proportions between these datasets should be roughly the same. [...] From January 2017 to February 2020, the number of biographies about women on English-language Wikipedia rose from 16.83% to 18.25%, yet the percentage of biographies about women nominated for deletion each month was consistently over 25%." [my bolding]

And while Tripodi correctly points out that this overall discrepancy between articles about male and female subjects is statistically significant (just like the one in the Berkeley case), further arguments in the paper veer towards p-hacking (a term for a kind of data misuse that consists of repeating an experiment or measurement multiple times, cherry-picking those outcomes that resulted in a significant result in the expected direction, and dismissing those that did not):

"In January 2017, June 2017, July 2017, and April 2018, women’s biographies were twice as likely as men’s biographies to be miscategorized as non-notable (p < .02 for each month). The statistical significance and the real significance of the observed difference of these findings strongly support the patterns identified during my ethnographic observations. Wikipedians trying to close the gender gap must work nearly twice as hard to prove women’s notability [...] Only once (June 2018) were notable men more frequently miscategorized, but this was not statistically significant (p > .15). Three times over the three-year period my data could not reject the null hypothesis. The proportion of miscategorized biographies was equal between men and women in October 2018, November 2018, and May 2019. However, these proportions were not statistically significant (p > .85)."

Does this mean that disparities such as the one found by Tripodi here can never be evidence of gender bias? Of course not. But (again quoting from the aforementioned Wikipedia article), it requires that "confounding variables and causal relations are appropriately addressed in the statistical modeling" (with several methods being used for this purpose in bias and discrimination research) – something that is entirely lacking from Tripodi's paper. And it is easy to think of several possible confounders that might have a large effect on her analysis.

For example, the ratio of female article subject among biographies of living people, or those born within, say, the last 50-70 years, is much larger than the ratio among English Wikipedia's biographies as a whole (as would be expected from historical considerations) – and at the same time, it is very plausible that issues of notability are less likely to be settled for living subjects.
Another plausible confounder is the age of the article itself: More recently created articles are presumably more likely to be scrutinized for notability (for example as part of the New pages patrol) than those that have survived for many years already. And as Tripodi points out herself, "the number of biographies about women on English-language Wikipedia rose from 16.83% to 18.25%" in the timespan analyzed (plausibly at least in part thank to "activists [who] host 'edit-a-thons' to increase the visibility of notable women" and increase this ratio, as highlighted in the paper's abstract). But this indicates that the female ratio of newly created articles was much higher during that time than in the existing article corpus that forms the reference point of Tripodi's headline comparison.

It is also noteworthy that several previous research publications who started from similar concerns as Tripodi (e.g. that the gender gap among editors – which is very well documented across many languages and Wikimedia projects, see e.g. this reviewer's overview from some years ago – would cause a gender bias in content too) but applied more diligent methods, e.g. by attempting to use external reference points as a "ground truth" against which to compare Wikipedia's coverage, ended up with unexpected results:

Consider, for example, the paper we previously reviewed here: "Notable women "slightly overrepresented" (not underrepresented) on Wikipedia, but the Smurfette principle still holds".
Or a paper titled "Exploring Systematic Bias through Article Deletions on Wikipedia from a Behavioral Perspective", which started out from the observation that "Malicious forms of bias towards women on Wikipedia has been well-documented in numerous accounts of online harassment" but found contrary to the authors' expectations "that content of supposed interest to men is more likely to be nominated for CSD [which] runs contrary to common ideas regarding biases in content [...] Bluntly, there does not appear to be significant qualitative differences in the rates of AfD or CSD for articles of supposed interest to women compared to articles of supposed interest to men."

To be sure, other papers found evidence for bias in expected directions, for example in the frequency of words used in articles about women. But overall, this shows that Tripodi's conclusions should be regarded with great skepticism.

Tripodi's second quantitative result, the "miscategorization" concept highlighted in the paper's title, is likewise more open to interpretation than the paper would like one to believe. The author found that once nominated for deletion, articles about women have a higher chance of surviving than articles about men. She interprets this as evidence for sexist bias against women (apparently taking the eventual AfD outcome as a baseline, i.e. postulating the English Wikipedia community as a whole as a non-sexist neutral authority against which to evaluate the individual AfD nominator's action). Other researchers have taken the exact opposite approach, where it would have counted as evidence for bias against women when pages about them would be more likely to be deleted than pages about men, e.g. Julia Adams, Hannah Brückner and Cambria Naslund in the paper reviewed here (which also, as Tripodi acknowledges, "found that women academics were not more likely to be deleted" in a sample of 6,323 AfD discussions – in contrast to Tripodi's sample, where women in general were deleted less often than men).

The quantitative results only form part of this mixed methods paper though. In its qualitative part, Tripodi draws from extensive field research, namely

hundreds of hours of ethnographic observations at 15 edit-a-thons from 2016 to 2017. Edit-a-thons are daylong events designed to improve the representation of women on Wikipedia while also providing a safe space for new editors—primarily women—to learn how to contribute to Wikipedia [...]. In addition to edit-a-thons, I also attended two large-scale Wikipedia events, smaller meetups, happy hours, and two regional chapter meetings. In-depth interviews with 33 individuals (23 Wikipedians and 10 new editors) were conducted outside participant observation spaces.

Tripodi's report about the impressions and frustrations shared by these participants are well worth reading. For example:

"In interviews following the event, newcomers said that they enjoyed the process, but would not likely edit on their own because they still found the experience too frustrating. Most had attended the event in the hopes of adding hundreds of women. They were dismayed to learn that adding just part of an article had taken the entire day. Only one person I interviewed recalled their username/password just days following their participation in an edit-a-thon and none of the new editors had added the articles they created to their “watchlist” ..."

Still, even the validity of some of the paper's qualitative observations have been questioned by Wikipedians. For example, Tripodi opens her paper with a misleading summary of the Strickland case:

"On March 7, 2014, a biography for Donna Strickland, the physicist who invented a technology used by all the high-powered lasers in the world, was created on Wikipedia. In less than six minutes, it was flagged for a “speedy deletion” and shortly thereafter erased from the site. This decision is part of the reason Dr. Strickland did not have an active Wikipedia page when she was honored with the Nobel Prize in Physics four years later. Despite clear evidence of Dr. Strickland’s professional endeavors, some did not feel her scholastic contributions were notable enough to warrant a Wikipedia biography."

However, this deletion within minutes did not at all rely on examining "evidence of Dr. Strickland’s professional endeavors" – rather, it was done based on the "Unambiguous copyright infringement" speedy deletion criterion, as can be readily inferred from the revision history that Tripodi cites here.

It is worth noting that the author of this deeply flawed paper has testified twice before U.S. Senate Judiciary Committee in the past, on different but somewhat related matters (bias in search engine results in particular).

Briefly

See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.
This edition of Recent research/the Wikimedia Research Newsletter marks the tenth anniversary of our inaugural issue. Thank you for reading and contributing over this decade, and consider following the WikiResearch feeds on Twitter, Facebook or Mastodon for more frequent updates – the Twitter account celebrated the milestone of 15,000 followers earlier this month.

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

"Wikipedia successfully fended off disinformation" on COVID-19

From the abstract:^[2]

"...we asked which sources informed Wikipedia’s growing pool of COVID-19-related articles during the pandemic’s first wave (January-May 2020). We found that coronavirus-related articles referenced trusted media sources and cited high-quality academic research. Moreover, despite a surge in preprints, Wikipedia’s COVID-19 articles had a clear preference for open-access studies published in respected journals and made little use of non-peer-reviewed research up-loaded independently to academic servers. Building a timeline of COVID-19 articles on Wikipedia from 2001-2020 revealed a nuanced trade-off between quality and timeliness, with a growth in COVID-19 article creation and citations, from both academic research and popular media. It further revealed how preexisting articles on key topics related to the virus created a frame-work on Wikipedia for integrating new knowledge. [...] Lastly, we constructed a network of DOI-Wikipedia articles, which showed the landscape of pandemic-related knowledge on Wikipedia and revealed how citations create a web of scientific knowledge to support coverage of scientific topics like COVID-19 vaccine development. [...] Wikipedia successfully fended of disinformation on the COVID-19 [sic]"

"The Influence of Multilingualism and Mutual Intelligibility on Wikipedia Reading Behaviour – A Research Proposal"

From the abstract:^[3]

"This article argues for research on the effects of multilingualism and mutual intelligibility on Wikipedia reading behaviour, focusing on the Nordic countries, Denmark, Norway, and Sweden. Initial exploratory analysis shows that while residents of these countries use the native language editions quite frequently, they rely strongly on English Wikipedia, too."

Using Wikidata to help organize the COVID-19 research literature

From the abstract:^[4]

"... the Covid-on-the-Web project aims to allow biomedical researchers to access, query and make sense of COVID-19 related literature. To do so, it adapts, combines and extends tools to process, analyze and enrich the "COVID-19 Open Research Dataset" (CORD-19) that gathers 50,000+ full-text scientific articles related to the coronaviruses. [...] The dataset comprises two main knowledge graphs describing (1) named entities mentioned in the CORD-19 corpus and linked to DBpedia, Wikidata and other BioPortal vocabularies, and (2) arguments extracted using ACTA, a tool automating the extraction and visualization of argumentative graphs, meant to help clinicians analyze clinical trials and make decisions. On top of this dataset, we provide several visualization and exploration tools ..."

"Unveiling the veiled: Wikipedia collaborating with academic libraries in Africa in creating visibility for African women through Art+Feminism Wikipedia edit-a-thon"

From the abstract:^[5]

From the abstract: "Findings showed that the library has created or edited digital content for various categories of women, such as women in academia, industry and politics. These entries have received more than eight million views over a period of two years, which shows that the entries are being utilised. However, the editing exercise had been confronted with challenges such as accessing reliable citations in terms of the notability and verifiability policy of Wikipedia amongst others."

How much does Wikipedia really diverge from traditional, "authoritative" encyclopedias?

From the abstract:^[6]

"Scholarship and journalism about Wikipedia often consider the ways it carries forward, diverges from, or takes to an extreme the various qualities commonly ascribed to encyclopedias. In doing so, it is taken for granted that encyclopedias are authoritative sources of summarized knowledge based on values like accuracy and comprehensiveness, and the question becomes how Wikipedia compares. Through this dissertation, I argue that these commonly held beliefs about encyclopedias are not inherent in the text but the result of centuries of external associations and internal efforts to cultivate a particular kind of authority. Encyclopedias have had close relationships with powerful institutions throughout their history and use a variety of techniques to frame the ways readers should think about them. Furthermore, these cultivated 'encyclopedic virtues' obscure the way that encyclopedists negotiate competing priorities and influences in the knowledge production process. Rather than being perfect, neutral summaries of the world, they often reflect nationalist, religious, or capitalist interests, sometimes even requiring the consent of the powerful in order to be published at all, or in rare cases, they can even prioritize direct critique of those same institutions."

The author is an experienced editor on the English Wikipedia (as User:Rhododendrites) and former longtime employee of the Wiki Education Foundation.

References

^ Tripodi, Francesca (2021-06-27). "Ms. Categorized: Gender, notability, and inequality on Wikipedia". New Media & Society: 14614448211023772. doi:10.1177/14614448211023772. ISSN 1461-4448.
^ Benjakob, Omer; Aviram, Rona; Sobel, Jonathan (2021-03-01). "Meta-Research: Citation needed? Wikipedia and the COVID-19 pandemic". bioRxiv: 2021–03.01.433379. doi:10.1101/2021.03.01.433379.
^ Meier, Florian Maximilian (2021). "The Influence of Multilingualism and Mutual Intelligibility on Wikipedia Reading Behaviour – A Research Proposal". Proceedings of the 16th International Symposium for Information Science (ISI 2021).
^ Michel, Franck; Gandon, Fabien; Ah-Kane, Valentin; Bobasheva, Anna; Cabrio, Elena; Corby, Olivier; Gazzotti, Raphaël; Giboin, Alain; Marro, Santiago; Mayer, Tobias; Simon, Mathieu; Villata, Serena; Winckler, Marco (November 2020). Covid-on-the-Web: Knowledge Graph and Services to Advance COVID-19 Research book. International Semantic Web Conference. Athens, Greece.
^ Ukwoma, Scholastica Chizoma; Osadebe, Ngozi Eunice; Okafor, Victoria Nwamaka; Ezeani, Chinwe Nwogo (2021-01-01). "Unveiling the veiled: Wikipedia collaborating with academic libraries in Africa in creating visibility for African women through Art+Feminism Wikipedia edit-a-thon". Digital Library Perspectives. ahead-of-print (ahead-of-print). doi:10.1108/DLP-08-2020-0079. ISSN 2059-5816.
^ McGrady, Ryan Douglas: "Consensus-Based Encyclopedic Virtue: Wikipedia and the Production of Authority in Encyclopedias". Dissertation in Communication, Rhetoric, and Digital Media, North Carolina State University, 2020-10-29 https://repository.lib.ncsu.edu/handle/1840.20/38333

Supplementary references and notes:

^ P.J. Bickel, E.A. Hammel and J.W. O'Connell (1975). "Sex Bias in Graduate Admissions: Data From Berkeley" (PDF). Science. 187 (4175): 398–404. doi:10.1126/science.187.4175.398. PMID 17835295.

← Previous "Recent research"

Next "Recent research" →

In this issue

25 July 2021 (all comments)

News and notes

Special report

In the media

Board of Trustees candidates

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

I did not read Tripodi's paper, but I did listen to her interview and it did seem like she was interpreting her findings farther than the data might have warranted. I think one thing that needs to be taken into account is that the WiR edit-a-thons attract a lot of novice editors who are likely to be frustrated by much more mundane things than sexism—simply the difficulties of Wikipedia's mechanics. Also, since so much of the new women material is being churned out by novice editors, it may be more likely that their quality isn't as good, or they aren't written in a way that obviously establishes the topic's notability, thus more women articles are shipped off to AfD for further inspection. -Indy beetle (talk) 04:52, 26 July 2021 (UTC)[reply]
Agreed. Buffs (talk) 15:14, 26 July 2021 (UTC)[reply]

Also, since so much of the new women material is being churned out by novice editors, it may be more likely that their quality isn't as good This is very true. I actually find it highly problematic that there is such a drive for novices to churn out biographies on women, particularly academics: the inevitable result is a flurry of AfDs on women, which is bad for the subjects and bad for our reputation. A large proportion of the articles created through the UW WikiEd course that seemed to focus on "uncommon STEM leaders" are/were on women with no evidence of meeting BASIC, let alone NPROF, with many or most seemingly chosen either by scanning the UW people directory for minority names (there was at least one page made on a Latina with an entirely non-academic administrative position in one of the UW STEM schools--someone who by every indication is a low-profile private citizen and would be mortified to see a biography on herself), or by choosing obscure subjects who were very likely connected to the student editor (like an article on a current grad student at an east coast university who was name-dropped in two news pieces covering local activism). BLPs are the trickiest pages to create PAG-wise, and NPROF is probably the most opaque/complex SNG; throwing students who almost certainly aren't even interested in the subject into navigating this area is bad enough, but adding in the constraint of profiling a demographic (an intersectional one at that!) whose presence and treatment on/by Wikipedia is already lambasted by the (wiki-policy-ignorant) media just seems like a swiss cheese recipe that starts out with more holes than cheese. JoelleJay (talk) 01:04, 31 July 2021 (UTC)[reply]

I would agree that it is problematic for there being a drive for said novices to churn out these biographies. I used to think of the WiR activist model as a good method for procuring content on under-covered areas, but not anymore. Either experienced editors need to be encouraged to write more about women (which is not likely to happen, as no one is obligated to write about something they don't want to) or novices who want to create new articles about women should be encouraged to practice more by doing regular editing before creating an entirely new page (especially a BLP) by themselves. -Indy beetle (talk) 01:44, 2 August 2021 (UTC)[reply]

It would have been nice to see a response from Tripodi to the accusations leveled here. I'm also shocked by the abysmal retention of new editors who joined through edit-a-thons. --Dutchy45 (talk) 08:55, 26 July 2021 (UTC)[reply]
I'm not shocked in the slightest. The bureaucracy and standards are significantly more difficult than they used to be. Buffs (talk) 15:14, 26 July 2021 (UTC)[reply]
- I'm not shocked about the retention, but it's a somewhat different reason, Low editathon retention has been reported in "The Signpost" before. But the overall retention is also incredibly low. We might want tp start with the hypothesis that only 1 in 100 people in the world are attracted to writing encyclopedia articles for a hobby. There's nothing really strange about that idea anyway. Smallbones_(smalltalk) 17:26, 26 July 2021 (UTC)[reply]
  - Please see Operation successful, patient dead: Outreach workshops in Namibia (December 2016) Smallbones_(smalltalk) 17:55, 26 July 2021 (UTC)[reply]
    - Thank you Smallbones for the article. I would also like to add my lack of shock. I've only ever attended one edit-a-thon, and it was a rather unremarkable affair. I was the only dedicated Wikipedia editor there aside from the organisers, and all the other people who showed up were students eager to get the extra credit one of their professors had attached to their participation in the event. They all seemed relatively disinterested in what they were doing and quite confused by Wikipedia's mechanics. -Indy beetle (talk) 02:25, 27 July 2021 (UTC)[reply]

I wrote to Ms Tripodi on 28 June, pointing out factual errors in her paper (different to those detailed above), regarding her analysis of the biography of Lois K. Alexander Lane, saying, in part:

You wrote:
"According to edit history, her biography was pushed out of the main space by a Wikipedian who deemed Lane 'a person not yet shown to meet notability guidelines'."
At the time of that edit, the article had never been in main space; it was in the Article for Creation process, and a request to move it to main space was rejected.
Also at that point, the article contained only two sources, used in seven citations, not the seven sources claimed.
While the volunteer making that rejection could have been more proactive in improving and then publishing the draft, they were correct that notability (in Wikipedia terms) had not been established *in the draft as submitted*. It is significant that the comment says "a person not yet shown to meet notability guidelines", as opposed to, say "a person who does not meet notability guidelines"
As a result, the article was improved so that notability was shown to exist, by the addition of a third source, the Adam Bernstein article "Lois Alexander Lane; Founder Of Harlem Institute of Fashion".

At the time of writing I have not had a reply (other than an automated out-of-office acknowledgement saying she would return on 6 July). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:42, 26 July 2021 (UTC)[reply]

Experienced Wikipedians running editathons know to guide people to work on improving existing articles, rather than starting new ones. For new articles, it's necessary to very carefully verify the likely notability for any list of potential new articles, ensure that new contributors do not try to make articles on themselves or their relatives, and inspect the work as it is being done before it goes live. Just like writing articles, running editing sessions takes experience. . It's my impression that some editathons to add coverage of under-represented groups have not at first done this vetting adequately--and this is not to blame them, because they need time to learn; and, from what I see, they have been learning. DGG ( talk ) 03:27, 27 July 2021 (UTC)[reply]
- I think I count as an experienced Wikipedian. I run editathons, and many of the participants have success in writing a new article, with virtually zero subsequent deletions. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 09:51, 27 July 2021 (UTC)[reply]

I did / am doing a survey of (so far) 350 articles of all types from the "random article" button. Including that was exploring the mix of male vs. female, recent (active in the last 15 years) vs non-recent, and also, because sports bios are by far the most prevalent category, sports vs. non-sports. The breakdowns are:

Sports on individual people: 32% All other articles on individual people 68%

Non-recent sports: Male 100% Female 0%
Recent sports: Male 83% Female 17%
Non sports, non recent: Male: 85% Female 15%
Non sports, recent: Male 47% Female 53%

IMO the last split best dials out the realities of history and sports and best addresses any Wikipedia systemic bias question regarding article topics. North8000 (talk) 11:51, 27 July 2021 (UTC)[reply]

@North8000:

Some old data along the same line. There's some description at User:Smallbones/1000 random results with a link to the data. This is from the time we just hit 5 million articles. It might give you something to compare to. Are we making progress? There were 278 bios (out of 1001 randomly selected articles), with only 41 bios of women. "BDP,F (sports)" has obviously not made any progress: 0% in 2015, compared to your "Non-recent sports: Male 100% Female 0%". Contact me if you have any questions. Smallbones_(smalltalk) 02:47, 28 July 2021 (UTC)[reply]

It's not even remotely shocking how many non-recent sports figures are men vs women. Women's sports prior to 1900, beyond a trivial nature, are a relatively unknown. That doesn't mean we can't have them, but there is scant information on them. If you want more, you need to produce more. I see no barriers to that other than history. Buffs (talk) 18:48, 28 July 2021 (UTC)[reply]

Quoting: From January 2017 to February 2020, the number of biographies about women on English-language Wikipedia rose from 16.83% to 18.25%,

Some time ago I started the following table in my user page, to see how I am doing well with women compared to the rest of Wikipedia :-)
- f / f+m bio ratio:
  - 09/25/2019: 23:(23+87)=0.209 (cf. 0.1804 in Wikipedia)
  - 01/28/2020: 29:(29+93)=0.238 (cf. 0.1823 in Wikipedia)
  - Friday 13 March: 40:(40+93)=0.301 (cf. 0.1827 in Wikipedia)
  - 08/15/2020: 47:(47+99)=0.322 (cf. 0.1854 in Wikipedia)
  - 12/06/2020: 52:(52+104)=0.3333333 (cf. 0.1866 in Wikipedia)
  - 25/04/2021 55:(55+105)=0.34375 (cf. 0.1886 in wp)
  - 27/07/2021 60:(60+107) = 0.35928 (cf. 0.1899 in wp)
From the above I have an impression that Wikipedia is "underperforming" in terms of the relative growth rate of women's bio share despite all its editathons. I am wondering whether someone is skilled in presentations and can draw a timeline curve to see how well this ratio is doing?

P.S. I started tracking this, because one of my wikignoming jobs was the creation of surname articles. In doing this I've been consulting non-en-wikis and was unpleasantly surprized with big numbers of clearly notable foreign "women in red", so I started creating reasonable stubs for them in order to "protect" their entries in the {{surname}} lists I created. Lembit Staan (talk)

My own conclusions from the limited work I did are that

History has a bias - historically women has been less involved in the things that sources write about. And Wikipedia goes by sources
Sports dominates anything numerical in Wikipedia, and the low "did it for a living for one day" sports SNG criteria means that professional sports bios are heavily influential on any bio numbers. And professional sports is still numerically dominated by males, doubly sso if viewed over history.
So the real world, looked at over history, has a male bias. You could call going by sources a Wikipedia "systemic bias" but other than that I don't think that Wikipedia introduces any gender bias.

BTW IMHO the fact that Wikipedia is such a mean and vicious battleground environment for editors does introduce a systemic bias against female editors. But that's a different question. North8000 (talk) 14:24, 28 July 2021 (UTC)[reply]

A lot of the comments above are really interesting, and I don't have much to add on the statistics or gender bias lines, but this sentence really stuck out to me: Most had attended the event in the hopes of adding hundreds of women. They were dismayed to learn that adding just part of an article had taken the entire day. The reason people have these expectations is because Wikipedians are invisible. Most readers do not know how the site is written. Most readers who know have this fictitious impression that a small number of people can simply mash a few keys and pop out an article, rather than understanding that every segment of content that takes a minute to read took 10 minutes or an hour or ten hours of community action to build. Most readers don't understand how much upkeep there is and how much necessary logistical work behind the scenes there is. So it's no wonder that people are put off by the realisation of reality. And it explains so many other phenomena on this site, such as people's readiness to vandalism—they don't understand how long we spend on fixing it—or people's reluctance to contribute—"how many people do they need, I'm sure they've got enough". — Bilorv (talk) 16:26, 28 July 2021 (UTC)[reply]

No doubt others have commented on this elsewhere, and it is alluded to in some of the comments above, but to what extent does Wikipedia replicate systemic gender bias versus to what extent does it exacerbate that bias? I suspect for many (most?) editors the first is a sort of natural, shrug of the shoulders, that's obvious, response. However, to my mind, there are ways in which the nature of contributing to Wikipedia in a long term, consistent manner, provides far more opportunity for men, in particular older, professionally educated men, the opportunity to contribute. Our culture/principle of volunteerism (which is venerated and defended with as close to complete consensus of any principle here) per se provides more opportunity for men; every single study shows a gender inequality with regard to access to free time. Access to technology, wages, income in retirement; all these mean men are more likely to have time and means to contribute. The more one moves away from the Euro-American world, the more stark these differences become. So, I find this response somewhat missing the forest for the trees; I'm not saying there's a simple solution, but I think we should welcome attempts which try to understand how Wikipedia processes exacerbate gender inequality, rather than simply dismiss the problem as beyond our capacities to confront (or worse, deny there is a problem). Regards, --Goldsztajn (talk) 03:49, 29 July 2021 (UTC)[reply]

I think you've made a key point but in a way that hides your point. IMO Wikipedia is systemically biased against female EDITORS which is a different topic than the one being discussed here.North8000 (talk) 12:09, 6 August 2021 (UTC)[reply]

If my point was not clear, my apologies. To clarify: this review criticises and claims to refute a paper about gender bias in Wikipedia, it includes claims that other research has not shown gender bias to exist (or not to be as bad as claimed) and makes no comment otherwise. For me, this reads as a defence of the status quo; ie, Wikipedia simply reflects the world's gender bias (inter alia), rather than also containing structures and processes which exacerbate that bias (eg the vast over-representation of military and sports related material, the variability of the SNG, are a reflection of Wikipedia's own built bias not simply a broader social bias). Regards, --Goldsztajn (talk) 22:21, 13 August 2021 (UTC)[reply]

Update: I did / am doing a survey of (so far) 500 articles of all types from the "random article" button. Including that was exploring the mix of male vs. female, recent (active in the last 15 years) vs non-recent, and also, because sports bios are by far the most prevalent category, sports vs. non-sports. The breakdowns are:

Sports on individual people: 33% All other articles on individual people 67%. So sports is heavily influential on all biography numbers

Non-recent sports: Male 100% Female 0%
Recent sports: Male 82% Female 18%
Non sports, non recent: Male: 87% Female 13%
Non sports, recent: Male 52% Female 48%

Lots of great analysis on this page. @North8000, thought this recent Slate article might be relevant to you: How to Use Wikipedia When You’re Watching the Olympics, which discusses the gender gap with respect to our Olympic SNG. czar 02:58, 30 July 2021 (UTC)[reply]
It always amazes me when people mess up the first part of the Donna Strickland fiasco. It takes all of literally two seconds of reading the deletion log, and even if you're unregistered and the log doesn't show up when you click on the red link, the link to it is right there to click on. Some crack reporting there. The Blade of the Northern Lights (話して下さい) 02:14, 6 August 2021 (UTC)[reply]

@The Blade of the Northern Lights: ...or if logs are to difficult you can just Google her name and this shows up: wikimediafoundation.org, written by The ed17. Polygnotus (talk) 04:15, 1 October 2024 (UTC)[reply]

The Signpost needs your help putting together the next issue.

Home

About