Jump to content

Wikipedia talk:Article size

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Onetwothreeip (talk | contribs) at 07:44, 28 June 2022 (Survey). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Clarification needed for "article splitting activists"

There seems to be a recent trend of a couple of people (@Blubabluba9990 , @Zsteve21 , and @Onetwothreeip) using the Wikipedia:Database_reports/Articles_by_size page and going around to each page and trying to split articles or edit them in some ways incorrectly to try to shrink the size. Is there any description that can be added to this page such that it can be clarified that simply trying to split articles because they're relatively large and ONLY because they're relatively large is not good editing etiquette? Especially when the splitting is being done by non-subject matter experts they seem to commonly make mistakes when splitting and are done without consultation of the regular editors of the pages. Ergzay (talk) 04:01, 15 October 2021 (UTC)[reply]

This guideline already states that such editorial decisions should obtain consensus. The rest seems to fall somewhat within WP:BOLD. If an editor is being perhaps too bold, the best course of action is probably direct engagement with the users. CMD (talk) 06:09, 15 October 2021 (UTC)[reply]
I've split and reduced many articles over the last few years, mostly without any controversy at all. I can't stress enough that the vast majority of articles >450,000 bytes that I have split have been without any opposition from other editors. I'm sorry if some editors supporting such actions have been uncivil, but the etiquette is clearly a matter of how it is done rather than it being done at all and I have always sought to upheld the highest standards of civility, even when faced with spurious accusations of vandalism, sockpuppeting, bad-faith editing or other abuses. Sometimes I have disagreements with editors over splitting or condensing articles, and that's fine, we work them out. I am willing to offer advice or assistance to Blubabluba9990, Zsteve21 or any other editors that wish to help in the size area, but ultimately they will be accountable for their own actions.
Most of all I would like to stress to everyone that civility should be of the highest importance. Sometimes editors feel that they own a certain article, and can feel offended when other editors seek to make the article congruent with Wikipedia's guidelines and the vast majority of other articles. This should be considered, although obviously we don't let editors make decisions for an article as if they are the owner(s). An editor who has never edited a particular article has as much right to make changes as an editor who has done most of the work on it.
I would also be the first person to say that editors who have worked on articles for a significant amount of their time are often those who know best the most optimal way to split an article, or to otherwise reduce its size. These articles may not be split or reduced in the way that one might anticipate, but it happens eventually in some way or another. Onetwothreeip (talk) 06:39, 15 October 2021 (UTC)[reply]
@Onetwothreeip "can feel offended when other editors seek to make the article congruent with Wikipedia's guidelines and the vast majority of other articles" Except this is not true. You're not trying to make articles congruent with Wikipedia's guidelines. You're trying to make articles congruent with your own opinion that many articles should be much smaller than they are now. You've made your own guidelines that you think should be followed, and that is fine, but then you go on to assert that those personal guidelines are Wikipedia's guidelines which is simply a form of gaslighting. Ergzay (talk) 20:52, 15 October 2021 (UTC)[reply]
The articles we are talking about are the extremely long articles, several times larger than the average article size. Those articles are inconsistent with the great majority of Wikipedia articles which are much smaller. Articles being split when they get large is a normal process on this project. Onetwothreeip (talk) 21:59, 15 October 2021 (UTC)[reply]
So they're several times larger than the average article. So what? Are they several times larger than the average well-developed, comprehensive article? And even if so, again: so what? Different topics have different needs. And are you really using Wikipedia:Database_reports/Articles_by_size, which reports the source size of each page, not the amount of readable prose? This is the worst kind of gnoming.
You say Articles being split when they get large is a normal process on this project -- yeah, a normal process when carried out by people who have an interest in a topic and have thought about how it might be best presented, not drive-bys who fancy themselves working "in the size area". I'll say it again: worst kind of gnoming.
Exhibit A: Talk:Glossary_of_engineering#Splitting_this_article was complete waste of time -- yours; that of anyone else interested in the article; and that of anyone wanting to use the article, since you've uselessly broken it into two pieces so that readers have to jump around. You also broke intra-article links while you were at it. Tell us what you achieved there? And while you're at it, convince the rest of us that you even understand the difference between source size and rendered size (or, if you like, readable size). EEng 23:10, 15 October 2021 (UTC)[reply]
Yes, they are several times larger than the average well-developed comprehensive article, and their excessive size is either an issue itself, caused by another issue, or both. I don't know what you mean by using that particular page, that's simply a weekly summary of the largest articles by the size of the source code. I did not split that particular article you are mentioning, but I'm happy to defend the splitting of any articles I've split myself, or any other issues to do with this area. All I am concerned with is that the articles and Wikipedia itself is improved. Onetwothreeip (talk) 23:21, 15 October 2021 (UTC)[reply]
I've had a look at the split of that article and it seems fine to me. There doesn't seem to be any issues with intra-article links being broken. One of the two halves hadn't been renamed yet, but I've done that now. It looks like the only issue in this example was that editors were too concerned about process. Onetwothreeip (talk) 23:33, 15 October 2021 (UTC)[reply]
Onetwothreeip: Before we go on... where do you get your statistics on the average size of well-developed, comprehensive articles? You say you didn't split Glossary_of_engineering -- that's right, you merely told others it was a good idea [1][2], and now say that splitting it into two arbitrary halves "seem fine". So I'm going to insist that you defend that decision. You have still failed to give any indication of what the benefit was, so I repeat the challenge: how did it help anything? Because here are eight ways it hurt:
  • (1) Readers have to think about which of two arbitrary subpages (A-L, M-Z) has the entry they're looking for;
  • (2) If you're searching for a word or phrase, you have to do it on two different pages;
  • (3) Intra-article links are broken (contrary to what you say -- if you think they're not, then you're not competent to be splitting articles);
  • (4) Even once the intra-article links are fixed, it will take significantly longer to follow such links (in 1/2 the cases);
  • (5) Countless incoming inter-article links are now broken, and I don't see you rushing to find and fix them;
  • (6) Fixing (5) will create pointless churn of watchlists;
  • (7) Adding new links from other articles is now harder, since editors have to remember how the list is split;
  • (8) Everyone's time has been wasted marveling at this personal crusade you've created for yourself so that you can feel you're doing something useful, which you're not.
Now, again: what was the benefit of the split? And, specifically, when do you bunch plan to find and fix all the broken intra-article links and incoming links? EEng 04:06, 16 October 2021 (UTC)[reply]
By comparing the sizes of these super-large articles, which are often but not always those with the most source code, with the sizes of what are considered our better articles, such as featured articles. I did express that it would be good for the article to be split, but that doesn't endorse any possible split. The splitting that did indeed take place of that article, I support.
This is not the right place to discuss the merits of splitting the article, and I'm happy to discuss that on my user talk page. I will briefly address the points you raise. (1) assumes the reader is looking for a specific entry, which is not true. If they wanted the definition of one specific word or phrase, they would use the main search function. (2) is essentially the same point as (1). (3), you'll have to be specific which links you're referring to, but you are admitting in (4) that it's a fixable problem and I don't accept that it takes longer. The same can be said of (5), keeping in mind that I didn't split the article myself. If I did, I would be attentive to particular issues arising from the split. (6), added activity on watchlists is negligible, (7) is not true as the previous links still apply, and (8) it's up to you if you want to spend your time discussing this, that's not my fault or the fault of anyone splitting the article. There are thousands, if not millions, of articles that could use my attention or the attention of any editor, and since I don't have the capacity to address all of the articles we have, I decide which articles I focus on. Onetwothreeip (talk) 06:07, 16 October 2021 (UTC)[reply]
I did express that it would be good for the article to be split, but that doesn't endorse any possible split. -- What??? EEng 06:51, 16 October 2021 (UTC)[reply]
  • (1) Of course they may be looking for a particular entry. By your reasoning we ought to have a thousand individual pages instead of one (or, I guess, two) consolidated pages.
  • (2) Your response makes no sense at all. Let's say I'm interested in engineering terms related to the word heat sink. I have to search two different pages.
  • (3) No, I don't have to be specific what links I'm referring to. If you can't find them without my help then (I repeat) you're not competent to be dealing with article splits.
  • (4) So it's someone else's job to fix the broken intra-article links (I guess because you don't even know how to find them). And of course it takes longer in half the cases, since now half of the intra-article links are now inter-article links, so that you have to load a new page to follow it. Do you really not grasp that?
  • (5) The point remains.
  • (6) I guess watchlist churn is unimportant to you, but to those who actually tend to articles it's a significant timewaster.
  • (7) What are you talking about? Someone wanting to add a link to a particular entry on what used to be a single page now have to go look to see that how the page was split. Many will perhaps be completely unaware that it was split, and unknowingly link to the old article, which no longer exists.
  • (8) What about the participants at WP:Administrators'_noticeboard/IncidentArchive1026#Undiscussed_split? Was that up to them as well? Are you just an innocent onlooker, or are you the editor whose activities are raising so much concern.
I'll note again that, for all the above threadbare excuses for why nothing too bad resulted from the split, you still haven't responded to the most important question asked: What was the benefit?
While you struggle to find an answer to that, let's look (as you suggest) at an article you yourself did split. This article [3] was a handy collection of statistics on the 2021 German elections. Apparently because its source was 400K+ (which is a result of every line of every table carrying an external link as a source, not because there's unusually much material in the article, for an article of this kind) you decided to split off one arbitrary piece [4]. Why that piece? How does that better serve the reader? In fact, do you have any idea of how that material relates to the rest of the material? Do you have even the foggiest idea of the significance of what you did, or how it might affect a reader interested in the elections? Let me guess: no.
Pinging in Rosguill, who closed the ANI discussion linked in (8).
EEng 06:59, 16 October 2021 (UTC)[reply]
The opposite of splitting something into a thousand individual articles is to combine a thousand different articles into one. My comment on a talk page saying that it would be good for an article to be split doesn't mean I support every possible way to split an article. Articles should neither be too small or too large, but often the large size is because of another problem.
I'm willing to take this extensive discussion to my talk page, but I think you should take a break from your computer as you're getting needlessly heated. To respond briefly, on 1 and 2, readers search using the search bar in the top right. I can't address which links you're talking about in 3, 4 and 5 if you don't tell me which links you're talking about. 6 is bizarre, because edits shouldn't be discouraged on the basis that they appear in watchlists. It only takes a few edits to fully split an article anyway. 7, the old article destination has links to both.
This next article you mention was never "a handy collection of statistics on the 2021 German elections". It was and remains an article about opinion polling for a federal German election. Why that piece? It was an especially large part of an article which was not the core content for the article and worthy of its own article. How does that better serve the reader? Both the content that remains in the main article and the article split off are more accessible to readers. In fact, do you have any idea of how that material relates to the rest of the material? Yes, the content is about opinion polling for the election; voting intention polling and favourability of the lead candidates. That article is one I have been reading for years and is currently on my watchlist. If you wish to follow up about this article, I invite you to take the discussion to my talk page. Onetwothreeip (talk) 07:26, 16 October 2021 (UTC) (Note: Much of the comment which this is a response to, timestamped 06:59, was added in subsequent edits after I had first read EEng's comment, so my response didn't cover all of what they added afterwards. Onetwothreeip (talk) 02:27, 17 October 2021 (UTC))[reply]
@Onetwothreeip Note, normally with opinion polling you include a constituency prediction based on the opinion polling. They go hand in hand and splitting that article was incorrect based on the how the two pieces of information are normally together (look elsewhere on wikipedia where similar information is presented and those pieces of information are on the same page). I'm going to revert that split of the german article. Ergzay (talk) 15:51, 16 October 2021 (UTC)[reply]
Good idea. Replace the old split page with #REDIRECT [[destination page]] {{R from merge}}. EEng 19:06, 16 October 2021 (UTC)[reply]
That is not true. Constituency results, predictions and polling are typically separate from the other articles in an election series when there is enough content to justify a separate article. Onetwothreeip (talk) 21:21, 16 October 2021 (UTC)[reply]
We're having the discussion here, now because the real issue isn't any particular article, but your idea that arbitrary splits based on size, and with little or no attention to the effect on the presentation of the material, are somehow helpful. We are trying to help you see that, but you seem unable to engage the issues I've raised -- for example, after doing years of splits you still don't seem to know how to find links broken by a split, and when I've referred to watchlist churn caused by fixing broken links, you responded by saying only takes a few edits to fully split an article, which shows you still don't understand the issue. So we'll put that stuff aside to focus on this one thing: I've asked over and over what the benefit was of these splits, and the best you've come up with is Both the content that remains in the main article and the article split off are more accessible to readers. Sorry, but that makes no sense. How in the world does splitting the article make any content "more accessible to readers"? EEng 19:06, 16 October 2021 (UTC)[reply]
Sometimes it's appropriate to split large articles, but that's not always what is the best solution. Often there are other solutions not only to the issue of an article being exceptionally large, but other issues which also happen to greatly increase the source size of the article.
You can't say I haven't engaged with what you've saying, I've taken each point you've made and responded. What you mean to say is that I am not agreeing with the opinions you've presented.
It is not controversial at all to say that articles being extremely large are harder to read for readers, and harder to edit for editors. You can read the Wikipedia guidelines to see more on that. Onetwothreeip (talk) 21:25, 16 October 2021 (UTC)[reply]
Other editors will decide whether you're engaged my concerns. I'm assuming your statement that articles being extremely large are harder to read for readers, and harder to edit for editors is an attempt to answer my request that you explain how (as you claimed) that splitting articles makes them more accessible to readers. Let's say that's true, at least all other things being equal (which they rarely are). But how does that apply to the engineering glossary, which isn't "read", or to the German polling, which also isn't "read" (though someone might want to use it to find trends and so on -- a use case you've neatly hobbled by isolating a big part of the data from all the rest). Please explain. EEng 02:01, 17 October 2021 (UTC)[reply]
Both those articles are read, viewed and accessed by readers. Those verbs can be used interchangeably with my previous use of "read", which should cover those articles. Only one of those articles you mention have I actually split, and I've very easily defended it (and also the split of another article by a different editor). Even if you disagree with an article split I have made, what you should have done is reverted the split or raise it with me. In the few circumstances that I have made a split that was contested, this is what editors who opposed it have done. Then we go to the talk page and work it out, coming to an agreeable conclusion as per WP:BRD. Onetwothreeip (talk) 02:21, 17 October 2021 (UTC)[reply]
Read, viewed, and accessed are certainly not interchangeable -- you don't "read" a glossary the way you might read the bio of some senator. But in any event, you still haven't said in what way the split makes it easier for a reader to read, view, or access the material, especially given that you've broken it into to pieces that can't be considered together. Again, please explain. EEng 18:18, 17 October 2021 (UTC)[reply]
@EEng: Wow that split to glossary of engineering is horrendous. Is there any way to revert these types of things? Ergzay (talk) 15:08, 16 October 2021 (UTC)[reply]
That's going to be a bit harder. More urgent is to put a stop to all this ongoing spilt nonsense. EEng 19:06, 16 October 2021 (UTC)[reply]
  • Referring to 123IP, after years of disruption and IDHT refusals to accept the concerns of myriad editors, I think the only solution is a topic ban against splitting of any kind, including discussion of the subject, as much of the disruption is on talk pages. This excessive focus on article size is weird and counterproductive. -- Valjean (talk) 20:18, 16 October 2021 (UTC)[reply]
    Not true at all Valjean, I have a long record of collaboration on article talk pages with editors I disagree with. You've lied about me before, which you admitted to after being called out by other editors, so I don't think you are being or will be constructive in advising me. I would much prefer to have disagreements over content than whatever personal issues you may have with me, stemming from our interactions on contentious articles. Onetwothreeip (talk) 21:29, 16 October 2021 (UTC)[reply]
    Well I've never run into you before, and my analysis is exactly the same. I think we should wait to hear from the admin who closed the ANI thread on this two years ago, and then decide how to move forward. EEng 02:01, 17 October 2021 (UTC)[reply]
    You should've raised any concerns you had with any of my edits on the talk page of those article. You're overreacting. Onetwothreeip (talk) 02:08, 17 October 2021 (UTC)[reply]
    EEng, I see quite a bit of heated discussion about article splitting philosophy, and a handful of editors asserting that specific edits were poor. I don't have a strong opinion about the topic. Onetwothreeip has made an adequate effort to respond to several complaints here. It's clear that several editors disagree with Onetwothreeip about article organization, and that Onetwothreeip's changes are discovered by said editors long after they have been made, creating a scenario where you're objecting to a pattern of behavior rather than challenging individual edits. I would need to see much stronger consensus that Onetwothreeip's recent edits were undesirable and reckless to justify a sanction. signed, Rosguill talk 06:02, 17 October 2021 (UTC)[reply]
    I should have been clearer that it's indeed the pattern of behavior that's of concern here. Obviously a sanction (read: topic ban, as was proposed at ANI last time) would need careful evidence and a community discussion. I was just interested in your thoughts about the situation, given that you closed that discussion (and so perhaps have the best sense of its gestalt); I wasn't suggesting that you do anything. EEng 18:18, 17 October 2021 (UTC)[reply]
    123IP, that's an oddly hypocritical personalization, considering you're charging me with actually lying about you. I suggest you strike that and stick to the issue, which happens to be your attitude toward article splitting. -- Valjean (talk) 02:06, 17 October 2021 (UTC)[reply]
    Your entire comment was about myself personally. I would much rather discuss issues to do with editing articles. Onetwothreeip (talk) 02:10, 17 October 2021 (UTC)[reply]

This entire discussion shows that there needs to be additional guidance on article size. I think it is not likely that this is a coordinated effort, there is a group of editors whose objective seems to simply be to split articles. It would be instructive to look at how the list of longest articles has evolved over the last year, especially the editors and their rhetorical tactics and creative use of the Wikipedia policies. Just some examples from a recent "split battle:"

  • "The largest, second largest and third largest articles should be split, or in some way have their size reduced." [Obviously an impossibility–there will always be a largest article.]
  • "The article is almost at 500,000 bytes, so it is not consistent with WP:SIZE." [Again, obviously wrong, as the reference only discusses readable prose]
  • "Our size guidelines do allow for articles to exceed 100,000 bytes, but this article is a few times larger than that." [The first part contradicts the second bullet above, the second part is irrelevant.]
  • "The reason for splitting this article is best summarised as making it easier for readers to access and view the overall content, which may be better done over more than one article." [A segue to the alternate argument, point out a problem that doesn't exist.]
  • "I don't think you would be convinced by anything I would show you." [The fallback approach when the others are failing.]
  • "The prose size limits are there for the ease of the reader in reading the main content of the article, which is typically the written prose for most articles." [A made-up "rule"]
  • "When assessing the size of a prose article, we typically don't consider tables, images and other elements to be the primary content of the article, but that's obviously not tenable for articles which primarily contain those elements." [Another made-up rule, where the second part contradicts the first part.]

As I said, there needs to be some written policy on this because once started, the assault never stops.VarmtheHawk (talk) 16:19, 17 October 2021 (UTC)[reply]

Well summarized. Can I trouble your for diffs for the above, or links to the discussions? EEng 18:23, 17 October 2021 (UTC)[reply]
Talk:List_of_Falcon_9_and_Falcon_Heavy_launchesVarmtheHawk (talk) 05:23, 18 October 2021 (UTC)[reply]
Those quotes all appear to be mine. In the first one, I was referring to the articles that are currently the very largest, not all articles. All the other quotes are correct in their context. What's most important is to take an approach that evaluates each article's needs separately, so for example if an article's content is mostly in tables, we would evaluate the size of the content within the tables. Onetwothreeip (talk) 21:21, 17 October 2021 (UTC)[reply]
Thank you for explaining that "The largest, second largest and third largest articles should be split..." referred to "articles that are currently the very largest, not all articles." Those of us with a public education had trouble figuring that one out. You might note that this is in direct opposition to your last sentence above. Maybe you could enlighten us as to what your position is on this issue, quantitatively if possible.VarmtheHawk (talk) 05:23, 18 October 2021 (UTC)[reply]
See Tall_poppy_syndrome#Etymology. EEng 14:33, 25 October 2021 (UTC)[reply]
Great analogy. But, as noted below, some tall poppies are not really tall at all; all are equal but some are more equal than others. And, as I frequently point out, there will always be a largest article.VarmtheHawk (talk) 17:52, 25 October 2021 (UTC)[reply]

Discussion about proposed solution

We obviously need some clearly stated official wording to guide editors when the idea of splitting is broached. We are not talking about normal content editing here. Splitting is rarely necessary and should always be preceded by a thorough discussion and near 100% consensus for splitting. It should NEVER be a BOLD move.

With other content changes that may be considered controversial, BOLD does not apply, but sometimes a passing editor is not aware of any controversy and they make a BOLD controversial edit. In such cases, they should follow BRD when their edit is reverted and not restore their change. They should allow the status quo version to remain untouched until a discussion has produced a very solid consensus. With normal content editing, BOLD is okay once, but if there are objections, caution should then rule. We are not talking about normal content editing here.

Proposal: It should be plainly stated here and at BOLD that:

Article splitting (which is never a normal content type edit) is a de facto controversial change that excludes appeals to BOLD. Splitting is too consequential a change to do as normal editing and using BRD. The possibility of edit warring over a split should be excluded. A split should only happen after an official RfC reaches a very clear consensus, determined by outside observers, not the one wishing to do the splitting.

Let's discuss and improve this suggested wording. -- Valjean (talk) 19:01, 17 October 2021 (UTC)[reply]

Wikipedia doesn't have a problem of bold edits which split articles. Any objected bold edit which splits articles gets reverted and they don't get reinstated unless there's consensus. I would certainly self-revert a bold edit splitting an article if I was asked to do so. Onetwothreeip (talk) 21:24, 17 October 2021 (UTC)[reply]
History has shown that to not be the case. BOLD splits have often caused problems. This proposal would prevent the many debacles that have led to much debate, edit wars, disruption, wasted time, and strong warnings which have been ignored. Let's plug that open pit so more people don't fall into it and even more editors have to waste time pulling them out and cleaning up the mess they have made. We would not be here if this proposal had been our guideline for splits. We're here because it hasn't been. Following this proposal would also prevent the need to threaten topic blocks for BOLD edits that did not enjoy consensus and the ensuing, long, IDHT discussions that have often followed. That's history. Let's enforce the basic principle that is supposed to work here, which is collaborative editing. Let's stop the kind of solo editing that creates problems. Splits are not normal content editing, so they should be treated differently. -- Valjean (talk) 02:14, 18 October 2021 (UTC)[reply]
Splits are not normal content editing – That's a really good point. In many cases they're more akin to RMs, and should be treated as such (in -- I repeat -- many cases, but by no means all). EEng 06:00, 18 October 2021 (UTC)[reply]
Moves are actually often very routine and unremarkable. RMs are in effect only for contested moves. Onetwothreeip (talk) 06:20, 18 October 2021 (UTC)[reply]
What part of in many cases ... I repeat -- many cases, but my no means all do you not understand? You have an extremely annoying habit of responding to fragments of what other say. EEng 17:47, 18 October 2021 (UTC)[reply]
You're here because you want to be here, in your case because you saw a comment on my talk page. "Bold splits" have a very simple solution when they are contested: they are reverted. That is what's happened every single time they were contested before and is normal process. If someone is edit warring over it, that's a specific matter solved through our usual processes. Onetwothreeip (talk) 02:51, 18 October 2021 (UTC)[reply]
I'm not quite sure what the first sentence means, but simply repeating the current policy doesn't really add much to the discussion. Valjean has raised a valid point, and I don't think the policy anticipated the destructive editing that is occurring. Case in point is zsteve21. He says, and I quote: "I am just a novice Wikipedia editor on my own who wants to make articles have manageable markup sizes." Notwithstanding the bizarre nature of that statement, do we want a novice editor making bold split decisions as he has numerous times in his impressive 2-month experience with Wikipedia? I don't much care about "List of Hallmark Movies" but don't think an article like "Glossary of Engineering" should be messed with without a discussion with the large number of expert contributors. VarmtheHawk (talk) 05:23, 18 October 2021 (UTC)[reply]
At this point I don't know if we need a guideline change, or a handful of topic bans, or both. Your last example there is pretty scary. EEng 05:46, 18 October 2021 (UTC)[reply]
If a particular editor is making bad edits, that's a specific matter and not a flaw of policy. It would be helpful if editors could raise what they think are the bad edits. These are all pretty solvable simply by reverting such bad edits. Onetwothreeip (talk) 06:22, 18 October 2021 (UTC)[reply]
This is not about "good" or "bad" edits. Splits can be either. They are bad when made as BOLD edits without a pre-existing consensus for "if" and exactly "how" it should be done.
Splits are not normal editing. They are very different and should be governed by different rules. They should not be subject to back-and-forth and BRD editing procedures. All the preliminary work should be done on the talk page, with no attempts to perform the split until a consensus is reached. -- Valjean (talk) 17:35, 18 October 2021 (UTC)[reply]

Summary to-date. Since this discussion appears to be winding down, I thought it would be useful to summarize the major points that have been made.

  • There is a group of editors whose mission is to split the largest articles, regardless of merit.
  • This group will use a myriad of arguments, generally untrue, irrelevant or exaggerated, and will continue to recycle these arguments until the contributors of the article are worn down.
  • Counterarguments by the subject matter experts to these arguments are met with derision or requests to prove their counterarguments by comparing their work to other Wikipedia articles; responses are never good enough.
  • They support the destructive process of "bold splitting" and believe that it is easy to counter, despite evidence to the contrary.
  • They are very familiar with Wikipedia policies, much more so than an average editor working on an article.
  • They do not support any changes to policy.
  • They are particularly adept at using the SIZE argument, making it mean whatever they feel like at the time.

Please feel free to add to this list or to show that any of them are untrue.VarmtheHawk (talk) 17:41, 18 October 2021 (UTC)[reply]

There may be more but that's a great start. Perfect description of the behavior in display at Talk:Glossary_of_engineering:_A–L#Reverting_the_split. A friend suggests that What's needed is much more prominent guidance at WP:SPINOUT that this should only be done to well-established articles when there is both a strong consensus and adequate subject expertise to make a sensible subdivision. WP:HASTE is too wishy-washy about how maybe you might think for five seconds before breaking out the chainsaw. I think that's a great framework to work from. EEng 17:02, 19 October 2021 (UTC)[reply]
In looking into the background of this issue, I've noticed that, in the last six months, the top 10 longest articles have been targeted and split by this group, relegating them to a lower spot on the list. Of the current top 20, almost all are identified as targets for splitting. The first two, List of chess grandmasters and List of Falcon 9 and Falcon Heavy launches are subject to fierce debate (in addition to the revisit of Glossary of engineering). Yet the third, List of The Amazing Spider-Man issues has no such comments. Even more interesting is the fact that #7 and #18 are about Donald Trump and his presidency and both exceed 100k in readable prose. I wish someone would comment on this dichotomy. Perhaps a list of longest articles by readable prose?
What often goes unsaid is that an article once split may again be subject to another split as the list narrows. The attempt to call a moratorium on further discussion of splitting List of Falcon 9 and Falcon Heavy launches was, of course, met with derision.VarmtheHawk (talk) 17:54, 19 October 2021 (UTC)[reply]
When you say "top 10 longest", you mean longest per that database report Ergzay mentioned in his OP at the top of this thread? EEng 18:04, 19 October 2021 (UTC)[reply]
Hopefully fixed.VarmtheHawk (talk) 18:53, 19 October 2021 (UTC)[reply]
My point is that that report is about wikisource size, which is completely irrelevant. EEng 01:24, 21 October 2021 (UTC)[reply]
The longest articles purely by prose would mostly be related to Donald Trump and recent American politics, in my experience. Onetwothreeip (talk) 06:49, 20 October 2021 (UTC) Onetwothreeip (talk) 06:16, 20 October 2021 (UTC)[reply]

That's not even remotely true (see, for example, Douglas MacArthur), but I think the picture on this issue is getting clearer:

  • The arguments usually presented towards splitting an article are frequently misstated, and the articles in question are almost always in compliance with WP:AS.
  • There is not a "one size fits all" approach that works.
  • Many articles should be split or otherwise reduced in size, but any argument for doing that should include a valid reason. For example, if one section is considerably more detailed that the rest of the article, that may be a candidate.
  • Appearance on Special:LongPages (and similar reports based on wikisource size) has zero weight in arguing for an article to be split (e.g., List of The Amazing Spider-Man issues);[further explanation needed] all arguments should relate to the amount of material the reader sees, distinguishing article prose vs. tabular (and similar) material vs. notes and references vs. images and other visuals, etc., and take into account the distribution of material into sections, the nature of the topic, ways readers are likely to approach the material, and so on.
  • The list of prose-size breakpoints (e.g. "almost certainly split" at 100K) is just something someone wrote on the back of an envelope 15 years ago, yet certain editors treat it like the Ten Commandments.
  • Because of WP:HASTE, articles should not be split boldly. If an editor feels that an article needs to be split, they should make a concrete proposal and consensus reached. Significant weight should be given to the opinions of the subject matter experts. If the vote is against, the issue should be put to rest until the article has major changes.

In particular, this practice of constantly throwing up arguments, with the corresponding scramble to respond, really should stop.VarmtheHawk (talk) 23:52, 20 October 2021 (UTC)[reply]

Subject to User:VarmtheHawk's approval, I've made some changes to the above list. EEng 01:37, 21 October 2021 (UTC)[reply]
Yes, no problem. The comment above reflects that the List of The Amazing Spider-Man issues is not materially different from the lists being contested and yet no one has a problem with it. What I meant to say on the proposed deletion was: "If the vote is against, the issue should be put to rest. Should the article significantly change through the addition of material, the issue could be revisited." Either way is fine with me.VarmtheHawk (talk) 01:48, 21 October 2021 (UTC)[reply]
There's always a general discouragement of re-raising a question too soon, but specific language saying "You can't raise this again until X" would be very unusual, and there's no special reason for it here. It would become a point of contention, trust me. EEng 01:58, 21 October 2021 (UTC)[reply]
VarmtheHawk, I agree, and would like to add that article splitting is so consequential that long discussions by advocates should be avoided. They should only try to split articles where they meet no resistance from other editors. The need for splitting, and manner of doing so, should be readily apparent to all and uncontroversial. If there is much resistance, they should move on and not try to press the point. -- Valjean (talk) 15:19, 21 October 2021 (UTC)[reply]

Rewrite of WP:SIZERULE section

I did an initial rewrite of the WP:SIZERULE as we seemed to be making no progress in the discussion. It's been rewritten to instead use words instead of byte size as humans don't read bytes, we read words. I took the previous values and used a length of "5" for the word length, which is intentionally small to also uprate the length of articles to be considered and also factor in spaces and other punctuation characters. Ergzay (talk) 01:51, 26 October 2021 (UTC)[reply]

Switching from bytes to words has one VERY important effect, which is to stop people from stupidly looking at the size of the source instead of the amount of readable prose. EEng 02:09, 26 October 2021 (UTC)[reply]
Can we recommend how people can find this info? Maybe using Wikipedia:Prosesize?VR talk 21:24, 28 October 2021 (UTC)[reply]
Ergzay I'm not sure if the "5" factor makes sense. I'm looking at today's main page article and its 27,527 bytes or 2,345 words, meaning a factor of 12. Yesterday's main page article had a factor of 39.VR talk 21:32, 28 October 2021 (UTC)[reply]
For God's sake, the 27k is the wikisource size, not the readable prose size, which is 14k. If we can't get stuff like this straight this conversation is doomed. EEng 05:39, 29 October 2021 (UTC)[reply]
@Vice regent I wasn't trying to match it exactly. I intentionally was doing it to slightly expand the maximum size of articles as the rule was written back when computers and phones in general were less performant and couldn't handle large page sizes. (The sizes were originally added before 2006 when flip phones were the norm even in the US.) Exploring the page history is illuminating. Ergzay (talk) 23:11, 28 October 2021 (UTC)\[reply]
@Ergzay: ok but I don't see this as merely a slight expansion. The last few main page FAs have had these many words: 3339, 2345, 4799, 3308, 3003, 3931, 2055, 2869, for an average of ~3,200 words/featured article. So maybe we should recommend splitting a lot earlier than 20,000 words.VR talk 00:24, 29 October 2021 (UTC)[reply]
I don't think the factor of bytes to words should matter. The point of the size rule is to separate out the byte count as the reason to split the article, as the byte count includes tables, references, re-worded links, and other things that should not be counted for the reason to split articles. Wikipedia uses XTools to calculate prose words and characters, and I verified by copying over the article page for 1989 (Taylor Swift album) to Microsoft Word, deleted out tables and photos, and used used the Word Count tool on the Review tab, and got a word count of 5,132 words, for 34,053 characters, for an average word size of 6.64, compared to XTools calculation of 4,819 prose words and 30,423 prose bytes or characters with an average word size of 6.31, not the factor of 39 mentioned above. This is a reasonable size article that is not close to needing to be split. If we compare a word count of 4,819 to an article byte size of 188,818 for those stuck on byte size, then if this article grew to 20,000, the article might have a byte size of 784,000 bytes, maybe an imposingly large article size, about 50% larger than article sizes of 500,000 that the article editors have been pursuing. I suppose we could compromise on 15,000 words, which if we extrapolated the Taylor Swift album article, would take it to 588,000 bytes, in range with other extra large articles, and almost 95,000 prose bytes, at which point it is a good idea to recommend splitting the article. Mburrell (talk) 03:11, 29 October 2021 (UTC)[reply]
15,000 words is nearly 5 times more than the average featured article that has ~3,200 words (1989 (Taylor Swift album) is the biggest FA; I'm using FAs on main page in last week as a random sample). I'm seeing plenty of other FAs also in the 2,000-5,000 word count range. I think moderately sized articles are easier to read and maintain and that's what we should strive for.VR talk 04:25, 29 October 2021 (UTC)[reply]
I'm going to draw a line in the sand right here and now on this (and I'm sorry if you feel picked on -- not my intention):
(a) Our convenience in editing is of ZERO consequence. All that matters is what serves our readers best.
(b) Articles are very rarely "read". Most, er, readers read the lead, read or skim the first section or two, and then dip in here and there according to level of interest or what they're after, possibly using the TOC as a guide. Talking about making articles "easy to read" (read top to bottom, that is) is a red herring.
EEng 06:00, 29 October 2021 (UTC)[reply]
@Vice regent To be frank I'm in favor of deleting the section entirely. The rule was originally written in a time frame of the internet when it was dominated by low performance devices with very low amounts of internal memory. That is not the norm now even for the most underpowered of Android phones. I changed it to words to "repurpose" the size section for something useful, as splitting on fixed byte size in this day and age is frankly ridiculous. Ergzay (talk) 05:54, 29 October 2021 (UTC)[reply]
I think there still needs to be byte-size considerations, particularly when you get to pages like tables and lists that do not use a lot of prose. While it is important to not have extensively long prose articles and thus reasons to split, we also don't want pages that are extremely large in byte-size for readers on slower/limited connections (5g and fast connections are *still* not universal). You probably need to have both word count and byte size, though word count should be the leading reason to split. --Masem (t) 23:14, 28 October 2021 (UTC)[reply]
I think this suggestion conflates two different discussions, because lists and tables are excluded from the paragraph about readable prose. The Lists, tables and summaries section does not have a size limit specified, but these days there is an unofficial splitting logic that reduces tables and lists when they approach 500,000 characters.Mburrell (talk) 03:11, 29 October 2021 (UTC)[reply]
Lets make the unofficial, official. And I'd prefer splitting much before 500,000 characters.VR talk 04:27, 29 October 2021 (UTC)[reply]
Characters are not bytes though. Wikimarkup isn't readable characters either. None of these metrics are good as they are all open to interpretation. It's better to have no rule at all and better to have a "no extreme articles" rule and "split where appropriate" rules rather than constantly trying to chop articles to smaller sizes purely based on their size. Some topics need lots of references and sources which inflate the size to extreme sizes despite the page itself being small. Others have massive amounts of prose with little sources and could often have sections split out and summarized when they become too bloated. There is no hard and fast rule based on size on when something should be split, so writing it down in a page like this just gives excuses to trolls to come and split a page your working on (I had the poor experience to encounter such a troll recently which is what caused me to come to this page and start this effort to fix this page). Ergzay (talk) 06:12, 29 October 2021 (UTC)[reply]
@Masem When you say "not 5G", I don't even have 5G nor does anyone I know. 5G isn't even relevant. Byte size is a historical remnant of the time when 2G (or slower) was the norm and devices had memory sizes that were given in terms of single digit megabytes. The size limits were originally added even before the iPhone 1 came out with 128 MB of onboard RAM (and little left for the web browser) which was huge for the time. The era of trying to limit page sizes to such an extreme extent is long past. Modern websites even clock into the megabytes (which I agree is too much), but trying to chop webpages at the 100kb or 200kb mark is just absurd. Ergzay (talk) 06:00, 29 October 2021 (UTC)[reply]
  • While I think this whole guideline is in dire need of reform, I vigorously object to etching in stone a new set of numbers coming off the back of some envelope in 2021, to replace the previous written-in-stone numbers that came off the back of some other envelope 15 years ago. And I absolutely cannot believe there's still talk about "byte size", meaning the size of the wikisource -- which is absolutely irrelevant to what the reader sees, the cost of downloading, or anything else. And even if you fixed that goofup in the discussion and talked about HTML (etc.) size instead, that's still irrelevant to download cost. Are there people in this day and age who don't realize that images completely dominate download cost?
    And is there anyone who thinks this guideline will be substantially changed without an eventual RfC? EEng 04:56, 29 October 2021 (UTC)[reply]

I'm going to muscle in here to order you all to do something, on pain of excommunication (and I've got a personal pipeline to the pope, so I can arrange that if really necessary). Before anyone says one more word, everyone needs to go to Preferences > Gadgets > Browsing and check the box that says Prosesize: add a toolbox link to show the size of and number of words in a page. That adds a "Page size" link to the toolbox to the left of each article. When you click that, it barfs back a bunch of statistics for the article you're looking at, including

Prose size (text only): XX kB (YYYY words) "readable prose size"

Those, and only those, are the numbers we should be discussing (at least when we're talking about prose, not tables and quotes and stuff).

After everyone does the above, I'll allow discussion to resume. The Great and Powerful Oz has spoken! EEng 05:46, 29 October 2021 (UTC)[reply]


  • I have reverted the change, which did more than a slight upgrade to length, it actively changed the level of guidance so that it was far more encouraging of longer articles. Humans read words not bytes, but bytes are used as a proxy here, on the assumption that 10,000 words is around 50,000 bytes. The SIZERULE section is a supplementary rule of thumb for the rough 10,000 word guideline. Further, the edit removed mention of two tools which can measure the byte proxy, with no replacements. Regarding overall technical size, the byte size of the SIZERULE section explicitly refers to prose size, so it does not correspond to the download size, the loading size, or similar considerations. If there is a technical issue relating to overall technical size (eg. Wikipedia:Template limits), it would need to be reflected in the Markup size section. CMD (talk) 05:38, 29 October 2021 (UTC)[reply]
    I made the change and I frankly agree. I'm primarily in favor of deleting the section entirely, but I kept the section and rearranged it as a concession. If the preference is that it's entirely bad I'd prefer to dlete it. Ergzay (talk) 06:01, 29 October 2021 (UTC)[reply]
    I've deleted the section as an alternative rather than trying to massage it into something more appropriate for this day and age. If this is agreeable (or no comments in a few weeks) I'll also go fix all the now broken links to the section. Ergzay (talk) 06:04, 29 October 2021 (UTC)[reply]
    I love your enthusiasm, and personally I'm for it, but just killing all numbers is never going to fly without substantial discussion. Or maybe it will -- wouldn't that be wonderful? EEng 06:15, 29 October 2021 (UTC)[reply]
    Right now I'm trying to drive more discussion on why people think fixed size limits are a good idea at all. As far as I'm aware, if you have internet at all wikipedia web page sizes are going to be smaller than the rest of almost anywhere else on the internet. Even the largest pages (that don't have images). I'd like one person to arrive with real numbers that can justify such limits. As so far it's just been hand waving. Ergzay (talk) 06:19, 29 October 2021 (UTC)[reply]
    I think EEng's doctrinal invocation should be noted here. The presence and absence of images is not relevant to the SIZERULE subsection, and overall web page sizes similarly do not relate to the subsection's purpose or intention. If the issue relates to images and overall web pages, I am not sure why SIZERULE is being edited in any direction. CMD (talk) 06:44, 29 October 2021 (UTC)[reply]
    I think you're agreeing with me but to be honest I'm sure what you just said. EEng 06:47, 29 October 2021 (UTC)[reply]
    @Chipmunkdavis My primary impetus for starting this discussion is people (gnomes/trolls) abusing SIZERULE to go around to many disparate articles that they are not involved with, trying to split them, often ignoring discussion or not seeking to bring in the usual editors of the page to consult their opinions. Then if you go against their splitting of the article they immediately point to SIZERULE and use the wikimarkup size as proof that they are doing good work by chopping other people's articles into smaller pieces. I wish to stop this behavior. How we get there I am not particular on. Cutting off their incorrect use of a very old article written in the days of 2G and sub 100 MB memory phones by deleting/modifying/etc the section they are using seems like a good start. In either case the article should be changed because it is outdated for the modern era. Ergzay (talk) 07:54, 29 October 2021 (UTC)[reply]
    Anyone pointing to wikimarkup size to support SIZERULE is not applying SIZERULE, and if done consistently should be handled as disruptive behaviour as in any other area of the wiki. With regards to 2G and sub 100 MB, such factors are not relevant to SIZERULE, which is more or less a part of MOS, and not too attached to how modern our era is. CMD (talk) 08:22, 29 October 2021 (UTC)[reply]
    @Chipmunkdavis How is it part of MOS? Isn't it just something that people just got used to as it encrustified into "this is just how we've always done things"? Look at the section I created below. It definitely started as a technical limitation when the rule was originally created. Ergzay (talk) 11:43, 29 October 2021 (UTC)[reply]
    It is similar to MOS in how it is treated, providing general parameters for the formatting of our articles. MOS is indeed pretty encrustified. CMD (talk) 12:34, 29 October 2021 (UTC)[reply]
    Well I'm up for getting MOS changed. It deserves to be with regards to this. Ergzay (talk) 12:51, 29 October 2021 (UTC)[reply]
    That's fine, but it should have a wide discussion involving the areas that use it, such as the GAN and FAC processes. CMD (talk) 05:52, 30 October 2021 (UTC)[reply]
  • Today's FA is Climate change, which is a complex topic with scientific, political, and economical dimensions. Despite its complexities, it is covered in prose size of only 53,000 bytes (8298 words). Each of the article's main sections has its own article, and what's left behind is a summary (WP:SUMMARYSTYLE). To me limiting article size is not about bandwidth limitations, its about article quality.VR talk 15:50, 31 October 2021 (UTC)[reply]
    Took a look at Climate change. You are correct that the article has a prose size of 53 kB, 8294 words. It has a wiki-text of 263 kB. So if we took the 8294 words, scaled it up to 20,000 words proposed for a size limit, we would have a prose size in bytes of 128 kB, and a total wiki-text size of about 634 kB, not that we are trying to use wiki-text size, so just using that to compare to currently enforce unofficial standards. This makes it a little larger, but not excessively larger, so if I am reading your statement as a comment on article size, it seems to say that a proposed 20,000 word limit would be acceptable? Or are you suggesting that a smaller 15,000 word limit would be more acceptable (a scaled 95 kB prose text, 475 kB wiki-text)? Maybe I am missing the thrust of your argument on the discussion on article size. Are you saying article size does not matter, as long as every article is written to the quality of a featured article standard? Could you expand on what you are trying to state in terms of article size? Thanks. Mburrell (talk) 20:50, 31 October 2021 (UTC)[reply]
    @Mburrell: yes I think 15,000 words should be the limit. Although personally I'd prefer even lower, as the policy page does quote 10,000 words as ideal from a human attention span perspective[5]. Smaller pages force us to summarize content, which is incredibly useful to the average reader (they can always go to the spinned off article if they want more detail).VR talk 23:11, 31 October 2021 (UTC)[reply]
    Or they could keep reading the current article if they want more detail. I'm sorry, but this discussion is built on sand. I've just removed the passage asserting that articles should be X words at more because humans read at Y words per minute and can only concentrate for 40 minutes -- cited to a book on management (not psychology, or education, or anything like that) -- and which at the same time links to the article attention span -- which, interestingly, says that adults can concentrate for 5 to 6 hours. It's all a mess of conjecture and OR, founded only on a few random editors' unsupported assertions about what our readers want or need. EEng 00:56, 1 November 2021 (UTC)[reply]
    Can we have an RfC to decide this? Guidelines should reflect a broader and stronger consensus than we have here.VR talk 01:15, 1 November 2021 (UTC)[reply]
    I too would like to see an RfC. I am mostly in agreement with the changes that User:EEng is doing, but I agree that we need a broad and strong consensus to change a project page. I would not mind a fuzzy upper limit where reducing or splitting a prose article should be discussed, and I would not mind setting the relative (not absolute) limit at either 15k or 20k, and I wouldn't mind an upper fuzzy limit on lists and tables as well, but it should be based on community agreement, or real size logic, and not the hand-waving logic that EEng has been excising. I think the current modifications to the article give a good discussion point for the RfC, but I would like to see community buy-in on the changes. Mburrell (talk) 02:13, 1 November 2021 (UTC)[reply]
    As I note in a separate section below, Wikipedia:Splitting is a parallel page that very much overlaps this one (or the way this one was until the axe was taken to it recently) re the triggers and considerations for splitting, plus it gives detailed how-to on carrying out splits. I think the thing to do is to take this conversation over there, or get the participants there over here, and hash it out among us -- before opening any RfC. On the down side, I now have an external commitment that's going to take up a lot of time, and may be as attentive as I know all of you would love me to be. EEng 02:52, 1 November 2021 (UTC)[reply]
    As there's no recent discussion there, best to continue it here. And yes the RfC would affect the wording both here and at Wikipedia:Splitting. What exactly will be asking in an RfC? a) no size guideline, b) prose-based size guideline of a max of 10 or 15 or 20 k words, c) a guideline that is based on both prose and other markup? Sorry just hypothesizing.VR talk 16:35, 1 November 2021 (UTC)[reply]
    There shouldn't be any use of markup in the size guideline, if we have guideline at all, as that reason existed historically only for technical reasons that no longer exist.
    @EEng I know you said you'd be busy soon, but you're leading this very well so far with your edits. If you want additional people to comment just ping me however and I'll stop by. I'll be watching this page but probably not regularly looking. Ergzay (talk) 05:37, 2 November 2021 (UTC)[reply]
    It's impossible to overemphasize the following: There are clearly editors over at WP:Splitting who would be interested in what we've been doing here, and my guess is they don't have this page watchlisted, which explains why there's been so little comment so far. They need to be brought into the discussion before we start thinking about a project-wide RfC. It's just that I've been hesitant to open that door given my other commitments. EEng 11:05, 2 November 2021 (UTC)[reply]
    @EEng: Done. VR talk 15:10, 2 November 2021 (UTC)[reply]
  • Comment - so I came to this page this morning, looking for the usual article size guide table, only to find it's gone completely... and on the back of a few bold edits by just two or three editors, which has already been reverted once by Chipmunkdavis. So I've reverted again, pretty much for the identical reasons given by CMD above. The guidelines on article length are a longstanding and highly-used aspect of the MOS, and I cite the 60kb "probably should be split" guidance frequently at FAC and elsewhere. Of course, there are exceptions, and the guidance already gives advice about not being hasty, but the general guidance is sound.. and it's not just about length of time to load the page (something which is still a factor for those in the global south who don't enjoy the advanced internet connections that we do), it's also a simple issue of readability and good article design. If changes of this magnitude are to be effected, it needs to be via a sitewide RFC, and with extremely good reasons set out as to why having long articles is suddenly fine and dandy, when it never has been before.  — Amakuru (talk) 10:17, 4 November 2021 (UTC)[reply]
    @Amakuru They're longstanding because they've been forgotten about with the advancement of technology. Please see my summary in the documentation down below. The rules are abused by trolls/gnomes to chop articles on the basis of wiki markup size rather than some reasonable standard about clarity or topic coverage being too wide. There's numerous articles that have been long standing but have been ruined by the adventurism of these types of people. The size rules date to the era of 2G pre-smartphone phones and when many people had dialup internet and should be discarded. Ergzay (talk) 15:26, 4 November 2021 (UTC)[reply]
    Also, do note @Vice regent recently put an item on the talk page for starting discussion to head into an RFC. Ergzay (talk) 15:27, 4 November 2021 (UTC)[reply]
    It really doesn't matter how the guidelines came into being 15 years ago, the point is that they are in effect today and they are used regularly to inform size decisions and I see no evidence that the rules of thumb contained in those tables are not relevant now. As noted previously, the recent FA articles on climate change and Earth both come in at significantly below 60kb of prose size, yet these are among the most complex topics that one could possibly seek to write an article on. So it is not only possible to write articles that aren't too long, it is also desirable. From a stylistic standpoint as much as from a technology one. And, on that topic, the "advancement of technology" you mention may be significant in the western world, but as someone with experience working in Africa, I can assure you that bandwidths and data rates there can still be limited.
    If the guidance re bytesize is misunderstood by those whom you characterise as "trolls/gnomes", then the solution is to clarify the language around this guidance so that it's crystal clear that we refer to prose sizing rather than Wiki markup sizing. The solution is not to throw the whole guidance out altogether, just because a few people misunderstand it. Also, the concept of prose size as a byte count is well-established and already used to ensure minimum article sizes in processes such as WP:DYK and destubathons, so let's not pretend this is an archaic and little-understood metric. Cheers  — Amakuru (talk) 15:57, 4 November 2021 (UTC)[reply]
    Confusion re source size vs prose length is the least of it: the history shows that the numerical limits are simply made up, the end product of a cascade of arbitrary transformations applied to an original, real, 32K limit on source size (which of course no longer applies), leavened by some nonsense about human attention span. They're built on nothing, and while it may be comforting to feel you're guided by some kind of authority [6], it's not healthy for articles to be cut apart on such a basis.
    There were dozens of changes made, the substantive ones explained in edit summaries. If you think something should be restored or changed, do that (after duly considering the reason offered for the original change, of course), but blindly reverting everything because you miss the comfort of someone telling you what to do instead of deciding for yourself, no. EEng 18:12, 4 November 2021 (UTC)[reply]
    The edits start here [7]. I propose we keep them. If no one says what they don't like about them in the next few days I'll be putting them back. EEng 06:35, 10 November 2021 (UTC)[reply]
    I didn't like the edit here.VR talk 13:07, 10 November 2021 (UTC)[reply]

Documentation of the history of WP:SIZERULE

Here I will document the history of SIZERULE and show how little it has been updated in recent years. SIZERULE first appeared with the creation of this page on March 7th, 2003. At the time the max page size was given as 30K. It was explicitly at that time written as a technical limitation of browsers of that time period. Smartphones didn't exist in 2003, and data plans for phones, what they were, often used 2G or worse. Some people had cable internet but most people still used dialup. At some point in the intervening years the value was tweaked to 32K, likely to be base 2, and there were additional changes clarifying that the limit was only for the article, and not for lists as the meaning started to drift away from it being for a technical reason. Some time before 2005 or so a clarification was added that said that mobile browsers and some web browsers crop any pages longer than 32KB and refuse to load any more. On January 17th, 2006 the limit was increased to 50kb. On February 22nd, 2007 the limit was increased to 100kb. And there it has sat for 13 years, with a technological and digital revolution happening around it, we now keep chopping articles to 100kb in wikitext length for "technical reasons".

Does this not strike anyone else as utterly ridiculous? Ergzay (talk) 08:19, 29 October 2021 (UTC)[reply]

It's not only utterly ridiculous, but completely and totally ridiculous as well. And here's more ridiculousness: that early guideline was talking about the size of the wikisource [8], but then suddenly someone apparently just stuck in the words of readable prose, thereby completely changing the meaning [9].
Then in 2006 someone actually proposed (AND I AM NOT MAKING THIS UP) a "Mandatory breakup committee":
First, an editor tries to establish consensus: the issue is brought up on the talk page, and it is suggested that the regulars break up the article into subtopics, with short summary paragraphs (w/ main article attachments), see thermodynamics as an example, so that the main page gets below a certain limit. Second, if plan #1 stifles out in argument and indecision to act, for a number of consecutive weeks, then an breakup arbitration committee notice is placed on the talk page, putting an ultimatum deadline, such that either the regulars break up the page to below a certain limit by that date or an external breakup committee, enforced by a team of administrators, will do so.
This cookie-cutter approach persists to this day (see elsewhere on this very page) and must be resisted at all costs. As one editor put it (elsewhere on the page just linked):
The persons providing the justification for limits on article size are predominantly "techies" for whom the writing part is a chore compared to the joy of formatting pages, blocking miscreants and otherwise engaging in the plumbing aspects of html page production. These are the folks who theorize that readers will get bored with articles that are longer than x kb (notice how the limits are in kb and not words - very instructive) often because of their own inadequacies in that department. What is lost in this discussion is that some articles are well written and can hold the readers' interest far longer than much of the mediocre prose found in other entries.
Just so! I absolutely support removing the numerical limits, which are fashioned from whole cloth and based on no evidence whatsoever about what readers want or need. EEng 13:47, 29 October 2021 (UTC)[reply]
I'm afraid you're both mistaken. There are many content creators, including yours truly and SandyGeorgia, who emphasize the importance of writing concise articles and using summary style when they get too long. Anything over 10,000 words is unlikely to pass at FAC. (t · c) buidhe 10:34, 7 January 2022 (UTC)[reply]
  • Note to self and to my fellow editors: Don't forget Wikipedia:Splitting, which repeats the stupid character count cutoffs, claiming they're based on an assertion that readers can concentrate for 30-40 minutes, citing our very own article Attention span -- which says nothing like that, rather says 5-6 hours. We all know this varies tremendously depending on the reader, motivation, nature of material, and 50 other things, and figures like 30-40 minutes (or 5-6 hours, for that matter) are just pulled out of the air.
    This page and that page need to be harmonized somehow; they're really both trying to do the same thing. I know! Let's merge them! EEng 22:55, 30 October 2021 (UTC)[reply]
I agree that the attention span argument is quite unfounded; in my experience many readers don't even read a full Wikipedia article from top to bottom anyway; they are looking for a specific piece of information. But I think that's not the only reason why you'd want to limit how long articles can get. Extremely long articles have longer load times and are harder to navigate. I've spent some time at Wikipedia talk:Manual of Style lately and that page is so long that I constantly get lost. If we want to aid readers in finding information then splitting extremely long articles up into ones that are overseeable seems like a good idea to me. ―Jochem van Hees (talk) 16:41, 2 November 2021 (UTC)[reply]
I'm afraid I must disagree with some of what you say. The load-time argument is completely fallacious given that most article's download time is driven almost entirely by images. And saying something about articles based on your experience with MOS is like saying you drive a compact car because you had trouble finding the bathroom on a 747. They're completely different animals. EEng 22:44, 2 November 2021 (UTC)[reply]
Sorry I only noticed your reply just now. I'm not sure which fallacy my argument has, nor where you got that statistic for download time from. I'm not at all an expert on this but I did a quick test by loading the page Border control, which is not only huge in page size but also has loads of images. According to Chrome's network devtools, loading the page content took longer than any of the images; the content took 833ms while the images were anywhere between 10ms and 130ms, and they were downloaded in parralel. (That's with my relatively good wired connection; it will take significantly longer on a weak mobile wireless connection.)
In any case, if you're really that disstatisfied with me using the MOS as an example, I only used that example because that happened recently and was therefore quickly on my mind. I have had similar issues during the UEFA Euro 2020 when I often wanted to look up the latest developments but had to scroll all the way down each time; especially on mobile it's hard to find stuff. Or an article that I have worked on myself, List of Eurovision Song Contest entries, which was ginormous before we split it into two. ―Jochem van Hees (talk) 12:13, 5 November 2021 (UTC)[reply]
@Jochem van Hees The mobile site being a bad user experience is mostly a result of them collapsing every table section heading by default. You can stroll from the top of the page to the bottom of the page in an instant though as a single flick can move very quickly from top to bottom. Ergzay (talk) 16:17, 6 November 2021 (UTC)[reply]
  • I was just trying to replace some references in an article whose entire size (not just prose) is 131K. It was slowing my browser down to edit the entire article, so I had to edit it section by section. Does this happen to others too? VR talk 13:08, 10 November 2021 (UTC)[reply]
    @Vice regent I've never had that problem personally though for large articles (the one I edit commonly is over 400K) however it sometimes takes a couple seconds to load the page and then submit the edit for the page, but there is no problem browsing the page or editing the page. I've heard that the "visual editor" is extremely bad/slow for Wikipedia. Are you using that? Ergzay (talk) 20:09, 10 November 2021 (UTC)[reply]
    No, always source editor. Maybe I have too many windows or tabs open? VR talk 20:32, 10 November 2021 (UTC)[reply]
    I'm not sure. I'm on Firefox on a more recent Macbook M1 but I had no problems on my 2015 Macbook Pro I used to use. Ergzay (talk) 21:25, 10 November 2021 (UTC)[reply]
    Yes. I live in a first world country and use a recent laptop with an updated browser with a fast broadband connection. Editing starts to slow down around 40-50k wikitext and when it gets much longer than that, either you have to live with a lot of lag or go section by section. (t · c) buidhe 10:31, 7 January 2022 (UTC)[reply]
    Then go section by section. That's what section editing is for. It's incomprehensible that you're putting the convenience of your editing over the needs of the reader. EEng 00:32, 8 January 2022 (UTC)[reply]
    And I am also in a developed country with modern technology yet am finding J.K. Rowling hard to edit. There are still very good reasons for the size limits, not all related to technology, and one of the key historical issues left out of the initial analysis is attention span and average time to read the page. I suggest the page has not changed because it is still useful as is. SandyGeorgia (Talk) 00:27, 8 January 2022 (UTC)[reply]
    one of the key historical issues left out of the initial analysis is attention span and average time to read the page – And the evidence about attention span, and types of users who want to "read the page" versus use it in other ways is ... where? EEng 00:32, 8 January 2022 (UTC)[reply]
    Our experience at WP:FAC and WP:FAR (where you don't contribute) shows that too-long articles accumulate bloat, and are very difficult to write and maintain, which has a detrimental downstream effect on the article quality and therefore the reader experience compared to an article kept to an appropriate length. A reader who wants more detail on a specific aspect should visit a sub-article. (t · c) buidhe 00:41, 8 January 2022 (UTC)[reply]
    That a "too long" article is ... well, too long, is a tautology. The question is: how long is too long? At what point should a particular article have a chunk split off? It's self-evident that articles should be, just as you say, an appropriate length, but that takes judgment based on the topic, not some stupid one-size-fits-all table based on, AFAICT, just something someone arbitrarily wrote down 15 years ago.
    I really appreciate the where you don't contribute throwaway, because it gives me a chance to remind everyone that FAC reliably produces articles which conform to a checklist of mindless rules but which are often pretty awful, sometimes laughably so. As well expressed in the essay User:Physchim62/Situation Normal: All FACked up, "ensuring that featured articles meet the featured article criteria is NOT the end in itself." EEng 01:19, 8 January 2022 (UTC)[reply]

wp:size under discussion

The redirect page wp:size is currently discussed at WP:RFD. --George Ho (talk) 10:13, 23 December 2021 (UTC)[reply]

Change size guideline proposal

I propose changing the "Probably should be divided" in Wikipedia:Article size#size guideline from 60kB to 100kB and the "Almost certainly should be divided" from 100kB to 200kB. This rule was made in 2007, when devices didn't have the capacity to navigate long articles smoothly. 15 years later, we have advanced technology and this rule is ridiculous. Ak-eater06 (talk) 20:15, 25 June 2022 (UTC)[reply]

The rule does not only relate to computer processing speed; it relates to reading and attention span. I think it fine as is. SandyGeorgia (Talk) 20:24, 25 June 2022 (UTC)[reply]
The idea that any more than 1/10 of 1% of visitors have the desire to read an article from top to bottom is absurd, as is reasoning based on such an idea. Overall article size is just one of many considerations in trying to answer the following question: What structure (of this one article, or of a group of related articles) best allows readers of various kinds to satisfy their knowledge-needs? Stupid size formulas, blindly applied, are not the answer to that question. EEng 20:50, 25 June 2022 (UTC)[reply]
User:EEng#s I agree, this size rule is infuriating. Take Barack Obama's presidency for example, his policies are split into a DOZEN or so articles (economic policy, energy policy, East Asia policy, space policy, etc.) in the name of the presidency article being "too long". It just makes it more disorganized. Ak-eater06 (talk) 21:01, 25 June 2022 (UTC)[reply]
Our stats Research:Which parts of an article do readers read Moxy- 04:20, 28 June 2022 (UTC)[reply]
I agree that the guideline shouldn't be changed. Reading and attention spans have not substantially increased during the existence of Wikipedia. While articles do not need to be written as shortly as possible, they should not be difficult to read in their entirety. Onetwothreeip (talk) 03:10, 26 June 2022 (UTC)[reply]
It's really unbelievable to see the attention span argument trotted out over and over and over. There's no evidence anything on this page is more than stuff a few editors made up one day. EEng 21:37, 26 June 2022 (UTC)[reply]
  • My proposal is changing the "Probably should be divided" from 60kB to 80kB. Ak-eater06 (talk) 19:30, 26 June 2022 (UTC)[reply]
    While I'm happy to see even the tiniest move in the direction of sanity, it's really just deck chairs on the Titanic. As you can see from various comments on this page, there's a core group (a) utterly dedicated to the ridiculous idea that any significant proportion of visitors to an article have the intention of reading it top to bottom and (b) willing to translate that myth into word counts derived via baseless "attention span" and "reading rate" numbers from low-quality, one-size-fits-all sources. I hope your change sticks, but if it doesn't just do what I do: ignore this page's nonsense and structure articles according to the needs of each topic. Let those with no judgment of their own apply this Procrustean bed to articles unlucky enough to attract their attention. EEng 21:37, 26 June 2022 (UTC)[reply]
EEng if you want an example of people defending article-splitting insanity, check out this discussion where my proposal to merge the four articles into one of Stephen Harper's tenure was shot down 4-2. Four users opposed merging due to this stupid size rule. Stephen Harper's tenure as Canadian prime minister is divided into four seperate articles (Premiership, domestic policy, foreign policy and environmental policy)! Same with his successor, Justin Trudeau (Premiership, domestic policy, and foreign policy).
You say to "ignore this page's nonsense and structure articles according to the needs of each topic" and while I try to, these users who voted against merging in the discussion I linked to always prevent me from merging due to WP:MERGEPROP and when I do follow WP:MERGEPROP my merger proposals get shot down due to people citing the size guideline.
Coming back to my Harper example, it is extremely frustrating to know people have to flip between four different pages when we can easily have his tenure in one, clean article. Ak-eater06 (talk) 06:01, 27 June 2022 (UTC)[reply]
@Ak-eater06: I haven't read all the arguments on the relevant article talk pages but I think you would find success in merging the Premiership, Domestic policy and Environmental policy articles, while keeping the Foreign policy article separate. Onetwothreeip (talk) 08:57, 27 June 2022 (UTC)[reply]
User:Onetwothreeip I did that and got reverted. They cited WP:Mergeprop and once again the stupid size rule. Ak-eater06 (talk) 17:31, 27 June 2022 (UTC)[reply]
I feel for you, but as things stand, reforming this page is one of those situations in which the ratio (effort to overcome mindless idiocy) / (benefit) is just too high. I wish you luck. EEng 12:22, 27 June 2022 (UTC)[reply]
  • EEng and Ak-eater06 are right. "Too long" is not the issue it once was, so the proposed change makes sense. To understand this, imagine a book. No matter the size of a normal book, it is always easier to find a bit of content as long as it's between the book's covers. Those who want to know everything about the topic/story will read the whole book/article, regardless of length. Splitting articles into separate locations (different book volumes) makes it much more difficult to find stuff, and increases the chance of important info never being seen by the reader.
Nowadays most readers search a page for information, so keeping it all in one place makes most sense. Few read the whole article. They may read the whole lead, and may skip to interesting parts, but that's all, unless they search for key words and phrases.
One editor's single-minded, pathological, obsession with splitting long articles is usually very destructive and contrary to the needs of 95% of our readers. Almost no one benefits from it. (Yes, you know who you are.)
EEng is right: "What structure... best allows readers of various kinds to satisfy their knowledge-needs? Stupid size formulas, blindly applied, are not the answer to that question." -- Valjean (talk) (PING me) 22:48, 26 June 2022 (UTC)[reply]

User:Valjean and User:EEng I updated it again to reflect common sense more...hope my change sticks. Thanks for your efforts :) Ak-eater06 (talk) 01:06, 28 June 2022 (UTC)[reply]

A good change that brings us out of the dark ages. -- Valjean (talk) (PING me) 01:09, 28 June 2022 (UTC)[reply]

Let's scrap the size guideline

The size guideline rule was made in 2007, when devices didn't have the capacity to navigate long articles smoothly. 15 years later, we have advanced technology and this rule is ridiculous. Let's remove it altogether. People can split articles without being influenced by this obsolete rule.

Some other arguments:

The idea that any more than 1/10 of 1% of visitors have the desire to read an article from top to bottom is absurd, as is reasoning based on such an idea. Overall article size is just one of many considerations in trying to answer the following question: What structure (of this one article, or of a group of related articles) best allows readers of various kinds to satisfy their knowledge-needs? Stupid size formulas, blindly applied, are not the answer to that question.

— User:EEng

"Too long" is not the issue it once was, so the proposed change makes sense. To understand this, imagine a book. No matter the size of a normal book, it is always easier to find a bit of content as long as it's between the book's covers. Those who want to know everything about the topic/story will read the whole book/article, regardless of length. Splitting articles into separate locations (different book volumes) makes it much more difficult to find stuff, and increases the chance of important info never being seen by the reader.

Nowadays most readers search a page for information, so keeping it all in one place makes most sense. Few read the whole article. They may read the whole lead, and may skip to interesting parts, but that's all, unless they search for key words and phrases.

One editor's single-minded, pathological, obsession with splitting long articles is usually very destructive and contrary to the needs of 95% of our readers. Almost no one benefits from it.

It's time. Ak-eater06 (talk) 04:18, 28 June 2022 (UTC)[reply]

Survey

  • I support removing the size guideline. Wrote my reasoning above. Ak-eater06 (talk) 04:18, 28 June 2022 (UTC)[reply]
  • I also support the removal of numerical "limits", though it might be helpful to give editors an idea what the distribution of articles sizes is. And there's plenty of room for guidelines about how to usefully think about article length as an important aspect of topic organization and presentation. I suggest other editors review the earlier threads on this page to get an idea of the recent history of this issue. EEng 04:27, 28 June 2022 (UTC)[reply]
  • Oppose simple scrapping, without any alternatives being proposed. There is no hard limit as is, there is a range of guidelines which can be applied differently if situations warrant it (depending on how the overall topic is structured with regards to WP:SUMMARYSTYLE for example). As it stands, we have articles whose topics fill multiple books. Unless the proposal is to have book length articles (which is one reading of one of the quotes above I suppose), then there are obviously going to be some guidelines on the matter. The quote from EEng above is correct that size is "one of many considerations" and that this guideline shouldn't be "blindly applied", however those are not reasons to scrap this guideline, they apply inherently to all guidelines on en.wiki. CMD (talk) 05:22, 28 June 2022 (UTC)[reply]
    I'm sympathetic to most of what you say, but the problem is that this particular guideline is peculiarly susceptible to being mindlessly "enforced". EEng 06:43, 28 June 2022 (UTC)[reply]
  • Oppose it's not about "devices [not having] the capacity to navigate long articles smoothly", it's about brains not having the capacity to navigate long articles smoothly. Indeed, the attention span of our average user now is likely even less than it was when Wikipedia was first formed, back in the year 6 B.I. ("before iPhone"), at a time when everybody read books (for fun! not just for school!) instead of being glued to their smartphones. Which, of course, have now colonized brains with the idea that a "full page of information" is whatever fits on a six-inch (15cm) diagonal screen. If it were only about device capacity, then articles could be a thousand times longer than they are now, and a couple years from now, a million times longer. Mathglot (talk) 07:28, 28 June 2022 (UTC)[reply]
  • Oppose These are guidelines, not rules, designed to say that Wikipedia articles should be neither too large nor too small. They should contain as much information as a narrow subject can allow, and when there is the opportunity to expand the overall content by splitting an article into one or more, that opportunity is usually taken. Guidelines like these are necessary to promote good editorial standards, such as articles neither being "book-length" when it is more appropriate for a topic or list to cover more than one article, nor many articles being one sentence long when they could be reasonably merged into one article under a broader topic. Onetwothreeip (talk) 07:44, 28 June 2022 (UTC)[reply]

Discussion

Is there any good research on readability size? We have editor retention stats for Wikipedia....but is there accessibility data?..... on a side note...A site-wide rfc should take place as mentioned in previous talks and edit summaries.Moxy- 04:37, 28 June 2022 (UTC)[reply]