Wikidata:Requests for comment/Frequency of YouTube follower count data
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Option A: 3, Option B: 2, Option C: 7, Option D: 3. A majority supports to Update items with Wikipedia articles whenever their subscriber count has changed by at least 10% or has surpassed a new factor of 10 milestone (100k, 1M, 10M, etc.); update other items once per year --Ameisenigel (talk) 08:30, 13 March 2022 (UTC)[reply]
An editor has requested the community to provide input on "Frequency of YouTube follower count data" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.
If you have an opinion regarding this issue, feel free to comment below. Thank you! |
THIS RFC IS CLOSED. Please do NOT vote nor add comments.
At English Wikipedia, editors have expressed openness about having infoboxes for YouTubers use subscriber counts imported from Wikidata and updated via bot. If that happens, it would a great step toward closer integration of the two projects, a matter of existential concern for us.
However, I suspect that in order for a proposal there to find consensus, our data will need to be updated frequently enough to remain reasonably current and deter the IP editors that love to constantly update the numbers for channels they follow. The current rate of updates is only once per year, done by BorkedBot, and there are no provisions for channels that experience an explosion in popularity.
At previous discussions, editors have proposed alternative methods for updating the counts. This RfC seeks consensus on which to use. Here are some (non-exhaustive) options:
- Option A (status quo): Update all items once per year
- Option B: Update all items once per N time period (please include your preference for N in your !vote)
- Option C: Update items with Wikipedia articles whenever their subscriber count has changed by at least 10% or has surpassed a new factor of 10 milestone (100k, 1M, 10M, etc.); update other items once per year
- Option D: Option C, but with some other value instead of 10% (please specify your preferred value in your !vote)
Regards, {{u|Sdkb}} talk 22:45, 14 October 2021 (UTC)[reply]
Courtesy pinging prior participants: @Alexis Jazz, Berrely, Brojam, BrokenSegue, Jura1, MisterSynergy: feel free to weigh in. {{u|Sdkb}} talk 22:51, 14 October 2021 (UTC)[reply]
Discussion
[edit]- Option C. This strikes a good balance between ensuring that counts are updated often enough that they'll be reasonably current and able to be used on Wikipedia and not bloating Wikidata. I'd also be okay with Option D with some reasonable range (between 5% and 25%). {{u|Sdkb}} talk 22:51, 14 October 2021 (UTC)[reply]
- Option C If we only look at items on enwiki with youtube accounts associated we are talking about 38 thousand items. That number overstates the number of impacted items because many of those items are not going to have accounts with a substantial youtube following (we exclude small youtube channels). I think increasing integration between wikis and wikidata is good and worth a little pain. BrokenSegue (talk) 01:07, 15 October 2021 (UTC)[reply]
- Option D/C though I suspect that the query to find out the current subscriber/view count is more "expensive" than actually updating the value here. I'd say updating when there is a 5% difference or the last update is more than, say, 6 months ago would be reasonable. — Alexis Jazz (talk or ping me) 01:21, 15 October 2021 (UTC)[reply]
- Option D/C, per Alexis Jazz. I think 10% is a bit to significant (though if lower is too expensive to query then I'm fine with it); it would mean a YouTuber with 50 million subscribers would only have an update once they reach 55 million; in that time they'd likely reach the 1 year/6 months threshold in advance. Berrely (talk) 06:01, 15 October 2021 (UTC)[reply]
- Option C/D I'm comfortable with 10%, or a bit higher than that. At 10%, each factor of 2 their subscribers go up by would have 7 updates. Some channels may well expand by a factor of 1000 over time, and having 72 points over that expansion phase seems on the high side to me. --99of9 (talk)
- Option C or anything similar to it. For subscriber counts (not only at Youtube), ballpark figures matter much more than frequent updates. Option C takes this into consideration and provides a good balance. —MisterSynergy (talk) 09:19, 18 October 2021 (UTC)[reply]
- Option B on a monthly visitation basis because if a bot has to check for +10% stats, it may as well just update new stats directly into Wikidata whilst it has the data. Perhaps it should overwrite the previous statement if it was within the same year. Thus over time, historical stats will show for each year as of date XX Dec XXXX. If historical stats are shown at all, there should always be a archive URL (P1065) link provided as a reference so historical stats aren't lost forever for referencing. --Dhx1 (talk) 15:10, 18 October 2021 (UTC)[reply]
- Option C sounds like a balanced approach to it. Ainali (talk) 15:28, 23 October 2021 (UTC)[reply]
- Option A or Option B – In principle, I am satisfied with the status quo. Certainly you can perhaps reduce the gap to half a year, but in my opinion you get a good trend curve for the development of the number of followers. --Gymnicus (talk) 16:59, 3 November 2021 (UTC)[reply]
- Option A I don't see a benefit in maintaining very precise follower counts. If someone is interested in the precise count they'd use the link to scrape youtube themselves. The use cases I can see for follower counts in Wikidata don't require the data to be refreshed frequently. -- Dr.üsenfieber (talk) 14:22, 5 November 2021 (UTC)[reply]
- Does the definition of "benefit" exclude other Wikimedia projects? I wouldn't agree with that—we should want our data to be widely used, especially within Wikimedia—and the benefits to Wikipedia are clear per the nom. {{u|Sdkb}} talk 22:23, 22 November 2021 (UTC)[reply]
- Option A I agree with Dr.üsenfieber. I also think using a percentage does not scale well in terms of desirability at either end of the subscriber count spectrum. It may provide a more granular view of the growth rate in some instances, but thinking on a long term scale that is not super significant. Sampling at a consistent interval is simple and effective. --SilentSpike (talk) 22:14, 19 November 2021 (UTC)[reply]
Odd options
[edit]- Comment 1: I think option A misrepresents the status quo, please see Wikidata:Property_proposal/social_media_followers: "one value per media and calendar year, not more than one per quarter.". --- Jura 13:11, 25 October 2021 (UTC)[reply]
- My understanding of option A comes from Special:Diff/1502261262. {{u|Sdkb}} talk 01:45, 26 October 2021 (UTC)[reply]
- Comment 2: It's unclear why we should set different rules Youtube than for others. --- Jura 13:11, 25 October 2021 (UTC)[reply]
- I noted the impetus for this proposal above; for OTHERSTUFF, we can have separate discussions as needed. {{u|Sdkb}} talk 01:51, 26 October 2021 (UTC)[reply]
- Comment 3: overwrite (update) or add new date? Wikidata:Property_proposal/social_media_followers (after several other proposals), added new data periodically, rather than overwrite. --- Jura 13:11, 25 October 2021 (UTC)[reply]
- Personally, I would prefer adding rather than overwriting. --Gymnicus (talk) 16:31, 3 November 2021 (UTC)[reply]
- Comment 4: Every 10% leads to countless values. How many +10% are there from 10000 to 10,000,000? Exactly. --- Jura 13:11, 25 October 2021 (UTC)[reply]
- Not sure if that was a real question, but the answer is not "countless", but 72. I would guess you wouldn't ever catch every 10% jump exactly so the actual number of values over a 1000-fold growth period would probably be more like 50 at most. ArthurPSmith (talk) 14:30, 25 October 2021 (UTC)[reply]
- If it goes for youtube, what happens to other websites? So it would be 72 (or less) for each. --- Jura 14:49, 25 October 2021 (UTC)[reply]
- I guess I am ok giving deference to platforms whose data is needed/requested by the wikipedia community. Essentially: only store detailed data if it would help advance other wikimedia projects. BrokenSegue (talk) 15:07, 26 October 2021 (UTC)[reply]
- Comment 5: Some social media website round values to one or two digits, so what would be the use case to store such detailed data here? --- Jura 13:11, 25 October 2021 (UTC)[reply]
- Not sure how this impacts things? The rounding is small enough to not impact something like a 10% rule anyways. For a yt channel with 100k followers the bottom 1000 follower data is truncated. We wouldn't have updated for a change that small anyways. BrokenSegue (talk) 15:07, 26 October 2021 (UTC)[reply]