Wikidata:Requests for permissions/Bot/MerlIwBot
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Since there are no local bureaucrats and no local bot approval group, I'm just going to close this as a regular user. The consensus appears to be Bot approved, but not for aliases. While there is broad support for the bot itself, there is as-yet-unresolved concern about how the bot handles aliases. If the bot operator wishes to add aliases back into the task, he should reach a consensus with the people below as to how best to do that, and file a new request here. This is also Sven Manguard 15:25, 8 November 2012 (UTC)[reply]
- Operator: w:de:User:Merlissimo
- Task: automatically create items
Hi, i am Merlissimo and have an interwiki bot running on many wikimedia wikis. It has botflag (ethier by local or global right) on all wikipedias and some other projects. It was mainly developed to automatically solve easy langlink conflicts. For this i am internally using a connection graph and using the algorithmn of Tarjan to detect strongly connected components.
In June wikidata developers needed more items on their test wiki. Because of my datamodel it was easy to extend the framework to support wikidata. Since June my bot is continiouly editing on the test wiki (only some outages after api interface changes): http://wikidata-test-repo.wikimedia.de/wiki/Special:Contributions/MerlIwBot .
Of course i improved my bot over the last month and now it creates really usable item i think. Items contain sitelinks to strong connected conflict free langlinks groups, labels and aliases. For labels it uses the displaytitle or, if not available, the page title having brackets or comma separated appendings removed which only exists for local disambuguation. Additional different page titles of wikis having the same language (like enwiki, simplewiki, specieswiki, commonswiki) are added as aliases. Also local redirets are added as aliases, but only those which are not equal case insenitive.
Items are created while running as normal interwiki bot. But i also have database scripts that can check if there are pages not linked by an item. Merlissimo (talk) 01:20, 31 October 2012 (UTC)[reply]
- All local redirects are added as aliases? I'm not sure that's a good idea. There are plenty of redirects that aren't just other names for the entity. --Yair rand (talk) 02:38, 31 October 2012 (UTC)[reply]
- Also, displaytitles don't actually remove the brackets, typically. And the bracket contents are also sometimes part of the name, and not just for disambiguation. Anyway, there are plenty of other instances of names being moved around to avoid duplicates, which we don't need to follow here. --Yair rand (talk) 02:44, 31 October 2012 (UTC)[reply]
- The backets are also removed from displaytitles, of course. Removing an alias is easy for humans, because it's only one click. Adding additional aliases ist much more work. All automatically imported items must be reviewed some day. Currently you can find them because of missing descriptions and i hope flaggrevs will be enabled someday. Merlissimo (talk) 11:44, 31 October 2012 (UTC)[reply]
- I don't think it makes sense to add millions of items a large percentage of which will be wrong. Articles very frequently will not have the same labels as page titles minus bracketed and post-comma text. See for example Q1321, which is properly labeled "Spanish", but the English Wikipedia article is "Spanish language", because "Spanish" can also mean other things. Take a look at any significant page's redirects, and you'll see plenty of things which are not acceptable aliases, like "Spanish etymology" for "Spanish" or "About the Earth" for "Earth". I fully support this bot adding interwikis, but its adding of labels and aliases would just produce literally millions of damaging errors. --Yair rand (talk) 21:21, 31 October 2012 (UTC)[reply]
- The backets are also removed from displaytitles, of course. Removing an alias is easy for humans, because it's only one click. Adding additional aliases ist much more work. All automatically imported items must be reviewed some day. Currently you can find them because of missing descriptions and i hope flaggrevs will be enabled someday. Merlissimo (talk) 11:44, 31 October 2012 (UTC)[reply]
- Support - but please make some test edits first, I am curious to see how that will look like. Romaine (talk) 16:34, 31 October 2012 (UTC)[reply]
- I started on new article titles at dewiki from today. Merlissimo (talk) 18:29, 31 October 2012 (UTC)[reply]
- Comment I think we will eventually need interwiki bots, because having everyone constantly adding sitelinks floods the RecentChanges, and humans make more interwiki mistakes than well-programmed bots do. Thus, I strongly support this request, but I think we should wait to allow people to get used to Wikidata and its interface. PiRSquared17 (talk) 21:13, 31 October 2012 (UTC)[reply]
- In don't know why i am requesting for bot flag. My script is developed and tested over months. Many people are alredy running unconfirmed scripts programmed by their own or my others. The edit rate of my bot and the new items it creates would be much lower than of many other "wikidataians" because my script must check every local page. Should i simply run this script on my main user account? I do not see the difference. Merlissimo (talk) 21:37, 31 October 2012 (UTC)[reply]
- I Support running this bot; it is apparently well-maintained. There is no huge difference if humans or bots just create tons of pages for the starting of Wikidata. --MF-Warburg (talk) 22:32, 31 October 2012 (UTC)[reply]
- This bot's test edits include way too many errors. Its edits need to be reviewed for problems. --Yair rand (talk) 00:18, 1 November 2012 (UTC)[reply]
- Actually, for added aliases, it would probably be better just to delete all of them. They're really riddled with mistakes, and I don't know of a good way to easily verify the correct ones. --Yair rand (talk) 00:22, 1 November 2012 (UTC)[reply]
- I excluded aliases from simplewiki because it contains spelling mistake redirects which are not marked a such. Merlissimo (talk) 02:26, 1 November 2012 (UTC)[reply]
- Most mistakes were not from simplewiki. Copying redirects from other wikis does not result in a usable collection of aliases. --Yair rand (talk) 03:11, 1 November 2012 (UTC)[reply]
- I excluded aliases from simplewiki because it contains spelling mistake redirects which are not marked a such. Merlissimo (talk) 02:26, 1 November 2012 (UTC)[reply]
- Actually, for added aliases, it would probably be better just to delete all of them. They're really riddled with mistakes, and I don't know of a good way to easily verify the correct ones. --Yair rand (talk) 00:22, 1 November 2012 (UTC)[reply]
- Re Aliases: Can you check whether a redirect links to the page with or without anchor link (i.e. Microsoft links to MicroSoft vs. Windows links to Microsoft#Windows). I would exclude redirects which point to anchors. They won't be an alias of the item in most of the cases. --Saint-Louis (talk) 10:23, 1 November 2012 (UTC)[reply]
- YairRand, I don't understand your general opposition to the application. It's obvious that the main workload to migrate interwikis from the current system to Wikidata needs automated support by Bots. Merlissimo has developed changes to his bot to be compatible for Wikidata. He corporated with the developers. As far as I'm aware he's the only one (might be wrong). For me it's obvious that he gets the flag. Regardless of whether the bot creates aliases from Redirects. This question is to be discussed separately. --Saint-Louis (talk) 11:36, 1 November 2012 (UTC)[reply]
- I do not object to the bot getting the bot flag to add interwikis at all. I object to it adding aliases and labels, whether it's flagged or not. Is this not something that should be discussed during the flag request before the bot becomes operational? Or is the issue of the actions of the bot considered separate? Should I open a discussion on Project chat about bot-added aliases and labels instead of discussing it here? --Yair rand (talk) 22:02, 1 November 2012 (UTC)[reply]
- Support trusted user @xqt 11:48, 1 November 2012 (UTC)[reply]
- Support trusted user and responds positively to suggestions.
@ Saint-Louis: Well, there is pywikidata. Currently the bot operator of pywikidata needs to type in every single interwiki link that he want´s to add and pywikidata does not check for interwiki conflicts. Does that make pywikidata compatible with wikidata? I would say no.--Snaevar (talk) 13:24, 1 November 2012 (UTC)[reply]
- Support Well known and trusted user Raymond (talk) 17:16, 1 November 2012 (UTC)[reply]
- Support like MF-Warburg.--CennoxX (talk) 17:42, 1 November 2012 (UTC)[reply]
- Support, could be useful here. Ajraddatz (talk) 20:37, 1 November 2012 (UTC)[reply]
- Support --Beta16 (talk) 23:44, 1 November 2012 (UTC)[reply]
- Support Hazard-SJ ✈ 16:29, 3 November 2012 (UTC)[reply]
- Support -- Bertrand GRONDIN → (écrire) 20:13, 5 November 2012 (UTC)[reply]
- Support of course --Bene* (talk) 13:12, 6 November 2012 (UTC)[reply]
- Support trusted user Photograpers (talk) 21:53, 6 November 2012 (UTC)[reply]
- Support Good job. Jmvkrecords Intra Talk 08:13, 7 November 2012 (UTC)[reply]
- Support, but maybe is better that bot don't add local redirects (User_talk:MerlIwBot#Wrong_aliases). --Stryn (talk) 19:40, 7 November 2012 (UTC)[reply]
- Support I agree with Merlissimo: aliases can be deleted with a simple click, while adding them is much more difficult. In fact, I think that the current task forces are a waste of human resources, and I also think that they could be used instead to check bot'ted items, a faster, more comfortable and less tedious job. --Dalton2 (talk) 04:40, 8 November 2012 (UTC)[reply]
Bot flag set at Meta-Wiki. — MarcoAurelio (talk) 15:45, 8 November 2012 (UTC)[reply]