Wiktionary:Grease pit/2012/September

Autoformat bots

Latest comment: 12 years ago8 comments5 people in discussion

Currently we only have one autoformat bot, KassadBot (talk • contribs). It struggles a bit, both because it keeps breaking down, and just the sheer volume of pages which need formatting. Could anyone possibly run the same code with another bot? As long as it's the same code, we can run as many such bots as we like, they will never 'revert' each other. Mglovesfun (talk) 10:32, 1 September 2012 (UTC)Reply

Are you intending to run it, or are you asking for volunteers? --Μετάknowledge^{discuss/deeds} 03:33, 2 September 2012 (UTC)Reply

Was it designed to be run on more than 1 machine at a time? DTLHS (talk) 03:46, 2 September 2012 (UTC)Reply

I wouldn't know how; I'm asking. User:DTLHS not sure what you mean. Mglovesfun (talk) 10:28, 2 September 2012 (UTC)Reply

Well, would two bots running this code actually help each other, each editing the entries the other isn't? Or would they just get in each other's way, both trying to edit the same entries, and leaving the same entries untouched? —Ruakh_TALK 12:41, 2 September 2012 (UTC)Reply

Could one work from a list of, say, entries with the oldest most-recent-edits. Would the other know not to bother to check those? Breadcrumbs? DCDuring TALK 14:27, 2 September 2012 (UTC)Reply

Maybe we should dedicate one to working on [[water]]. No, really. What we would learn from making that one run might be useful for other giant entries. And having one skip water would keep that one running longer. DCDuring TALK 02:32, 5 September 2012 (UTC)Reply

Yes, KassadBot seems to die after editing water. I find a certain irony there. Mglovesfun (talk) 03:45, 5 September 2012 (UTC)Reply

Bantu noun class indications in translations

Latest comment: 12 years ago3 comments2 people in discussion

The {{t}} template supports a third parameter to indicate the gender, like {{t|ca|dona|f}}. The Bantu languages also have a system that could be considered 'genders' but they are called noun classes and given numbers rather than names. Would it be possible for the template to support those, as well? The gender templates could probably be named {{c1}}, {{c2}} and so on. —CodeCa t 12:32, 2 September 2012 (UTC)Reply

I thought noun-classes in Bantu languages are mostly assigned pairs of numbers; as in, there's a separate noun-class-number for the singular as for the plural. So presumably we'd actually want {{c1-2}}, {{c3-4}}, and so on. No? —Ruakh_TALK 12:40, 2 September 2012 (UTC)Reply

That's true only if the noun is countable; an uncountable noun only has one class, and so does a plurale tantum. And even then, the plural class isn't always predictable. Usually the plural is formed with the next class (odd classes being singular, even being plural) but there are nouns in Zulu in class 11 that have a plural in class 10, and there are also some class 1/6 and 9/6 nouns (not to mention the noun iso which has a suppletive plural). It's likely that other Bantu languages have similar 'irregular' pairs, but each language probably has different exceptions, which may make it unfeasible to have templates for all classes (20 or more) and for all possible/existing combinations of odd and even classes as well. It is probably easier to write the templates so that {{c1|c2}} results in c1/c2 (rather than {{m|f}} which gives m or f). —CodeCa t 13:01, 2 September 2012 (UTC)Reply

Another bot idea

Latest comment: 12 years ago3 comments3 people in discussion

Low priority, but for anyone who might feel like doing it: how about automatically adding those plurals that have lazily not been created because the entry already exists in another language? There are tons. It would bump up the English word count a bit too! Equinox ◑ 14:29, 2 September 2012 (UTC)Reply

The bot would have to rely on the accuracy of the templates in the entries. We know it's not foolproof, wrong entries have been deleted before. Mglovesfun (talk) 18:44, 3 September 2012 (UTC)Reply

Indeed. If there were some way to greenlink such entries, though, that'd be nice.—msh210℠ (talk) 19:48, 4 September 2012 (UTC)Reply

Frequencies of Words - we need them!

Latest comment: 12 years ago12 comments8 people in discussion

Everyone knows of the indispensable use and immense importance of Wiktionary to language learners and researchers, but there is one critical feature we need: frequency of words. Language students need to know the first 5000 lexemes in a language to survive, and given that this is a very large number considering the time it takes per lexeme, it's vital to pick the right ones. Frequencies of words are easily found and many of course are listed in Wiktionary (though admittedly I don't know where there are frequency lists of other units). I'm not a tech person so I don't know, but my question is, would it be possible and if so difficult to match frequency data to all or as many of the articles in Wiktionary as possible so entries on a page can be sorted with frequency? Thanks! — This unsigned comment was added by 82.71.66.78 (talk) at 00:40, 3 September 2012 (UTC).Reply

Frequency of words in what, though? Different words are used less or more depending on where you look. —CodeCa t 01:05, 3 September 2012 (UTC)Reply

Many major languages have definitive corpora - I know that such exist for various dialects of English, for Esperanto, for Mandarin, for Latin, and other languages. I would welcome this addition, and I would be glad to help find corpora and get them approved if someone took up the technical end. --Μετάknowledge^{discuss/deeds} 01:10, 3 September 2012 (UTC)Reply

Although much more work is needed, see Category:Basic_word_lists_by_language. --BB12 (talk) 01:52, 3 September 2012 (UTC)Reply

Corpora are nice, but word frequency varies wildly depending on what segment of the population you're looking at, and what medium, etc. Corpora based on written sources already are skewed away from colloquial and slang, and there are so many specialized vocabularies that aren't represented in proportion to their importance. I think word frequency can be misleading as a guide to which words to learn. Chuck Entz (talk) 02:43, 3 September 2012 (UTC)Reply

Well, the frequency information would have a short disclaimer in the template. Moreover, lists not exceeding a few hundred will be almost identical for the written and spoken sublects of a single dialect in most languages, because various slang terms tend to be localized, and thus the spoken variant taken as a whole is more standardised than it might seem to be. --Μετάknowledge^{discuss/deeds} 03:07, 3 September 2012 (UTC)Reply

@Chuck Entz: Indeed. Also time-period: the 500th most common word one hundred years ago is not necessarily the same as the 500th most common word today. —Ruakh_TALK 03:27, 3 September 2012 (UTC)Reply

Here's a concrete example: the International Corpus of English, which is majority transcribed spoken English. COCA and BNC would be good for dialectal information. Obviously statistics can never be perfect. However, COCA is basically complete in terms of modern AmEng, and its limitations are discussed thoroughly on its WP page. Would that be acceptable (with a templated disclaimer)? --Μετάknowledge^{discuss/deeds} 03:34, 3 September 2012 (UTC)Reply

How does licensing work? Can one of us just take their corpus, compute word-frequency rankings, and upload a list of the top 5000 lexemes to a page in the Appendix namespace? Or do we need to get some sort permission for that? —Ruakh_TALK 03:50, 3 September 2012 (UTC)Reply

COCA's lets one download certain lists for personal use. Given that our user pages are themselves subject to our wiki-licensing, I don't see how we could download them to Wiktionary or systematically use them. We could synthesize the various lists that we get access to, but we would probably lose a lot in the synthesis. DCDuring TALK 04:06, 3 September 2012 (UTC)Reply

the COCA project folks are jerks about sharing their data. They claim to have a robot which scours the net to make sure that you are providing a link back to their site if you use their data. They presumably also own the copyright to every peice of text they used in their corpus, hypocrites! What do they think they are protecting?! If we do use a corpus in this way, for frequency markings, I would like to suggest that we not use any disclaimer. That is needlessly complicated. Instead we should simply be very straight-forward and precise about what data we are providing exactly. Instead of saying that a word is common, and then explaining how we decided that and mummble indecisivly about the caveats of such a claim, we might simply state that a word ranks #420 on a particular frequency list and allow the user to investigate the list if they want to know more. For COCA we might abreviate this as "COCA #420" and provide a complete sentence and a link in a footnote. We could also use categories to link member of the list~~, and have the articles listed in the category as "1 the", "2 be", "127 try"~~. I believe a template can handle that. We can use COCA data but we could not include a copy of the actual list, becuase, like I said, they are jerks and hypocrites. Metal.lunchbox (talk) 05:14, 14 September 2012 (UTC)Reply

Sorry, I meant use the category sort order with the rank from the frequency list. Metal.lunchbox (talk) 05:20, 14 September 2012 (UTC)Reply

Weird

Latest comment: 12 years ago2 comments2 people in discussion

Suddenly I’m can’t access the Wiktionary page tête de nœud. Instead, I seem to be getting a Wikisaurus page Wikisaurus:imbécile. In the tabbed languages, it lists French twice, but I can’t click either of them. If I click on the topmost edit button, I get a Wikisaurus page, but if I click on the EDIT tab at the very top of the page, then and only then can I open the page tête de nœud. But when I edit and save it, I still can see it, I only see the Wikisaurus page. —Stephen ^(Talk) 07:08, 3 September 2012 (UTC)Reply

Someone used braces instead of square brackets so the Wikisaurus page was transcluded instead of linked to. Now fixed. --Yair rand (talk) 07:14, 3 September 2012 (UTC)Reply

A request I cannot meet

Latest comment: 12 years ago3 comments3 people in discussion

A message from my talk page:

Help with template

Is working on a template in my sandbox for use on another Wiktionary. Will you help me by doing so that template in my sandbox (User:Trade/Template), giving example. Rhymes: -ɪsən, instead of Rhymes: -ɪsən ie. removing the link. Has both, the template in the sandbox, and the related doc. Good day here. --Trade (talk) 11:46, 3 September 2012 (UTC)Reply

I'm sorry, but I don't understand anything about templates.--Makaokalani (talk) 11:50, 3 September 2012 (UTC)Reply

Done. --Yair rand (talk) 12:41, 3 September 2012 (UTC)Reply

How to change the assisted translation adding tool?

Latest comment: 12 years ago4 comments3 people in discussion

I would like to add noun classes to the tool so that they can be added in languages that use them. Noun classes are numbered and often come in singular/plural pairs, so ideally there should be two textboxes to enter them in. Can someone help me with this? —CodeCa t 19:38, 3 September 2012 (UTC)Reply

Also, transliterations aren't wikified, but I think they're supposed to be... --BB12 (talk) 21:00, 3 September 2012 (UTC)Reply

Well, only for languages like Chinese. But that would be a change to {{t}}. --Μετάknowledge^{discuss/deeds} 01:03, 4 September 2012 (UTC)Reply

They're also normally wikified for Gothic. Anyway, I've found out how to edit the tool and I've (hopefully) made it work now. —CodeCa t 11:04, 4 September 2012 (UTC)Reply

Help with Telugu template

Latest comment: 12 years ago8 comments4 people in discussion

I’ve tried to make a "plural of" template at te:మూస:plural of that also drops the word into a Plurals category. I used <noinclude>[[]]</noinclude> around the category name, but when I look at a word that has the template, te:ఏనుగులు, the category does not appear. Is <noinclude> incorrectly formatted, or is it the wrong command? —Stephen ^(Talk) 09:37, 7 September 2012 (UTC)Reply

Wrong command. noinclude is for things that are only supposed to be on the template page, includeonly is for things only to be displayed when transcluded. --Yair rand (talk) 09:40, 7 September 2012 (UTC)Reply

That’s fixed it. Thanks. —Stephen ^(Talk) 09:45, 7 September 2012 (UTC)Reply

But now the te:మూస:plural of template has a different problem. When I use this template, for example, at te:మనుషులు, it will not accept the use of the # sign at the start of the line. I can put the # sign, but it is ignored. If I put the # sign followed by regular text, it works correctly, but when I place the template on the line, the # is ignored. If I add some regular text before them template, for example, "# a {{plural of| }}", the "# a" part works, but the template is wrapped to the next line. —Stephen ^(Talk) 06:47, 8 September 2012 (UTC)Reply

I don't know much about templates, but that sounds like it goes to a new line before starting the text. That would mean that # followed by the template would be a # with nothing after it. Another problem: your template is setting the script for the whole page, not just its own contents. If I didn't know better, I would have thought that the page was at Telegu Wiktionary. Chuck Entz (talk) 07:12, 8 September 2012 (UTC)Reply

Hmm, maybe. I’m not sure what that means or how to fix it. But yes, the page is at Telugu Wiktionary. I think it’s a simple template, using a span class (whatever that is), which is what our regular {{plural of}} template uses. But {{plural of}} allows the use of the # sign. My template te:మూస:plural of has something wrong with it, but I don’t know what. —Stephen ^(Talk) 07:20, 8 September 2012 (UTC)Reply

I fixed the problem. -- Liliana • 07:29, 8 September 2012 (UTC)Reply

Wow, thanks. Such a little thing to cause such a big problem. —Stephen ^(Talk) 07:35, 8 September 2012 (UTC)Reply

User:Conrad.Irwin/editor.js

Latest comment: 12 years ago8 comments4 people in discussion

Can somebody please add a feature so that if somebody tries to add a translation that includes the characters (, ), [, or ], the translation will be rejected and the edit will fail? I've seen IPs try to add translit in parentheses, thus causing a redlink. Thanks! --Μετάknowledge^{discuss/deeds} 18:30, 8 September 2012 (UTC)Reply

I think it's probably better to detect and fix these cases ourselves, after the edit has been made. Rejecting the translation will probably not end up with the result we want: either they won't add the translation at all, or they'll work around the problem in some less-than-ideal way. —Ruakh_TALK 18:40, 8 September 2012 (UTC)Reply

There are some legitimate entries with commas in the title, that's the problem. Mglovesfun (talk) 19:35, 8 September 2012 (UTC)Reply

Maybe we should use a warning/reminder, like we do for entry names starting with caps. Chuck Entz (talk) 19:45, 8 September 2012 (UTC)Reply

That sounds way better. Mglovesfun (talk) 19:46, 8 September 2012 (UTC)Reply

A warning would also be great. Does anybody know how to do this (or should I try and possibly break it)? --Μετάknowledge^{discuss/deeds} 19:51, 8 September 2012 (UTC)Reply

Please don't try-and-possibly-break-it; because it's all client-side JavaScript, with client-side caching, you can't even properly test a change after making it unless you know what you're doing. —Ruakh_TALK 22:01, 8 September 2012 (UTC)Reply

I would much rather not try-and-possibly-break-it. (I would, of course, rather that someone knowledgeable (like you) do it.) But I think clearing my cache would suffice to let me test it, and that is in fact what I would do were I to TAPBI (again, I'm not really planning to). --Μετάknowledge^{discuss/deeds} 22:04, 8 September 2012 (UTC)Reply

ķīmisks

Latest comment: 12 years ago3 comments2 people in discussion

For some reason I don't understand, the above link (which I entered using {{l|lv|ķīmisks}}) works fine here -- I click on it and go to the right page -- but it doesn't on the entry page ķīmija. If I scroll down to the "Derived terms" section of ķīmija and click on the link to ķīmisks, I'm taken to ķīmija instead of to ķīmisks. Does this happen to anyone else? If so, why? --Pereru (talk) 01:18, 9 September 2012 (UTC)Reply

You will probably facepalm at this: diff —CodeCa t 01:24, 9 September 2012 (UTC)Reply

Indeed. 0_0... I must have opened that page ten times, and I didn't notice the double bar. Reminds me of when our local Alliance française had 1000 t-shirts made with the words "La Tour Eifflel" on them before someone pointed out it should be "Eiffel", not "Eifflel"... Thanks! --Pereru (talk) 01:37, 9 September 2012 (UTC)Reply

Adding words to my watchlist hangs (and other bugs)

Latest comment: 12 years ago11 comments8 people in discussion

When I click the star button at the top left to add a page to my watchlist, the start just keeps spinning and I don't get a message saying I've added the page. It does actually add the page, though. It seems that it has started doing this since the recent software update. Is anyone else having this problem? —CodeCa t 19:23, 9 September 2012 (UTC)Reply

Actually yes. Mglovesfun (talk) 19:28, 9 September 2012 (UTC)Reply

Ditto. And just before this happened, the success popup was looking so nice... --Μετάknowledge^{discuss/deeds} 19:34, 9 September 2012 (UTC)Reply

Tritto? (Ditto thirded!) - -sche (discuss) 20:51, 9 September 2012 (UTC)Reply

I have noticed that when I click "unwatch" on a particular page, it seems to hang on the "unwatching..." caption and never produces the success message. I have not checked whether the unwatch actually succeeds. Equinox ◑ 20:44, 9 September 2012 (UTC)Reply

It does (for me at least). —An gr 20:49, 9 September 2012 (UTC)Reply

My impression is that the change is submitted to the database, but that the display isn't updated. Refreshing the page or closing it and reopening it will show that the action succeded. Also, when I get the spinning icon, it's not really a "hang", because I can click links, use the search box, etc.Chuck Entz (talk) 20:54, 9 September 2012 (UTC)Reply

Yeah, I don't know how the JavaScript/AJAX stuff actually works, but I didn't mean it hangs the browser: it doesn't. The thread/process/whatever that is reporting the progress of the unwatch operation just doesn't seem to proceed beyond "unwatching...". Equinox ◑ 21:13, 9 September 2012 (UTC)Reply

Bugzilla:40103. --Yair rand 21:07, 9 September 2012 (UTC)

I think I noticed another problem. When I delete a page, I get a Wikimedia error page. But the page itself does get deleted. Maybe these two problems are related, and the page that is loaded 'behind the scenes' by the watchlist button also triggers this error, but we don't see it? —CodeCa t 00:10, 11 September 2012 (UTC)Reply

No, the watchlist star problem is a javascript issue. The removal of $.parseHTML caused an error in the code for displaying the little box, so it breaks before it can stop the star spinning (I think). --Yair rand (talk) 01:05, 11 September 2012 (UTC)Reply

It works again. :) That is, the star stops spinning after a short time and informs me that the page has been added to my watchlist. - -sche (discuss) 21:05, 12 September 2012 (UTC)Reply

Extracting an IPA table from English Wiktionary

Latest comment: 12 years ago6 comments3 people in discussion

Hello folks,

I am working on a project that's related to teaching English to Chinese teachers and students. I have a use for a table of English words and their pronunciation transcribed in IPA. Does such a table exist here on wikitionary? Is it possible to produce one? If not, any suggestion about where I might go for such a thing would be appreciated. Licensing doesn't really matter because it would be public domain information and not subject to copyright, so any place where you think I might be able to scrape the data would also be a helpful suggestion. I would also be posting a copy of such a table in CSV format on the project's website in case any other developers have a use for it. Any ideas that you think will help are welcome.

Metal.lunchbox (talk) 06:58, 10 September 2012 (UTC)Reply

Appendix:English pronunciation? Mglovesfun (talk) 08:52, 10 September 2012 (UTC)Reply

Thanks for the tip, but I'm looking for words, not phonemes. What I'm looking for is a machine-readable dictionary with English words matched with their pronunciation transcribed in IPA. So far I've found CMUdict, which uses arpabet. Translating this to IPA is not impossible but it means I have to accept certain inaccuracies, like not marking syllable stress the way the IPA prescribes. That would require teaching the machine to guess syllable boundaries, which I am not confident I can do accurately. I ask here, because I wonder if there is a way to extract pronunciation data from Wiktionary. Does anyone know? Metal.lunchbox (talk) 14:35, 10 September 2012 (UTC)Reply

Check http://dumps.wikimedia.org/backup-index.html for this site's most recent dump; get the pages-articles file; search it for ==English== followed by (without an intervening ==Anything== with only two equal signs on each side) {{IPA| and then, before any }, the specific IPA symbol you seek.—msh210℠ (talk) 16:33, 10 September 2012 (UTC)Reply

Thank you. I'll look into approaching this task this way. Your suggestion might be the best option available. Its just so suprising that there isn't already a good table of such information available. So many applications could use IPA to describe English pronunciation and despite the 15 years it took, unicode is actually everywhere. Metal.lunchbox (talk) 03:57, 11 September 2012 (UTC)Reply

I was able to extract a table of English words and their IPA transcriptions using the above tips, but there must be something wrong with my regular expression skills or the logic I'm using to analyze the dump. Out of 112 million lines of text I founda little over 300,000 which contain an IPA tag. Looking at only those in mainspace articles (not talk pages, templates, etc.) in the "English" pronunciation section and not labeled as representing a language other than English I found a little over 32,000 transcriptions representing 26,579 distinct English words. That means that only 26,579 English Wiktionary articles have proper IPA transcriptions. At first I thought this was wrong, but after looking at the dictionary and my data some more, I think that this is simply an area where wikitionary can continue to be improved. So I added an IPA transcription for fixture. That's my first contribution to Wiktionary! horray for collaboration! Metal.lunchbox (talk) 04:47, 13 September 2012 (UTC)Reply

Showing usage of a word graphically

Latest comment: 12 years ago4 comments4 people in discussion

Hello,

how is the idea proposed that each English term gets a little balk of how much it is spoken.

For the 100 basic English terms I'd suggest 5 balks.
For the 1000 basic English terms I'd suggest 4 balks.
For the English terms with at least one normal meaning I'd suggest 3 balks.
For English terms with at least one archaic meaning and no normal meaning I'd suggest 2 balks.
For English terms with only obsolete meanings I'd suggest 1 balk.

I don't know yet what balk we should use this for, but firstly is this possible?

Greetings HeliosX (talk) 05:13, 11 September 2012 (UTC)Reply

What sense of (deprecated template usage) balk is this? I don't understand what you mean.—msh210℠ (talk) 05:31, 11 September 2012 (UTC)Reply

As HeliosX is a native German speaker, I think the word he's going for is Balken (“beam, bar”) (cognate with defn. 2 of balk), used not only of wooden beams in German but also of little bars like on mobile phones or on progress bars in software applications. —An gr 12:52, 11 September 2012 (UTC)Reply

We could use this list from COCA of the top 60,000 lemmas in their corpus to provide the date for such a project. Some of the items included seem suspect and some might not meet CFI. They have some caveats about PoS as well.

A scheme of at least one positive mark for anything in their list and additional marks for higher frequency terms (upto three) seems adequate and feasible. I think our obsolete and rare tags provide enough warning on the other side. DCDuring TALK 23:20, 12 September 2012 (UTC)Reply

Babel templates 4 knowledge levels of the Ethiopic script R missing

Latest comment: 12 years ago1 comment1 person in discussion

Who is willing 2 help? I was sent here 2 ask 4 help since I M no expert in designing templates and M hereby claiming the need 4 templates 4 Ethiopic: --IM Serious (talk) 10:41, 11 September 2012 (UTC)Reply

Aedini miscategorized

Latest comment: 12 years ago3 comments2 people in discussion

I don't understand why this Translingual term is categorized in Category:English terms derived from New Latin. AFAICT no other such term is so categorized. I think I have not templated this is any unique way. I could not find a recent change to the templates it transcluded. DCDuring TALK 19:35, 11 September 2012 (UTC)Reply

Fixed; see Aedini?diff=18112881. —Ruakh_TALK 19:48, 11 September 2012 (UTC)Reply

I was afraid that it was a mistake I couldn't see. Thanks. DCDuring TALK 19:51, 11 September 2012 (UTC)Reply

Bug in Random Entry feature...

Latest comment: 12 years ago5 comments5 people in discussion

This is a new bug that was not occurring a few months ago or any time prior to that.

If you go to Random Entry, by language, and selection English, it utilizes this link. For years, this has provided a reliable way to find random words. However now, ~90% of the results are words that start with the letters 'ab', another ~7% of the results are words that start with 'a' and then a letter other than 'b', and ~3% are words that don't begin with 'a' (but are not very random: they are often months, or the words 'raven', 'crow' 'raining cats and dogs', 'thesaurus', 'adjective', and related words).

Kingturtle (talk) 01:11, 12 September 2012 (UTC)Reply

See Wiktionary:Grease pit/2012/August#"Random in English" only returns words starting with "ab". --Μετάknowledge^{discuss/deeds} 01:18, 12 September 2012 (UTC)Reply

I really miss the "nearby words" feature. It showed me a lot of good stuff. What can we do about this? Equinox ◑ 01:35, 12 September 2012 (UTC)Reply

This was brought up on IRC and I asked a few times for an example of a nearby link but got no answer. I can't find them. Have they been removed/disabled because they don't work? in particular I'm wondering if they use the toolserver at all and how. (the randompage link does use the toolserver) --Jeremyb (talk) 12:16, 12 September 2012 (UTC)Reply

Re: "I can't find them": They're an optional feature; to turn them on, visit Wiktionary:Per-browser preferences, check "Use the preferences set on this page" and "Add links to previous and next pages", and click on the "Save settings" link. Re: "I'm wondering if they use the toolserver at all and how": Yes, they use http://toolserver.org/~hippietrail/nearbypages.fcgi?langname=...&term=...&num=4&callback=wiktNearby.callback or http://toolserver.org/~hippietrail/nearbypages.fcgi?langname=...&term=...&num=4&seq=...&callback=wiktNearby.callback. (See User:Hippietrail/nearbypages.js.) —Ruakh_TALK 12:27, 12 September 2012 (UTC)Reply

watchlist deletions

Latest comment: 12 years ago4 comments2 people in discussion

Why doesn't the watchlist show when pages on your watchlist are deleted? Wikipedia's does. --WikiTiki89 (talk) 14:06, 13 September 2012 (UTC)Reply

It's bug 33591. --Yair rand (talk) 15:19, 13 September 2012 (UTC)Reply

That's from March. Is anyone actually working on fixing it? --WikiTiki89 (talk) 15:21, 13 September 2012 (UTC)Reply

Nope. The bug is currently assigned to "Nobody". --Yair rand (talk) 15:22, 13 September 2012 (UTC)Reply

IPv6 addresses

Latest comment: 12 years ago5 comments4 people in discussion

Hi there. Is User:2A02:2F02:C021:F012:0:0:4F76:F560 one of these new IP addresses, even though it looks like a User name?

If so (and it does indeed seem to be), do we have any advice on blocking them. I imagine that range-blocks would be a nightmare. SemperBlotto (talk) 07:53, 15 September 2012 (UTC)Reply

Answering myself, I have updated Wiktionary:Range blocks accordingly. SemperBlotto (talk) 08:07, 15 September 2012 (UTC)Reply

In case someone should wonder how to check out whether an ID is an IP or a user name: do a regular search on it, and see where it takes you. 2A02:2F02:C021:F012:0:0:4F76:F561 is an IP (lands at a contributions page), 2A02:2F02:C021:Å012:0:0:4F76:F561 is not (lands at a regular search result page). That said, I don't know whether usernames that look like IPv6 actually are possible/permitted. Njardarlogar (talk) 15:29, 15 September 2012 (UTC)Reply

We don't allow it, though it's possible to slip it by the system's restrictions by using slightly different characters. When the new IPs first came out on Mediawiki, I permablocked just such a fake IP. As for how to tell: no need to search- IPs are bluelinks that go directly to the contributions, while non-IPs go to a user page and can be redlinks (there is at least one template that erroneously creates redlinks for IPs, though). Chuck Entz (talk) 15:53, 15 September 2012 (UTC)Reply

We could always just block users from registering names with more than, say, four : characters. I doubt that it will cause any problems for legitimate user names, and if it does, they could always try another name. —CodeCa t 18:17, 16 September 2012 (UTC)Reply

Template:yi-personal pronouns

Latest comment: 12 years ago8 comments4 people in discussion

Er, stupid question, but what do I do with all the whitespace? Is there are graceful way to make it disappear (i.e., not by stretching out the plural sections?) --Μετάknowledge^{discuss/deeds} 07:26, 16 September 2012 (UTC)Reply

Actually, I'd think stretching out the 3rd person plural section would be a graceful (and appropriate: emhpasises that the plural applies to all genders) way of filling it. Or do you just want to colour it? - -sche (discuss) 07:52, 16 September 2012 (UTC)Reply

My dream was to combine it with {{yi-possessive pronouns}}, actually, but that looked even worse. I guess I'll just stretch it out for now, but if you get any ideas... --Μετάknowledge^{discuss/deeds} 17:25, 16 September 2012 (UTC)Reply

You could add in just the lemma form (like מײַן (mayn)) for each person and still keep {{yi-possessive pronouns}} as a separate template, to which, by the way, you need to add gender distinctions. Even though gender distinctions are lost when used before a noun, they are kept in other cases (e.g. אוי מאַמע מײַנע!, דאָס איז מײַנע.). --WikiTiki89 (talk) 10:50, 18 September 2012 (UTC)Reply

Will do. Please forgive my foolish mistake, because although I have heard Yiddish my whole life, I don't know a single person who is fluent, and I only started learning in earnest last week. --Μετάknowledge^{discuss/deeds} 23:54, 18 September 2012 (UTC)Reply

Well, my thought process was that I'd leave the template with lemma forms only, and use adjective declension templates for the pages themselves. Lo and behold, the adjective templates are a total mess, mainly composed of an very incomplete and rather idiosyncratic few written by Dick Laurent (talk • contribs). Currently, I'm trying to write templates for every possible declension as outlined by yiddishdictionary.com, but I'm having a bit of trouble. Can somebody please tell me what's wrong with {{yi-adj-final}} that's making the transcusion at מענלעך behave so strangely? --Μετάknowledge^{discuss/deeds} 01:10, 19 September 2012 (UTC)Reply

Re: {{yi-adj-final}}: the main problem was that various instances of what should be {{{1}}} and {{{2}}} were instead {{1}} and {{2}}. (This was made less-than-obvious by the fact that {{1}} actually exists, and creates a link based on the current pagename.) I've fixed it now. —Ruakh_TALK 01:41, 19 September 2012 (UTC)Reply

That's funny. I noticed that problem myself, and I went through and fixed them (they were a negligent copy-paste error), but I obviously missed most of 'em, now that I re-examine the diffs. I thought that I was plateauing in terms of my ability to handle template syntax, but this thread seems to show that instead, I'm plateauing in my ability to handle simple, stupid problems that I theoretically know how to fix (and thinking up ridiculous reasons for PIBCAKs like this). Thank you! --Μετάknowledge^{discuss/deeds} 02:46, 19 September 2012 (UTC)Reply

Translation boxes' greyness does not fill entire 'slot'

Latest comment: 12 years ago9 comments4 people in discussion

right|thumb This phenomenon occurs for me (Windows, Firefox and Opera) on some entries, e.g. [[jaguar]] and [[cougar]], but not others, e.g. [[iron]]. Does it occur for anyone else? - -sche (discuss) 00:10, 18 September 2012 (UTC)Reply

Yes, on all pages (it only appears if you edit or preview the page). DTLHS (talk) 00:12, 18 September 2012 (UTC)Reply

This appears to be due to a MediaWiki software change. {{trans-top}} contains the wikitext <div class="NavHead" align="left">, which used to result in the HTML <div class="NavHead" align="left"> (that is, it was basically left alone), but which now results in the HTML <div class="NavHead" style="float: left;">. Since the result of wikitext-handling is generally cached on the server-side until a page is edited, not all pages have yet been affected by the change. —Ruakh_TALK 02:55, 18 September 2012 (UTC)Reply

I've now fixed that problem by editing {{trans-top}} to use style="text-align: left" rather than align="left". (Note: the edit will take a while to clear the job queue.) However, we should keep our eyes open for other consequences of this software change. It's kind of a bizarre change, since I don't think that align="left" and style="float: left" are ever semantically equivalent, are they? —Ruakh_TALK 03:00, 18 September 2012 (UTC)Reply

{{trans-see}} is also affected by this. DTLHS (talk) 08:02, 18 September 2012 (UTC)Reply

I've now edited {{trans-see}}, as well as {{rel-top}} and {{der-top}}. Dunno what else to check. —Ruakh_TALK 12:22, 18 September 2012 (UTC)Reply

Is there common functionality between these templates that can be moved to a master template? DTLHS (talk) 16:44, 18 September 2012 (UTC)Reply

The fix also needs to be applied to {{rel-top3}}, {{rel-top4}}, and {{rel-top5}}. Especially {{rel-top3}} is now widely used for Derived Terms. See, for example elbow. · (talk) 16:50, 18 September 2012 (UTC)Reply

Update: The software seems to have been fixed. The wikitext <div class="NavHead" align="left"> now results in the HTML <div class="NavHead" style="text-align: left;">. So, we don't need to modify any more templates, but we might want to make null-edits to existing affected pages (for the same reason as above). —Ruakh_TALK 19:21, 18 September 2012 (UTC)Reply

Template:homophonecat

Latest comment: 12 years ago2 comments2 people in discussion

Can this be updated so that the links to other categories go to the specified language? Category:Russian terms with homophones currently links to English categories (one a redlink!). - -sche (discuss) 02:44, 18 September 2012 (UTC)Reply

The problem was actually in {{nyms}}, which only worked for English. Mglovesfun (talk) 09:32, 21 September 2012 (UTC)Reply

Telugu etymology template

Latest comment: 12 years ago1 comment1 person in discussion

Can someone help me in preparing the etymology template for Telugu wiktionary, which will work in Sanskrit, and English in the beginning. Later on we can expand it to other languages. Thanking you.Rajasekhar1961 (talk) 05:52, 18 September 2012 (UTC)Reply

context used outside of definitions

Latest comment: 12 years ago7 comments4 people in discussion

Per User talk:Mglovesfun#patskanis could someone come up with a list of all the {{context}} labels used in the main namespace outside of definitions. And yes, I really should learn Perl and do it myself. Mglovesfun (talk) 09:25, 21 September 2012 (UTC)Reply

Wiktionary:Todo/context outside of definitions. 926 instances of {{context}} occurring on a line not starting with "#" (I didn't look at the other 440 context templates) DTLHS (talk) 02:17, 22 September 2012 (UTC)Reply

The others aren't as likely to be abused because they come with formatting that makes them seem inappropriate. Specific abuses include use instead of {{sense}} under Synonyms and instead of {{a}} or {{qualifier}} or others under Translations and on inflection lines. It seems as if certain editors go through a period of using it, believing it to be preferred to hard formatting, before they learn of other ways of getting the same formatting results. DCDuring TALK 13:49, 22 September 2012 (UTC)Reply

Sense instead of a? I think you mean the opposite, right? Mglovesfun (talk) 20:50, 22 September 2012 (UTC)Reply

No, context instead of sense, and context instead of a or qualifier. Chuck Entz (talk) 21:37, 22 September 2012 (UTC)Reply

That list is so big I think It'd be good to divide it up into lots of 25 like Wiktionary:Todo/needed trans templates. Mglovesfun (talk) 10:18, 23 September 2012 (UTC)Reply

Also, as with many such lists it would be handy to have it sorted by the L2 section header that contained the offending item. I understand that there may be no convenient way to do that.

Any sectioning is handy to facilitate removing items. DCDuring TALK 14:20, 23 September 2012 (UTC)Reply

is there a wiktionary of 21st century slang?

Latest comment: 12 years ago3 comments2 people in discussion

hello wiktionary!

i thought i saw a wiktionary of modern slang, but i don't know where to find it. can you assist ? thank you.

kevin

I don't think so, but you might be interested in Category:English neologisms and Category:English internet slang. Equinox ◑ 15:19, 21 September 2012 (UTC)Reply

Urban Dictionary? - -sche (discuss) 15:32, 21 September 2012 (UTC)Reply

UD accepts anything, though, and most of the words are not in actual use, or only by a handful of people. Equinox ◑ 15:34, 21 September 2012 (UTC)Reply

"Mark as patrolled" often gives errors; sometimes "rollback" does, too.

Latest comment: 12 years ago5 comments4 people in discussion

Is anyone else experiencing the problem that "mark as patrolled" gives a WMF error, and doesn't actually mark the edit as patrolled? It happens to me about 25% of the time, I think. It happens especially often when the edit is several days old. Fortunately, the problem doesn't affect edits marked via the API instead of the UI, so it doesn't affect the M buttons on recent-changes and contributions-pages and so on, but even so, it's seriously affecting my ability to patrol. (It started several days ago.) —Ruakh_TALK 19:50, 23 September 2012 (UTC)Reply

Yeah, I get this error, but not (as you say) for the blue M button in Recent Changes. Equinox ◑ 20:13, 23 September 2012 (UTC)Reply

I never get any error messages, but I think it also never patrols the edit. I always use the M button. Mglovesfun (talk) 21:09, 23 September 2012 (UTC)Reply

O.K., I've created bugzilla:40481: "UI giving fatal error when trying to mark certain edits as patrolled. (en.wikt)" —Ruakh_TALK 18:31, 24 September 2012 (UTC)Reply

It's been happening to me today, mostly after a (successful) page deletion. SemperBlotto (talk) 18:50, 24 September 2012 (UTC)Reply

Tocharian script

Latest comment: 12 years ago10 comments5 people in discussion

Hello,

I just discovered, the West Tocharian and the East Tocharian language both have a scribe and we actually didn't use it in any entry at all. Here is a link to [the Tocharian scribe]. I don't know if there is a font for the Tocharian scribe, if not I might email Unicode to include it, but firstly we need to find out, is there a font for the Tocharian scribe?

Actually I can't reply on this page, so I'll reply on your talk page.

Greetings HeliosX (talk) 16:34, 24 September 2012 (UTC)Reply

Tocharian is, as of yet, missing from Unicode, so there isn't any way to use it in entries I am afraid. -- Liliana • 17:55, 24 September 2012 (UTC)Reply

By "scribe" you mean "script", yeah? - -sche (discuss) 20:20, 24 September 2012 (UTC)Reply

Yes. I'm currently working on to put all Tocharian symbols in a folder, if the author of the symbols allows me to use them, can someone make a font from those symbols (they're GIF-files) which can be included in Unicode?

Greetings HeliosX (talk) 14:29, 25 September 2012 (UTC)Reply

Unicode doesn't worry about specific fonts; it just assigns code points to specific characters. So, if and when Tocharian script gets into Unicode, there will be a code point named TOCHARIAN LETTER KA or the like, but Unicode won't decide what exactly that character is to look like; that's up to font designers. If you want to design a font that includes Tocharian now, you can, but you would need to put it in the Private Use area until Tocharian gets added. I don't think Wiktionary is the place to look for people to make a font for you; we're lexicographers, not font designers. And Wiktionary still won't be able to use Tocharian letters until they're in Unicode proper, not just in the Private Use area of one font. Incidentally, a more complete set of Tocharian script characters can be found here. —An gr 14:53, 25 September 2012 (UTC)Reply

If he is serious about a proposal to Unicode (but it is hard to write one, be warned!), then he will need to submit a font. I think I'm the only one around here who has experience with fonts, but I am bad at drawing. -- Liliana • 15:36, 25 September 2012 (UTC)Reply

You don't need to draw, I don't need a font designer in actual drawing, but one who joins the symbols I can gather up in GIF-files into a font. First of all anyhow we'd need to get a permission from Jost Gippert for to use the GIF-files he made, I already sent an email to him. Nontheless I can draw the symbols if he rejects or he doesn't answer, but I actually can't join them in a font.

Greetings HeliosX (talk) 16:53, 25 September 2012 (UTC)Reply

Surely the shapes of letters used in century-old manuscripts are in the public domain. —An gr 17:21, 25 September 2012 (UTC)Reply

The shapes are, yes, but not the result of someone drawing the shapes in their own particular way. —CodeCa t 13:21, 27 September 2012 (UTC)Reply

I'm currently doing a proposal, but is it needed to have had conversation with experts of Tocharian? And Angr, are the symbols copyrighted though he drew them?

Greetings HeliosX (talk) 13:19, 27 September 2012 (UTC)Reply

template:he-proper noun

Latest comment: 12 years ago1 comment1 person in discussion

Why does this not accept dwv? --WikiTiki89 (talk) 16:10, 27 September 2012 (UTC)Reply

Finno-Ugric as language template

Latest comment: 12 years ago2 comments2 people in discussion

Hello,

is Finno-Ugric included as language template as example to use it in {{attention|fiu}}?

Greetings HeliosX (talk) 18:20, 27 September 2012 (UTC)Reply

It's not a language, so no. What is that meant to call attention to? —CodeCa t 18:46, 27 September 2012 (UTC)Reply

Spam Page Titles

Latest comment: 12 years ago15 comments8 people in discussion

It looks like spammers have come up with a new strategy: moving high-traffic pages to names containing spam. It's already been suggested that we might have to ban moves by new accounts. I would like to suggest banning html from page titles. Is there any legitimate reason for html there, and, if not, can we implement such a ban? Chuck Entz (talk) 18:59, 28 September 2012 (UTC)Reply

You can't use HTML in page titles, because > and < cannot be used. And moves are already banned for new accounts, you need to wait four days before you can move pages. -- Liliana • 19:04, 28 September 2012 (UTC)Reply

I think Chuck means we should ban the string html, probably by local blacklist, with any pages that are supposed to contain it locally whitelisted. - -sche (discuss) 19:38, 28 September 2012 (UTC)Reply

Is four days enough for the new registered-user move restriction? Or should only whitelisted users be allowed to move pages? DCDuring TALK 21:05, 28 September 2012 (UTC)Reply

I like your idea. -- Liliana • 22:19, 28 September 2012 (UTC)Reply

Indeed, all of the accounts that took part in the move-spamming were created in January. Chuck Entz (talk) 00:34, 29 September 2012 (UTC)Reply

No, I actually misread the move log. Apparently the html is in the summary line. That can be dealt with by hiding the summary. We still need to come up with a way to address this, but that isn't it. Perhaps we should create a list of pages that shouldn't be moved, ordered by importance, and start protecting them admin-move-only. Chuck Entz (talk) 21:20, 28 September 2012 (UTC)Reply

To elaborate: there is absolutely no reason to move a page like write, since it's so widely attested with that spelling. It will no doubt change its content repeatedly and often, but there will always be an entry at that location. If a reason came up due to some technical consideration, an admin would probably be involved, anyway. The pages that are high-value targets for spammers are probably all the same in that respect, with the exception of some that might not meet CFI. We will probably also have to look back through the redirects resulting from rfd, rfv and rfm and see which ones need to be protected (a redirect can be easily hijacked by a simple page edit to the redirect page). Chuck Entz (talk) 22:27, 28 September 2012 (UTC)Reply

All 722 of the "1000 most basic English words" should probably be made immovable. - -sche (discuss) 00:01, 29 September 2012 (UTC)Reply

But they probably would include only a small number of the top one hundred words looked up, based on the highly suggestive count of actual page hits (See Category:Vulgarities by language) done some years. DCDuring TALK 02:12, 29 September 2012 (UTC)Reply

Hm, good point. Wikipedia has a tool that lets users check the number of hits entries get from outside links. I presume that tool is also available to us...? - -sche (discuss) 02:53, 29 September 2012 (UTC)Reply

"All 722 of he 1000 most basic English words" LOL. Mglovesfun (talk) 11:05, 30 September 2012 (UTC)Reply

Support limiting moves to whitelist: There's really no need for anybody else to move pages anyway. If a page is wrongly titled, it would most likely be fixed by a whitelister anyway Purplebackpack89 ^{(Notes Taken) (Locker)} 21:34, 2 October 2012 (UTC)Reply
This just happened again. Support. —CodeCa t 13:42, 3 October 2012 (UTC)Reply
Support a whitelist. Forceeterna (talk • contribs • global account info • deleted contribs • nuke • abuse filter log • page moves • block • block log • active blocks), for example, created a sleeper account over 9 months ago and just started using it to make spammy page moves today. —An gr 13:50, 3 October 2012 (UTC)Reply

IPA /ɡ/

Latest comment: 12 years ago14 comments6 people in discussion

Can someone make a bot change all instances of the letter "g" to the proper IPA letter "ɡ" within {{IPA}} and {{IPAchar}}? --WikiTiki89 (talk) 22:20, 30 September 2012 (UTC)Reply

Are you sure that's right? I thought that IPA had abandoned the distinction between open-tailed and closed-tailed G, so we can just use ASCII g. No? —Ruakh_TALK 22:46, 30 September 2012 (UTC)Reply

I don't think it's a distinction just that open-tailed is preferable. But regardless, we should either use one or the other for all our IPA and I think it makes more sense to use the more correct form. --WikiTiki89 (talk) 22:59, 30 September 2012 (UTC)Reply

Does it depend on what font you use? There's no difference between the two letters on my monitor. BigDom (t • c) 08:15, 1 October 2012 (UTC)Reply

Yes. In some fonts, the two characters are identical; in others they're different. For me, they're identical in the font I use to view text normally, but they're different in the monospaced font I use in the edit box. Can you see the difference between g and ɡ? —An gr 09:43, 1 October 2012 (UTC)Reply

Ah thanks, I see the difference now. They're both the same on my monospaced font as well you see. I suppose we might as well change all the IPA templates to open-tailed g, if only for consistency's sake. BigDom (t • c) 10:23, 1 October 2012 (UTC)Reply

If we decide to change the gs, we should also change [g̊] to [ɡ̊]. - -sche (discuss) 22:49, 30 September 2012 (UTC)Reply

If that's a combining character, then wouldn't that just be a regular instance of changing [g] to [ɡ]? --WikiTiki89 (talk) 22:59, 30 September 2012 (UTC)Reply

I hope so; I just wanted to be sure. - -sche (discuss) 23:04, 30 September 2012 (UTC)Reply

A bot could also change [ʤ] to [dʒ] and [ʦ] to [ts] and [��] to [tʃ]. (Look into these characters, too.) Those ligatures are certainly obsolete. - -sche (discuss) 22:49, 30 September 2012 (UTC)Reply

Such a bot could also look for and correct uses of ' which should be ˈ, as here, and uses of : instead of ː (and who knows? maybe people even misuse , for ˌ). - -sche (discuss) 06:50, 12 October 2012 (UTC)Reply

A lot of people misuse /ɘ/ for /ə/. Does any language actually use /ɘ/? Because if not, the bot can do that too. Otherwise we could restrict it to languages that we know do not use /ɘ/. --WikiTiki89 (talk) 14:25, 12 October 2012 (UTC)Reply

AutoFormat (talk • contribs) used to do these changes, so I assume KassadBot (talk • contribs) does too. Does it? Mglovesfun (talk) 11:22, 13 October 2012 (UTC)Reply

Old Irish uses /ɘ/. I doubt anyone who adds pronunciations to Old Irish entries will mix up ɘ and ə. —An gr 10:16, 14 October 2012 (UTC)Reply

Wiktionary:Grease pit/2012/September

Contents

Autoformat bots

Bantu noun class indications in translations

Another bot idea

Frequencies of Words - we need them!

Weird

A request I cannot meet

How to change the assisted translation adding tool?

Help with Telugu template

User:Conrad.Irwin/editor.js

ķīmisks

Adding words to my watchlist hangs (and other bugs)

Extracting an IPA table from English Wiktionary

Showing usage of a word graphically

Babel templates 4 knowledge levels of the Ethiopic script R missing

Aedini miscategorized

Bug in Random Entry feature...

watchlist deletions

IPv6 addresses

Template:yi-personal pronouns

Translation boxes' greyness does not fill entire 'slot'

Template:homophonecat

Telugu etymology template

context used outside of definitions

is there a wiktionary of 21st century slang?

"Mark as patrolled" often gives errors; sometimes "rollback" does, too.

Tocharian script

template:he-proper noun

Finno-Ugric as language template

Spam Page Titles

IPA /ɡ/

Navigation menu

Wiktionary:Grease pit/2012/September

Autoformat bots

Bantu noun class indications in translations

Another bot idea

Frequencies of Words - we need them!

Weird

A request I cannot meet

How to change the assisted translation adding tool?

Help with Telugu template

User:Conrad.Irwin/editor.js

ķīmisks

Adding words to my watchlist hangs (and other bugs)

Extracting an IPA table from English Wiktionary

Showing usage of a word graphically

Babel templates 4 knowledge levels of the Ethiopic script R missing

Aedini miscategorized

Bug in Random Entry feature...

watchlist deletions

IPv6 addresses

Template:yi-personal pronouns

Translation boxes' greyness does not fill entire 'slot'

Template:homophonecat

Telugu etymology template

context used outside of definitions

is there a wiktionary of 21st century slang?

"Mark as patrolled" often gives errors; sometimes "rollback" does, too.

Tocharian script

template:he-proper noun

Finno-Ugric as language template

Spam Page Titles

IPA /ɡ/

Navigation menu

Search