Wiktionary:Grease pit/2012/September

From Wiktionary, the free dictionary
Latest comment: 12 years ago by Angr in topic IPA /ɡ/
Jump to navigation Jump to search


Autoformat bots

[edit]

Currently we only have one autoformat bot, KassadBot (talkcontribs). It struggles a bit, both because it keeps breaking down, and just the sheer volume of pages which need formatting. Could anyone possibly run the same code with another bot? As long as it's the same code, we can run as many such bots as we like, they will never 'revert' each other. Mglovesfun (talk) 10:32, 1 September 2012 (UTC)Reply

Are you intending to run it, or are you asking for volunteers? --Μετάknowledgediscuss/deeds 03:33, 2 September 2012 (UTC)Reply
Was it designed to be run on more than 1 machine at a time? DTLHS (talk) 03:46, 2 September 2012 (UTC)Reply
I wouldn't know how; I'm asking. User:DTLHS not sure what you mean. Mglovesfun (talk) 10:28, 2 September 2012 (UTC)Reply
Well, would two bots running this code actually help each other, each editing the entries the other isn't? Or would they just get in each other's way, both trying to edit the same entries, and leaving the same entries untouched? —RuakhTALK 12:41, 2 September 2012 (UTC)Reply
Could one work from a list of, say, entries with the oldest most-recent-edits. Would the other know not to bother to check those? Breadcrumbs? DCDuring TALK 14:27, 2 September 2012 (UTC)Reply
Maybe we should dedicate one to working on [[water]]. No, really. What we would learn from making that one run might be useful for other giant entries. And having one skip water would keep that one running longer. DCDuring TALK 02:32, 5 September 2012 (UTC)Reply
Yes, KassadBot seems to die after editing water. I find a certain irony there. Mglovesfun (talk) 03:45, 5 September 2012 (UTC)Reply

Bantu noun class indications in translations

[edit]

The {{t}} template supports a third parameter to indicate the gender, like {{t|ca|dona|f}}. The Bantu languages also have a system that could be considered 'genders' but they are called noun classes and given numbers rather than names. Would it be possible for the template to support those, as well? The gender templates could probably be named {{c1}}, {{c2}} and so on. —CodeCat 12:32, 2 September 2012 (UTC)Reply

I thought noun-classes in Bantu languages are mostly assigned pairs of numbers; as in, there's a separate noun-class-number for the singular as for the plural. So presumably we'd actually want {{c1-2}}, {{c3-4}}, and so on. No? —RuakhTALK 12:40, 2 September 2012 (UTC)Reply
That's true only if the noun is countable; an uncountable noun only has one class, and so does a plurale tantum. And even then, the plural class isn't always predictable. Usually the plural is formed with the next class (odd classes being singular, even being plural) but there are nouns in Zulu in class 11 that have a plural in class 10, and there are also some class 1/6 and 9/6 nouns (not to mention the noun iso which has a suppletive plural). It's likely that other Bantu languages have similar 'irregular' pairs, but each language probably has different exceptions, which may make it unfeasible to have templates for all classes (20 or more) and for all possible/existing combinations of odd and even classes as well. It is probably easier to write the templates so that {{c1|c2}} results in c1/c2 (rather than {{m|f}} which gives m or f). —CodeCat 13:01, 2 September 2012 (UTC)Reply

Another bot idea

[edit]

Low priority, but for anyone who might feel like doing it: how about automatically adding those plurals that have lazily not been created because the entry already exists in another language? There are tons. It would bump up the English word count a bit too! Equinox 14:29, 2 September 2012 (UTC)Reply

The bot would have to rely on the accuracy of the templates in the entries. We know it's not foolproof, wrong entries have been deleted before. Mglovesfun (talk) 18:44, 3 September 2012 (UTC)Reply
Indeed. If there were some way to greenlink such entries, though, that'd be nice.​—msh210 (talk) 19:48, 4 September 2012 (UTC)Reply

Frequencies of Words - we need them!

[edit]

Everyone knows of the indispensable use and immense importance of Wiktionary to language learners and researchers, but there is one critical feature we need: frequency of words. Language students need to know the first 5000 lexemes in a language to survive, and given that this is a very large number considering the time it takes per lexeme, it's vital to pick the right ones. Frequencies of words are easily found and many of course are listed in Wiktionary (though admittedly I don't know where there are frequency lists of other units). I'm not a tech person so I don't know, but my question is, would it be possible and if so difficult to match frequency data to all or as many of the articles in Wiktionary as possible so entries on a page can be sorted with frequency? Thanks! — This unsigned comment was added by 82.71.66.78 (talk) at 00:40, 3 September 2012 (UTC).Reply

Frequency of words in what, though? Different words are used less or more depending on where you look. —CodeCat 01:05, 3 September 2012 (UTC)Reply
Many major languages have definitive corpora - I know that such exist for various dialects of English, for Esperanto, for Mandarin, for Latin, and other languages. I would welcome this addition, and I would be glad to help find corpora and get them approved if someone took up the technical end. --Μετάknowledgediscuss/deeds 01:10, 3 September 2012 (UTC)Reply
Although much more work is needed, see Category:Basic_word_lists_by_language. --BB12 (talk) 01:52, 3 September 2012 (UTC)Reply
Corpora are nice, but word frequency varies wildly depending on what segment of the population you're looking at, and what medium, etc. Corpora based on written sources already are skewed away from colloquial and slang, and there are so many specialized vocabularies that aren't represented in proportion to their importance. I think word frequency can be misleading as a guide to which words to learn. Chuck Entz (talk) 02:43, 3 September 2012 (UTC)Reply
Well, the frequency information would have a short disclaimer in the template. Moreover, lists not exceeding a few hundred will be almost identical for the written and spoken sublects of a single dialect in most languages, because various slang terms tend to be localized, and thus the spoken variant taken as a whole is more standardised than it might seem to be. --Μετάknowledgediscuss/deeds 03:07, 3 September 2012 (UTC)Reply
@Chuck Entz: Indeed. Also time-period: the 500th most common word one hundred years ago is not necessarily the same as the 500th most common word today. —RuakhTALK 03:27, 3 September 2012 (UTC)Reply
Here's a concrete example: the International Corpus of English, which is majority transcribed spoken English. COCA and BNC would be good for dialectal information. Obviously statistics can never be perfect. However, COCA is basically complete in terms of modern AmEng, and its limitations are discussed thoroughly on its WP page. Would that be acceptable (with a templated disclaimer)? --Μετάknowledgediscuss/deeds 03:34, 3 September 2012 (UTC)Reply
How does licensing work? Can one of us just take their corpus, compute word-frequency rankings, and upload a list of the top 5000 lexemes to a page in the Appendix namespace? Or do we need to get some sort permission for that? —RuakhTALK 03:50, 3 September 2012 (UTC)Reply
COCA's lets one download certain lists for personal use. Given that our user pages are themselves subject to our wiki-licensing, I don't see how we could download them to Wiktionary or systematically use them. We could synthesize the various lists that we get access to, but we would probably lose a lot in the synthesis. DCDuring TALK 04:06, 3 September 2012 (UTC)Reply
the COCA project folks are jerks about sharing their data. They claim to have a robot which scours the net to make sure that you are providing a link back to their site if you use their data. They presumably also own the copyright to every peice of text they used in their corpus, hypocrites! What do they think they are protecting?! If we do use a corpus in this way, for frequency markings, I would like to suggest that we not use any disclaimer. That is needlessly complicated. Instead we should simply be very straight-forward and precise about what data we are providing exactly. Instead of saying that a word is common, and then explaining how we decided that and mummble indecisivly about the caveats of such a claim, we might simply state that a word ranks #420 on a particular frequency list and allow the user to investigate the list if they want to know more. For COCA we might abreviate this as "COCA #420" and provide a complete sentence and a link in a footnote. We could also use categories to link member of the list, and have the articles listed in the category as "1 the", "2 be", "127 try". I believe a template can handle that. We can use COCA data but we could not include a copy of the actual list, becuase, like I said, they are jerks and hypocrites. Metal.lunchbox (talk) 05:14, 14 September 2012 (UTC)Reply
Sorry, I meant use the category sort order with the rank from the frequency list. Metal.lunchbox (talk) 05:20, 14 September 2012 (UTC)Reply

Weird

[edit]

Suddenly I’m can’t access the Wiktionary page tête de nœud. Instead, I seem to be getting a Wikisaurus page Wikisaurus:imbécile. In the tabbed languages, it lists French twice, but I can’t click either of them. If I click on the topmost edit button, I get a Wikisaurus page, but if I click on the EDIT tab at the very top of the page, then and only then can I open the page tête de nœud. But when I edit and save it, I still can see it, I only see the Wikisaurus page. —Stephen (Talk) 07:08, 3 September 2012 (UTC)Reply

Someone used braces instead of square brackets so the Wikisaurus page was transcluded instead of linked to. Now fixed. --Yair rand (talk) 07:14, 3 September 2012 (UTC)Reply

A request I cannot meet

[edit]

A message from my talk page:

Help with template

Is working on a template in my sandbox for use on another Wiktionary. Will you help me by doing so that template in my sandbox (User:Trade/Template), giving example. Rhymes: -ɪsən, instead of Rhymes: -ɪsən ie. removing the link. Has both, the template in the sandbox, and the related doc. Good day here. --Trade (talk) 11:46, 3 September 2012 (UTC)Reply

I'm sorry, but I don't understand anything about templates.--Makaokalani (talk) 11:50, 3 September 2012 (UTC)Reply

How to change the assisted translation adding tool?

[edit]

I would like to add noun classes to the tool so that they can be added in languages that use them. Noun classes are numbered and often come in singular/plural pairs, so ideally there should be two textboxes to enter them in. Can someone help me with this? —CodeCat 19:38, 3 September 2012 (UTC)Reply

Also, transliterations aren't wikified, but I think they're supposed to be... --BB12 (talk) 21:00, 3 September 2012 (UTC)Reply
Well, only for languages like Chinese. But that would be a change to {{t}}. --Μετάknowledgediscuss/deeds 01:03, 4 September 2012 (UTC)Reply
They're also normally wikified for Gothic. Anyway, I've found out how to edit the tool and I've (hopefully) made it work now. —CodeCat 11:04, 4 September 2012 (UTC)Reply

Help with Telugu template

[edit]

I’ve tried to make a "plural of" template at te:మూస:plural of that also drops the word into a Plurals category. I used <noinclude>[[]]</noinclude> around the category name, but when I look at a word that has the template, te:ఏనుగులు, the category does not appear. Is <noinclude> incorrectly formatted, or is it the wrong command? —Stephen (Talk) 09:37, 7 September 2012 (UTC)Reply

Wrong command. noinclude is for things that are only supposed to be on the template page, includeonly is for things only to be displayed when transcluded. --Yair rand (talk) 09:40, 7 September 2012 (UTC)Reply
That’s fixed it. Thanks. —Stephen (Talk) 09:45, 7 September 2012 (UTC)Reply
But now the te:మూస:plural of template has a different problem. When I use this template, for example, at te:మనుషులు, it will not accept the use of the # sign at the start of the line. I can put the # sign, but it is ignored. If I put the # sign followed by regular text, it works correctly, but when I place the template on the line, the # is ignored. If I add some regular text before them template, for example, "# a {{plural of| }}", the "# a" part works, but the template is wrapped to the next line. —Stephen (Talk) 06:47, 8 September 2012 (UTC)Reply
I don't know much about templates, but that sounds like it goes to a new line before starting the text. That would mean that # followed by the template would be a # with nothing after it. Another problem: your template is setting the script for the whole page, not just its own contents. If I didn't know better, I would have thought that the page was at Telegu Wiktionary. Chuck Entz (talk) 07:12, 8 September 2012 (UTC)Reply
Hmm, maybe. I’m not sure what that means or how to fix it. But yes, the page is at Telugu Wiktionary. I think it’s a simple template, using a span class (whatever that is), which is what our regular {{plural of}} template uses. But {{plural of}} allows the use of the # sign. My template te:మూస:plural of has something wrong with it, but I don’t know what. —Stephen (Talk) 07:20, 8 September 2012 (UTC)Reply
I fixed the problem. -- Liliana 07:29, 8 September 2012 (UTC)Reply
Wow, thanks. Such a little thing to cause such a big problem. —Stephen (Talk) 07:35, 8 September 2012 (UTC)Reply

User:Conrad.Irwin/editor.js

[edit]

Can somebody please add a feature so that if somebody tries to add a translation that includes the characters (, ), [, or ], the translation will be rejected and the edit will fail? I've seen IPs try to add translit in parentheses, thus causing a redlink. Thanks! --Μετάknowledgediscuss/deeds 18:30, 8 September 2012 (UTC)Reply

I think it's probably better to detect and fix these cases ourselves, after the edit has been made. Rejecting the translation will probably not end up with the result we want: either they won't add the translation at all, or they'll work around the problem in some less-than-ideal way. —RuakhTALK 18:40, 8 September 2012 (UTC)Reply
There are some legitimate entries with commas in the title, that's the problem. Mglovesfun (talk) 19:35, 8 September 2012 (UTC)Reply
Maybe we should use a warning/reminder, like we do for entry names starting with caps. Chuck Entz (talk) 19:45, 8 September 2012 (UTC)Reply
That sounds way better. Mglovesfun (talk) 19:46, 8 September 2012 (UTC)Reply
A warning would also be great. Does anybody know how to do this (or should I try and possibly break it)? --Μετάknowledgediscuss/deeds 19:51, 8 September 2012 (UTC)Reply
Please don't try-and-possibly-break-it; because it's all client-side JavaScript, with client-side caching, you can't even properly test a change after making it unless you know what you're doing. —RuakhTALK 22:01, 8 September 2012 (UTC)Reply
I would much rather not try-and-possibly-break-it. (I would, of course, rather that someone knowledgeable (like you) do it.) But I think clearing my cache would suffice to let me test it, and that is in fact what I would do were I to TAPBI (again, I'm not really planning to). --Μετάknowledgediscuss/deeds 22:04, 8 September 2012 (UTC)Reply

ķīmisks

[edit]

For some reason I don't understand, the above link (which I entered using {{l|lv|ķīmisks}}) works fine here -- I click on it and go to the right page -- but it doesn't on the entry page ķīmija. If I scroll down to the "Derived terms" section of ķīmija and click on the link to ķīmisks, I'm taken to ķīmija instead of to ķīmisks. Does this happen to anyone else? If so, why? --Pereru (talk) 01:18, 9 September 2012 (UTC)Reply

You will probably facepalm at this: diffCodeCat 01:24, 9 September 2012 (UTC)Reply
Indeed. 0_0... I must have opened that page ten times, and I didn't notice the double bar. Reminds me of when our local Alliance française had 1000 t-shirts made with the words "La Tour Eifflel" on them before someone pointed out it should be "Eiffel", not "Eifflel"... Thanks! --Pereru (talk) 01:37, 9 September 2012 (UTC)Reply

Adding words to my watchlist hangs (and other bugs)

[edit]

When I click the star button at the top left to add a page to my watchlist, the start just keeps spinning and I don't get a message saying I've added the page. It does actually add the page, though. It seems that it has started doing this since the recent software update. Is anyone else having this problem? —CodeCat 19:23, 9 September 2012 (UTC)Reply

Actually yes. Mglovesfun (talk) 19:28, 9 September 2012 (UTC)Reply
Ditto. And just before this happened, the success popup was looking so nice... --Μετάknowledgediscuss/deeds 19:34, 9 September 2012 (UTC)Reply
Tritto? (Ditto thirded!) - -sche (discuss) 20:51, 9 September 2012 (UTC)Reply
I have noticed that when I click "unwatch" on a particular page, it seems to hang on the "unwatching..." caption and never produces the success message. I have not checked whether the unwatch actually succeeds. Equinox 20:44, 9 September 2012 (UTC)Reply
It does (for me at least). —Angr 20:49, 9 September 2012 (UTC)Reply
My impression is that the change is submitted to the database, but that the display isn't updated. Refreshing the page or closing it and reopening it will show that the action succeded. Also, when I get the spinning icon, it's not really a "hang", because I can click links, use the search box, etc.Chuck Entz (talk) 20:54, 9 September 2012 (UTC)Reply
Yeah, I don't know how the JavaScript/AJAX stuff actually works, but I didn't mean it hangs the browser: it doesn't. The thread/process/whatever that is reporting the progress of the unwatch operation just doesn't seem to proceed beyond "unwatching...". Equinox 21:13, 9 September 2012 (UTC)Reply
I think I noticed another problem. When I delete a page, I get a Wikimedia error page. But the page itself does get deleted. Maybe these two problems are related, and the page that is loaded 'behind the scenes' by the watchlist button also triggers this error, but we don't see it? —CodeCat 00:10, 11 September 2012 (UTC)Reply
No, the watchlist star problem is a javascript issue. The removal of $.parseHTML caused an error in the code for displaying the little box, so it breaks before it can stop the star spinning (I think). --Yair rand (talk) 01:05, 11 September 2012 (UTC)Reply
It works again. :) That is, the star stops spinning after a short time and informs me that the page has been added to my watchlist. - -sche (discuss) 21:05, 12 September 2012 (UTC)Reply

Extracting an IPA table from English Wiktionary

[edit]

Hello folks,

I am working on a project that's related to teaching English to Chinese teachers and students. I have a use for a table of English words and their pronunciation transcribed in IPA. Does such a table exist here on wikitionary? Is it possible to produce one? If not, any suggestion about where I might go for such a thing would be appreciated. Licensing doesn't really matter because it would be public domain information and not subject to copyright, so any place where you think I might be able to scrape the data would also be a helpful suggestion. I would also be posting a copy of such a table in CSV format on the project's website in case any other developers have a use for it. Any ideas that you think will help are welcome.

Metal.lunchbox (talk) 06:58, 10 September 2012 (UTC)Reply

Appendix:English pronunciation? Mglovesfun (talk) 08:52, 10 September 2012 (UTC)Reply
Thanks for the tip, but I'm looking for words, not phonemes. What I'm looking for is a machine-readable dictionary with English words matched with their pronunciation transcribed in IPA. So far I've found CMUdict, which uses arpabet. Translating this to IPA is not impossible but it means I have to accept certain inaccuracies, like not marking syllable stress the way the IPA prescribes. That would require teaching the machine to guess syllable boundaries, which I am not confident I can do accurately. I ask here, because I wonder if there is a way to extract pronunciation data from Wiktionary. Does anyone know? Metal.lunchbox (talk) 14:35, 10 September 2012 (UTC)Reply
Check http://dumps.wikimedia.org/backup-index.html for this site's most recent dump; get the pages-articles file; search it for ==English== followed by (without an intervening ==Anything== with only two equal signs on each side) {{IPA| and then, before any }, the specific IPA symbol you seek.​—msh210 (talk) 16:33, 10 September 2012 (UTC)Reply
Thank you. I'll look into approaching this task this way. Your suggestion might be the best option available. Its just so suprising that there isn't already a good table of such information available. So many applications could use IPA to describe English pronunciation and despite the 15 years it took, unicode is actually everywhere. Metal.lunchbox (talk) 03:57, 11 September 2012 (UTC)Reply
I was able to extract a table of English words and their IPA transcriptions using the above tips, but there must be something wrong with my regular expression skills or the logic I'm using to analyze the dump. Out of 112 million lines of text I founda little over 300,000 which contain an IPA tag. Looking at only those in mainspace articles (not talk pages, templates, etc.) in the "English" pronunciation section and not labeled as representing a language other than English I found a little over 32,000 transcriptions representing 26,579 distinct English words. That means that only 26,579 English Wiktionary articles have proper IPA transcriptions. At first I thought this was wrong, but after looking at the dictionary and my data some more, I think that this is simply an area where wikitionary can continue to be improved. So I added an IPA transcription for fixture. That's my first contribution to Wiktionary! horray for collaboration! Metal.lunchbox (talk) 04:47, 13 September 2012 (UTC)Reply

Showing usage of a word graphically

[edit]

Hello,

how is the idea proposed that each English term gets a little balk of how much it is spoken.

  • For the 100 basic English terms I'd suggest 5 balks.
  • For the 1000 basic English terms I'd suggest 4 balks.
  • For the English terms with at least one normal meaning I'd suggest 3 balks.
  • For English terms with at least one archaic meaning and no normal meaning I'd suggest 2 balks.
  • For English terms with only obsolete meanings I'd suggest 1 balk.
I don't know yet what balk we should use this for, but firstly is this possible?

Greetings HeliosX (talk) 05:13, 11 September 2012 (UTC)Reply

What sense of (deprecated template usage) balk is this? I don't understand what you mean.​—msh210 (talk) 05:31, 11 September 2012 (UTC)Reply
As HeliosX is a native German speaker, I think the word he's going for is Balken (beam, bar) (cognate with defn. 2 of balk), used not only of wooden beams in German but also of little bars like on mobile phones or on progress bars in software applications. —Angr 12:52, 11 September 2012 (UTC)Reply
We could use this list from COCA of the top 60,000 lemmas in their corpus to provide the date for such a project. Some of the items included seem suspect and some might not meet CFI. They have some caveats about PoS as well.
A scheme of at least one positive mark for anything in their list and additional marks for higher frequency terms (upto three) seems adequate and feasible. I think our obsolete and rare tags provide enough warning on the other side. DCDuring TALK 23:20, 12 September 2012 (UTC)Reply

Babel templates 4 knowledge levels of the Ethiopic script R missing

[edit]

Who is willing 2 help? I was sent here 2 ask 4 help since I M no expert in designing templates and M hereby claiming the need 4 templates 4 Ethiopic: --IM Serious (talk) 10:41, 11 September 2012 (UTC)Reply

Aedini miscategorized

[edit]

I don't understand why this Translingual term is categorized in Category:English terms derived from New Latin. AFAICT no other such term is so categorized. I think I have not templated this is any unique way. I could not find a recent change to the templates it transcluded. DCDuring TALK 19:35, 11 September 2012 (UTC)Reply

Fixed; see Aedini?diff=18112881. —RuakhTALK 19:48, 11 September 2012 (UTC)Reply
I was afraid that it was a mistake I couldn't see. Thanks. DCDuring TALK 19:51, 11 September 2012 (UTC)Reply

Bug in Random Entry feature...

[edit]

This is a new bug that was not occurring a few months ago or any time prior to that.

If you go to Random Entry, by language, and selection English, it utilizes this link. For years, this has provided a reliable way to find random words. However now, ~90% of the results are words that start with the letters 'ab', another ~7% of the results are words that start with 'a' and then a letter other than 'b', and ~3% are words that don't begin with 'a' (but are not very random: they are often months, or the words 'raven', 'crow' 'raining cats and dogs', 'thesaurus', 'adjective', and related words).

Kingturtle (talk) 01:11, 12 September 2012 (UTC)Reply

See Wiktionary:Grease pit/2012/August#"Random in English" only returns words starting with "ab". --Μετάknowledgediscuss/deeds 01:18, 12 September 2012 (UTC)Reply
I really miss the "nearby words" feature. It showed me a lot of good stuff. What can we do about this? Equinox 01:35, 12 September 2012 (UTC)Reply
This was brought up on IRC and I asked a few times for an example of a nearby link but got no answer. I can't find them. Have they been removed/disabled because they don't work? in particular I'm wondering if they use the toolserver at all and how. (the randompage link does use the toolserver) --Jeremyb (talk) 12:16, 12 September 2012 (UTC)Reply
Re: "I can't find them": They're an optional feature; to turn them on, visit Wiktionary:Per-browser preferences, check "Use the preferences set on this page" and "Add links to previous and next pages", and click on the "Save settings" link.   Re: "I'm wondering if they use the toolserver at all and how": Yes, they use http://toolserver.org/~hippietrail/nearbypages.fcgi?langname=...&term=...&num=4&callback=wiktNearby.callback or http://toolserver.org/~hippietrail/nearbypages.fcgi?langname=...&term=...&num=4&seq=...&callback=wiktNearby.callback. (See User:Hippietrail/nearbypages.js.) —RuakhTALK 12:27, 12 September 2012 (UTC)Reply

watchlist deletions

[edit]

Why doesn't the watchlist show when pages on your watchlist are deleted? Wikipedia's does. --WikiTiki89 (talk) 14:06, 13 September 2012 (UTC)Reply

It's bug 33591. --Yair rand (talk) 15:19, 13 September 2012 (UTC)Reply
That's from March. Is anyone actually working on fixing it? --WikiTiki89 (talk) 15:21, 13 September 2012 (UTC)Reply
Nope. The bug is currently assigned to "Nobody". --Yair rand (talk) 15:22, 13 September 2012 (UTC)Reply

IPv6 addresses

[edit]

Hi there. Is User:2A02:2F02:C021:F012:0:0:4F76:F560 one of these new IP addresses, even though it looks like a User name?

If so (and it does indeed seem to be), do we have any advice on blocking them. I imagine that range-blocks would be a nightmare. SemperBlotto (talk) 07:53, 15 September 2012 (UTC)Reply

In case someone should wonder how to check out whether an ID is an IP or a user name: do a regular search on it, and see where it takes you. 2A02:2F02:C021:F012:0:0:4F76:F561 is an IP (lands at a contributions page), 2A02:2F02:C021:Å012:0:0:4F76:F561 is not (lands at a regular search result page). That said, I don't know whether usernames that look like IPv6 actually are possible/permitted. Njardarlogar (talk) 15:29, 15 September 2012 (UTC)Reply
We don't allow it, though it's possible to slip it by the system's restrictions by using slightly different characters. When the new IPs first came out on Mediawiki, I permablocked just such a fake IP. As for how to tell: no need to search- IPs are bluelinks that go directly to the contributions, while non-IPs go to a user page and can be redlinks (there is at least one template that erroneously creates redlinks for IPs, though). Chuck Entz (talk) 15:53, 15 September 2012 (UTC)Reply
We could always just block users from registering names with more than, say, four : characters. I doubt that it will cause any problems for legitimate user names, and if it does, they could always try another name. —CodeCat 18:17, 16 September 2012 (UTC)Reply

Template:yi-personal pronouns

[edit]

Er, stupid question, but what do I do with all the whitespace? Is there are graceful way to make it disappear (i.e., not by stretching out the plural sections?) --Μετάknowledgediscuss/deeds 07:26, 16 September 2012 (UTC)Reply

Actually, I'd think stretching out the 3rd person plural section would be a graceful (and appropriate: emhpasises that the plural applies to all genders) way of filling it. Or do you just want to colour it? - -sche (discuss) 07:52, 16 September 2012 (UTC)Reply
My dream was to combine it with {{yi-possessive pronouns}}, actually, but that looked even worse. I guess I'll just stretch it out for now, but if you get any ideas... --Μετάknowledgediscuss/deeds 17:25, 16 September 2012 (UTC)Reply
You could add in just the lemma form (like מײַן (mayn)) for each person and still keep {{yi-possessive pronouns}} as a separate template, to which, by the way, you need to add gender distinctions. Even though gender distinctions are lost when used before a noun, they are kept in other cases (e.g. אוי מאַמע מײַנע!, דאָס איז מײַנע.). --WikiTiki89 (talk) 10:50, 18 September 2012 (UTC)Reply
Will do. Please forgive my foolish mistake, because although I have heard Yiddish my whole life, I don't know a single person who is fluent, and I only started learning in earnest last week. --Μετάknowledgediscuss/deeds 23:54, 18 September 2012 (UTC)Reply
Well, my thought process was that I'd leave the template with lemma forms only, and use adjective declension templates for the pages themselves. Lo and behold, the adjective templates are a total mess, mainly composed of an very incomplete and rather idiosyncratic few written by Dick Laurent (talkcontribs). Currently, I'm trying to write templates for every possible declension as outlined by yiddishdictionary.com, but I'm having a bit of trouble. Can somebody please tell me what's wrong with {{yi-adj-final}} that's making the transcusion at מענלעך behave so strangely? --Μετάknowledgediscuss/deeds 01:10, 19 September 2012 (UTC)Reply
Re: {{yi-adj-final}}: the main problem was that various instances of what should be {{{1}}} and {{{2}}} were instead {{1}} and {{2}}. (This was made less-than-obvious by the fact that {{1}} actually exists, and creates a link based on the current pagename.) I've fixed it now. —RuakhTALK 01:41, 19 September 2012 (UTC)Reply
That's funny. I noticed that problem myself, and I went through and fixed them (they were a negligent copy-paste error), but I obviously missed most of 'em, now that I re-examine the diffs. I thought that I was plateauing in terms of my ability to handle template syntax, but this thread seems to show that instead, I'm plateauing in my ability to handle simple, stupid problems that I theoretically know how to fix (and thinking up ridiculous reasons for PIBCAKs like this). Thank you! --Μετάknowledgediscuss/deeds 02:46, 19 September 2012 (UTC)Reply

Translation boxes' greyness does not fill entire 'slot'

[edit]

right|thumb This phenomenon occurs for me (Windows, Firefox and Opera) on some entries, e.g. [[jaguar]] and [[cougar]], but not others, e.g. [[iron]]. Does it occur for anyone else? - -sche (discuss) 00:10, 18 September 2012 (UTC)Reply

Yes, on all pages (it only appears if you edit or preview the page). DTLHS (talk) 00:12, 18 September 2012 (UTC)Reply
This appears to be due to a MediaWiki software change. {{trans-top}} contains the wikitext <div class="NavHead" align="left">, which used to result in the HTML <div class="NavHead" align="left"> (that is, it was basically left alone), but which now results in the HTML <div class="NavHead" style="float: left;">. Since the result of wikitext-handling is generally cached on the server-side until a page is edited, not all pages have yet been affected by the change. —RuakhTALK 02:55, 18 September 2012 (UTC)Reply
I've now fixed that problem by editing {{trans-top}} to use style="text-align: left" rather than align="left". (Note: the edit will take a while to clear the job queue.) However, we should keep our eyes open for other consequences of this software change. It's kind of a bizarre change, since I don't think that align="left" and style="float: left" are ever semantically equivalent, are they? —RuakhTALK 03:00, 18 September 2012 (UTC)Reply
{{trans-see}} is also affected by this. DTLHS (talk) 08:02, 18 September 2012 (UTC)Reply
I've now edited {{trans-see}}, as well as {{rel-top}} and {{der-top}}. Dunno what else to check. —RuakhTALK 12:22, 18 September 2012 (UTC)Reply
Is there common functionality between these templates that can be moved to a master template? DTLHS (talk) 16:44, 18 September 2012 (UTC)Reply
The fix also needs to be applied to {{rel-top3}}, {{rel-top4}}, and {{rel-top5}}. Especially {{rel-top3}} is now widely used for Derived Terms. See, for example elbow. · (talk) 16:50, 18 September 2012 (UTC)Reply
  • Update: The software seems to have been fixed. The wikitext <div class="NavHead" align="left"> now results in the HTML <div class="NavHead" style="text-align: left;">. So, we don't need to modify any more templates, but we might want to make null-edits to existing affected pages (for the same reason as above). —RuakhTALK 19:21, 18 September 2012 (UTC)Reply

Template:homophonecat

[edit]

Can this be updated so that the links to other categories go to the specified language? Category:Russian terms with homophones currently links to English categories (one a redlink!). - -sche (discuss) 02:44, 18 September 2012 (UTC)Reply

The problem was actually in {{nyms}}, which only worked for English. Mglovesfun (talk) 09:32, 21 September 2012 (UTC)Reply

Telugu etymology template

[edit]

Can someone help me in preparing the etymology template for Telugu wiktionary, which will work in Sanskrit, and English in the beginning. Later on we can expand it to other languages. Thanking you.Rajasekhar1961 (talk) 05:52, 18 September 2012 (UTC)Reply

context used outside of definitions

[edit]

Per User talk:Mglovesfun#patskanis could someone come up with a list of all the {{context}} labels used in the main namespace outside of definitions. And yes, I really should learn Perl and do it myself. Mglovesfun (talk) 09:25, 21 September 2012 (UTC)Reply

Wiktionary:Todo/context outside of definitions. 926 instances of {{context}} occurring on a line not starting with "#" (I didn't look at the other 440 context templates) DTLHS (talk) 02:17, 22 September 2012 (UTC)Reply
The others aren't as likely to be abused because they come with formatting that makes them seem inappropriate. Specific abuses include use instead of {{sense}} under Synonyms and instead of {{a}} or {{qualifier}} or others under Translations and on inflection lines. It seems as if certain editors go through a period of using it, believing it to be preferred to hard formatting, before they learn of other ways of getting the same formatting results. DCDuring TALK 13:49, 22 September 2012 (UTC)Reply
Sense instead of a? I think you mean the opposite, right? Mglovesfun (talk) 20:50, 22 September 2012 (UTC)Reply
No, context instead of sense, and context instead of a or qualifier. Chuck Entz (talk) 21:37, 22 September 2012 (UTC)Reply
That list is so big I think It'd be good to divide it up into lots of 25 like Wiktionary:Todo/needed trans templates. Mglovesfun (talk) 10:18, 23 September 2012 (UTC)Reply
Also, as with many such lists it would be handy to have it sorted by the L2 section header that contained the offending item. I understand that there may be no convenient way to do that.
Any sectioning is handy to facilitate removing items. DCDuring TALK 14:20, 23 September 2012 (UTC)Reply

is there a wiktionary of 21st century slang?

[edit]

hello wiktionary!

i thought i saw a wiktionary of modern slang, but i don't know where to find it. can you assist ? thank you.

kevin

I don't think so, but you might be interested in Category:English neologisms and Category:English internet slang. Equinox 15:19, 21 September 2012 (UTC)Reply
Urban Dictionary? - -sche (discuss) 15:32, 21 September 2012 (UTC)Reply
UD accepts anything, though, and most of the words are not in actual use, or only by a handful of people. Equinox 15:34, 21 September 2012 (UTC)Reply

"Mark as patrolled" often gives errors; sometimes "rollback" does, too.

[edit]

Is anyone else experiencing the problem that "mark as patrolled" gives a WMF error, and doesn't actually mark the edit as patrolled? It happens to me about 25% of the time, I think. It happens especially often when the edit is several days old. Fortunately, the problem doesn't affect edits marked via the API instead of the UI, so it doesn't affect the  M  buttons on recent-changes and contributions-pages and so on, but even so, it's seriously affecting my ability to patrol. (It started several days ago.) —RuakhTALK 19:50, 23 September 2012 (UTC)Reply

Yeah, I get this error, but not (as you say) for the blue M button in Recent Changes. Equinox 20:13, 23 September 2012 (UTC)Reply
I never get any error messages, but I think it also never patrols the edit. I always use the M button. Mglovesfun (talk) 21:09, 23 September 2012 (UTC)Reply
O.K., I've created bugzilla:40481: "UI giving fatal error when trying to mark certain edits as patrolled. (en.wikt)" —RuakhTALK 18:31, 24 September 2012 (UTC)Reply
It's been happening to me today, mostly after a (successful) page deletion. SemperBlotto (talk) 18:50, 24 September 2012 (UTC)Reply

Tocharian script

[edit]

Hello,

I just discovered, the West Tocharian and the East Tocharian language both have a scribe and we actually didn't use it in any entry at all. Here is a link to [the Tocharian scribe]. I don't know if there is a font for the Tocharian scribe, if not I might email Unicode to include it, but firstly we need to find out, is there a font for the Tocharian scribe?

Actually I can't reply on this page, so I'll reply on your talk page.

Greetings HeliosX (talk) 16:34, 24 September 2012 (UTC)Reply

Tocharian is, as of yet, missing from Unicode, so there isn't any way to use it in entries I am afraid. -- Liliana 17:55, 24 September 2012 (UTC)Reply
By "scribe" you mean "script", yeah? - -sche (discuss) 20:20, 24 September 2012 (UTC)Reply

Yes. I'm currently working on to put all Tocharian symbols in a folder, if the author of the symbols allows me to use them, can someone make a font from those symbols (they're GIF-files) which can be included in Unicode?

Greetings HeliosX (talk) 14:29, 25 September 2012 (UTC)Reply

Unicode doesn't worry about specific fonts; it just assigns code points to specific characters. So, if and when Tocharian script gets into Unicode, there will be a code point named TOCHARIAN LETTER KA or the like, but Unicode won't decide what exactly that character is to look like; that's up to font designers. If you want to design a font that includes Tocharian now, you can, but you would need to put it in the Private Use area until Tocharian gets added. I don't think Wiktionary is the place to look for people to make a font for you; we're lexicographers, not font designers. And Wiktionary still won't be able to use Tocharian letters until they're in Unicode proper, not just in the Private Use area of one font. Incidentally, a more complete set of Tocharian script characters can be found here. —Angr 14:53, 25 September 2012 (UTC)Reply
If he is serious about a proposal to Unicode (but it is hard to write one, be warned!), then he will need to submit a font. I think I'm the only one around here who has experience with fonts, but I am bad at drawing. -- Liliana 15:36, 25 September 2012 (UTC)Reply

You don't need to draw, I don't need a font designer in actual drawing, but one who joins the symbols I can gather up in GIF-files into a font. First of all anyhow we'd need to get a permission from Jost Gippert for to use the GIF-files he made, I already sent an email to him. Nontheless I can draw the symbols if he rejects or he doesn't answer, but I actually can't join them in a font.

Greetings HeliosX (talk) 16:53, 25 September 2012 (UTC)Reply

Surely the shapes of letters used in century-old manuscripts are in the public domain. —Angr 17:21, 25 September 2012 (UTC)Reply
The shapes are, yes, but not the result of someone drawing the shapes in their own particular way. —CodeCat 13:21, 27 September 2012 (UTC)Reply

I'm currently doing a proposal, but is it needed to have had conversation with experts of Tocharian? And Angr, are the symbols copyrighted though he drew them?

Greetings HeliosX (talk) 13:19, 27 September 2012 (UTC)Reply

template:he-proper noun

[edit]

Why does this not accept dwv? --WikiTiki89 (talk) 16:10, 27 September 2012 (UTC)Reply

Finno-Ugric as language template

[edit]

Hello,

is Finno-Ugric included as language template as example to use it in {{attention|fiu}}?

Greetings HeliosX (talk) 18:20, 27 September 2012 (UTC)Reply

It's not a language, so no. What is that meant to call attention to? —CodeCat 18:46, 27 September 2012 (UTC)Reply

Spam Page Titles

[edit]

It looks like spammers have come up with a new strategy: moving high-traffic pages to names containing spam. It's already been suggested that we might have to ban moves by new accounts. I would like to suggest banning html from page titles. Is there any legitimate reason for html there, and, if not, can we implement such a ban? Chuck Entz (talk) 18:59, 28 September 2012 (UTC)Reply

You can't use HTML in page titles, because > and < cannot be used. And moves are already banned for new accounts, you need to wait four days before you can move pages. -- Liliana 19:04, 28 September 2012 (UTC)Reply
I think Chuck means we should ban the string html, probably by local blacklist, with any pages that are supposed to contain it locally whitelisted. - -sche (discuss) 19:38, 28 September 2012 (UTC)Reply
Is four days enough for the new registered-user move restriction? Or should only whitelisted users be allowed to move pages? DCDuring TALK 21:05, 28 September 2012 (UTC)Reply
I like your idea. -- Liliana 22:19, 28 September 2012 (UTC)Reply
Indeed, all of the accounts that took part in the move-spamming were created in January. Chuck Entz (talk) 00:34, 29 September 2012 (UTC)Reply
No, I actually misread the move log. Apparently the html is in the summary line. That can be dealt with by hiding the summary. We still need to come up with a way to address this, but that isn't it. Perhaps we should create a list of pages that shouldn't be moved, ordered by importance, and start protecting them admin-move-only. Chuck Entz (talk) 21:20, 28 September 2012 (UTC)Reply
To elaborate: there is absolutely no reason to move a page like write, since it's so widely attested with that spelling. It will no doubt change its content repeatedly and often, but there will always be an entry at that location. If a reason came up due to some technical consideration, an admin would probably be involved, anyway. The pages that are high-value targets for spammers are probably all the same in that respect, with the exception of some that might not meet CFI. We will probably also have to look back through the redirects resulting from rfd, rfv and rfm and see which ones need to be protected (a redirect can be easily hijacked by a simple page edit to the redirect page). Chuck Entz (talk) 22:27, 28 September 2012 (UTC)Reply
All 722 of the "1000 most basic English words" should probably be made immovable. - -sche (discuss) 00:01, 29 September 2012 (UTC)Reply
But they probably would include only a small number of the top one hundred words looked up, based on the highly suggestive count of actual page hits (See Category:Vulgarities by language) done some years. DCDuring TALK 02:12, 29 September 2012 (UTC)Reply
Hm, good point. Wikipedia has a tool that lets users check the number of hits entries get from outside links. I presume that tool is also available to us...? - -sche (discuss) 02:53, 29 September 2012 (UTC)Reply
"All 722 of he 1000 most basic English words" LOL. Mglovesfun (talk) 11:05, 30 September 2012 (UTC)Reply

IPA /ɡ/

[edit]

Can someone make a bot change all instances of the letter "g" to the proper IPA letter "ɡ" within {{IPA}} and {{IPAchar}}? --WikiTiki89 (talk) 22:20, 30 September 2012 (UTC)Reply

Are you sure that's right? I thought that IPA had abandoned the distinction between open-tailed and closed-tailed G, so we can just use ASCII g. No? —RuakhTALK 22:46, 30 September 2012 (UTC)Reply
I don't think it's a distinction just that open-tailed is preferable. But regardless, we should either use one or the other for all our IPA and I think it makes more sense to use the more correct form. --WikiTiki89 (talk) 22:59, 30 September 2012 (UTC)Reply
Does it depend on what font you use? There's no difference between the two letters on my monitor. BigDom (tc) 08:15, 1 October 2012 (UTC)Reply
Yes. In some fonts, the two characters are identical; in others they're different. For me, they're identical in the font I use to view text normally, but they're different in the monospaced font I use in the edit box. Can you see the difference between g and ɡ? —Angr 09:43, 1 October 2012 (UTC)Reply
Ah thanks, I see the difference now. They're both the same on my monospaced font as well you see. I suppose we might as well change all the IPA templates to open-tailed g, if only for consistency's sake. BigDom (tc) 10:23, 1 October 2012 (UTC)Reply
If we decide to change the gs, we should also change [g̊] to [ɡ̊]. - -sche (discuss) 22:49, 30 September 2012 (UTC)Reply
If that's a combining character, then wouldn't that just be a regular instance of changing [g] to [ɡ]? --WikiTiki89 (talk) 22:59, 30 September 2012 (UTC)Reply
I hope so; I just wanted to be sure. - -sche (discuss) 23:04, 30 September 2012 (UTC)Reply
A bot could also change [ʤ] to [dʒ] and [ʦ] to [ts] and [��] to [tʃ]. (Look into these characters, too.) Those ligatures are certainly obsolete. - -sche (discuss) 22:49, 30 September 2012 (UTC)Reply
Such a bot could also look for and correct uses of ' which should be ˈ, as here, and uses of : instead of ː (and who knows? maybe people even misuse , for ˌ). - -sche (discuss) 06:50, 12 October 2012 (UTC)Reply
A lot of people misuse /ɘ/ for /ə/. Does any language actually use /ɘ/? Because if not, the bot can do that too. Otherwise we could restrict it to languages that we know do not use /ɘ/. --WikiTiki89 (talk) 14:25, 12 October 2012 (UTC)Reply
AutoFormat (talkcontribs) used to do these changes, so I assume KassadBot (talkcontribs) does too. Does it? Mglovesfun (talk) 11:22, 13 October 2012 (UTC)Reply
Old Irish uses /ɘ/. I doubt anyone who adds pronunciations to Old Irish entries will mix up ɘ and ə. —Angr 10:16, 14 October 2012 (UTC)Reply