Announcements
[edit]Proposals
[edit]Bot approval requests
[edit]- See Wikisource:Bots for information about applying for a bot status
- See Wikisource:Bot requests if you require an existing bot to undertake a task
(See also #Thinking of an anti-linkrot bot.)
For non-scan backed works, sometimes the original webpage disappears and we lose the source. This task would archive automatically sources in new mainspace/talk pages at the wayback machine, and add {{wml}}.
To avoid archiving vandalism, it would only do this on pages older than a week. (It won't search beyond the 2000th created page.)
It uses pywikibot on toolforge. Source's at User:Alien333/test#Link archiving.
The idea would be to run this daily. Test edits: [1] and [2]. — Alien 3
3 3 08:59, 23 April 2025 (UTC)
- As nearly two weeks have passed without objections, I activated this task per WS:BOT. — Alien 3
3 3 13:59, 6 May 2025 (UTC)- The run is over. Before launching the cronjob I will change the code to prevent it from archiving links in mainspace works' content (there are few valid reasons for extlinks in works; but there are some). — Alien 3
3 3 14:48, 6 May 2025 (UTC)
- The run is over. Before launching the cronjob I will change the code to prevent it from archiving links in mainspace works' content (there are few valid reasons for extlinks in works; but there are some). — Alien 3
User:333Bot 2
[edit](See also #Seeking feedback on bot task to tag untagged deletion nominations for details and discussion.)
Works proposed for deletion at WS:PD or WS:PDWS:CV should be accordingly tagged. Occasionally, people forget to tag them. This task would locate these and tag them.
It uses pywikibot on toolforge. The code's at User:Alien333/test#Nomination_tagging. It would run daily. — Alien 3
3 3 14:53, 6 May 2025 (UTC)
- WS:PD or WS:PD ? Aren't they the same ? -- Beardo (talk) 18:24, 6 May 2025 (UTC)
- Yeah, you're right. Got mixed up. Meant PD and CV. — Alien 3
3 3 20:43, 6 May 2025 (UTC)
- Yeah, you're right. Got mixed up. Meant PD and CV. — Alien 3
Repairs (and moves)
[edit]Designated for requests related to the repair of works (and scans of works) presented on Wikisource
See also Wikisource:Scan lab
This should be moved to Two Songs ("The Blaeberries") (see Two Songs), as that is a more comprehensible disambiguator. TE(æ)A,ea. (talk) 12:54, 2 April 2025 (UTC)
Done — Alien 3
3 3 16:28, 12 April 2025 (UTC)
Replacement of Index:Doctor Grimshawe's Secret.djvu
[edit]The scan file here (a poor Google one) is missing multiple pages. There is a much better Library of Congress scan of the same edition already on Commons. Could someone change the index page so that it links to https://commons.wikimedia.org/wiki/File:Doctor_Grimshawe%27s_Secret_(1883).djvu instead of the current file? Once this is done, the existing transcriptions will need to be moved, as follows:—
- Index page name = Index:Doctor_Grimshawe's_Secret_(1883).djvu
- Page offset = -9 (i.e. text on /16 moves to /7)
- Pages to move = "14-42"
- Reason = "realigned pages"
Chrisguise (talk) 08:30, 3 April 2025 (UTC)
Done. Be careful, though, it's not exactly the same edition. I deleted what used to be /14 and /15, because they aren't in this file. — Alien 3
3 3 16:47, 12 April 2025 (UTC)- Thanks. According to A Bibliography of Nathaniel Hawthorne (1905), by Nina E. Browne, p.31, there was a large paper version of the book issued (limited to 250 copies). It says:
- DR. GRIMSHAWE'S SECRET, a Romance; edited with Preface and Notes by Julian Hawthorne. 13+368 pp. facsimile, D. Boston, James R. Osgood & Co. 1883.
- First edition.
- A special large paper edition, limited to 250 copies, was also issued, which had an extra title-page, and an etched frontispiece by E. H. Garrett.
- The original file was a scan of the large paper copy whereas the LoC one is of the standard sized book. However, a comparison of a few random pages indicates that the two versions were printed from the same type. Chrisguise (talk) 18:07, 12 April 2025 (UTC)
- Ok, good. Makes sense. — Alien 3
3 3 18:14, 12 April 2025 (UTC)
- Ok, good. Makes sense. — Alien 3
The Moon Pool (All Story Weekly, 1918) and The Conquest of the Moon Pool (All Story Weekly, 1919)
[edit]The works are "sourced" but not "scan-backed". The titles reflect the original publication, not the scan they were transcribed from. So, please move:
- The Moon Pool (All Story Weekly, 1918) to The Moon Pool (1940)
- The Conquest of the Moon Pool (All Story Weekly, 1919) to The Conquest of the Moon Pool (1948)
--RaboKarbakian (talk) 10:54, 9 April 2025 (UTC)
- The Internet Archive document linked on the discussion page of The Moon Pool is dated September 1939 - so shouldn't that be moved to The Moon Pool (1939) ? -- Beardo (talk) 15:00, 12 April 2025 (UTC)
- It completed its last part (installment) in 1940. We use the latest year here. We probably use the latest year because of the public domain day, which would not apply in this case, but I chose it for consistency and because public domain days keep flying by and we chose the last year for those. A lot of magazines split their volumes over two different years.--RaboKarbakian (talk) 16:29, 12 April 2025 (UTC)
- Shouldn't they be moved within the source they're sourced from? i.e. Famous Fantastic Mysteries/Issue 1/The Moon Pool and Fantastic Novels/Volume 2/Issue 3/The Conquest of the Moon Pool? — Alien 3
3 3 16:50, 12 April 2025 (UTC)- @Alien333 - good point ! -- Beardo (talk) 19:23, 12 April 2025 (UTC)
- I found and then lost a text that was hidden like this when I made one of the two All-Stories. Found it again! All-Story Weekly/Volume 98/Number 3/Fires Rekindled. If Famous Fantastic Mysteries doesn't exist--I just found that first one by accident. Oh! The New York Sun! Please, only do that "future linking" if you also do something like what is happening at The Sun, where I have been tossing redirects to when I find them.
- @Alien333 - good point ! -- Beardo (talk) 19:23, 12 April 2025 (UTC)
- And, of course, Famous Fantastic Mysteries does exist. I am moving this to a new discussion under "Other discussions"....--RaboKarbakian (talk) 09:34, 13 April 2025 (UTC)
- I was thinking the same thing. —Uzume (talk) 12:21, 13 April 2025 (UTC)
Please move Robur the Conqueror to Robur the Conqueror (Project Gutenberg).--RaboKarbakian (talk) 16:01, 19 April 2025 (UTC)
- (A side note: I saw you did it yourself; for cases like these do say that we have a different edition; I was reluctant to move because I did not know of the need to disambiguate.) — Alien 3
3 3 13:52, 20 April 2025 (UTC)
Please move Master of the World to The Master of the World (Gutenberg)--RaboKarbakian (talk) 16:10, 19 April 2025 (UTC)
Done to The Master of the World (Project Gutenberg), as that's the usual titling scheme for these works. I left a temporary redirect at Master of the World. — Alien 3
3 3 14:15, 20 April 2025 (UTC)
I want to use the Internet Archive scan instead of the Google Books one, but in doing so the pages are now misaligned because the latter scan has two extra pages in the front. So I wonder if every page starting with index page 7 could be shifted left 2 pages. Thanks, prospectprospekt (talk) 05:52, 20 April 2025 (UTC)
- Yes it's possible. Just to be sure, this would also mean deletion of /5 and /6. Is that intended? Thanks. — Alien 3
3 3 06:34, 20 April 2025 (UTC)- Yes, that is intended. prospectprospekt (talk) 12:48, 20 April 2025 (UTC)
Done. — Alien 3
3 3 14:33, 20 April 2025 (UTC)
- Yes, that is intended. prospectprospekt (talk) 12:48, 20 April 2025 (UTC)
Other discussions
[edit]Index lua issue
[edit]@CalendulaAsteraceae: All indexes I can find have "Lua error in Module:Proofreadpage_index_template at line 516: data for mw.loadData contains unsupported data type 'function'." now. I suggest we maybe revert at Module:Proofreadpage index template/config until we can sort it out. — Alien 3
3 3 19:07, 11 March 2025 (UTC)
- (Note: it has been reverted and issue is now fixed.) — Alien 3
3 3 19:44, 11 March 2025 (UTC)- Thanks. People may still encounter the issue for a while until everything is updated. It's showing up on multiple pages for me, but I find that I can clear the problem with a null edit. --EncycloPetey (talk) 19:58, 11 March 2025 (UTC)
- How long? I'm still getting it. IdiotSavant (talk) 02:24, 13 March 2025 (UTC)
- Try purging the page. —Justin (koavf)❤T☮C☺M☯ 02:29, 13 March 2025 (UTC)
- Is there going to be a way to clear the problem automatically ? Or will each index need to be done manually ? -- Beardo (talk) 13:18, 13 March 2025 (UTC)
- If we want, we could null-edit all indexes with a bot, but before undertaking mass site-wide actions I'd prefer waiting a week (so until the 18th) to see if it doesn't fix itself. — Alien 3
3 3 17:42, 13 March 2025 (UTC)- OK. Could a bot do a purge on all indexes ? -- Beardo (talk) 17:59, 13 March 2025 (UTC)
- A null-edit is I think about equivalent (for our purposes) to a hard purge. What I mean is that doing a null edit also have the effect of a purge. We could also just purge, if we want to. — Alien 3
3 3 18:18, 13 March 2025 (UTC)- I still see indexes with this error and the "What links here" tool often does not work. The Orphaned Pages listing is full of pages which are not actually orphaned. -- Beardo (talk) 16:56, 22 March 2025 (UTC)
- Ok, will try to patch something up to mass-purge things. — Alien 3
3 3 07:51, 23 March 2025 (UTC)- All wikipages do get "purged" eventually, it's just it can take quite a bit of time for indirect changes like this. You may want to check whether you can find some way to see the number of affected pages and watch that for a bit before firing up a bot (i.e. how big and whether and how fast it is decreasing). If you have to null-edit every single Index:-namespace page that's going to be a pretty big job (takes a long time and puts strain on the servers). Xover (talk) 10:26, 23 March 2025 (UTC)
- @Beardo: How often are you finding some that still have the issue? I've just fished through about two hundred of them (to try and get a good way of selecting them), and I haven't managed to find one that still has the error. — Alien 3
3 3 14:39, 23 March 2025 (UTC)- https://en.wikisource.org/w/index.php?title=Special:LonelyPages&limit=20&offset=1430 - has 3,567 Pages showing nothing links to them, and that only reaches partway through letter A. Selecting any, going to the index and doing a hard purge, and suddenly the pages find that they are linked. There must be many multiple of thousands of Pages affected. -- Beardo (talk) 21:08, 23 March 2025 (UTC)
- LonelyPages only gets updated once in a while, and
last updated 07:38, March 22, 2025
. — Alien 3
3 3 07:04, 24 March 2025 (UTC)- Generally updated every three days. Updated today. So now two weeks since the problem happened. -- Beardo (talk) 23:42, 25 March 2025 (UTC)
- Ah, indeed, it still lists them. It gives us a good means to know which indexes are affected: through api?action=query&list=querypage&qppage=Lonelypages&qplimit=500, and then by looping the continue.
- In the 5000 pages in cache, there are about 3600 Page:s, from 147 distinct indexes. If we null-edit them, and assume 150 new indexes every three days, that would make 50 null edits a day, so about two null edits an hour. Which should be mostly fine on server load. @Xover: What do you think? — Alien 3
3 3 07:04, 26 March 2025 (UTC)- It's been about a month and still on. Will try to patch up some code and report after. — Alien 3
3 3 17:39, 11 April 2025 (UTC)- Just to note that I've been going into Lonelypages at each update and null-editing the related indices. The current round didn't take long as there were a couple that had over 500 pages listed. For indices with only a few pages listed, the pages tend to be blanks that haven't been transcluded. Beeswaxcandle (talk) 19:12, 11 April 2025 (UTC)
- Oh great ! And you have also cleared out almost all of the orphaned author pages ! (A lot of those were showing as orphaned because of the lua issue.). -- Beardo (talk) 00:18, 16 April 2025 (UTC)
- Just to note that I've been going into Lonelypages at each update and null-editing the related indices. The current round didn't take long as there were a couple that had over 500 pages listed. For indices with only a few pages listed, the pages tend to be blanks that haven't been transcluded. Beeswaxcandle (talk) 19:12, 11 April 2025 (UTC)
- It's been about a month and still on. Will try to patch up some code and report after. — Alien 3
- Generally updated every three days. Updated today. So now two weeks since the problem happened. -- Beardo (talk) 23:42, 25 March 2025 (UTC)
- LonelyPages only gets updated once in a while, and
- https://en.wikisource.org/w/index.php?title=Special:LonelyPages&limit=20&offset=1430 - has 3,567 Pages showing nothing links to them, and that only reaches partway through letter A. Selecting any, going to the index and doing a hard purge, and suddenly the pages find that they are linked. There must be many multiple of thousands of Pages affected. -- Beardo (talk) 21:08, 23 March 2025 (UTC)
- @Beardo: How often are you finding some that still have the issue? I've just fished through about two hundred of them (to try and get a good way of selecting them), and I haven't managed to find one that still has the error. — Alien 3
- All wikipages do get "purged" eventually, it's just it can take quite a bit of time for indirect changes like this. You may want to check whether you can find some way to see the number of affected pages and watch that for a bit before firing up a bot (i.e. how big and whether and how fast it is decreasing). If you have to null-edit every single Index:-namespace page that's going to be a pretty big job (takes a long time and puts strain on the servers). Xover (talk) 10:26, 23 March 2025 (UTC)
- Ok, will try to patch something up to mass-purge things. — Alien 3
- I still see indexes with this error and the "What links here" tool often does not work. The Orphaned Pages listing is full of pages which are not actually orphaned. -- Beardo (talk) 16:56, 22 March 2025 (UTC)
- A null-edit is I think about equivalent (for our purposes) to a hard purge. What I mean is that doing a null edit also have the effect of a purge. We could also just purge, if we want to. — Alien 3
- OK. Could a bot do a purge on all indexes ? -- Beardo (talk) 17:59, 13 March 2025 (UTC)
- If we want, we could null-edit all indexes with a bot, but before undertaking mass site-wide actions I'd prefer waiting a week (so until the 18th) to see if it doesn't fix itself. — Alien 3
- Is there going to be a way to clear the problem automatically ? Or will each index need to be done manually ? -- Beardo (talk) 13:18, 13 March 2025 (UTC)
- Try purging the page. —Justin (koavf)❤T☮C☺M☯ 02:29, 13 March 2025 (UTC)
- How long? I'm still getting it. IdiotSavant (talk) 02:24, 13 March 2025 (UTC)
- Thanks. People may still encounter the issue for a while until everything is updated. It's showing up on multiple pages for me, but I find that I can clear the problem with a null edit. --EncycloPetey (talk) 19:58, 11 March 2025 (UTC)
We have (at least some of) all of the “horrid” novels, except one: The Orphan of the Rhine. I have just obtained scans of all four volumes, and (with Alien333’s help in splitting three of the volumes) they are now available at Author:Eleanor Sleath, if anyone would be interested in proofreading them. TE(æ)A,ea. (talk) 21:17, 24 March 2025 (UTC)
- @Alien333 Hi, I've done quite a lot on some of the 'Horrid' novels (currently, working on 'Clermont' and 'The Italian', albeit slowly). Could you obtain volumes 1, 2 and 4 of 'Horrid Mysteries'; currently only volume 3 is publically available as a scan. If so, I think that would complete the set. I believe the volumes are in the nineteenth century equivalent of EEBO and ECCO. Thanks, Chrisguise (talk) 08:46, 3 April 2025 (UTC)
- I'm much better at manipulating scans than finding one. Hathi has V.1 limited-search only, if someone knows how to bypass their restrictions; Can't help you further though. (I don't even know what EEBO and ECCO are.) — Alien 3
3 3 09:50, 3 April 2025 (UTC)- @TE(æ)A,ea.EEBO is Early English Books Online, a database of scans of (every?) book printed in English up to 1700. ECCO is Eighteenth Century Collections Online, a database which contains scans of books published between 1700 to 1800. I don't know how comprehensive it is. There's also one covering the nineteen century. I have access to the first two (also most(?)/all(?) of the content of EEBO is on IA) but not the last one. Chrisguise (talk) 10:02, 3 April 2025 (UTC)
- @TE(æ)A,ea. The version on Hathi is a modern one. The version that's partially transcribed is the first edition.I don't have anything other than general access to Hathi. I've occasionally resorted to downloading individual page images and reconstructing books, which is a bit easier these days since they appear to have removed the restriction on page downloads. It used to be the case that you got 15-20 pages and then had to wait about half an hour to download the next batch. Chrisguise (talk) 10:10, 3 April 2025 (UTC)
- Chrisguise: My scans were actually from a collection of Gothic novels in the collection of the University of Virginia, which were saved in microfilm form c. 2002. I’ve had poor luck in finding EEBO stuff on IA; it’s great that you have access to the other two, though. When I was downloading a 170-odd page book the other day, I was only rate-limited once (and that might have been incidental), so it really is a big improvement. UVA does seem to have that reel of microfilm in their collection, so I’ll see if they’re willing to send it to me. (It’s in their off-site storage, though, so that might be annoying.) TE(æ)A,ea. (talk) 15:54, 3 April 2025 (UTC)
- @Alien333 @TE(æ)A,ea.There's something odd happening here. I have just created the index page for volume 1 using what I thought was a file called 'Orphan of the Rhine v1.pdf' on Commons. To assist setting up the page list, I downloaded a copy, which contains 272 single pages. However, when I saved the index page Index:Orphan of the Rhine v1.pdf, it is linked to a file - with the same name - on Wikisource, which consists of 140 double pages. I would suggest that the Wikisource version needs to be deleted. Chrisguise (talk) 06:00, 4 April 2025 (UTC)
Done. Probably there are a few others out there that should be deleted. — Alien 3
3 3 06:07, 4 April 2025 (UTC)- @Alien333 @TE(æ)A,ea. Thanks both. I've set up the index pages for the three files. There was a minor issue with the volume 1 file (two duplicate pages) which I've fixed. Chrisguise (talk) 22:43, 16 April 2025 (UTC)
- @Alien333 @TE(æ)A,ea.There's something odd happening here. I have just created the index page for volume 1 using what I thought was a file called 'Orphan of the Rhine v1.pdf' on Commons. To assist setting up the page list, I downloaded a copy, which contains 272 single pages. However, when I saved the index page Index:Orphan of the Rhine v1.pdf, it is linked to a file - with the same name - on Wikisource, which consists of 140 double pages. I would suggest that the Wikisource version needs to be deleted. Chrisguise (talk) 06:00, 4 April 2025 (UTC)
- Chrisguise: My scans were actually from a collection of Gothic novels in the collection of the University of Virginia, which were saved in microfilm form c. 2002. I’ve had poor luck in finding EEBO stuff on IA; it’s great that you have access to the other two, though. When I was downloading a 170-odd page book the other day, I was only rate-limited once (and that might have been incidental), so it really is a big improvement. UVA does seem to have that reel of microfilm in their collection, so I’ll see if they’re willing to send it to me. (It’s in their off-site storage, though, so that might be annoying.) TE(æ)A,ea. (talk) 15:54, 3 April 2025 (UTC)
- Chrisguise: I have scanned, and Alien333 has split, volumes 1, 2, and 4 of Horrid Mysteries.
- I'm much better at manipulating scans than finding one. Hathi has V.1 limited-search only, if someone knows how to bypass their restrictions; Can't help you further though. (I don't even know what EEBO and ECCO are.) — Alien 3
Thinking of an anti-linkrot bot
[edit]We've often had the issue, that works sadly not scan-backed, but sourced to an internet page, can lose that source whenever said webpage becomes inoperant.
I've been thinking of a bot that could possibly help remedy to that. The workflow would be:
- When a page is created, if it's
- a mainspace page with an external link in the notes field
- a main talk page with an external link;
- then ask IA to archive the link.
- Then, a week later, if the page's still there and still has the link (to not do useless edits on vandalism/spam pages, which are prone to include external links), add (archived) after the link.
This would only be for new pages. It would also be useful to archive the still-functioning links of non-new pages. The issue with that would be detecting the "still-functioning" part. Does someone have an idea? And in general what do you think of this project? — Alien 3
3 3 11:34, 6 April 2025 (UTC)
- I am not sure we should consider pushing all links in new pages under your criteria to be archived at the IA Wayback Machine. As you mentioned, we do have considerable vandalism/spam and blindly pushing that to be archived elsewhere seems a poor approach. That said, I see the merit of your idea. Why not wait the week (or some other proscribed period for new link survival) and then both archive and add the link together? I know IA Wayback Machine archived content is available immediately after successfully archiving, although it does take considerable time for them to add such to their indices so one often cannot immediately search and find it. —Uzume (talk) 22:45, 6 April 2025 (UTC)
- Thanks for the feedback! I thought about waiting for archiving, but the issue is that, a week later, we do not know if the link's actually working or not, and so we might be archiving dead links, which is not very good either. This brings us back to the "can we check if a link is dead" question. — Alien 3
3 3 05:23, 7 April 2025 (UTC)- Well frankly, I feel attempting to archive a dead link is better than attempting to archive blatant spam. The dead link will at least be mostly benign but the spam link will result in pollution of the archive. —Uzume (talk) 07:21, 7 April 2025 (UTC)
- From this perspective, if we don't mind archiving dead links, it indeed makes it simpler.
- We might even perhaps want to archive non-new pages, in that case? Though it's true that the concentration of broken links is higher in older pages. — Alien 3
3 3 19:42, 8 April 2025 (UTC) - Code's done: User:Alien333/test#Link_archiving.
- Test edits are [3] and [4]. Looks good to me. What do you think? — Alien 3
3 3 20:07, 11 April 2025 (UTC)
- Well frankly, I feel attempting to archive a dead link is better than attempting to archive blatant spam. The dead link will at least be mostly benign but the spam link will result in pollution of the archive. —Uzume (talk) 07:21, 7 April 2025 (UTC)
- Thanks for the feedback! I thought about waiting for archiving, but the issue is that, a week later, we do not know if the link's actually working or not, and so we might be archiving dead links, which is not very good either. This brings us back to the "can we check if a link is dead" question. — Alien 3
So, the updated workflow is:
- among the new pages
- that are rootpages
- in main or talk namespace
- older than a week
- and contain one or more external links (not counting links to ws or archive.org), for each link:
- try to ask IA to archive that link
- try to get archival status
- if both of the above succeeded
- and there isn't a {{wml}} or raw archive.org link on the same line as the link
- then add a {{wml}}
— Alien 3
3 3 13:07, 12 April 2025 (UTC)
Policy on source-based transcriptions and Commons redactions of de minimis fair use
[edit]Wikisource has a strong policy against fair use (see Project:Copyright § Fair use) and I cannot say I have considered this deeply but I think this is a good thing as it allows excerpts to be made without having to consider the copyright implications of de minimis fair use when such excerpts are taken out of context (but I am not here to consider weakening our fair use policy). That said, the aforementioned Wikisource fair use policy says nothing about how to handle Commons content allowed under the Commons:de minimis fair use policy. Wikisource mostly consumes Commons media content via source-based transcription (see Help:Proofread); predominately via DjVu and PDF formats.
Historically Wikisource seems to have varying methods of dealing with Commons media containing de minimis fair use content. Wikisource has things like: {{text removed}}, {{image removed}}/{{FI|file=removed}}/{{FIS|file=removed}}, etc. However, I have recently noticed there is a growing de facto policy to require such fair use content to be redacted in essentially censored versions of such media and that such fair use censored media should be hosted at Commons (often to the exclusion of the original unredacted versions). Since Wikisource clearly has the tools to redact such fair use content locally (either via the above mentioned templates in transcribed Page
namespace pages or in locally hosted media where appropriate), I feel Wikisource should not pressure contributors to upload censored versions of media to Commons with de minimis fair use redacted.
Notes: It should be noted, not all derived works contain only de minimis fair use content that is acceptable at Commons and Commons itself may require censored media versions with clearly copyrighted sections of content be redacted. Also Commons admits that much of its actual de minimis fair use content is not clearly identified. Commons has a method to identify such via its c:template:de minimis and when Wikisource runs into such copyright issues I believe to be a good policy to help Commons tag such media with this template in a sort of "best effort" approach (which in the future would in turn help us to identify such sticky content, letting us know when we need to employ local censorship redactions; perhaps Index
pages could automatically have a warning when Commons media has such a template tag).
In any event, I would like Wikisource to adopt a solid policy on the handling of Commons acceptable de minimis fair use media content by expanding its fair use copyright policy to clarify its stance on such. I would prefer Wikisource local censorship over pushing for redacted versions at Commons, but one way or the other, I want to have our copyright policy expanded to clarify our handling of such. Thank you, —Uzume (talk) 22:05, 6 April 2025 (UTC)
- I'm not sure what you're asking, or where you think it is needed. You have said both "I am not here to consider weakening our fair use policy" and "I would like Wikisource to adopt a solid policy on the handling of Commons acceptable de minimis fair use media content by expanding its fair use copyright policy". Are you asking for explicit statement of current stance in some specific case or issue? Because Wikisource:Copyright policy has an entire section on "Fair use" that explicitly states the Wikisource stance and legal reasons for that stance. What more needs to be clarified? --EncycloPetey (talk) 23:51, 6 April 2025 (UTC)
- I want a clarification on how to handle de minimis fair use content at Commons. I personally would like to see our policy specify local censhorship of such and thus not pressure contributors to create redacted versions at Commons but even if the decision is made in the opposite direction a clarification of that would be good. Thanks, —Uzume (talk) 00:12, 7 April 2025 (UTC)
- For the record, EncycloPetey, this is in relation to this document, which contains an image which had been deleted from Wikimedia Commons. TE(æ)A,ea. (talk) 02:45, 7 April 2025 (UTC)
- That is certainly, one example but hardly the only one. You might notice, I tagged
c:file:Mallory v. Norfolk Southern.pdf
withc:template:de minimis
at 1018190383 which adds it toc:category:de minimis
. Another example includes the heated debate related toc:file:Campbell v. Acuff-Rose Music.pdf
(which I also tagged de minimis.) vs.c:file:Campbell v. Acuff-Rose Music (redacted).djvu
at Wikisource:Copyright discussions/Archives/2022#Undelete Campbell v. Acuff-Rose Music and relevant page:s (which suggests there was some earlier deleted discussion related to it). Clearly there were some high tensions in that discussion and the last thing I want to do is respark that flame, but in reading that, it seems there was significant pressure to have a redacted version uploaded to Commons (which eventually did occur). The discussion did mention {{text removed}} but for some reason that was not the end result. I am sure there are plenty of other occurrences. The point is, Commons is the primary backing store for most of our Proofread-based transcription and yet Commons has a copyright policy (which allows de minimis fair use) that is not compatible with the Wikisource copyright policy (which outright bans fair use entirely). So to avoid future issues, a clear policy on how this is handled here seems appropriate as an expansion of the existing copyright policy. —Uzume (talk) 04:59, 7 April 2025 (UTC)- Uzume: The earlier discussion is actually on the same page: § Campbell v. Acuff-Rose Music lyrical passages. The second discussion was created because of the favorable termination of Commons’ lengthy discussion as to the PDF. As to its closure, I won’t rehash the debate (and don’t want to here, for the record), but I will state my continued belief that the closing administrator acted incredibly inappropriately in that case. As to your no-redacted-copies argument, I think that gets the relationship backwards. Normally, files for use in indexes would be stored locally, and thus need to follow our local rule against fair use; it is only because of Foundation policy that the files have been moved to Commons. Because of this, the files are really local, in their use, even though they are stored globally. I certainly have never seen any negative feedback on Commons’ side for redacting our PDFs in line with our policy, even if a PDF intended for Commons’ use would not need the redactions. TE(æ)A,ea. (talk) 02:17, 8 April 2025 (UTC)
- @TE(æ)A,ea.: It is nice to know about the rest of the discussion but besides that being rather hot-tempered, it is not really directly relevant to this discussion (except perhaps as just another example). It sounds like you are vying for de minimis redacted media at Commons. Even if you believe "the relationship" is backwards, that does not mean is it implemented as such or seen as such by most contributors. It seems to me there is always negative feedback with regard to any type of censorship but I agree it does not seem to be a policy issue for Commons (but just looking at any Commons DR with de minimis claims and there are those who vehemently argue against any such censorship). The unredacted versions in their entirety might be useful for other purposes like Wikipedia, etc. We have the tools to locally redact such things without requiring separate redacted version of the backing media so I do not seem the usefulness in creating and maintaining extra versions for such purposes. —Uzume (talk) 02:55, 8 April 2025 (UTC)
- Uzume: The earlier discussion is actually on the same page: § Campbell v. Acuff-Rose Music lyrical passages. The second discussion was created because of the favorable termination of Commons’ lengthy discussion as to the PDF. As to its closure, I won’t rehash the debate (and don’t want to here, for the record), but I will state my continued belief that the closing administrator acted incredibly inappropriately in that case. As to your no-redacted-copies argument, I think that gets the relationship backwards. Normally, files for use in indexes would be stored locally, and thus need to follow our local rule against fair use; it is only because of Foundation policy that the files have been moved to Commons. Because of this, the files are really local, in their use, even though they are stored globally. I certainly have never seen any negative feedback on Commons’ side for redacting our PDFs in line with our policy, even if a PDF intended for Commons’ use would not need the redactions. TE(æ)A,ea. (talk) 02:17, 8 April 2025 (UTC)
- That is certainly, one example but hardly the only one. You might notice, I tagged
- you make a good point that the "de minimus" rationale at commons is ideosyncratic and inconsistant. for exampe c:File talk:Report On The Investigation Into Russian Interference In The 2016 Presidential Election.pdf. other government documents were deleted. I would go for a local exemption fair use of government documents with "non-free" citations, but there was not a consensus for that here. it would be more consistant and forthright. see also Wikisource:Scriptorium/Archives/2007-04#Public_domain_materials_with_limited_fair_use_items. --Slowking4 ‽ digitaleffie's ghost 00:52, 8 April 2025 (UTC)
- @Slowking4: Yes, on the one hand Commons claims "no fair use" but then on the other hand they have a policy supporting de minimis (which Commons claims is not a form of fair use but I do not see it that way; I see it as a very particular form of fair use) but then they waver around and only really defend that policy sometimes (but I do not really see that as a major issue for Wikisource). I do not recommend we adopt the Commons de minimis definition and instead consider it to be a form of fair use while we continue to ban all fair use (including the Commons de minimis policy) but I also recommend we document this difference of copyright policy between us and our major media backing store "sister" site and explicitly recommend we redact such parts with local templates (and not push for de minimis redacted media at Commons). We should also probably help them tag their media that contains such when we notice it. —Uzume (talk) 01:23, 8 April 2025 (UTC)
- My two cents:
- 1. First complication is that the definition of de minimis is quite vague. The examples given here are arguably not de minimis, we are not talking about is something like court opinion quotes a sentence from an email from one company employee to another or something on that side of the spectrum. For example arguing about whether the various quotes on this page should be redacted because maybe the various government officials or translators still own the copyright for small excerpts from their speeches: Page:Pentagon-Papers-Part IV. A. 5.djvu/49. Instead, most cases where people complain are full images, which they are generally complaining about specifically because they believe the images are not de minimis.
- 2. Second complication, '"de minimis" might be in regard to the fact that they are in file space but then stripped out in main space. For example, when quoting a work that is part of the anthology, we might upload the tail of the previous work in the anthology because the transcluded work starts in the middle of the page. Assuming the text isn't de minimis, let's assume it is an entire copyrighted poem, we tend to say, while it isn't transcluded so that's fine even though we would agree that a transclusion of that poem would be deleted. MarkLSteadman (talk) 02:47, 8 April 2025 (UTC)
- @MarkLSteadman: Thanks for the feedback. I am not sure I fully understand your second point but as for the first, I believe it is a moot point. Commons has such a policy and some media contains copyrighted material under such a policy (be they small excerpts or entire works) provided Commons makes a determination that such contents is considered de minimis under their policy definitions. I agree they do not always seem to apply such decisions uniformly (but then neither does the US court systems). I think the bigger take away is whether we support de minimis. I feel this is a form of fair use and we should continue to ban fair use entirely. One of the nice uses of Wikisource is that reuse of its content is guaranteed to be freely usable and by its very definition de minimis means that derivative works focusing on the contained copyrighted material becomes a copyright violation. It is quite common to take pieces of something from Wikisource and use it in a different context. You can imagine how "Appendix A" depicted at Page:Campbell v. Acuff-Rose Music (redacted).djvu/26 might have been transcribed on a separate subpage since such a construction is not uncommon but under that construction, the entire subpage would focus on the copyrighted content and thereby effectively construct a copyright infringement. Treating de minimis as a specialized form of fair use where fair use is banned rectifies this issue. —Uzume (talk) 03:18, 8 April 2025 (UTC)
- The first point is about the pressure to redact which this policy is trying to solve. Almost every case where someone was pressured before, they would be pressured again. I don't think true de minimis is a concern based on the examples brought up. E.g. a whole song like Pretty Women is not de minimis. So what are we talking about? MarkLSteadman (talk) 03:25, 8 April 2025 (UTC)
- @MarkLSteadman: Well that clearly depends on one's point of view. Typically "fair use" rules out large chunks of copyrighted works such as the inclusion of entire works like Pretty Woman but de minimis is legally a separate beast and Commons treats it as such. Commons asserts that de minimis is not fair use and has a policy accepting such while rejecting fair use. I want our copyright policy to ban de minimis in the same way we currently ban fair use (or at least deem it as fair use for Wikisource and thus be banned in the same vein). I am not sure what you mean by "true de minimis". Your opinion is that the whole song Pretty Woman is not de minimis but de minimis is subjective and always context dependent by its very definition and the inclusion of an entire work can still be considered de minimis (e.g., c:File:A Porsche 997 GT2 in front of Boutique de parfumerie Guerlain, 356 rue Saint-Honoré.jpg contains the entire The Dark Knight poster making c:File:The Dark Knight movie poster - censored copyright.jpg a copyright violation if it was not censored) as at least within the confines of c:File:Campbell v. Acuff-Rose Music.pdf, Commons did decided the entire Pretty Woman is de minimis (see c:Commons:Deletion requests/File:Campbell v. Acuff-Rose Music.pdf). Besides specifying our policy on de minimis (which I am pushing for banning), it would also be nice to note the difference in policy between here and Commons and specify recommended procedures for handling such differences. I personally would prefer to avoid recommending redaction via censored media and instead rely upon transcription templating but that can be decided here via consensus and I am certainly not the only voice in such a matter (but you know what by vote would be). —Uzume (talk) 21:09, 11 April 2025 (UTC)
- My point is that if we want to avoid pressuring people if they do de minimis but pressure them if they don't we end up in the same place where we are now because whether Pretty Woman is actually de minimis is irrelevant, just that the examples given are all arguable hence we will just argue about de minimis anyways. If you think we should be redacting "the great rejuvenation of the Chinese nation," "leading position", "the Chinese Dream" on Page:2023-MILITARY-AND-SECURITY-DEVELOPMENTS-INVOLVING-THE-PEOPLES-REPUBLIC-OF-CHINA.PDF/19 because they are copyright infringing Xi Jinping give that as an example. MarkLSteadman (talk) 21:20, 11 April 2025 (UTC)
- except for de minimus you are relying on the capricious rationales of commons functionaries, that provides no notice (or false comfort) to re-users. but if you have a local copy with a fair use warning, then WS controls the work, and provides guidance for re-users. --Slowking4 ‽ digitaleffie's ghost 13:33, 12 April 2025 (UTC)
- @MarkLSteadman: I really do not follow that rationale at all. How does one do de minimis? As far as I know it is a noun or adjective and I am unable to conceive of a way of doing or it or not. I want to avoid pressuring people to redact de minimis via media censorship but I also want to encourage it to be banned here and thus it needs to be redacted in another way and I support the templating methods used during transcription. That said, I believe the more important issue here is to add de minimis to our copyright policy banning it as we do with fair use. If/When we clearly ban it, such material needs to be redacted in one way or another. Currently it falls outside our copyright policy and contributors are being pressured to redact it however, it is not clear on the best/recommended way to accomplish such so it becomes confusing for contributors since: A) it is not specified in our copyright policy and B) there is no specified recommended way to handle the required redactions. This lack of clarity leads to confusion then arguments and heated tempers, etc. —Uzume (talk) 01:58, 13 April 2025 (UTC)
- My point is that if we want to avoid pressuring people if they do de minimis but pressure them if they don't we end up in the same place where we are now because whether Pretty Woman is actually de minimis is irrelevant, just that the examples given are all arguable hence we will just argue about de minimis anyways. If you think we should be redacting "the great rejuvenation of the Chinese nation," "leading position", "the Chinese Dream" on Page:2023-MILITARY-AND-SECURITY-DEVELOPMENTS-INVOLVING-THE-PEOPLES-REPUBLIC-OF-CHINA.PDF/19 because they are copyright infringing Xi Jinping give that as an example. MarkLSteadman (talk) 21:20, 11 April 2025 (UTC)
- @MarkLSteadman: Well that clearly depends on one's point of view. Typically "fair use" rules out large chunks of copyrighted works such as the inclusion of entire works like Pretty Woman but de minimis is legally a separate beast and Commons treats it as such. Commons asserts that de minimis is not fair use and has a policy accepting such while rejecting fair use. I want our copyright policy to ban de minimis in the same way we currently ban fair use (or at least deem it as fair use for Wikisource and thus be banned in the same vein). I am not sure what you mean by "true de minimis". Your opinion is that the whole song Pretty Woman is not de minimis but de minimis is subjective and always context dependent by its very definition and the inclusion of an entire work can still be considered de minimis (e.g., c:File:A Porsche 997 GT2 in front of Boutique de parfumerie Guerlain, 356 rue Saint-Honoré.jpg contains the entire The Dark Knight poster making c:File:The Dark Knight movie poster - censored copyright.jpg a copyright violation if it was not censored) as at least within the confines of c:File:Campbell v. Acuff-Rose Music.pdf, Commons did decided the entire Pretty Woman is de minimis (see c:Commons:Deletion requests/File:Campbell v. Acuff-Rose Music.pdf). Besides specifying our policy on de minimis (which I am pushing for banning), it would also be nice to note the difference in policy between here and Commons and specify recommended procedures for handling such differences. I personally would prefer to avoid recommending redaction via censored media and instead rely upon transcription templating but that can be decided here via consensus and I am certainly not the only voice in such a matter (but you know what by vote would be). —Uzume (talk) 21:09, 11 April 2025 (UTC)
- The first point is about the pressure to redact which this policy is trying to solve. Almost every case where someone was pressured before, they would be pressured again. I don't think true de minimis is a concern based on the examples brought up. E.g. a whole song like Pretty Women is not de minimis. So what are we talking about? MarkLSteadman (talk) 03:25, 8 April 2025 (UTC)
- @MarkLSteadman: Thanks for the feedback. I am not sure I fully understand your second point but as for the first, I believe it is a moot point. Commons has such a policy and some media contains copyrighted material under such a policy (be they small excerpts or entire works) provided Commons makes a determination that such contents is considered de minimis under their policy definitions. I agree they do not always seem to apply such decisions uniformly (but then neither does the US court systems). I think the bigger take away is whether we support de minimis. I feel this is a form of fair use and we should continue to ban fair use entirely. One of the nice uses of Wikisource is that reuse of its content is guaranteed to be freely usable and by its very definition de minimis means that derivative works focusing on the contained copyrighted material becomes a copyright violation. It is quite common to take pieces of something from Wikisource and use it in a different context. You can imagine how "Appendix A" depicted at Page:Campbell v. Acuff-Rose Music (redacted).djvu/26 might have been transcribed on a separate subpage since such a construction is not uncommon but under that construction, the entire subpage would focus on the copyrighted content and thereby effectively construct a copyright infringement. Treating de minimis as a specialized form of fair use where fair use is banned rectifies this issue. —Uzume (talk) 03:18, 8 April 2025 (UTC)
- @Slowking4: Yes, on the one hand Commons claims "no fair use" but then on the other hand they have a policy supporting de minimis (which Commons claims is not a form of fair use but I do not see it that way; I see it as a very particular form of fair use) but then they waver around and only really defend that policy sometimes (but I do not really see that as a major issue for Wikisource). I do not recommend we adopt the Commons de minimis definition and instead consider it to be a form of fair use while we continue to ban all fair use (including the Commons de minimis policy) but I also recommend we document this difference of copyright policy between us and our major media backing store "sister" site and explicitly recommend we redact such parts with local templates (and not push for de minimis redacted media at Commons). We should also probably help them tag their media that contains such when we notice it. —Uzume (talk) 01:23, 8 April 2025 (UTC)
Seeking feedback on bot task to tag untagged deletion nominations
[edit]It commonly happens, that sometimes, when people nominate a work for deletion at WS:PD or WS:CV, that they forget to tag it.
This makes more work for others, and pages can stay untagged for a long while (in one case more than a year) until someone notices it.
I've made a pywikibot script that can automatically find and fix instances of this (credit to @SnowyCinema for the idea of making one). More precisely, if:
- a section's title is of the form
== [[Title]] ==
(including slight whitespace variations), and - the page named "Title" exists, and
- it does not contain a deletion tag ({{delete}}/{{cv}}, including aliases), and
- the relevant section has not been closed (in which case it's normal for there not to be a tag anymore), and
- the deletion nomination is between one day old (to not tag stuff instantly) and one week old (it happens for some stuff to be untagged, e.g. when the content has been replaced), then:
it adds the relevant deletion tag to the relevant page.
The idea would be for this to run daily on toolforge, and for the edits to be made by User:333Bot. (I have the technical access necessary.)
User:Alien333/test#Nomination tagging contains the latest version of the code.
I have made a test run; the specifics (exact code version used (now outdated), input, output, logs) are at Special:PermaLink/14995363.
I've got three main questions:
- Are there objections to a bot like this running?
- Are there details (notably in the algorithm) that should be changed?
- Should this remind the nominating user to tag? (current code doesn't.) Often, when a user does the work this does (tagging untagged noms), they drop a message, but this has been criticised. And if the bot should remind users, would it be better to do so in a message or in the edit summary of the tag addition?
— Alien 3
3 3 19:25, 8 April 2025 (UTC)
- One thing to consider is that the section title is not always a pagename, and sometimes there are multiple items being discussed. In the event that the bot cannot find a linked pagename, it would be helpful to have the bot add a post to the discussion, asking: Have the pages under discussion been taggged with {template}? --EncycloPetey (talk) 19:29, 8 April 2025 (UTC)
- I'd just made the bot not touch these, precisely because it's very hard to know which links in the nom are nominated pages. Adding a post to the discussion would be possible, if people agree. I think any interventions in discussions should be as concise and discrete as possible. E.g. for this specific case, perhaps something like
Are the nominated pages tagged as such?—333Bot (talk)
(with "tagged as such" linking to the top of the page, which give the tagging instructions.) - Maybe it shouldn't be phrased as a question? As the bot doesn't really expect an answer. Perhaps more like a reminder? — Alien 3
3 3 19:40, 8 April 2025 (UTC)
- I'd just made the bot not touch these, precisely because it's very hard to know which links in the nom are nominated pages. Adding a post to the discussion would be possible, if people agree. I think any interventions in discussions should be as concise and discrete as possible. E.g. for this specific case, perhaps something like
Support the invocation of this bot, wholeheartedly (and thanks for tackling the issue!). I agree with the decision not to try to find non-links in section headers, since it actually could be that, for example, a section header is called "Two Discussions" and contain deletion discussions of 2 related works under them; but then, say that Two Discussions was actually the title of a real work or disambiguation page and then it accidentally gets tagged. Just a false positive I thought of.
- But I will say that often L3s (
===[[link]]===
rather than==[[link]]==
) link to works, so could those get added to the functionality? If it's not already there that is. Anyway, great work!!! SnowyCinema (talk) 19:54, 8 April 2025 (UTC)- On possible false positives: what are the odds that someone would link, in the section title, something which has nothing to do with the works, but exists? At any rate, in the year-odd I've spent here, I've never seen a deletion discussion where the title contained a link, and only a link, and that link was not one of the nominated pages.
- On L3s: fairly straightforward, will implement. — Alien 3
3 3 20:00, 8 April 2025 (UTC)- @Alien333: Sorry, just to clarify, I meant if the section header was unlinked (eg
==Two Discussions==
) and hypothetically the bot guessed where the work was despite there being no link (like==Two Discussions==
-> Go to Two Discussions -> Oops, that was a random disambig page.) And now I want to see if there's actually a work with that title on IA or Hathi... brb lol SnowyCinema (talk) 20:09, 8 April 2025 (UTC)- The bot only follows linked section headers (section headers with one link and nothing else), which seems to me a reasonable assumption because for single noms it's almost always the case. — Alien 3
3 3 20:12, 8 April 2025 (UTC)
- The bot only follows linked section headers (section headers with one link and nothing else), which seems to me a reasonable assumption because for single noms it's almost always the case. — Alien 3
- @Alien333: Sorry, just to clarify, I meant if the section header was unlinked (eg
Can someone fix the Wikidata link to Wikisource
[edit]At Q85430571 it points to an obituary for them, and not the Portal for them. Portal:Cornelia Augusta Betts RAN (talk) 00:10, 11 April 2025 (UTC)
Done — Alien 3
3 3 06:01, 11 April 2025 (UTC)
- Thanks! --RAN (talk) 16:03, 11 April 2025 (UTC)
Wikidata and Sister Projects: An online community event
[edit](Apologies for posting in English)
Hello everyone, I am excited to share news of an upcoming online event called Wikidata and Sister Projects celebrating the different ways Wikidata can be used to support or enhance with another Wikimedia project. The event takes place over 4 days between May 29 - June 1st, 2025.
We would like to invite speakers to present at this community event, to hear success stories, challenges, showcase tools or projects you may be working on, where Wikidata has been involved in Wikipedia, Commons, WikiSource and all other WM projects.
If you are interested in attending, please register here. If you would like to speak at the event, please fill out this Session Proposal template on the event talk page, where you can also ask any questions you may have.
I hope to see you at the event, in the audience or as a speaker, - MediaWiki message delivery (talk) 09:18, 11 April 2025 (UTC)
In 2006, Pathoschild moved this template to the mainspace, on the grounds that this is specific to a single work. But that was 19 years ago, and nowadays it's very common for work-specific templates to exist in the Template namespace, with the mainspace being a rather inappropriate place for it. I think we should put it back in the mainspace, and I think this should be uncontroversial, but since I don't work on EB1911 actively, I wanted to bring it up here first to make sure there weren't any technical considerations or objections before making the move? SnowyCinema (talk) 19:06, 11 April 2025 (UTC)
- I suppose you meant
we should put it back in the
templatespace, not mainspace? As the mainspace"s where it is now. — Alien 3
3 3 20:28, 11 April 2025 (UTC)- Yes *** SnowyCinema (talk) 21:28, 11 April 2025 (UTC)
- I do not have an issue with its current placement but I also have no issue in moving it. Nineteen years is well beyond "edit warring": be bold and go for it. —Uzume (talk) 21:46, 11 April 2025 (UTC)
Magazines, Newspapers and other works with many volumes
[edit]I have recently made some Main space and mostly red link pages for magazines. I did this because I want to put links for first publications at those first publications. The New Review is an example of this.
Problems with this include that some of the magazines and certainly the newspapers have a lot of volumes.
Meanwhile, there are articles that are already living within these Volume/Number spaces. Like All-Story Weekly/Volume 98/Number 3/Fires Rekindled.
There is a solution for this in use at The New York Sun which also accepts redirects. The problem with this is that it divides by year and not by volume, and it requires that human editing not happen. I divided Radio News by both year and volume. By Year, it can be navigated through the sidebar and by volume because that is how it is. A few of the magazines are issued with its volumes also evenly divided by year, but not that many of them.
What I have done at The New Review is nice because if a source exists, it can be pointed at. What is going on at The New York Sun is nice because buried links are no longer buried.
I bring this up here and now because I am wanting to tear Famous Fantastic Mysteries out of its AuxTOC so I can paste a link or two; and maybe I could be doing something else that would be better to get that link there. Like, maybe a {{tl:Header volume}} that scoops up links like {{Header periodical}} but also accepts human edits (like above or below where it works) and is not dependent upon the year.
I also had an idea for a "future link" property at wikidata. Maybe the official name could be "Wikisource volume link". It is useless if the link will be "Book title/Chapter 3" but for "Amazing Stories/Volume 2/Number 5/War of the Worlds" or "Amazing Stories/Volume 2/Number 5/The War of the Worlds" or "Amazing Stories/Volume 02/Number 05/The War of the Worlds" as you can see, it would be very useful.--RaboKarbakian (talk) 10:11, 13 April 2025 (UTC)
- There seem to be a couple of issues raised here.
- Linking to other editions of serialised works within periodicals
- Following a discussion following your edit to The Strand Magazine, I understand you'd like it to become a norm to link to a generic edition of something like The Time Machine in any periodical which included The Time Machine in it, even if the work in the periodical is a different edition with, for example, different illustrations and different spelling. I don't in general agree with this. We do not currently *have* The New Review's version of The Time Machine, and this should be made clear if you do link to it from The New Review's contents page.
- As a different example, [issue 52 onwards] of The Strand Magazine serialises The Exploits of Brigadier Gerard. These, as far as I am concerned, should *definitely not* be just linked to the already existing work The Exploits of Brigadier Gerard. Instead, *once the Strand Magazine version of this work has been proofread and transcluded*, we can use a disambiguation page to direct readers to the different versions. The Strand Magazine/Volume 9/Issue 52 <--how to make a link to what should probably be a "Number"
- The New York Sun
- I'm not sure what you are trying to highlight by showing the page for The New York Sun. If you'd like an example of a newspaper with a lot more proofread articles then I'd be interested to know your thoughts on The New York Times.
- The New York Sun is an example here because all that needs to happen is that an article be written under the namespace and it shows up on that page, automatically, and it works by year. Personally, I dislike the limited functionality as it cannot be edited by hand. This (here at Scriptorium) was to be about functionality first. The New York Times has pretty good functionality; and maybe the magazines are going to be different because of that year as a divider issue.
- The 'look and feel' of periodicals
- I see from your recent contributions that you have been doing a lot of work recently altering the contents pages of moribund periodicals to match your preferred look and feel. More consistency can be a good idea, although I'm not sure I particularly like the style you are using which leads to pages that are not friendly to the end user. The point of the main space is to provide a view to *the reader* - someone who is looking to read material, and I don't see the point of filling up every contents page with a wall of red links or to content which is not on this site. Qq1122qq (talk) 16:05, 20 April 2025 (UTC)
- Hands down, in my opinion, The Strand Magazine looks great! It is beautiful. It is, in my experience, not functional. I want to note the Volume and Number that first publications appear. There are many who think that the first publication of these works was the first book and that is so often not the case in the years of works in which we work.
- Is there a discussion about "look and feel" of the Volumed works here that I missed that caused you to rework The Strand?
- I only added the navigation by year to the way almost all of the other magazines are here. I was just improving the functionality. They looked fine while being functional, to me; in my opinion. Functionality should be first.
- Navigation through the volumes by year is great! While The Strand Magazine has its volumes nicely tucked between year changes, many, (many, many) magazines, journals don't, Putting the navigation into <h2> puts the year into the navigation that by default appears in the left "navigation" column. So, if I am looking for the volume that contains April 1904, I need only select 1904 from the "navigation column" and then find which volume contains the April number and I can also easily look at its djvu to see if it is actually in there (it could be an unverified paste error that me or someone else made). So, truly, the work does not have to be transcluded to be useful/functional here; especially if it has been uploaded already.
- Finding it online and pasting the link to the work? If the djvu is not present here, then verifications can be made online or acquired and uploaded and maybe just the one article the person is interested in proofread. It is how all of the other magazines here work and have been, at least in my experience, functional.
- Also, what the heck is the "feel" of it. Does the "feel" have anything to do with functionality? Do we worry of that here?--RaboKarbakian (talk) 16:53, 20 April 2025 (UTC)
- @RaboKarbakian
- There is definitely scope for improving the way that the index/contents pages of The Strand Magazine look. When I came back to work on it three years or so ago I tried to keep it looking like the work that had already been done (which at that stage was just a couple of issues of the first volume).
- I currently like the look of Weird Tales, which organises the material by year, and then shows all the issues within that. This would also make it easier to mix fully formatted tables of contents (which we have for the first 5 years), and links to the occasional articles which people have proofread from later issues. This seems to fit the way with your ideas, as well.
- One thing that indexing by year loses is the ability to download a whole issue of a periodical, but I don't know how many people use this facility (unlike PG, where downloading a whole 'issue' of something is the main way that people interact with their content).
- The way the Tables of Contents are formatted is how they actually do look in the printed Strand Magazines, but these were almost always removed when put into the bound volumes we use for proofreading, so there is a reason for the AuxTOC tables of contents to look the way they do. It does take some effort to make the TOCs look the way they do, though, including making sure that all of the authors and illustrators are correctly linked. One of the projects on my 'to do' list is to extend the current complete TOCs to Volume 20, which will cover the first 10 years of The Strand Magazine (and the late-Victorian, early Edwardian period which is what I'm most interested in anyway!).
- On other websites people have put quite a lot of effort into recording the TOCs of The Strand Magazine (and other periodicals). The main resource that I have used in the past is [The FictionMags Index], an incredible resource and one well worth a browse if you're interested in periodical literature. Qq1122qq (talk) 10:36, 23 April 2025 (UTC)
- I agree. It might be be useful and even helpful if the main index for a multi-volume work like a periodical was organized by date but the articles should definitely be organized by volume and issue as they were labelled in the published issues. If the root main page index is date oriented it would probably be useful to also have a volume and issue overview somewhere as well. @Qq1122qq: I agree, Galactic Central is a really useful and amazing piece of work. I keep meaning to request a Wikidata property for their magazine identifiers (I do not think the FMI, GFI and other magazine genre index identifiers would be good though), e.g., Galactic Central magazine ID: ALLSTORYMAGAZINE1905, STRANDMAGAZINE1891, etc. —Uzume (talk) 19:55, 24 April 2025 (UTC)
- FYI I have now updated The Strand Magazine to the look I was talking about earlier. The main contents page is now year based, with complete contents to 1895, and all of the currently transcluded articles linked beyond that to the appropriate issue. This should provide a good framework to people to add any articles or stories that they transclude in the future. Qq1122qq (talk) 17:40, 26 April 2025 (UTC)
- Also, what the heck is the "feel" of it. Does the "feel" have anything to do with functionality? Do we worry of that here?--RaboKarbakian (talk) 16:53, 20 April 2025 (UTC)
Duplicate ID's..
[edit]One of the major causes of Duplicate ID's seems to be a situation where Page:'s are transcluded, but on examining the Index: pages and the relevant lint error, the page numbering uses repeated "—" , dashes or generic page-names such as "img", "Plate" multiple times.
The use of "—" for blank pages isn't being challanged as it was my understanding that entirely blank pages, don't show page numbers on transclusion in any event.
Until another contributor pointed out some (re)introduced numbering errors, I had been attempting to update Index, in a good faith attempt to de-duplicate the names used. For images the approach had been to use "(<!--work numbering-->)" (typically a nnumeric or roman numeral seqeunce) or where there wasn't an internal numbering to use "(")" with the relevant facing page being used, (often with reference to a list of illustrations provided by the work being transcribed.).
For Front matter - The intended convention was to use roman numerals, based on any numbering in the work being transcribed, (or if there was a lack of relevant internal numbering to treat the first Half-title as page "i". (There are some works however where it would be possible to use conventional numbers (again based on the page numbering in the work concerned.)
I am however in the process of reverting many of my existing repairs, using the approach, as another contributor, very tactfully pointed out that there hadn't been a full discussion about this, and hence the changes were too bold or novel to remain.
What do other contributors think? It would be a very good idea to de-duplicate Index pages as much as possible. ShakespeareFan00 (talk) 16:38, 13 April 2025 (UTC)
Comment We also have published works with duplicate page numbers, sometimes with two, three, or more sets of re-used page numbers. Some examples I can think of off the cuff are Index:Tragedies of Euripides (Way 1894) v1.djvu, which has four sets of roman numbered "front matter" because parts had been previously published, and Index:Shakespeare - First Folio Faithfully Reproduced, Methuen, 1910.djvu, where the entire numbering restarts for each major section of the volume. We also have multiple "Ad" / "Adv" pages. So what problem are we trying to solve, and why is it a problem? Is it simply that we have used "Img" for image pages, and now this is a problem for some reason? What is the issue, and why does it need solving? --EncycloPetey (talk) 16:48, 13 April 2025 (UTC)
- Are these repeated numberings likely to crop up in the same transcluded section? If not then the repeated numbering per section is not an issue, (and a red herring). ShakespeareFan00 (talk) 20:34, 13 April 2025 (UTC)
- The problem seems to be that Mediawiki, doesn't like content having duplicated ID's within a single content page (possibly so that styles and classes can be unique maybe). This means that when two HTML elements share the same ID, it flags as a Linter concern. This would be relatively easy to solve, by de-duplicating the ID's in content. However on English Wikisource there are additional complications in Mainspace, namely 1) that the Page-numbering script, used when Pages: are transcluded, puts ID's for the floated left page links, (these are sometimes duplicated with in page content.) 2) Individual pages may contain duplicated ID's which aren't obvious until the transclusion stage.
- (Aside: There seems to be a Linter glitch, which means the Duplicate ID count and missing tag linter counts appear to share a counter, when they really shouldn't) 18:12, 13 April 2025 (UTC)
- Have we tried alerting the developers, to have them know that page IDs such as: -, —, _, Ad, Adv, and Img are expected to be duplicated and to not flag them? --EncycloPetey (talk) 18:30, 13 April 2025 (UTC)
- I'm not aware that anyone has.. Feel free to raise a Phabriactor ticket. (BTW I favour the (fp-xxx) approach for images, as it means Wikisource would gain the ability to link directly to images within a Mainspace page with very little effort :) )
- Anchors can do the same thing, and we're already using anchors. No need to invent an arcane symbolic system to label them. How often has linking to an image been an actual issue raised by anyone? --EncycloPetey (talk) 19:00, 13 April 2025 (UTC)
- I would agree in saying that the IDs that are repeated are not IDs people will be linking to (except maybe images). — Alien 3
3 3 20:21, 13 April 2025 (UTC)
- A lot of the false matches are mostly due to the page numbering script (So I'm not now sure if filing a Phabricator ticket would do anything. Other than asking for an option to turn of Duplicate ID detection for the output of that script specfically..) . So what do we to remove the 'noise' to find genuine issues?. ShakespeareFan00 (talk) 20:32, 13 April 2025 (UTC)
- Until someone with the skills and access knows that we're drowning in noise, the best we might do is try to craft some kind of local filter. But that is beyond my skill set. --EncycloPetey (talk) 20:36, 13 April 2025 (UTC)
- The approach I outlined above, (which was objected to), does actually solve the problem, By making the pagelist entries as unique as possible on the Index page, no other changes would be needed, and would not actually need any new filters, change to scripts, or tickets filed. It would however need other contributors to understand what the new approach was. (As you indicated, it's not as if the pagelist entries that would be updated, actually have incoming links. I am not entirely happy that "_" etc has been used on 'non-blank' pages... but that's a different issue.. ShakespeareFan00 (talk) 20:58, 13 April 2025 (UTC)
- The issue is not significant enough for me to accept a change to the long-established practice of labelling pages that are not in the numbered flow of pages with appropriate names, such as "img". For blank pages the label of a hyphen or dash of some kind is normative. Adding a prefix to a label just for the sake of making it unique is a form of user-annotation and misrepresents the work as published. Beeswaxcandle (talk) 08:07, 14 April 2025 (UTC)
- The approach I outlined above, (which was objected to), does actually solve the problem, By making the pagelist entries as unique as possible on the Index page, no other changes would be needed, and would not actually need any new filters, change to scripts, or tickets filed. It would however need other contributors to understand what the new approach was. (As you indicated, it's not as if the pagelist entries that would be updated, actually have incoming links. I am not entirely happy that "_" etc has been used on 'non-blank' pages... but that's a different issue.. ShakespeareFan00 (talk) 20:58, 13 April 2025 (UTC)
- Until someone with the skills and access knows that we're drowning in noise, the best we might do is try to craft some kind of local filter. But that is beyond my skill set. --EncycloPetey (talk) 20:36, 13 April 2025 (UTC)
- A lot of the false matches are mostly due to the page numbering script (So I'm not now sure if filing a Phabricator ticket would do anything. Other than asking for an option to turn of Duplicate ID detection for the output of that script specfically..) . So what do we to remove the 'noise' to find genuine issues?. ShakespeareFan00 (talk) 20:32, 13 April 2025 (UTC)
- So you would rather have other contributors, by having them sift through obvious noise? Perhaps you have a way of modifying the relevant scripts of filters so they aren't seeing "noise" then? ShakespeareFan00 (talk) 08:18, 14 April 2025 (UTC)
- Index:The_Valley_of_Fear.pdf (and we may have some missing blank pages BTW!),
- Index:Stevenson New Arabian Nights (Scribner, 1895).djvu
- Index:Fivechildren.djvu
- Index:The Poison Belt - Conan Doyle, 1913.djvu
- Here it was possible to determine the "works" numbering, I am more than happy to undo the changes, but by identifying the numbering the output looks tidier. ShakespeareFan00 (talk) 10:15, 14 April 2025 (UTC)
- Wouldn't it be better to solve the problem at the ID generation stage, inside the ProofreadPage extension? As mentioned above, duplicate page numbers can be unavoidable in some cases, and it's better to fix the problem once and for all than keep making workarounds. (I guess that could use a Phabricator ticket.) Arcorann (talk) 03:01, 16 April 2025 (UTC)
Font help
[edit]How do I replicate the font used for the top text on Page:Poems Plunkett.djvu/85. ToxicPea (talk) 21:38, 14 April 2025 (UTC)
- ToxicPea: You can use Template:Insular. TE(æ)A,ea. (talk) 23:05, 14 April 2025 (UTC)
Tech News: 2025-16
[edit]Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Weekly highlight
- Later this week, the default thumbnail size will be increased from 220px to 250px. This changes how pages are shown in all wikis and has been requested by some communities for many years, but wasn't previously possible due to technical limitations. [5]
- File thumbnails are now stored in discrete sizes. If a page specifies a thumbnail size that's not among the standard sizes (20, 40, 60, 120, 250, 330, 500, 960), then MediaWiki will pick the closest larger thumbnail size but will tell the browser to downscale it to the requested size. In these cases, nothing will change visually but users might load slightly larger images. If it doesn't matter which thumbnail size is used in a page, please pick one of the standard sizes to avoid the extra in-browser down-scaling step. [6][7]
Updates for editors
- The Wikimedia Foundation are working on a system called Edge Uniques which will enable A/B testing, help protect against Distributed denial-of-service attacks (DDoS attacks), and make it easier to understand how many visitors the Wikimedia sites have. This is so that they can more efficiently build tools which help readers, and make it easier for readers to find what they are looking for.
- To improve security for users, a small percentage of logins will now require that the account owner input a one-time password emailed to their account. It is recommended that you check that the email address on your account is set correctly, and that it has been confirmed, and that you have an email set for this purpose. [8]
- "Are you interested in taking a short survey to improve tools used for reviewing or reverting edits on your Wiki?" This question will be asked at 7 wikis starting next week, on Recent Changes and Watchlist pages. The Moderator Tools team wants to know more about activities that involve looking at new edits made to your Wikimedia project, and determining whether they adhere to your project's policies.
- On April 15, the full Wikidata graph will no longer be supported on query.wikidata.org. After this date, scholarly articles will be available through query-scholarly.wikidata.org, while the rest of the data hosted on Wikidata will be available through the query.wikidata.org endpoint. This is part of the scheduled split of the Wikidata Graph, which was announced in September 2024. More information is available on Wikidata.
- The latest quarterly Wikimedia Apps Newsletter is now available. It covers updates, experiments, and improvements made to the Wikipedia mobile apps.
View all 30 community-submitted tasks that were resolved last week.
Updates for technical contributors
- The latest quarterly Technical Community Newsletter is now available. This edition includes: an invitation for tool maintainers to attend the Toolforge UI Community Feedback Session on April 15th; recent community metrics; and recent technical blog posts.
Detailed code updates later this week: MediaWiki
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
MediaWiki message delivery 00:24, 15 April 2025 (UTC)
Template - {{Five hundred thousand strokes for freedom : a series of anti-slavery tracts, of which half a million are now first issued by the friends of the Negro TOC}}
[edit]I am not clear why this template exists. It seems just for the table of contents of that one book. Surely that can be included in the actual book ? -- Beardo (talk) 23:38, 15 April 2025 (UTC)
Delete - I'm guessing that the editor intends to transclude the TOC on every subpage of the work, but that's not how we generally do things here. Since the editor is not an experienced enWS editor, perhaps it would be advisable to not only delete this template, but also help clean up the transcription project's other issues. —Beleg Âlt BT (talk) 16:49, 16 April 2025 (UTC)
- @OAnick pinging the original editor to participate in the discussion if they wish to do so —Beleg Âlt BT (talk) 16:49, 16 April 2025 (UTC)
Delete and subst:, etc. as needed. Looks beautiful, but no need to be a standalone piece of content. If necessary, just copy and paste it. —Justin (koavf)❤T☮C☺M☯ 16:48, 17 April 2025 (UTC)
Done: Replaced by Auxilliary ToC and deleted. However, the text also looks 1) poorly transcribed, resembling just raw OCR, and 2) abandoned, so if it is not improved in some reasonable time, it should be nominated at Proposed deletions too. --Jan Kameníček (talk) 18:46, 17 April 2025 (UTC)
- Another user had brought in a scan-backed copy and started, but they seem to have run out of steam. I will have a go at doing the preface, so that can be replaced with a scan-backed version, at least. -- Beardo (talk) 05:09, 18 April 2025 (UTC)
- I have moved it to a shorter page name. The punctuation didn't match the original anyway. -- Beardo (talk) 05:41, 18 April 2025 (UTC)
Work period start/end, floruit and living authors
[edit]If an author's date of death is unknown but the "floruit" property is filled in on Wikidata, the date appears in our header, and the author is not categorized as a "living author." However, if the Wikidata item includes more precise "work period (start)" and "work period (end)" values instead of "floruit," the dates do not appear in the header, and the author remains categorized as living. For example, Author:Mordach Mackenzie was active in the 18th century but is still listed as a living author. This issue can be resolved by adding the "floruit" property to Wikidata in addition to the "work period" values. However, that feels redundant and arguably undesirable, since the "work period" data is more precise than "floruit." Could something be done so that the "work period" data:
- Is displayed in the header, taking precedence over "floruit" if both are present, and
- Is used for categorization in the same way "floruit" currently is?
Jan Kameníček (talk) 15:58, 16 April 2025 (UTC)
Support, I've brought this up before and I hope it can be fixed this time —Beleg Âlt BT (talk) 16:41, 16 April 2025 (UTC)
- I am having a problem with "floruit" implying still living. Unknown is unknown; at least, I think this is true.--RaboKarbakian (talk) 19:17, 16 April 2025 (UTC)
- "Floruit" does not imply that. If a floruit date of 1860 is added (for example) then the author is not categorized as still living. The reason Mordach Mackenzie is auto-categorized as "still living" is because he does not have a floruit date assigned to him. --EncycloPetey (talk) 20:44, 16 April 2025 (UTC)
Support, At first I was thinking that the solution would involve adding start/end dates to "floruit" in Wikidata, but that's not where the problem is. As Jan suggests having the Author template pick up and use the work period labels would be advantageous. The one question I would have is how would it work if only "work period (start)" or "work period (end)" existed. I'd guess it could be shown as "1860–?" or "?–1924" ... —Tcr25 (talk) 21:06, 16 April 2025 (UTC)
So what would be the rule? work period (start) or (end) before 1900? — Alien 3
3 3 06:43, 17 April 2025 (UTC)Ah, that's already defined. — Alien 3
3 3 07:12, 17 April 2025 (UTC)
- To all: I made this. Example output here. — Alien 3
3 3 07:20, 17 April 2025 (UTC)- (There are complications I hadn't thought of, so in the end this isn't fully ready.) — Alien 3
3 3 07:36, 17 April 2025 (UTC)- @Alien333: Thanks so much for taking care of this. Would it be possible to give the "work period" values priority over "floruit" if all are filled? Example: Author:William Duthie. --Jan Kameníček (talk) 08:33, 17 April 2025 (UTC)
- Ah, I did that, but then I had to move some code around and it doesn't anymore. gimme a sec. — Alien 3
3 3 08:35, 17 April 2025 (UTC) - Do we want to always take priority, even if we have only half a work period? E.g. is (for example) fl. ?-1924 better than fl. 1910? — Alien 3
3 3 08:41, 17 April 2025 (UTC)- I think so. --Jan Kameníček (talk) 08:46, 17 April 2025 (UTC)
- Ah, I did that, but then I had to move some code around and it doesn't anymore. gimme a sec. — Alien 3
- @Alien333: Thanks so much for taking care of this. Would it be possible to give the "work period" values priority over "floruit" if all are filled? Example: Author:William Duthie. --Jan Kameníček (talk) 08:33, 17 April 2025 (UTC)
- (There are complications I hadn't thought of, so in the end this isn't fully ready.) — Alien 3
Done. There we go. Including for when one of the two is missing. — Alien 3
3 3 08:34, 17 April 2025 (UTC)- William Duthie does not seem affected, there is still (fl. 1860), instead of expected (fl. 1852–1870). --Jan Kameníček (talk) 08:46, 17 April 2025 (UTC)
- You misread the order of my comments: first I posted about being done, and after I saw your message about overriding. Still working on it. — Alien 3
3 3 08:53, 17 April 2025 (UTC)There you go. — Alien 3
3 3 09:03, 17 April 2025 (UTC)- Great, thanks! I am sorry for the confusion. --Jan Kameníček (talk) 09:06, 17 April 2025 (UTC)
- You misread the order of my comments: first I posted about being done, and after I saw your message about overriding. Still working on it. — Alien 3
- William Duthie does not seem affected, there is still (fl. 1860), instead of expected (fl. 1852–1870). --Jan Kameníček (talk) 08:46, 17 April 2025 (UTC)
One more thought: Sometimes it may happen that only one of the birth/death dates is unknown + work period values are filled, which may lead to confusing outputs. I suggest the following:
data filled | current outcome | suggested outcome |
birth date=1430, death date=1480 | (1430–1480) | (1430–1480) |
birth and death unknown, work period (start)=1430, work period (end)=1480 | (fl. 1430–1480) | (fl. 1430–1480) |
birth unknown, work period (start)=1430, death=1480 | (fl. 1430 – 1480) | (fl. 1430–d. 1480) |
birth=1430, work period (end)=1480, death unknown | (1430 – fl. 1480) | (b. 1430–fl. 1480) |
It can be seen from the table that the current outcomes in the second and third row are confusing, which would be solved by the suggested change. The change suggested in the fourth row is not really necessary, it is just for the sake of consistency. --Jan Kameníček (talk) 11:25, 17 April 2025 (UTC)
- Well, long story, but the current outcome is actually, respectively: (1430–1480), (fl. 1430–1480), (–1480), and (1430–). The idea is to not mix fl dates and regular dates (we only use work period if we've got neither birth nor death). — Alien 3
3 3 11:29, 17 April 2025 (UTC)- OK, I am not really against that, but the current results are those which I have written in the table, so the transclusion of the work period values should be disabled in such cases. --Jan Kameníček (talk) 11:34, 17 April 2025 (UTC)
- There's something you didn't specify in your data: in your tests, it's the "floruit" and not the "work period" that appeared. I just retried, and with b. 1000, fl. 1860, work period start 1852, work period end 1870, it's 1000-fl. 1860 that shows up, and not 1000-fl. 1870 (as your table implies). So it's floruit mixing with certain dates, not work period mixing with certain dates. It's another issue.
- That being said, adding b. and d. for birth and death when the other uses floruit shouldn't be too difficult, and I agree it's useful. — Alien 3
3 3 12:02, 17 April 2025 (UTC)- Yes, you are right, only after you wrote it I realized that it is the floruit value and not work period values that interferes. --Jan Kameníček (talk) 12:17, 17 April 2025 (UTC)
Done (adding b. and d. to certain value when other is floruit.) — Alien 3
3 3 12:23, 17 April 2025 (UTC)
- Yes, you are right, only after you wrote it I realized that it is the floruit value and not work period values that interferes. --Jan Kameníček (talk) 12:17, 17 April 2025 (UTC)
- OK, I am not really against that, but the current results are those which I have written in the table, so the transclusion of the work period values should be disabled in such cases. --Jan Kameníček (talk) 11:34, 17 April 2025 (UTC)
Vote now on the revised UCoC Enforcement Guidelines and U4C Charter
[edit]The voting period for the revisions to the Universal Code of Conduct Enforcement Guidelines ("UCoC EG") and the UCoC's Coordinating Committee Charter is open now through the end of 1 May (UTC) (find in your time zone). Read the information on how to participate and read over the proposal before voting on the UCoC page on Meta-wiki.
The Universal Code of Conduct Coordinating Committee (U4C) is a global group dedicated to providing an equitable and consistent implementation of the UCoC. This annual review of the EG and Charter was planned and implemented by the U4C. Further information will be provided in the coming months about the review of the UCoC itself. For more information and the responsibilities of the U4C, you may review the U4C Charter.
Please share this message with members of your community so they can participate as well.
In cooperation with the U4C -- Keegan (WMF) (talk) 00:35, 17 April 2025 (UTC)
Embedded headline
[edit]See: The Des Moines Register/1910/Hahnen - Ransburg Wedding a Pretty Home Affair. Is there any way to mimic this imbedded headline? RAN (talk) 19:09, 18 April 2025 (UTC)
- I tried something by floating it left. What do you think of this? — Alien 3
3 3 19:33, 18 April 2025 (UTC)- That does it, thanks. I would love if there was more of a gap between the headline and the text, but that will take some more experimentation. --RAN (talk) 21:24, 18 April 2025 (UTC)
- Simplified it a bit by using {{float left}} with
|style=width: 0;
to do the wrapping. -ei (talk) 22:10, 25 April 2025 (UTC)
- Simplified it a bit by using {{float left}} with
- That does it, thanks. I would love if there was more of a gap between the headline and the text, but that will take some more experimentation. --RAN (talk) 21:24, 18 April 2025 (UTC)
This author page provides no license, and the only work listed is Secretary Rubio’s Meeting with Salvadoran President Nayib Bukele. That work appears to be about a meeting between Rubio and Bukele, but was the text actually written by them (like the work header claims)? Because it doesn't even contain any quotes of either, so I'm skeptical of this authorial attribution. Altogether, this is a confusing situation to me. In any case, especially given recent big-news political events surrounding Bukele, I imagine there's some kind of work somewhere by him that we could consider freely-licensed, but I'm just not sure what to do with the author page now. Any ideas? Pinging Jaredscribe who created the pages. SnowyCinema (talk) 22:26, 18 April 2025 (UTC)
- I don't see that Rubio or Bukele should be listed as authors of that item - they are subjects and I have moved it to "works about". I would think there must be some official works by him - though perhaps not in English. -- Beardo (talk) 02:23, 19 April 2025 (UTC)
- If there are no works by him, then he is not an Author and would not merit an Author page. --EncycloPetey (talk) 17:12, 20 April 2025 (UTC)
How do I proofread a png?
[edit]I uploaded File:Lewiston Evening Journal 7 Aug 1886.png; how do I go about transcribing it for Wikisource and bonus points if you help me out with the file creation/linking/etc - I don't mind typing it out by hand to transcribe it, but where to put it? Fundy Isles Historian - J (talk) 01:22, 20 April 2025 (UTC)
- There is no single method of storing and transcribing periodicals like newspapers here, but a common way to do it is something like Lewiston Evening Journal/1886/August/7/Twenty-two Deer Island Fishing Boats Seized. If the rest of the issue ends up being scanned and transcribed, it could be put into one long work at Lewiston Evening Journal/1886/August/7 or something, but if it's just this one story, that would be sufficient. —Justin (koavf)❤T☮C☺M☯ 01:29, 20 April 2025 (UTC)
- I hadn't even thought ahead to that point, I was thinking more along the lines of "Can I create an Index:XYZ.jpg or does it only work for pdf/djvu? Fundy Isles Historian - J (talk) 01:33, 20 April 2025 (UTC)
- You can 1000% create Index:Example.svg, sure. I don't think the OCR is smart enough to read a PNG, but it could be! And the benefit of making an index for just a one-page PNG with minimal text is that at least there's a way to measure if it's validated or not. —Justin (koavf)❤T☮C☺M☯ 01:41, 20 April 2025 (UTC)
- (OCR does work for JPGs/PNGs.) — Alien 3
3 3 06:40, 20 April 2025 (UTC)
- (OCR does work for JPGs/PNGs.) — Alien 3
- You can 1000% create Index:Example.svg, sure. I don't think the OCR is smart enough to read a PNG, but it could be! And the benefit of making an index for just a one-page PNG with minimal text is that at least there's a way to measure if it's validated or not. —Justin (koavf)❤T☮C☺M☯ 01:41, 20 April 2025 (UTC)
- I hadn't even thought ahead to that point, I was thinking more along the lines of "Can I create an Index:XYZ.jpg or does it only work for pdf/djvu? Fundy Isles Historian - J (talk) 01:33, 20 April 2025 (UTC)
- The best OCR software is available to all users for free at: https://ocr.wmcloud.org/ for jpgs and pngs and gifs. The OCR software used by Newspapers.com is terrible and makes many errors, it scrambles lines. omits periods, and can't handle columns of text. Also to get the date to sort properly in the index a news article title needs to be numerical with place holder leading zeros: Lewiston Evening Journal/1886/August/7/Twenty-two Deer Island Fishing Boats Seized to Lewiston Evening Journal/1886/08/07/Twenty-two Deer Island Fishing Boats Seized. Some paper titles have just the year. See: The Bergen Record. Some have the year and month and day. See: The_New_York_Times/1934. This is generally determined by whoever started adding news articles first. Whether the newspaper name use "The" is based on what Wikidata used: "The New York Times" versus "New York Times". --RAN (talk) 16:16, 3 May 2025 (UTC)
Anchors containing only numbers.
[edit]I recently created a tracking category Category:Anchor_which_is_numeric, which is currently tracking about a 1000 entires where an id= is purely digits.
This creates conflicts with the numbering used by the page numbering script, meaning that ideally such ID should not necessarily be placed directly on Page: content, where it could create potential conflicts.
This is a low priority, but migration in specifc Index ( s1, p1, n1 ID's have been approaches used elsewhere), would greatly assist, and would help improve things here, by slowly removing the source of conflicted ID's, which aids the adoption of the new parser.
(The other source of Duplicate ID's are duplicated numberings in a pagelist, suggesting sections of work might need to be in seperate Pages on transclusion, but that's a longer discussion.). ShakespeareFan00 (talk) 10:47, 20 April 2025 (UTC)
- Numeric anchors will only create a potential conflict is the numerical anchor is within the range of the number of pages. If a volume has 300 pages, but uses numerical anchors over 1000, then there is no conflict.
- But there is also the second half of the problem. If the anchors are changed, then any links to those anchors must also be changed, or we are simply breaking links. --EncycloPetey (talk) 17:10, 20 April 2025 (UTC)
- Wouldn't this be easier to solve by having the software that generates the page anchors be something other than just pure numbers? Perhaps
$p<pagenum>
or something. —Uzume (talk) 20:10, 24 April 2025 (UTC)
- Wouldn't this be easier to solve by having the software that generates the page anchors be something other than just pure numbers? Perhaps
A request for a link#identifer checker..
[edit]Currently it is possible to check for red-links (or Special:WantedPages.)
However it is not possible to identify Wiki links of the form foor#bar where the page 'foo' portion is valid but the section identifier 'id' portion is not.
It would be desirable to have a 'broken' links report, which generates a tabular report giving the Link containing page, Target page, and the identifier which has not been located in the target page.
Ideally this report should be focused on the Page and Main namespaces initially.
More complicated would be to determine for a Main namspace page, which specifc Page transcluded was responsible for generating the link which could not be resolved. Desirable but not essential. ShakespeareFan00 (talk) 18:45, 20 April 2025 (UTC)
R. C. Bednar and Čapek
[edit]I have found some information that there is a translation of Karel Čapek's play The Robber (Loupežník), made by R. C. Bednar as their master's thesis in the State University of Iowa in 1931. A copy is allegedly available in the New York Public Library, so I am just asking in case somebody has easy access there. -- Jan Kameníček (talk) 21:34, 20 April 2025 (UTC)
Tech News: 2025-17
[edit]Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.
Updates for editors
- Wikifunctions is now integrated with Dagbani Wikipedia since April 15. It is the first project that will be able to call functions from Wikifunctions and integrate them in articles. A function is something that takes one or more inputs and transforms them into a desired output, such as adding up two numbers, converting miles into metres, calculating how much time has passed since an event, or declining a word into a case. Wikifunctions will allow users to do that through a simple call of a stable and global function, rather than via a local template. [9]
- A new type of lint error has been created: Empty headings (documentation). The Linter extension's purpose is to identify wikitext patterns that must or can be fixed in pages and provide some guidance about what the problems are with those patterns and how to fix them. [10]
View all 37 community-submitted tasks that were resolved last week.
Updates for technical contributors
- Following its publication on HuggingFace, the "Structured Contents" dataset, developed by Wikimedia Enterprise, is now also available on Kaggle. This Beta initiative is focused on making Wikimedia data more machine-readable for high-volume reusers. They are releasing this beta version in a location that open dataset communities already use, in order to seek feedback, to help improve the product for a future wider release. You can read more about the overall Structured Contents project, and about the first release that's freely usable.
- There is no new MediaWiki version this week.
Meetings and events
- The Editing and Machine Learning Teams invite interested volunteers to a video meeting to discuss Peacock check, which is the latest Edit check that will detect "peacock" or "overly-promotional" or "non-neutral" language whilst an editor is typing. Editors who work with newcomers, or help to fix this kind of writing, or are interested in how we use artificial intelligence in our projects are encouraged to attend. The meeting will be on April 28, 2025 at 18:00–19:00 UTC and hosted on Zoom.
Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.
MediaWiki message delivery 21:00, 21 April 2025 (UTC)
PDF/djvu text layer
[edit]- if a pdf on commons contains a text layer, how to efficiently extract that and put into wikisource? (to save time from doing ocr.)
- how to check whether commons pdf contains text layer?
- how to check several pdf in a category, which of them might have text layer?
RoyZuo (talk) 17:44, 22 April 2025 (UTC)
- For (1), I'm not sure what you're asking. We want to have each page proofread from an Index. That way, each page can be verified side-by-side against the pages of the source scan. --EncycloPetey (talk) 17:46, 22 April 2025 (UTC)
- RoyZuo:
- An Index page starts the method by which we can view the text layer that is contained with the djvu or pdf. You can make this page by using [[Index:Name of file.ext]] See File:Taming Liquid Hydrogen The Centaur Upper Stage Rocket.pdf which made Index:Taming Liquid Hydrogen The Centaur Upper Stage Rocket.pdf (something Petey is working on) by simply changing the ":File" to "Index". You can click on any of the red page links there to see the text layer on the left and the page image on the right.
- Number 1 of my answer should show this.
- Check using the instructions in Number 1 and if you don't see the text layer or don't like it, put {{sdelete}} in the pages box of the form to have it deleted, although, it can be checked without saving. Simply click on any of the pages while "previewing" the Index and don't save the page you are peeking at either.
- Most of the files at the commons have the text layer; the OCR button is for when the occasional page of ocr is missing or if it is really bad ocr. djvu are preferred here so if there is a djvu in your options, please give this good consideration.--RaboKarbakian (talk) 19:42, 22 April 2025 (UTC)
- Thx a lot for the example! I didnt know if a text layer exists it's auto loaded. I've never worked with a file with text layer.😂 RoyZuo (talk) 20:48, 22 April 2025 (UTC)
- RoyZuo:
Database report request..
[edit]A list of links to IA scans , which are linked to from Wikisource, but for which a scan has not yet been uploaded onto Wikimedia Commons (licensing permitting).
The goal is to have a list, that can be batched into semi-automated IA-upload actions, by a second tool.
Fae was doing this in 2020-21, to some extent but the effort stalled when they left Commons. ShakespeareFan00 (talk) 10:52, 23 April 2025 (UTC)
1935 US Newspaper?
[edit]Wanted to add (another) Lewiston Evening Journal article, but this one is 1935,anyone able to point me to the Commons license I'd use? Fundy Isles Historian - J (talk) 11:40, 24 April 2025 (UTC)
- It may be not free at all.
- It depends on renewals, but given that this newspaper ran until '89, and even then only merged with another newspaper, likelihood of renewal seems high to me. — Alien 3
3 3 13:01, 24 April 2025 (UTC)- Looking at https://www.newspapers.com/image/828923887/ it seems that in 1935 the newspaper was being published without a copyright notice, so c:Template:PD-US-no notice should apply. --Jan Kameníček (talk) 14:35, 24 April 2025 (UTC)
- Thanks, got it up at Page:Lewiston Evening Journal 3 Jul 1935.png now; I appreciate just learning last week that we can Index: for jpg/PNG not just PDF - I assume that's the standard way to handle these things is with an Index? Fundy Isles Historian - J (talk) 14:59, 24 April 2025 (UTC)
- Yes. I have just standardized the way of its transclusion, see [11]. The previous way did not show the link to the page in the left margin. --Jan Kameníček (talk) 17:47, 24 April 2025 (UTC)
- Thanks, got it up at Page:Lewiston Evening Journal 3 Jul 1935.png now; I appreciate just learning last week that we can Index: for jpg/PNG not just PDF - I assume that's the standard way to handle these things is with an Index? Fundy Isles Historian - J (talk) 14:59, 24 April 2025 (UTC)
- Looking at https://www.newspapers.com/image/828923887/ it seems that in 1935 the newspaper was being published without a copyright notice, so c:Template:PD-US-no notice should apply. --Jan Kameníček (talk) 14:35, 24 April 2025 (UTC)
- Thanks, problem is I can never remember <page = ! setup=a page-list="11-3" from:26> or whatever, whereas it's easy to throw up ((, lol. Similarly, is there a reason when you populate with a header or subheader template it doesn't use subst to auto-fill name/author/etc?
- {{PD-US-not renewed}} would apply to pre-1964 articles. You can check the Wikidata entry wikidata:Q100306558 and if there is no "Online Books" entry for it, there were no renewals. We added "Online Books" for all newspapers and magazines that renewed. Some newspapers and magazines have "Online Books" entries that say "no renewal" so you have to click through to check. See for instance https://onlinebooks.library.upenn.edu/webbin/cinfo/time where you can see that Time magazine started renewing issues in 1934. --RAN (talk) 17:00, 3 May 2025 (UTC)
Help information on Index pages?
[edit]Hello all,
I am curious to hear what people think about putting key help information on index pages. Just four or five lines, mentioning in particular when to use nop, when to include/remove line breaks, how hyphenated words connect, and what to put in the header/footer, would be about all. I specifically feel this information is warranted on index pages (in addition to its current locations), thanks to the "new" experience had with correcting Index:The Best Continental Short Stories of 1923–1924.djvu. I am not sure what it is about Jan's MC nomination that lead to so many new contributors helping with this index specifically (more to my mind than any index I can remember in the MC), but it makes me think that providing immediate key information to new users, without them having to navigate anywhere else, would be useful. Thoughts welcome.
Regards, TeysaKarlov (talk) 00:11, 26 April 2025 (UTC)
- Where would it be put? As in where on the screen? Wondering about the technical means.
- Perhaps it would be better to change MediaWiki:Newarticletext. This is what appears when someone edits in page namespace.
- I think a link to WS:SG is a good, succint explanation of how stuff is generally done.
- Or perhaps Help:Beginner's guide to typography? As the other doesn't describe header/footers. — Alien 3
3 3 21:17, 26 April 2025 (UTC) - What do you mean "putting key help information on index pages"? Do you mean making proofreading tips appear on Index pages, or creating some page about Index pages, or what? We already have a Help page about basic proofreading linked from the Main page section concerning the Monthly Challenge, and we include it in the {{welcome}}. --EncycloPetey (talk) 21:23, 26 April 2025 (UTC)
- @Alien333 For location on the screen, I was thinking either just underneath the "Transclusion Index not transcluded or unreviewed" text, or where a Table of Contents would appear (top right). While I agree that Wikisource:Style guide is a good explanation, it is still a great deal longer than what I had in mind. I was thinking just the bare minimum so that the pages would transclude correctly. Also, for those editors that don't want space on the index page taken up, then there could perhaps be a setting in preferences that disables the help text.
- @EncycloPetey "Do you mean making proofreading tips appear on Index pages" - yes, this is what I meant. My concern was that links to help pages aren't always "in your face" enough to ensure that they are seen.
- Regards, TeysaKarlov (talk) 22:13, 26 April 2025 (UTC)
- Adding basic instructions to every Index page adds visual noise for everyone who already knows the basics. And it does nothing to direct people to Index pages. We already have Help pages that we send to people, who do not read them, and the main Help page is already available on the left menu of every page on the site. I am opposed to adding general information to every Index page on the site. --EncycloPetey (talk) 22:52, 26 April 2025 (UTC)
- @EncycloPetey Adding basic instructions to every Index page adds visual noise for everyone who already knows the basics. - As above, there could be a setting in preferences that disables the help text. Or, the help messages could only visible to users with less than X number of edits. I feel like there would be reasonable solutions if this is your greatest concern. My main concern (at present) is if you or other more experienced users see these new user edits as a problem (or sorts), and how best to help.
- And it does nothing to direct people to Index pages. - Again, my concern is what new users do once they start proofreading (from an Index), especially as they may have no idea that they are doing anything incorrectly, and so continue to proofread in such a manner (unless someone messages them on their talk page), or equally, if new users emulate other incorrect edits. And sure, I accept that if new users aren't all that concerned that they are doing things incorrectly, then all the help messages in the world won't make much difference.
- Regards, TeysaKarlov (talk) 00:30, 27 April 2025 (UTC)
- Adding basic instructions to every Index page adds visual noise for everyone who already knows the basics. And it does nothing to direct people to Index pages. We already have Help pages that we send to people, who do not read them, and the main Help page is already available on the left menu of every page on the site. I am opposed to adding general information to every Index page on the site. --EncycloPetey (talk) 22:52, 26 April 2025 (UTC)
- Let's be pragmatic here. Instead of adding the information to every Index (with or without an option to turn it off), create a template that can be dropped into the TOC field (or the begining of the Pages field) and put it on works that are being done through the MC (and possibly PotM). As a part of indicating that the Index has been completed, the template is removed. RC patrolling should still be happening with the {{Welcome}} template put on every new (non-vandal) contributor's Talk: page, and gentle nudges on what we expect to reinforce the enWS ethos. Beeswaxcandle (talk) 04:42, 27 April 2025 (UTC)