Progress in Harry Potter

The biggie here is that the Norwegian version is on its way from Norway. A clerk there and I struggled with how to pay for it. No PayPal there and their IBAN made no sense to me and my credit union wanted $60 to wire money for a 30 dollar purchase, so she just gave up and mailed me the book anyway, trusting me to get the money to her. It went off today.
That will complete my set of the first Harry Potter minus Kweyol, which it hasn’t been translated into.
I finished reading the essays in my Urdu textbook and so will shift to other reading. Therefore, I started yesterday back to reading H.P. in all the other languages. I feel good about that. The only one that gives me real trouble is Greek. I’m just plowing ahead with it to see how long before I can get 80% of a page. That’s still 20 out of 100 words I won’t know but it’s better than nothing, considering I started with very little knowledge of Greek.
The Dutch is a bit rough but it’s so close to English, I can handle it. The Norwegian will be about the same.
So, onward and upward.

Dec. 26 It arrived! And I’ve started reading. I’ll catch up tonight to the others, then tomorrow start a new page in H.P. and read that page in all the versions, INCLUDING NORWEGIAN.

Feb. 13, 2018 up-date:

After reading Eric Hermann’s Acquisition Classroom lesson on vocabulary, I decided not to let this Harry Potter thing go to waste. So as I started Chapter 3, I counted the number of words in the selection (down to Dudley telling Harry the first day of school they put your head in the toilet). Then I counted the number of words I didn’t know. Therein lies a tale.

The first issue as explained in Eric’s memo is to define a word. If we look at the number of “words” for the same text, we have two issues to discuss: are there deletions or interpolations made by the translator and does the structure of the language generate a need for more or fewer words? For example, Russian, Latin and Urdu do not use articles, so right away you reduce drastically the number of words in those languages vs Greek, Dutch, Spanish, French and Italian. Norwegian does have articles but definite articles are stuck on the end of the word and thus not counted as separate words. The distortion built in from the beginning just based on the article question is obvious.

Then we have other grammatical issues like the way highly inflected languages like Russian and Latin carry within their endings information conveyed by other languages via prepositions. Urdu is at the other extreme, with the preposition ka popping up almost every other word. Each ka gets counted as a separate word. That makes the following list understandable:

Language                          Word count                    unknown words           % of total words

English                                 220                                         –

Spanish                                 233                                      1                                            .4

Russian                                 224                                      1                                            .4

French                                   241                                      3                                           1.2

Latin                                      176                                       1                                              .5

Italian                                   229                                       7                                            3.0

Urdu                                      312                                       7                                             2.2

Dutch                                     194                                    10                                              5.2

Norwegian                            194                                    10                                               5.2

Greek                                     242                                    25                                             10.0

This was from p. 31 in the English version, the original, and 220 was the word count for the English.

The two outliers are Latin and Urdu, although Dutch and Norwegian have to be explained. For Norwegian I think the bound definite articles go a long way in accounting for the lower number and for Dutch I think it is the compounding of words. Latin uses a lot of participles which reduce the need for separate words. Russian, while just as dense morphologically as Latin and using participles generously, nevertheless uses a fair number of “little words” that Latin does not, particles that lend emphasis or direction in the discourse. And as I’ve said, Urdu uses a great many particles, a huge number really, as well as grammatical devices like ka.

One note on Urdu: in even the best printing of Urdu text, the words tend to run together, not entirely but there is little separation, so a word count has to be done very slowly so as to note separate words. In my case, since my scanning of Urdu is not as facile as it is for Russian or Spanish, say, I have to count the words in each line and then add them up. It’s just cumbersome.

One other factor that may vary with the translator: Urdu and Russian translators throw in a lot of interpolations, as if they are explaining to the reader what is going on or even “improving” on Rowling’s work. The latter is prevalent in the Russian to the point of real irritation. That translation has been panned so I look forward to the next volumes whose translators have been favorably compared to the first one.

For comparison we can look at the text immediately following the one I selected and the count of unknown words is higher. The reason is clear: the text describes Dudley’s school uniform which required words like “coat tails”, “frock”, “boater”, “maroon”, “knickers”, etc. Some languages just use the English but others do have their own words for those items. Just those add another 5 words to the unknown count.

The whole purpose of this is to see if the unknown count for the bottom five languages: Italian, Urdu, Dutch, Norwegian and Greek reduces significantly. That requires an exposition on how I determine an unknown word. Recall that I know what is supposed to be said on the page because I’ve already read the text in English, Spanish, Russian, and French (the French unknown count of 3 reflects the vagaries of texts; as I continue this, about once a week, that should vary for even the top languages. Sometimes I even see an English word I don’t know). So when I see an unknown word, I may know what it means but if I had seen the word outside on its own, would I have known it? OTOH, just because I would not have doesn’t mean I count it as an unknown word if I am able to see the meaning of it based on its internal cognate status, e.g. Urdu wmmid is hope, and wmmidvar is a person expecting, so I may see wmmidvar and be able to determine its exact meaning from context and based on its cognate status with wmmid. Context alone may be enough to count a word as known even if it would be unknown outside of a context because we often grasp the meaning of a word from its context. Disentangling that from my prior knowledge of the text might seem tricky but I feel confident I am being honest in that. The figures themselves bear this out as we go from one unknown word for Spanish to 10 for Dutch and 25 for Greek; those figures reflect my greater knowledge of the one and lesser knowledge of the other.

By the end of the first Harry Potter book, H.P. and the Philosoher’s/Sorcerer’s Stone, I would hope the Greek is substantially reduced and the top five generally have seen a reduction.

A few comments and examples: in Dutch ‘strohoed’ clearly means ‘straw hat’, but knowing that the text had the reference to straw hats in it certainly helped. Same language: chocoladecake for English 2 words. Also: citroenwaterijsje for lemon water icey, made up of 4 elements: citrus, water, ice, and a diminutive suffix -je. And then there’s mensenvermorzelende – man-crushing. 7 syllables to English’s 3. But surely there is no need to count that as more than one word, nor to mark it as unknown AS ONE WORD despite not knowing either vermor nor zelen.

May 18, 2018 Here’s further chapters:

Chapter 4   Word Count           Unknown Words           % of Total Words

Eng            133

Spanish     116                                        2                                           1.7

Russian     117                                         3                                           2.6

French       156                                        5                                           3.2

Latin            98                                        3                                           3.0

Italian        123                                       16                                         13.0

Urdu           165                                       17                                         10.0

Dutch          125                                       12                                           9.6

Norwegian 117                                        20                                         17.0

Greek           168                                      29                                          17.0

Chapter 5

English          153                                       –

Spanish         130                                       1                                               .7

Russian         123                                        1                                               .8

French           147                                        1                                               .6

Latin               101                                        1                                               .9

Italian             139                                        3                                            2.2

Urdu               216                                         6                                            2.7

Dutch              135                                         7                                            5.2

Norwegian     142                                        8                                             5.6

Greek              155                                        12                                           7.7

Chapter 6

English           194                                          –

Spanish           202                                         –

Russian           203                                         –

French             197                                          1                                             .5

Latin                 152                                         –

Italian               205                                        4                                           2.0

Urdu                  275                                         1                                             .3

Dutch                 197                                         7                                           3.5

Norwegian        208                                      17                                            8.2

Greek                 224                                       18                                           8.0

As I have stated, there is a lot of repetition of words like Urdu’s ‘ka’ that get into the word count but lower the number of possible unknown words, so the low percentage for Urdu does not reflect reality, although this last chapter, 6, revealed only one word on the Urdu page I did not know. To balance that, other than outright English words used, there are few external cognates, i.e. words whose meanings are clear by comparing them to English cognates. I admit, I sometimes pat myself on the back for recognizing Dutch bezom as being cognate with English besom (look it up. Hint: what do witches ride?)

What I am hoping is that over the seventeen chapters, an overall pattern will emerge of fewer unknown words as I reach the end of the book. I can’t imagine ever getting to zero routinely even in Spanish and Russian where my vocabulary is fairly extensive; I still find words unfamiliar to me, but then I do also in English. What if Rowlings had used besom instead of broom, would that be an unknown word for you?

Earlier I compared a later passage in Chapter 3 where Dudley’s school clothes were being described and about five unusual words were used. It is possible that one chapter’s first page will be loaded with such words and the next will use ordinary words, e.g. quill pen vs pen. I imagine that over 17 chapters only one or two will begin with a lot of unusual words. Seeing how rapidly I am progressing through this first book, I look forward to doing this again with the second H.P. book, H.P. and The Secret Room.

May 30, 2018 Here is a few lines from an interview of a translator:

Translation is a form of creative writing, it’s just creative writing within very strict parameters. Robert Frost once said that “writing free verse is like playing tennis with the net down.” Non-translation writing for me is like tennis with the net down as well. You can do anything, but do you want to? (Daisy Rockwell [granddaughter of Norman] in Scroll.In)

Here’s something from a draft I wrote some time ago, 4/5/18. I’ll put it in here:

I’ve read H.P. twice and am in Chapter 5 now reading it again. As I do so, several features of the narrative strike me. Literary critics break down novels and so on into parts that supposedly represent features of the world. This does not always work as when I attended a small college group talking with Aldous Huxley and some bright young man asked him what the Indian boy in Brave New World represented (maybe he asked that b/c we were in AZ). Huxley chuckled and said he just thought it would be fun.
However, to wax serious for this blog entry, I would point out how Moby Dick can be seen as the whale as penis and the ship as vagina (someone must have perceived that at some time), and other possible – what? analogies? The masts and sails of the ship, the seamen as semen, and so forth.
So I will launch my own interpretation of H.P and it will rise or fall based entirely on my own estimation of it.
Harry is clearly any ten year old boy. His aunt and uncle are clearly the bug-a-boo bougie.

Aug. 24, 2018 I have not written up the figures for the next chapter, 7, and I will. But first I wanted to note something strange. To remind you, I’m reading the first H.P. in Russian, French, Latin, Italian, Urdu, Dutch, Norwegian and Greek. In the original, the sorting process gets down to 3 youths after Harry; one was Ron but the first was a Black boy, then a girl, Turpin, then Ron, then a girl, Zabini. That was 4 more, not three as Rowling has it. Even stranger, not a single one of the translations mentions the Black boy. He’s out. The Greek cut the process down to one sentence other than Ron’s selection, mentioning only the two girls. Just odd. At first I noticed it in the Urdu, which is fifth on my list (I’m not reading the Spanish), so I looked for it in the next 3 and then went back to the first 4 when I didn’t find him there – he’s nowhere except in the English version. I can see that the translators would be confused by the number 3 but with 4 students named. If anyone reads H.P. in any language and sees I am confused, let me know. My own hypersensitivity (see umpteen other blog entries) makes me wonder if the Europeans and one Asian translator were shy about mentioning a Black person. I recall the Black boy with the tarantula on the train and I’ll check that tomorrow. For today I double checked: there is no circumventing the oddity of Rowling writing 3 and then citing 4; I looked for something like “other than Ron”, but nothing. I’m sure the boy with the tarantula was part of the translations but I’ll check tomorrow.

Dec. 23, 2018 My Harry Potter project has fallen by the wayside due to the need to prepare lessons in French for my granddaughter (see category TPRS). A major change is occurring today, so I hope to pick it up again in January.




  1. 伟思礼 says:

    Do you ever open the books in two languages and read “in parallel”?

    1. Pat Barrett says:

      I read them one right after the other, about a page at a time, so it is similar to reading in tandem. What I’ve done just this morning is to copy out selected passages in several languages with room for interlinear notes. Following your suggestion, next time I will copy out the same passage from each book/language and make a comparison. But you get at a point which has fascinated me as I read these, I.e. the parallels among the languages. Admittedly, they are all Indo-European, but clearly Urdu and Russian and, to a lesser extent, Latin deviate from Wester European patterns. I don’t have H.P. in Kweyol, my only non-I-E language.
      Today I had breakfast with two native Spanish speakers and I asked them about the construction in Sp: Tengo algo para comer, which paralles Compro los comestibles para sobrevivir. If you say that in French, it has to be j’ai quelquechose a manger, meaning something ‘to be eaten’, i.e. a passive sense. They both agreed that tener algo a comer does not work and tengo algo que comer means something a bit diffent.
      While most observers concentrate on matters like gender differences and different uses and frequency of tenses, I like these covert differences that even go by unnoticed b/c it’s all part of the accent. In English our variants run to I have something for to eat (poetic, archaic) and I have something which to eat (archaic). Also, a bit of a distortion, I have ought to eat and, if I am not mistaken, I have what to eat (archaic). A good, entertaining source for 18th century archaisms is Patrick O’Brian’s Jack Aubrey sea stories; lots of “We shall be lost without your sending aid soon.”
      So I’ve order an extensive French grammar. My most complete one, other than Le Bon Usage, gives “carriage” instead of “car” for voiture… (1901).
      The author is Granville. Any reviews? I rejected Modern French Grammar due to reviews on Amazon which judged it set at too low a level for those who already speak the langugage.

Leave a Reply

Your email address will not be published. Required fields are marked *