Avoiding duplicate translation units with weird symbols
Thread poster: IanW (X)
IanW (X)
IanW (X)
Local time: 15:42
German to English
+ ...
Nov 20, 2005

Dear all,

I have Trados 7.0.0.615, running on Windows XP.

I work together with three other translators for one particular client, all contributing to a common memory. I look after this memory, importing my colleagues’ translations and sending them updates of the memory containing the “final versions”. Recently, the memory doubled in size over the course of a couple of weeks – from 100,000 TUs to 200,000 TUs.

I had a look at the memory and saw that t
... See more
Dear all,

I have Trados 7.0.0.615, running on Windows XP.

I work together with three other translators for one particular client, all contributing to a common memory. I look after this memory, importing my colleagues’ translations and sending them updates of the memory containing the “final versions”. Recently, the memory doubled in size over the course of a couple of weeks – from 100,000 TUs to 200,000 TUs.

I had a look at the memory and saw that there were lots of duplicate entries where virtually everything was the same – the English translation, created by, changed by etc. – except that the German umlauts, “ß” and inverted commas had all been replaced by weird symbols. For example, there are two versions of the following German source text:
(1) Vorsicht, nichts für Anfänger
(2) Vorsicht, nichts für Anfänger

By searching and replacing an export TXT file and re-importing it into a fresh new memory, I was able to solve the problem and reduce the memory to half the size. However, I’d now like to ensure that this doesn’t happen again. I presume that it has something to do with one of my colleagues’ import or export settings, but can’t put my finger on it. Can anyone point me in the right direction?

Many thanks


Ian
Collapse


 
Antoní­n Otáhal
Antoní­n Otáhal
Local time: 15:42
Member (2005)
English to Czech
+ ...
Charset? Nov 20, 2005

It may be the charset of the text export - for example, you can download (for free) jEditor from http://www.jedit.org/ - it will enable you to see in what charset the txt file is and/or change this setting.

For example, my Czech/English TM text exports are OK if in Windows 1250 codepage, for German/English it will be something different.

Two remarks:

(1) I am not sure it is th
... See more
It may be the charset of the text export - for example, you can download (for free) jEditor from http://www.jedit.org/ - it will enable you to see in what charset the txt file is and/or change this setting.

For example, my Czech/English TM text exports are OK if in Windows 1250 codepage, for German/English it will be something different.

Two remarks:

(1) I am not sure it is the real reason, but it looks like it

(2) I do not know if and how you can change such settings directly in Trados. (Would be glad to learn if anybody knows)

HTH
Antonin
Collapse


 
Fernando Toledo
Fernando Toledo  Identity Verified
Spain
Local time: 15:42
German to Spanish
HI Ian Nov 20, 2005

I am experimenting the same situation now. Surely this can be solved with the proper Import Settings but to avoid complications, I do now always exports with filter settings.
As I explained in another Forum:
_______________________
Created on from (>)
Created on till

OR

Changed on from (>)
Changed on till

OR

Last Used from(>)
Last Used till
________________________

Do not forget the boolean
... See more
I am experimenting the same situation now. Surely this can be solved with the proper Import Settings but to avoid complications, I do now always exports with filter settings.
As I explained in another Forum:
_______________________
Created on from (>)
Created on till

OR

Changed on from (>)
Changed on till

OR

Last Used from(>)
Last Used till
________________________

Do not forget the boolean operation!

you can save this Setting as (.wcs) file so you have to modify only the Criterions next time.

Liebe Grüße

Fernando

[Edited at 2005-11-20 14:58]
Collapse


 
Hynek Palatin
Hynek Palatin  Identity Verified
Czech Republic
Local time: 15:42
English to Czech
+ ...
Unicode Nov 20, 2005

It might be caused by the new TM format which uses Unicode. TM can be exported into TXT in two formats - 7.x or 2.x-6.x. Importing a TXT file in the Trados 7 format (containing Unicode) into Trados 6.5 (or older) will create garbage. The garbage can be spread further by the exported TM. If anybody of you uses Trados 6.5 or older, you should exchange the TM only in the old format.

 
IanW (X)
IanW (X)
Local time: 15:42
German to English
+ ...
TOPIC STARTER
Thanks Nov 20, 2005

Hynek Palatin wrote:

It might be caused by the new TM format which uses Unicode. TM can be exported into TXT in two formats - 7.x or 2.x-6.x. Importing a TXT file in the Trados 7 format (containing Unicode) into Trados 6.5 (or older) will create garbage. The garbage can be spread further by the exported TM. If anybody of you uses Trados 6.5 or older, you should exchange the TM only in the old format.


Thanks all for your help - I think Hynek has found the problem, I am the only one of the four of us using Trados 7, the others all have 6.5. Hynek, can you explain to me what I should do differently when exporting?

Many thanks to you all


Ian


 
Fernando Toledo
Fernando Toledo  Identity Verified
Spain
Local time: 15:42
German to Spanish
Simply Nov 20, 2005

Ian Winick wrote:

Hynek Palatin wrote:

It might be caused by the new TM format which uses Unicode. TM can be exported into TXT in two formats - 7.x or 2.x-6.x. Importing a TXT file in the Trados 7 format (containing Unicode) into Trados 6.5 (or older) will create garbage. The garbage can be spread further by the exported TM. If anybody of you uses Trados 6.5 or older, you should exchange the TM only in the old format.


Thanks all for your help - I think Hynek has found the problem, I am the only one of the four of us using Trados 7, the others all have 6.5. Hynek, can you explain to me what I should do differently when exporting?

Many thanks to you all


Ian


In Export, select Translator Workbench (2x, 6x)(*.txt) instead of 7,x (default) as write format.

This can be the reason for the fonts problems, but I am not sure this has something to do with the duplicates...


 
Hynek Palatin
Hynek Palatin  Identity Verified
Czech Republic
Local time: 15:42
English to Czech
+ ...
Export Nov 20, 2005

See above.

 
IanW (X)
IanW (X)
Local time: 15:42
German to English
+ ...
TOPIC STARTER
Thanks Nov 21, 2005

Thanks guys - the duplicate problem is probably due to the fact that my memory exports were not overwriting the existing segments because they contained garbage. (The symbols, I mean - not the translations ...)

Thanks all round


Ian


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Avoiding duplicate translation units with weird symbols







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »