MultiLingual Computing, Inc., Magazine
menu 1
menu 2
menu 3
menu 4
menu 5
menu 6
menu 7
menu 8
About Us
Magazine
News
Guides
Calendar
Careers
Resources
Downloads
MultiLingual Computing Home Page

MultiLingual Article

Search Articles


Search for keyword:

Search for author:


 
 
Featured Article
Thursday, September 2, 2010


Sharing Translation
Database Information

Considerations for developing
an ethical and viable exchange of data


SUZANNE TOPPING


The Translation Memory eXchange (TMX) format allows translation elements to flow easily in and out of various translation database tools. The goal of this standard is to simplify movement of data between computer-assisted translation packages.

Unfortunately, the format’s creators have not addressed the issue of how and when translation databases should be exchanged or whether the open exchange of these databases is ethical.

While the question of ownership has been discussed for some time, the issue recently became more concrete when a site for exchange was established. The TRADOS Exchange Server (not affiliated with TRADOS Corporation) was created in August 1999 and was publicized through an e-mail discussion group which focuses on Translator’s Workbench.

The issue has therefore moved from being an interesting theoretical discussion to a questionable practice: translators can share and are sharing translation databases.





To Trade or Not To Trade

Three groups are involved in the debate over translation database trading, and each has a unique perspective. The three groups are individual translators, localization agencies and localization customers.

The desire of many individual translators is to maximize productivity by expanding their collection of translation databases as quickly as possible. As a group, they are advocates of translation database exchange.

For example, one translator comments, “I’ve been using TRADOS for six months now and reckon that it will be another year before my translation databases are of a size that will make any noticeable contribution to my work rate. Agencies are in a position where they may well turn out 20 times the volume of a single freelance translator. It will take me 20 years to gain the depth of translation database that an agency can obtain in a year. Why shouldn’t freelancers band together to get some of these advantages?”

Agencies, on the other hand, tend to think that sharing translation databases would not be feasible, efficient or ethical.

David Pooley, development manager at SDL International, provides some insight into these issues. “I find it difficult to see how translation databases can be shared,” he says. “Each client has and will require in their translation a particular style when authoring documents. It is unlikely that there will be much leverage to be gained for clients from different vertical markets due to the disparate subject matter. Even within the same market this is unlikely to change. I don’t think that the client would be content to have their documentation and Help in a similar style and terminology to that of their competitors.”

Mark Steven from The Kudos Partnership Ltd. agrees with this outlook, stating, “The sharing of translation databases would lead to confusion and indecision. What one person uses, another may dislike. Quite apart from the fact that clients may not take too kindly to having exactly the same content in their manuals as their competition.”

The third group participating in the debate is composed of people who purchase localization services. Members of this group want to protect their intellectual property and their investments.

One localization customer from a large telecommunication company says, “Most clients are educated enough now that they want control over any translation database created for a translation/localization job. They would not want you to share it with someone else. Also, because we sometimes work on things that are not general knowledge yet, you run into nondisclosure problems.”

Regardless of which stance you take, the argument boils down to three fundamental issues: ethics, quality and viability.

Ethics

The laws which apply to translation database ownership and copyright are murky and vary from country to country. When it comes to ethics, however, two of the three groups think that exchanging databases is wrong. Translators disagree.

Some translators suggest that the lack of context for translation databases removes the issue of confidentiality: “It is not possible to export a coherent text from a translation database. Doesn’t this mean that a translation database is not much more than a terminology database, and that exchanging it with colleagues does not give someone else insight in the text as a whole?”

Many translators believe that since they paid for a translation database tool, they own everything they create from it. One translator states, “Translation databases only exist because I have gone to the expense of buying a computer program which creates translation databases and because I go to the trouble of using it.” Another translator agrees, commenting, “A translation database belongs to the person using it, be it the company, agency or the translator, since he/she/it has to take the final responsibility for the product delivered.”

The majority of people believe that if nondisclosure agreements (NDAs) or other confidentiality agreements are signed at any level between the customer and the individual translator, then translation databases created from those projects must not be shared without customer approval. This can be tricky depending on the work structure. Customers working with localization agencies may put an NDA in place with the agency itself, but may not think to ask about agreements with individual translators. And different agencies have varying policies regarding these agreements. Regardless of where in the hierarchy the agreement is created, it should be applied down through the structure so that confidentiality is respected at all levels, including translation databases. To best protect confidentiality, NDAs should be established for each relationship — between the customer and the agency and between the agency and all suppliers.

Some translators say that they will evaluate what content is confidential and then exchange only nonconfidential information. But how are they able to make this type of judgment? Each customer’s opinion about confidentiality differs, and it should be the customer’s right to make these determinations.

Since localization customers often develop new technology, they are concerned about protecting their intellectual property. This extends to terminology as well. It takes a significant investment to develop multilingual terminology for new technologies. Localization customers don’t want to give these terms away to their competitors.

Quality

The second major area of concern about translation database exchange relates to quality. Many translators use translation database tools to help ensure consistency. But what happens when translators apply translation databases created by people who translate with a different style?

Some translators respond that they would use the additional databases as reference materials, and that translation databases would not be applied without ensuring that quality was good. For example, a translator comments, “Any tool in the world can be misused, and a translation database is obviously a tool. The translator must be just wise enough to distinguish between a machine’s work and an intelligent being’s. If he/she can do that, he/she is qualified to use a translation database.”

Another translator stands out from the crowd of popular opinion by opposing the idea of sharing. Iwan Davies (Translutions), an independent translator, comments that “leveraging translation databases across clients, subject area or document type, is bound to lead to a degradation in quality as translators. And generally, quality is the only selling point we have.”

Several agency personnel agree with Davies. Addison Philips, a globalization consultant with GlobalSight Corporation, states, “We’ve all seen the ‘train wreck’ that results from mixing terminology from separate product lines or terminology domains within a single customer, let alone in a global manner.”

A former employee of International Communications comments: “Global translation database sharing is an utopia. As with automatic translation, it doesn’t yet provide results which are up to our standard of quality.”

Viability

In addition to the problems of ethics and quality, many industry experts think that translation database exchange is simply not feasible. They don’t believe that the amount of work required to clean up the databases to make them leveragable across clients will be worth the supposed productivity gains.

As those who deal with translation databases know, a common or standard writing style for source materials is needed to get a decent level of matches.

David Eadington, Manager of Software Tools for Berlitz GlobalNET, states, “As a general rule with TRADOS, unless the tech writers on one project are using the same template for a subsequent/different project, there just won’t be much leveragability. We see this even on updates for the exact same product, if the client significantly rewrites the materials.”

George Spafford from Lionbridge Technologies has more to add: “Exchanging databases sounds good because reuse will go up; however, the total cost of ownership per word may increase due to an upwards shift in subsequent editing steps required to make the matched segment useable in the new context.”

Another problematic issue is how to actually manage the databases once they have been collected by a translator or vendor. Would they be compiled into a single huge, slow, unwieldy database, or would translators run projects through individual databases iteratively?

The conclusion of most agencies seems to be that translation database exchange sounds good in theory, but simply is not viable in practice. As one agency representative summed it up, “Translation databases are not new; we have been using them for quite some time and we know far too well their limits. If it is not cost/time/quality effective to share translation databases within the same company between projects, how can we expect global translation database exchange to work?”

Before Each Project Begins

Regardless of whether translation database exchange is ethical or viable, it is already happening.

If you are concerned about protecting your databases, the best thing to do is include translation database handling in contracts. At each project’s inception, written agreements should state who owns the databases, who pays for their creation and what the reuse policy will be. Terminology should also be addressed by stating who pays for and owns the glossaries, along with the reuse policy for them.

You should also make sure that confidentiality and copyright agreements cover the issue of translated text units such as translation databases.

Managing the Exchange

Given the fact that at least some translators will be sharing databases, a framework should be developed and followed to help keep transmission ethical and keep the process flowing as smoothly as possible.

When discussing this type of framework, it was suggested that the translation exchange should be available only to agencies or individuals who contribute some minimum number of translations in a given period of time. One must give in order to receive.

Working from the theory that productivity improvements will be achieved, customers of participating agencies should get a discounted rate for leveraging from the collected database. In return, these customers would also allow some of their translations to be entered into the pool.

Some consideration should be given to the way the exchange site is structured. It might make sense to create categories for subject matter type — for example, telecommunications, legal, finance, software or technology type and so on. This would help translators focus on the translation databases that are most useful to them.

A Proposed Workflow

For those people planning to participate, what is the most effective way of getting usable translation databases into the exchange? What can be done to help increase the feasibility level?

The steps below outline a starting point for a standardized preparation procedure.

Customers indicate whether they will allow translation databases from their projects to be exchanged.

Customers identify protected phrases, terms or other items which must not be shared.

Translators/agencies remove protected items from the translation databases.

Translators/agencies convert brand names to <brand name>.

Optimally, translation databases would be converted to TMX format.

Translators/agencies submit prepared translation databases to the exchange site.

Before using downloaded memories, translators or agencies could prepare materials to be translated by replacing brand names with <brand name>. This would help ensure a higher match rate. A clean-up pass would be required at the end of the job to replace all <brand name> entries with the real names from the source materials.

Lessons to Be Learned

The exchange of translation database information is a reality. As of early March 2000, 140 files had been downloaded from the TRADOS Exchange Server. Other similar sites probably exist or will soon be created.

What are the lessons to be learned?

If you are a localization customer, ask your vendors about their sharing policies and confirm your understanding in writing. If you don’t want your translation databases shared, find out how agencies ensure that their translation providers comply.

If you are a localization agency, create a sharing policy for translators who want to be providers for your company.

If you are a translator, be honorable about which translation databases you share, and careful about how you apply databases that you have not created.

The workflow suggested in this article is merely a starting point. If you have ideas for creating a standardized process and framework, please forward them to me. globe1.gif




Suzanne Topping is owner of Localization Unlimited, a consulting company. She can be reached at stopping@rochester.rr.com


This article reprinted from #33 Volume 11 Issue 5 of
MultiLingual Computing & Technology published by MultiLingual Computing, Inc., 319 North First Ave., Sandpoint, Idaho, USA, 208-263-8178, Fax: 208-263-6310.

July/August, 2000


 
     

 


webmaster@multilingual.com ©1998-2010, Copyright MultiLingual Computing, Inc. No duplication or reproduction without expressed written permission.