From ram@srv.net Sat Jan 5 20:16:03 2002 From: ram@srv.net (Rick Morneau) Date: Sat, 5 Jan 2002 13:16:03 -0700 Subject: [MT-List] Generation of English genitives Message-ID: <200201052016.g05KG3A17707@localhost.localdomain> Does anyone have any pointers to algorithms that can be used to determine whether to generate an apostrophe-s genitive or a genitive using "of"? I'm currently using an ad hoc approach that doesn't always work very well, but I'd really like to use something more principled. Any help would be greatly appreciated. Regards, Rick Morneau ram@srv.net ram@axxess.net From steven.krauwer@let.uu.nl Tue Jan 15 15:22:23 2002 From: steven.krauwer@let.uu.nl (Steven Krauwer) Date: Tue, 15 Jan 2002 16:22:23 +0100 (MET) Subject: [MT-List] CfP: MT Roadmap Workshop at TMI2002, Keihanna, Japan Message-ID: <200201151522.g0FFMNZ03799@sfinx.let.uu.nl> _________________________________________________________________ Call for Papers MT Roadmap Workshop (March 16) at TMI2002 (March 13-17) Keihanna (near Kyoto), Japan Organized by ELSNET Background: Since 2000, ELSNET (The European Network of Excellence in Human Language Technologies) has organised a series of workshops aimed at the creation of a broadly supported technological roadmap for various subfields of language and speech technology. A technology roadmap comprises an analysis of the present situation, a vision of where we want to be in e.g. ten years from now, and a number of intermediate milestones that would help in setting intermediate goals and in measuring our progress towards our goals. The function of the road map is not to impose anything on anyone, but rather to provide a broadly supported definition of a context in which to position the MT community's efforts, which would allow us to identify common priorities for joint activities in e.g. research, resources and training. ELSNET aims at (co-)organizing roadmap workshops at all major events in order to encourage continuous reflection on where we stand, where we want to go, and, most importantly, how we can get there. Format: It will be a one day workshop consisting of three components: * The first component will aim at giving a critical analysis of the present state of Machine Translation in the broadest sense. * The second component will be dedicated to visions of the future. * The third component will aim at identifying major research challenges and establishing intermediate milestones on our way towards our goals. Each of the components could contain * an invited speaker or panel, * submitted papers (reviewed), * and ample space for discussion. The results of the discussions will be published in a report that will published on ELSNET's website, and that will serve as a starting point of a broad consultation of the MT community on the future directions of Machine Translation in the broadest sense. Papers: We invite papers that * give critical analyses of the present state of the art in machine translation of written and spoken language, * present visions of the future of machine translation, both from a theoretical, a technological or from an application point of view, or * identify major milestones and challenges (both theoretical, practical and organisational) on our way towards the future. Submission: Abstracts must be submitted in English, and should be no more than 4 pages long and in single column format. Submissions should be sent electronically in plain text, MS Word or PDF to steven.krauwer@elsnet.org. Important dates: * Submission deadline: 13 February * Notification: 22 February * Final papers due: 5 March * Workshop: 16 March Proceedings: Participants will receive proceedings, containing * a summary of the results of the previous MT Roadmap workshop (at the MT Summit) * summaries of the invited talks * full versions of the submitted papers Audience: The primary audience consists of people with an analytical or future oriented, programmatic interest, both from research and from industry. Registration: The workshop registration fee is 5000 yen (app. 38 USD or 43 Euro). Registration details can be found on the TMI 2002 website (see below). URLs: Main conference: http://www.kecl.ntt.co.jp/events/tmi/ This workshop: http://www.elsnet.org/roadmap-tmi2002.html Contact point: Steven Krauwer (steven.krauwer@elsnet.org) ELSNET / Utrecht University Trans 10, 3512 JK Utrecht, NL phone +31 30 253 6050 fax +31 30 253 6000 Core Programme Committee: * Steven Krauwer * Laurie Gerber From steveri@microsoft.com Mon Jan 21 20:24:05 2002 From: steveri@microsoft.com (Steve Richardson) Date: Mon, 21 Jan 2002 12:24:05 -0800 Subject: [MT-List] AMTA-2002 Call for Participation Message-ID: <0FDD2891FCDF6E42891AEDE2198E5F7C0425D66F@red-msg-04.redmond.corp.microsoft.com> This is a multi-part message in MIME format. --------------InterScan_NT_MIME_Boundary Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C1A2B9.9294E1F8" ------_=_NextPart_001_01C1A2B9.9294E1F8 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable --- CALL FOR PARTICIPATION --- The Association for Machine Translation in the Americas =20 AMTA-2002 Conference Location: Tiburon, California Dates: October 8-12, 2002 =20 The Association for Machine Translation in the Americas (AMTA) is = pleased to announce its fifth biennial conference, planned for October = 8-12, 2002, in Tiburon (near San Francisco), California. =20 Conference theme: From Research to Real Users =20 Ever since the showdown between Empiricists and Rationalists a decade = ago at TMI-92, MT researchers have hotly pursued promising paradigms for = MT, including data-driven approaches (e.g., statistical, example-based) = and hybrids that integrate these with more traditional rule-based = components. =20 During the same period, commercial MT systems with standard transfer = architectures have evolved along a parallel and almost unrelated track, = increasing their coverage (primarily through manual update of their = lexicons, we assume) and achieving much broader acceptance and usage, = principally through the medium of the Internet. Web page translators = have become commonplace; a number of online translation services have = appeared, including in their offerings both raw and post-edited MT; and = large corporations have been turning increasingly to MT to address the = exigencies of global communication. Still, the output of the = transfer-based systems employed in this expansion represents but a small = drop in the ever-growing translation marketplace bucket. =20 Now, 10 years later, we wonder if this mounting variety of MT users is = any better off, and if the promise of the research technologies is being = realized to any measurable degree. In this regard, we pose the = following questions: =20 Why aren't any current commercially available MT systems primarily = data-driven? =20 Do any commercially available systems integrate (or plan to integrate) = data-driven components? =20 Do data-driven systems have significant performance or quality issues? =20 Can such systems really provide better quality to users, or is their = main advantage one of fast, facilitated customization? =20 If any new MT technology could provide such benefits (somewhat higher = quality, or facilitated customization), would that be the key to more = widespread use of MT, or are there yet other more relevant unresolved = issues, such as system integration? =20 If better quality, customization, or system integration aren't the = answer, then what is it that users really need from MT in order for it = to be more useful to them? =20 We solicit participation on these and other topics related to the = research, development, and use of MT in the form of original papers, = demonstrations, workshops, tutorials, and panels. We invite all who are = interested in MT to participate, including developers, researchers, end = users, professional translators, managers, and marketing experts. We = especially invite users to share their experiences, developers to = describe their novel systems, managers and marketers to talk about what = is happening in the marketplace, researchers to detail new capabilities = or methods, and visionaries to describe the future as they see it. We = also welcome and encourage participation by members of AMTA's sister = organizations, AAMT in Asia and EAMT in Europe.=20 =20 Details regarding the conference may be found on the AMTA Web site: http://www.amtaweb.org/AMTA2002/ =20 =20 CONFERENCE ORGANIZERS Elliott Macklovitch, General Chair Stephen D. Richardson, Program Chair Violetta Cavalli-Sforza, Local Arrangements Chair=20 Bob Frederking, Workshops and Tutorials=20 Laurie Gerber, Exhibits Coordinator=20 =20 =20 AMTA-2002: PAPER AND SYSTEM DESCRIPTION/DEMONSTRATION SUBMISSIONS. =20 =20 Authors/system developers are invited to submit presentations in English = in any of the following three categories: =20 =20 1. Theoretical papers: Unpublished papers describing original work on = all aspects of Machine Translation. Preference will be given to papers = that include concrete results and that address the theme of moving MT = research technology (including, but not limited to, data-driven systems = or components) into real use. Papers should not be longer than 10 = pages, with minimum character font size of 11 pt. =20 =20 2. User studies: Studies of users' experiences with implementing MT or = testing its applicability to some task. Of particular interest are = experiences deploying new or advanced MT technology in a production = context. Users, managers, and sales/marketing professionals are = especially welcome to submit. Studies should not be longer than 8 = pages, with minimum character font size of 11 pt. =20 =20 3. System descriptions with optional system demonstrations: Approx. 25 = minutes will be allocated per system description/demo. Submissions = should not be longer than 4 pages. The goal of system descriptions is to = educate participants about the features and functionality of current and = emerging MT systems. Sales presentations are not appropriate. The = following information should be provided in each system description; - name and contact information of system builder - system category (research, pre-market prototype, or commercially = available) - system characteristics (e.g., languages, domains, = integration/networking features) If a system demonstration is included, please provide the following = information: - hardware platform and operating system - name and contact information of system operations specialist =20 =20 First page: All submissions should include a separate title page with = the following information: - paper title, - author(s)' name(s), address(es), telephone and fax numbers, email = address(es), - one-paragraph abstract, - for theoretical papers: subject area keyword(s) - for user studies: the words "User study" - for system descriptions/demos: the words "System description/demo" =20 =20 DEADLINES and SCHEDULE: Submissions due at address below: April 15, 2002 (Monday) Notification of acceptance: May 31, 2002 (Friday) Final versions of papers due: July 15, 2002 (Monday) =20 =20 Electronic submissions are strongly preferred. They should be sent to: email address: steveri@microsoft.com subject line: AMTA-2002 submission =20 in one of the following formats: Microsoft Word (RTF format) PostScript ASCII plain text =20 Hardcopy submissions (please send four (4) copies): AMTA-2002: Stephen D. Richardson Microsoft Research One Microsoft Way Redmond, WA 98052 USA=20 =20 =20 =20 ------_=_NextPart_001_01C1A2B9.9294E1F8 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

           = ; --- CALL FOR PARTICIPATION ---

     The Association for = Machine Translation in the Americas

 

AMTA-2002 Conference

Location:  Tiburon, California

Dates:  October 8-12, 2002

 

The Association for Machine Translation in = the Americas (AMTA) is pleased to announce its fifth biennial conference, planned for October 8-12, 2002, in Tiburon (near San Francisco), = California.

 

Conference theme: From Research to Real = Users

 

Ever since the showdown between Empiricists = and Rationalists a decade ago at TMI-92, MT researchers have hotly pursued promising paradigms for MT, including data-driven approaches (e.g., statistical, example-based) and hybrids that integrate these with more traditional rule-based components.

 

During the same period, commercial MT systems = with standard transfer architectures have evolved along a parallel and almost unrelated track, increasing their coverage (primarily through manual = update of their lexicons, we assume) and achieving much broader acceptance and = usage, principally through the medium of the Internet. Web page translators = have become commonplace; a number of online translation services have = appeared, including in their offerings both raw and post-edited MT; and large corporations have been turning increasingly to MT to address the exigencies of global = communication.  Still, the output of the transfer-based systems employed in this expansion represents but a small drop in the ever-growing translation marketplace bucket.

 

Now, 10 years later, we wonder if this = mounting variety of MT users is any better off, and if the promise of the = research technologies is being realized to any measurable degree.  In this = regard, we pose the following questions:

 

Why aren't any current commercially available = MT systems primarily data-driven?

 

Do any commercially available systems = integrate (or plan to integrate) data-driven components?

 

Do data-driven systems have significant = performance or quality issues?

 

Can such systems really provide better = quality to users, or is their main advantage one of fast, facilitated = customization?

 

If any new MT technology could provide such = benefits (somewhat higher quality, or facilitated customization), would that be = the key to more widespread use of MT, or are there yet other more relevant = unresolved issues, such as system integration?

 

If better quality, customization, or system integration aren't the answer, then what is it that users really need = from MT in order for it to be more useful to them?

 

We solicit participation on these and other = topics related to the research, development, and use of MT in the form of = original papers, demonstrations, workshops, tutorials, and panels. We invite all = who are interested in MT to participate, including developers, researchers, end = users, professional translators, managers, and marketing experts. We especially = invite users to share their experiences, developers to describe their novel = systems, managers and marketers to talk about what is happening in the = marketplace, researchers to detail new capabilities or methods, and visionaries to = describe the future as they see it.  We also welcome and encourage = participation by members of AMTA's sister organizations, AAMT in Asia and EAMT in Europe.

 

Details regarding the conference may be found = on the AMTA Web site:

http://www.amtaweb.org/AMTA2002= /

 <= /font>

 

CONFERENCE ORGANIZERS

Elliott Macklovitch, General = Chair

Stephen D. Richardson, Program = Chair

Violetta Cavalli-Sforza, Local Arrangements = Chair

Bob Frederking, Workshops and Tutorials =

Laurie Gerber, Exhibits Coordinator =

 

 

AMTA-2002: PAPER AND SYSTEM DESCRIPTION/DEMONSTRATION SUBMISSIONS.

 

 

Authors/system developers are invited to = submit presentations in English in any of the following three categories:

 

 

1. Theoretical papers: Unpublished papers = describing original work on all aspects of Machine Translation.  Preference = will be given to papers that include concrete results and that address the theme = of moving MT research technology (including, but not limited to, = data-driven systems or components) into real use.  Papers should not be longer = than 10 pages, with minimum character font size of 11 pt.

 

 

2. User studies: Studies of users’ = experiences with implementing MT or testing its applicability to some task.  Of particular interest are experiences deploying new or advanced MT = technology in a production context.  Users, managers, and sales/marketing = professionals are especially welcome to submit.  Studies should not be longer = than 8 pages, with minimum character font size of 11 pt.

 

 

3. System descriptions with optional system demonstrations: Approx. 25 minutes will be allocated per system description/demo.  Submissions should not be longer than 4 = pages. The goal of system descriptions is to educate participants about the = features and functionality of current and emerging MT systems. Sales presentations = are not appropriate. The following information should be provided in each system = description;

-  name and contact information of = system builder

-  system category (research, pre-market prototype, or commercially available)

-  system characteristics (e.g., = languages, domains, integration/networking features)

If a system demonstration is included, please provide the following information:

-  hardware platform and operating = system

-  name and contact information of = system operations specialist

 

 

First page: All submissions should include a separate title page with the following information:

- paper title,

- author(s)' name(s), address(es), telephone = and fax numbers, email address(es),

- one-paragraph abstract,

- for theoretical papers: subject area = keyword(s)

- for user studies: the words "User = study"

- for system descriptions/demos: the words "System description/demo"

 

 

DEADLINES and SCHEDULE:

Submissions due at address below:   = April 15, 2002 (Monday)

Notification of = acceptance:         May 31, = 2002 (Friday)

Final versions of papers due:       July 15, 2002 (Monday)

 

 

Electronic submissions are strongly = preferred.  They should be sent to:

     email address: steveri@microsoft.com

     subject line: = AMTA-2002 submission

 

in one of the following = formats:

     Microsoft Word (RTF = format)

  =    PostScript

     ASCII plain = text

 

Hardcopy submissions (please send four (4) = copies):

AMTA-2002: Stephen D. = Richardson

Microsoft Research

One Microsoft Way

Redmond, = WA 98052

USA

 

 

 

------_=_NextPart_001_01C1A2B9.9294E1F8-- --------------InterScan_NT_MIME_Boundary-- From Kerstin.Sinautzki@bmw.de Thu Jan 24 13:33:12 2002 From: Kerstin.Sinautzki@bmw.de (Kerstin Sinautzki) Date: Thu, 24 Jan 2002 14:33:12 +0100 Subject: [MT-List] Cyrillic - Latin Message-ID: <3C500D18.8A3B8D34@bmw.de> Dies ist eine mehrteilige Nachricht im MIME-Format. --------------E4E242FC0B2E588574760DFF Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Dear all: Does anybody know of a tool that allows the 'translation' of Cyrillic characters to Latin characters via mouse click? I truly appreciate your input. Best regards Kerstin Sinautzki -- BMW Group - Service Translation (VS-41) 80788 Muenchen - Germany phone.: +49 (0)89 382 - 25831 fax: +49 (0)89 382 - 41675 email: Kerstin.Sinautzki@bmw.de --------------E4E242FC0B2E588574760DFF Content-Type: text/x-vcard; charset=us-ascii; name="kerstin.sinautzki.vcf" Content-Transfer-Encoding: Quoted-Printable Content-Disposition: attachment; filename="kerstin.sinautzki.vcf" Content-Description: Visitenkarte f?r Kerstin Sinautzki begin:vcard=20 n:Sinautzki;Kerstin tel;fax:+49 (0)89 382 41 675 tel;work:+49 (0)89 382 25 831 x-mozilla-html:FALSE org:BMW AG;VS-41 =DCbersetzung version:2.1 email;internet:Kerstin.Sinautzki@bmw.de adr;quoted-printable:;;VS-41=3D0D=3D0A;M=FCnchen;Bayern;80788;Germany end:vcard --------------E4E242FC0B2E588574760DFF-- From jack@kanji.org Fri Jan 25 04:43:53 2002 From: jack@kanji.org (Jack Halpern) Date: Thu, 24 Jan 2002 23:43:53 -0500 Subject: [MT-List] Cyrillic - Latin In-Reply-To: <3C500D18.8A3B8D34@bmw.de> References: <3C500D18.8A3B8D34@bmw.de> Message-ID: <200201250443.AA01594@mail.kanji.org> Greetings In message "[MT-List] Cyrillic - Latin", Kerstin Sinautzki wrote... >Dear all: > >Does anybody know of a tool that allows the 'translation' of Cyrillic >characters to Latin characters via mouse click? > >I truly appreciate your input. We have develped a transliteration/transcription tool in perl that can in principle convert between any two scripts, even such difficult scripts as Arabic and Chinese. I recently added a Russian<>Latin module.$B!!(BBut it does not work at a click -- one needs to confugure batch files and do it on a file basis. We have the technology and knowhow to transliterate or transcribe (far more difficult) between any two languages (see http://www.cjk.org for our activities). >Best regards > >Kerstin Sinautzki > >-- >BMW Group - Service >Translation (VS-41) >80788 Muenchen - Germany > >phone.: +49 (0)89 382 - 25831 >fax: +49 (0)89 382 - 41675 > >email: Kerstin.Sinautzki@bmw.de > Regards, Jack Halpern President, The CJK Dictionary Institute, Inc. http://www.cjk.org Phone: +81-48-473-3508 From olga.beregovaya@autodesk.com Thu Jan 24 19:28:23 2002 From: olga.beregovaya@autodesk.com (olga.beregovaya@autodesk.com) Date: Thu, 24 Jan 2002 11:28:23 -0800 Subject: [MT-List] Cyrillic - Latin Message-ID: <258E47267E4BD3118DCA00805FA72E080AE026EC@hqmsgsrf03.autodesk.com> www.design.ru/free/decoder -- they should have a translit <-> cyrillic converter cheers, Olga -----Original Message----- From: Kerstin Sinautzki [mailto:Kerstin.Sinautzki@bmw.de] Sent: Thursday, January 24, 2002 5:33 AM To: mt-list@eamt.org Subject: [MT-List] Cyrillic - Latin Dear all: Does anybody know of a tool that allows the 'translation' of Cyrillic characters to Latin characters via mouse click? I truly appreciate your input. Best regards Kerstin Sinautzki -- BMW Group - Service Translation (VS-41) 80788 Muenchen - Germany phone.: +49 (0)89 382 - 25831 fax: +49 (0)89 382 - 41675 email: Kerstin.Sinautzki@bmw.de From Vladimir Rykov Fri Jan 25 07:39:03 2002 From: Vladimir Rykov (Vladimir Rykov) Date: Fri, 25 Jan 2002 10:39:03 +0300 Subject: [MT-List] Cyrillic - Latin In-Reply-To: <3C500D18.8A3B8D34@bmw.de> References: <3C500D18.8A3B8D34@bmw.de> Message-ID: <71150332026.20020125103903@mail.ru> Hello Kerstin, Thursday, January 24, 2002, 4:33:12 PM, you wrote: KS> Dear all: KS> Does anybody know of a tool that allows the 'translation' of Cyrillic KS> characters to Latin characters via mouse click? KS> I truly appreciate your input. KS> Best regards KS> Kerstin Sinautzki KS> -- KS> BMW Group - Service KS> Translation (VS-41) KS> 80788 Muenchen - Germany KS> phone.: +49 (0)89 382 - 25831 KS> fax: +49 (0)89 382 - 41675 KS> email: Kerstin.Sinautzki@bmw.de There is free and unbelievable program - Punto switcher that makes in auto or one click mode many things that seemed to be unbelievable http://punto.ru/switcher/ -- Best regards, Vladimir mailto:rykov2000@mail.ru From Christian.Boitet@imag.fr Sun Jan 27 09:21:58 2002 From: Christian.Boitet@imag.fr (Christian Boitet) Date: Sun, 27 Jan 2002 10:21:58 +0100 Subject: [MT-List] Cyrillic - Latin In-Reply-To: <3C500D18.8A3B8D34@bmw.de> References: <3C500D18.8A3B8D34@bmw.de> Message-ID: --============_-1199998246==_ma============ Content-Type: text/plain; charset="iso-8859-1" ; format="flowed" Content-Transfer-Encoding: quoted-printable Dear Mr Sinautzki, 17/1/02 At 14:33 +0100 24/01/02, Kerstin Sinautzki wrote: >Dear all: > >Does anybody know of a tool that allows the 'translation' of Cyrillic >characters to Latin characters via mouse click? No. And you should specify your working environment because such a facility may exist or be generated quickly under some specific one (such as MASS by ISS-CRDL in Singapore under Unix-Linux). Assuming one wants to be able to invert the transformation, one cannot "translate" simply one character by one character as there are more cyrillic characters then roman ones (btw, one shouldn't mix a character set such as roman or cyrillic with the subsets used for certain languages such as russian, bulgarian, etc., or latin which had truly no k, w, y, z and one character for u-v). Here is the part of our transcription concerning Russian only. It is easy to write conversion programs for it. It uses only the "PL/I character set" (no low case, no accents, etc.) and is linguistically motivated as far as possible (e.g., H is used to encode palatalization, Y the jodized vowel with the opposite for E oborotnoe because E is so frequent, and : for the E under accent -- not the old yatq). Another transcription using high and low case letters is immediately derived from this one. Uppercase *A *B *V *G *D *E *E: *ZH *Z *I *J *K *L *M *N *O *P *R *S *T *U *F *X *C *KH *SH *TH *W *YI *Q *YE *YU *YA Low case A B V G D E E: ZH Z I J K L M N O P R S T U =46 X C KH SH TH W YI Q YE YU YA We have developed a specialized and simple language, LT, to write transcriptors, and written many transcriptors with it, including cyrillic-roman and back. We never developed a "click transcription" facility but might do it if people are interested. This language was first implemented in Prolog-I (unavailable), then in another Prolog by Y.Lepage, and then (around 1991-93) in MCL (Macintosh Common Lisp) by M.Lafourcade. I think it still exists on some disk here but nobody uses it at this time. We or the author might "revive" it. The first and only available publication about it is by Y.Lepage in COLING-86, although M.Lafourcade's thesis contains a chapter on his (multidialect) implementation. The TextEdit tool running under Mac OS X can open and store files with many menu-specified encodings. But it can not save, for instance, a document in Occidental Mac OS (Roman) containing a "=82" into Occidental Windows (Latin) because the last one does not contain "=82". There is a Vietnamese PhD student here starting a thesis on related problems (how to multilingualize software at the most basic level, how to use multilingual formats such as the UNL format -- not UNL graphs -- to prepare multilingual message files, and how to incorporate the notions related to writing systems and not only character sets into NLP-related and then general software). If you are interested, some cooperation might be arranged. >I truly appreciate your input. > >Best regards > >Kerstin Sinautzki > >-- >BMW Group - Service >Translation (VS-41) >80788 Muenchen - Germany > >phone.: +49 (0)89 382 - 25831 >fax: +49 (0)89 382 - 41675 > >email: Kerstin.Sinautzki@bmw.de Best regards, CB -- ------------------------------------------------------------------------- Christian Boitet (Pr. Universite' Joseph Fourier) Tel: +33.4-7651-4355/4817 GETA, CLIPS, IMAG-campus, BP53 Fax: +33.4-7651-4405 385, rue de la Bibliothe`que Mel: Christian.Boitet@imag.fr 38041 Grenoble Cedex 9, France Mobile: +33-(0)6-6005-1969 http://www-clips.imag.fr/geta/christian.boitet ------------------------------------------------------------------------- Serveurs de dictionnaires: projet SILFIDE (http://silfide.imag.fr) et plus particuli=E8rement fran=E7ais-malais (http://www-clips.imag.fr/geta/services/fem/) Projet C-STAR (http://www.c-star.org/) et projet europe'en Nespole (http://nespole.itc.it) de traduction de parole Projet UNL de communication et recherche d'information multilingue sur le re'seau http://www.unl.ias.unu.edu ou http://www.unl.org, Projet PAPILLON de construction coop=E9rative d'une base lexicale multilingue et de construction de dictionnaires http://vulab.ias.unu.edu/papillon/ --============_-1199998246==_ma============ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Re: [MT-List] Cyrillic - Latin
Dear Mr Sinautzki,                      17/1/02

At 14:33 +0100 24/01/02, Kerstin Sinautzki wrote:
Dear all:

Does anybody know of a tool that allows the 'translation' of Cyrillic
characters to Latin characters via mouse click?

No. And you should specify your working environment because such a facility may exist or be generated quickly under some specific one (such as MASS by ISS-CRDL in Singapore under Unix-Linux).

Assuming one wants to be able to invert the transformation, one cannot "translate" simply one character by one character as there are more cyrillic characters then roman ones (btw, one shouldn't mix a character set such as roman or cyrillic with the subsets used for certain languages such as russian, bulgarian, etc., or latin which had truly no k, w, y, z and one character for u-v).

Here is the part of our transcription concerning Russian only. It is easy to write conversion programs for it. It uses only the "PL/I character set" (no low case, no accents, etc.) and is linguistically motivated as far as possible (e.g., H is used to encode palatalization, Y the jodized vowel with the opposite for E oborotnoe because E is so frequent, and : for the E under accent -- not the old yatq).

Another transcription using high and low case letters is immediately derived from this one.

Uppercase

*A
*B
*V
*G
*D
*E
*E:
*ZH
*Z
*I
*J
*K
*L
*M
*N
*O
*P
*R
*S
*T
*U
*F
*X
*C
*KH
*SH
*TH
*W
*YI
*Q
*YE
*YU
*YA

Low case

A
B
V
G
D
E
E:
ZH
Z
I
J
K
L
M
N
O
P
R
S
T
U
=46
X
C
KH
SH
TH
W
YI
Q
YE
YU
YA

We have developed a specialized and simple language, LT, to write transcriptors, and written many transcriptors with it, including cyrillic-roman and back. We never developed a "click transcription" facility but might do it if people are interested.

This language was first implemented in Prolog-I (unavailable), then in another Prolog by Y.Lepage, and then (around 1991-93) in MCL (Macintosh Common Lisp) by M.Lafourcade. I think it still exists on some disk here but nobody uses it at this time. We or the author might "revive" it. The first and only available publication about it is by Y.Lepage in COLING-86, although M.Lafourcade's thesis contains a chapter on his (multidialect) implementation.

The TextEdit tool running under Mac OS X can open and store files with many menu-specified encodings. But it can not save, for instance, a document in Occidental Mac OS (Roman) containing a "=82" into Occidental Windows (Latin) because the last one does not contain "=82".


There is a Vietnamese PhD student here starting a thesis on related problems (how to multilingualize software at the most basic level, how to use multilingual formats such as the UNL format -- not UNL graphs -- to prepare multilingual message files, and how to incorporate the notions related to writing systems and not only character sets into NLP-related and then general software).
If you are interested, some cooperation might be arranged.


I truly appreciate your input.

Best regards

Kerstin Sinautzki

--
BMW Group - Service
Translation (VS-41)
80788 Muenchen - Germany

phone.: +49 (0)89 382 - 25831
fax:    +49 (0)89 382 - 41675
email: Kerstin.Sinautzki@bmw.de

Best regards,

CB
--
-------------------------------------------------------------------------
Christian Boitet
(Pr. Universite' Joseph =46ourier)         Tel: +33.4-7651-4355/4817
GETA, CLIPS, IMAG-campus, BP53           Fax: +33.4-7651-4405
385, rue de la Bibliothe`que             Mel: Christian.Boitet@imag.fr    
38041 Grenoble Cedex 9, =46rance           Mobile:  +33-(0)6-6005-1969
http://www-clips.imag.fr/geta/christian.boitet
-------------------------------------------------------------------------
Serveurs de dictionnaires: projet SILFIDE (http://silfide.imag.fr) et plus particuli=E8rement fran=E7ais-malais (http://www-clips.imag.fr/geta/services/fem/)
Projet C-STAR (http://www.c-star.org/) et projet europe'en
        Nespole (http://nespole.itc.it) de traduction de parole
Projet UNL de communication et recherche d'information multilingue sur le
        re'seau http://www.unl.ias.unu.edu ou http://www.unl.org,
Projet PAPILLON de construction coop=E9rative d'une base lexicale multilingue et de construction de dictionnaires
http://vulab.ias.unu.edu/papillon/
--============_-1199998246==_ma============-- From Vladimir Rykov Mon Jan 28 07:09:04 2002 From: Vladimir Rykov (Vladimir Rykov) Date: Mon, 28 Jan 2002 10:09:04 +0300 Subject: Re[2]: [MT-List] Cyrillic - Latin In-Reply-To: References: <3C500D18.8A3B8D34@bmw.de> Message-ID: <31407733199.20020128100904@mail.ru> Hello Christian, Sunday, January 27, 2002, 12:21:58 PM, you wrote: CB> Dear Mr Sinautzki, 17/1/02 CB> At 14:33 +0100 24/01/02, Kerstin Sinautzki wrote: >>Dear all: >> >>Does anybody know of a tool that allows the 'translation' of Cyrillic >>characters to Latin characters via mouse click? CB> No. And you should specify your working environment because such a CB> facility may exist or be generated quickly under some specific one CB> (such as MASS by ISS-CRDL in Singapore under Unix-Linux). CB> Assuming one wants to be able to invert the transformation, one CB> cannot "translate" simply one character by one character as there are CB> more cyrillic characters then roman ones (btw, one shouldn't mix a CB> character set such as roman or cyrillic with the subsets used for CB> certain languages such as russian, bulgarian, etc., or latin which CB> had truly no k, w, y, z and one character for u-v). CB> Here is the part of our transcription concerning Russian only. It is CB> easy to write conversion programs for it. It uses only the "PL/I CB> character set" (no low case, no accents, etc.) and is linguistically CB> motivated as far as possible (e.g., H is used to encode CB> palatalization, Y the jodized vowel with the opposite for E oborotnoe CB> because E is so frequent, and : for the E under accent -- not the old CB> yatq). CB> Another transcription using high and low case letters is immediately CB> derived from this one. CB> Uppercase CB> *A CB> *B CB> *V CB> *G CB> *D CB> *E CB> *E: CB> *ZH CB> *Z CB> *I CB> *J CB> *K CB> *L CB> *M CB> *N CB> *O CB> *P CB> *R CB> *S CB> *T CB> *U CB> *F CB> *X CB> *C CB> *KH CB> *SH CB> *TH CB> *W CB> *YI CB> *Q CB> *YE CB> *YU CB> *YA CB> Low case CB> A CB> B CB> V CB> G CB> D CB> E CB> E: CB> ZH CB> Z CB> I CB> J CB> K CB> L CB> M CB> N CB> O CB> P CB> R CB> S CB> T CB> U CB> F CB> X CB> C CB> KH CB> SH CB> TH CB> W CB> YI CB> Q CB> YE CB> YU CB> YA CB> We have developed a specialized and simple language, LT, to write CB> transcriptors, and written many transcriptors with it, including CB> cyrillic-roman and back. We never developed a "click transcription" CB> facility but might do it if people are interested. CB> This language was first implemented in Prolog-I (unavailable), then CB> in another Prolog by Y.Lepage, and then (around 1991-93) in MCL CB> (Macintosh Common Lisp) by M.Lafourcade. I think it still exists on CB> some disk here but nobody uses it at this time. We or the author CB> might "revive" it. The first and only available publication about it CB> is by Y.Lepage in COLING-86, although M.Lafourcade's thesis contains CB> a chapter on his (multidialect) implementation. CB> The TextEdit tool running under Mac OS X can open and store files CB> with many menu-specified encodings. But it can not save, for CB> instance, a document in Occidental Mac OS (Roman) containing a "‚" CB> into Occidental Windows (Latin) because the last one does not contain CB> "‚". CB> There is a Vietnamese PhD student here starting a thesis on related CB> problems (how to multilingualize software at the most basic level, CB> how to use multilingual formats such as the UNL format -- not UNL CB> graphs -- to prepare multilingual message files, and how to CB> incorporate the notions related to writing systems and not only CB> character sets into NLP-related and then general software). CB> If you are interested, some cooperation might be arranged. >>I truly appreciate your input. >> >>Best regards >> >>Kerstin Sinautzki >> >>-- >>BMW Group - Service >>Translation (VS-41) >>80788 Muenchen - Germany >> >>phone.: +49 (0)89 382 - 25831 >>fax: +49 (0)89 382 - 41675 >> >>email: Kerstin.Sinautzki@bmw.de CB> Best regards, CB> CB CB> -- CB> ------------------------------------------------------------------------- CB> Christian Boitet CB> (Pr. Universite' Joseph Fourier) Tel: +33.4-7651-4355/4817 CB> GETA, CLIPS, IMAG-campus, BP53 Fax: +33.4-7651-4405 CB> 385, rue de la Bibliothe`que Mel: Christian.Boitet@imag.fr CB> 38041 Grenoble Cedex 9, France Mobile: +33-(0)6-6005-1969 CB> http://www-clips.imag.fr/geta/christian.boitet CB> ------------------------------------------------------------------------- CB> Serveurs de dictionnaires: projet SILFIDE (http://silfide.imag.fr) et CB> plus particulièrement français-malais CB> (http://www-clips.imag.fr/geta/services/fem/) CB> Projet C-STAR (http://www.c-star.org/) et projet europe'en CB> Nespole (http://nespole.itc.it) de traduction de parole CB> Projet UNL de communication et recherche d'information multilingue sur le CB> re'seau http://www.unl.ias.unu.edu ou http://www.unl.org, CB> Projet PAPILLON de construction coopérative d'une base lexicale CB> multilingue et de construction de dictionnaires CB> http://vulab.ias.unu.edu/papillon/ Maybe my info would help you There is a good free program to switch between Rus/Lat keyboards - http://punto.ru/switcher/ The auto transliteration option is at every free e-mail portal in Runet - www.mail.ru, www.narod.ru etc -- Best regards, Vladimir Rykov mailto:rykov2000@mail.ru PhD in CL From teruko+@cs.cmu.edu Mon Jan 28 03:47:11 2002 From: teruko+@cs.cmu.edu (Teruko Mitamura) Date: Sun, 27 Jan 2002 22:47:11 -0500 Subject: [MT-List] TMI 2002 -- Call for Participation Message-ID: <6410.1012189631@kyoto.lti.cs.cmu.edu> --------------------------------- TMI 2002 - Call for Participation --------------------------------- The 9th Conference on Theoretical and Methodological Issues in Machine Translation March 13 - 17, 2002 Keihanna, Japan http://www.kecl.ntt.co.jp/events/tmi/ The ninth meeting of the TMI conference will be held March 13-17, 2002 near the historic cities of Nara and Kyoto in Japan. The workshops and tutorials will be held jointly with the Natural Language Processing Society, Japan. On-line registration is now available. Please visit TMI 2002 registration page: http://www.kecl.ntt.co.jp/events/tmi/registration.html Locations and Times: -------------------- TMI-2002 Papers and Panels (March 13-15 (Wed-Fri), 2002) http://sevilla.mt.cs.cmu.edu/TMI2002/prelimprog.html NTT Communication Science Laboratories, NTT Keihanna building 2-4, Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan, 619-0237 Workshops/Tutorials (March 16-17 (Sat-Sun), 2002) Workshop: MT Roadmap http://www.elsnet.org/roadmap-tmi2002.html Tutorials: http://www.kecl.ntt.co.jp/events/tmi/tutorials.html Keihanna Plaza, 1-7, Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan, 619-0237 TMI 2002 Officers: ------------------ Program Committee Chairs: Teruko Mitamura and Eric Nyberg, Carnegie Mellon University, USA Local Arrangements: Francis Bond and Hiromi Nakaiwa, NTT Communication Science Laboratories, Kyoto, Japan General Chair: Sergei Nirenburg, Computing Research Lab, NMSU, USA Program Committee: ------------------ Teruko Mitamura & Eric Nyberg (co-chairs) Carnegie Mellon Timothy Baldwin CSLI Christian Boitet Universit,Ai(B Joseph Fourier Andrew Bredenkamp University of Essex Lynn Carlson U.S. Department of Defense Satoru Ikehara Tottori University Hitoshi Isahara CRL Japan Kevin Knight USC-ISI Satoshi Sato Kyoto University Harold Somers UMIST Koichi Takeda TRL-IBM Hideki Tanaka ATR From wiggjd@sbu.ac.uk Mon Jan 28 11:45:17 2002 From: wiggjd@sbu.ac.uk (David Wigg) Date: Mon, 28 Jan 2002 11:45:17 +0000 Subject: [MT-List] Cyrillic - Latin References: <3C500D18.8A3B8D34@bmw.de> Message-ID: <3C5539CD.76417D6B@sbu.ac.uk> Hello Christian, When I received your message most of the lines in the first block of text were overwritten on the screen and unreadable(they are not overwritten below). Was I the only one? It looks as though the overwriting data comes from the signature (which is missing from the end of the message). I am using Netscape 4.61. I have tried using a variety of different character sets without success. I am still trying to understand how to use different character sets on the internet and why things go wrong from time to time. Thanks. David Wigg MT on the Net Project The Natural Language Translation Specialist Group The British Computer Society. Christian Boitet wrote: > = > Dear Mr Sinautzki, 17/1/02 > = > At 14:33 +0100 24/01/02, Kerstin Sinautzki wrote: > = > > Dear all: > > > > Does anybody know of a tool that allows the 'translation' of > > Cyrillic > = > > characters to Latin characters via mouse click? > = > No. And you should specify your working environment because such a > facility may exist or be generated quickly under some specific one > (such as MASS by ISS-CRDL in Singapore under Unix-Linux). > = > Assuming one wants to be able to invert the transformation, one cannot > "translate" simply one character by one character as there are more > cyrillic characters then roman ones (btw, one shouldn't mix a > character set such as roman or cyrillic with the subsets used for > certain languages such as russian, bulgarian, etc., or latin which had > truly no k, w, y, z and one character for u-v). > = > Here is the part of our transcription concerning Russian only. It is > easy to write conversion programs for it. It uses only the "PL/I > character set" (no low case, no accents, etc.) and is linguistically > motivated as far as possible (e.g., H is used to encode > palatalization, Y the jodized vowel with the opposite for E oborotnoe > because E is so frequent, and : for the E under accent -- not the old > yatq). > = > Another transcription using high and low case letters is immediately > derived from this one. > = > Uppercase > = > *A > *B > *V > *G > *D > *E > *E: > *ZH > *Z > *I > *J > *K > *L > *M > *N > *O > *P > *R > *S > *T > *U > *F > *X > *C > *KH > *SH > *TH > *W > *YI > *Q > *YE > *YU > *YA > Low case > = > A > B > V > G > D > E > E: > ZH > Z > I > J > K > L > M > N > O > P > R > S > T > U > F > X > C > KH > SH > TH > W > YI > Q > YE > YU > YA > = > We have developed a specialized and simple language, LT, to write > transcriptors, and written many transcriptors with it, including > cyrillic-roman and back. We never developed a "click transcription" > facility but might do it if people are interested. > = > This language was first implemented in Prolog-I (unavailable), then in > another Prolog by Y.Lepage, and then (around 1991-93) in MCL > (Macintosh Common Lisp) by M.Lafourcade. I think it still exists on > some disk here but nobody uses it at this time. We or the author might > "revive" it. The first and only available publication about it is by > Y.Lepage in COLING-86, although M.Lafourcade's thesis contains a > chapter on his (multidialect) implementation. > = > The TextEdit tool running under Mac OS X can open and store files with > many menu-specified encodings. But it can not save, for instance, a > document in Occidental Mac OS (Roman) containing a "=82" into Occidenta= l > Windows (Latin) because the last one does not contain "=82". > = > There is a Vietnamese PhD student here starting a thesis on related > problems (how to multilingualize software at the most basic level, how > to use multilingual formats such as the UNL format -- not UNL graphs > -- to prepare multilingual message files, and how to incorporate the > notions related to writing systems and not only character sets into > NLP-related and then general software). > If you are interested, some cooperation might be arranged. > = > > I truly appreciate your input. > > > > Best regards > > > > Kerstin Sinautzki > > > > -- > > BMW Group - Service > > Translation (VS-41) > > 80788 Muenchen - Germany > > > > phone.: +49 (0)89 382 - 25831 > > fax: +49 (0)89 382 - 41675 > > > > email: Kerstin.Sinautzki@bmw.de > = > Best regards, > = > CB > -- > -----------------------------------------------------------------------= -- > Christian Boitet > (Pr. Universite' Joseph Fourier) Tel: +33.4-7651-4355/4817 > GETA, CLIPS, IMAG-campus, BP53 Fax: +33.4-7651-4405 > 385, rue de la Bibliothe`que Mel: > Christian.Boitet@imag.fr > 38041 Grenoble Cedex 9, France Mobile: +33-(0)6-6005-1969 > http://www-clips.imag.fr/geta/christian.boitet > -----------------------------------------------------------------------= -- > Serveurs de dictionnaires: projet SILFIDE (http://silfide.imag.fr) et > plus particuli=E8rement fran=E7ais-malais > (http://www-clips.imag.fr/geta/services/fem/) > Projet C-STAR (http://www.c-star.org/) et projet europe'en > Nespole (http://nespole.itc.it) de traduction de parole > Projet UNL de communication et recherche d'information multilingue sur > le > re'seau http://www.unl.ias.unu.edu ou http://www.unl.org, > Projet PAPILLON de construction coop=E9rative d'une base lexicale > multilingue et de construction de dictionnaires > http://vulab.ias.unu.edu/papillon/ From alina.koson@wanadoo.fr Thu Jan 24 15:22:07 2002 From: alina.koson@wanadoo.fr (Alina Koson) Date: Thu, 24 Jan 2002 16:22:07 +0100 Subject: [MT-List] TA, OAT Message-ID: <001801c1a4ea$e3725cf0$9b3c0950@pcathlon> C'est un message de format MIME en plusieurs parties. ------=_NextPart_000_0015_01C1A4F3.447040A0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable je cherche des outils de traduction automatique, ou d'aide =E0 la = traduction, pour le polonais et le fran=E7ais. Contactez-moi si vous = les connaissiez. Alina Koson ------=_NextPart_000_0015_01C1A4F3.447040A0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
je cherche des outils de traduction = automatique, ou=20 d'aide =E0 la traduction,  pour le polonais et le fran=E7ais. = Contactez-moi si=20 vous les connaissiez. Alina Koson
------=_NextPart_000_0015_01C1A4F3.447040A0-- From lgerber@gerbersite.com Tue Jan 29 16:04:09 2002 From: lgerber@gerbersite.com (Laurie Gerber) Date: Tue, 29 Jan 2002 08:04:09 -0800 Subject: [MT-List] Anyone attending ICON in India? Message-ID: <3C56C7F9.17EEDB03@pacbell.net> Dear MT-listers, I am hoping you can help me to find and recruit a reporter on ICON (the Indian Conference On Natural language processing) which will be held March 1-3 in Channai, India. As editor of Machine Translation News International, I try to get coverage of as many conference events related to MT as possible. Conferences and activity in South Asia have largely escaped the notice of the international community, and I want to include more coverage of such regional events and activity. ICON sounded like a good event to cover, but I don't know anyone who will be attending the conference, and email to the organizers has not been answered. Does anyone on this list plan to attend the conference, or could you put me in touch with someone who will? Thanks!! Laurie Gerber Editor MTNI http://www.eamt.org/mtni.html From UBEFIT@aol.com Tue Jan 29 16:54:18 2002 From: UBEFIT@aol.com (UBEFIT@aol.com) Date: Tue, 29 Jan 2002 11:54:18 EST Subject: [MT-List] safe title or safe area Message-ID: <160.7db3874.29882dba@aol.com> --part1_160.7db3874.29882dba_boundary Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Hi, Does anyone know anything about: safe title or safe area and the difference between the PAL and NTSC in defining the screen size of safe area and the level of safe title. We are doing a subtitling project (English to Chinese) and a concern has been raised about this. If you do not know, do you know where I may find out?? Sorry for the trouble. Pam --part1_160.7db3874.29882dba_boundary Content-Type: text/html; charset="US-ASCII" Content-Transfer-Encoding: 7bit Hi,
Does anyone know anything about:

safe title  or  safe area 
and the difference between the PAL and NTSC in defining the screen size of safe area and the level of safe title.  We are doing a subtitling project (English to Chinese)  and a concern has been raised about this.

If you do not know, do you know where I may find out??

Sorry for the trouble.

Pam


--part1_160.7db3874.29882dba_boundary-- From cglobal25@hotmail.com Fri Feb 1 18:45:03 2002 From: cglobal25@hotmail.com (george carter) Date: Fri, 01 Feb 2002 13:45:03 -0500 Subject: [MT-List] Developing Translation Software Message-ID: I am interested in the innerworkings of translation software, i.e. algorithms, wordlists and the general process of translation. Are there any ongoing projects I can be a part of or learn from. Websites? Thanks! John _________________________________________________________________ Join the world’s largest e-mail service with MSN Hotmail. http://www.hotmail.com From bithabachi@hcm.fpt.vn Sun Feb 3 03:07:59 2002 From: bithabachi@hcm.fpt.vn (Thanh Binh) Date: Sun, 3 Feb 2002 10:07:59 +0700 Subject: [MT-List] Re: MT-List digest, Vol 1 #39 - 6 msgs References: <20020128071408.3DB4453C48@pairlist.net> Message-ID: <003c01c1ac5f$fd4ae8a0$1c46a2cb@thanhbinh> Dear all ! Do you known any good method or documents in which writing about MT from English into Vietnamese ? Best regards Nguyen Thanh Binh e-mail: bithabachi@hcm.fpt.vn From hatamin@ciyasoft.com Fri Feb 22 05:40:48 2002 From: hatamin@ciyasoft.com (Naquib U. Hatami) Date: Fri, 22 Feb 2002 00:40:48 -0500 Subject: [MT-List] Re: MT-List digest, Vol 1 #39 - 6 msgs In-Reply-To: <003c01c1ac5f$fd4ae8a0$1c46a2cb@thanhbinh> Message-ID: <000001c1bb63$7bd18690$78fafea9@nedah> YES... GO to Ciyasoft.com and join in and there is a good section about MT. You can download the PDF files. Regards Naquib -----Original Message----- From: mt-list-admin@eamt.org [mailto:mt-list-admin@eamt.org] On Behalf Of Thanh Binh Sent: Saturday, February 02, 2002 10:08 PM To: mt-list@eamt.org Subject: [MT-List] Re: MT-List digest, Vol 1 #39 - 6 msgs Dear all ! Do you known any good method or documents in which writing about MT from English into Vietnamese ? Best regards Nguyen Thanh Binh e-mail: bithabachi@hcm.fpt.vn -- For MT-List info, see http://www.eamt.org/mt-list.html From jack@kanji.org Sun Feb 3 20:23:39 2002 From: jack@kanji.org (Jack Halpern) Date: Sun, 03 Feb 2002 15:23:39 -0500 Subject: [MT-List] Re: MT-List digest, Vol 1 #39 - 6 msgs In-Reply-To: <003c01c1ac5f$fd4ae8a0$1c46a2cb@thanhbinh> References: <003c01c1ac5f$fd4ae8a0$1c46a2cb@thanhbinh> Message-ID: <200202032023.AA16067@mail.kanji.org> Greetings In message "[MT-List] Re: MT-List digest, Vol 1 #39 - 6 msgs", Thanh Binh wrote... >Dear all ! >Do you known any good method or documents in which writing about MT from >English into Vietnamese ? >Best regards >Nguyen Thanh Binh >e-mail: bithabachi@hcm.fpt.vn I suggest you contact a colleague of mine, Ngo Trung Viet (vietnt@altavista.net) (we may compile a Vietnamse dictionary together). He might know. > >-- > For MT-List info, see http://www.eamt.org/mt-list.html > Regards, Jack Halpern President, The CJK Dictionary Institute, Inc. http://www.cjk.org Phone: +81-48-473-3508 From steven.krauwer@let.uu.nl Mon Feb 4 12:06:49 2002 From: steven.krauwer@let.uu.nl (Steven Krauwer) Date: Mon, 4 Feb 2002 13:06:49 +0100 (MET) Subject: [MT-List] CfP: MT Roadmap Workshop at TMI2002 Message-ID: <200202041206.g14C6nQ14804@sfinx.let.uu.nl> R E M I N D E R DEADLINE is Wednesday Feb 13! Last chance to present your MT visions and challenges!! _________________________________________________________________ Call for Papers MT Roadmap Workshop (March 16) at TMI2002 (March 13-17) Keihanna (near Kyoto), Japan Organized by ELSNET Background: Since 2000, ELSNET (The European Network of Excellence in Human Language Technologies) has organised a series of workshops aimed at the creation of a broadly supported technological roadmap for various subfields of language and speech technology. A technology roadmap comprises an analysis of the present situation, a vision of where we want to be in e.g. ten years from now, and a number of intermediate milestones that would help in setting intermediate goals and in measuring our progress towards our goals. The function of the road map is not to impose anything on anyone, but rather to provide a broadly supported definition of a context in which to position the MT community's efforts, which would allow us to identify common priorities for joint activities in e.g. research, resources and training. ELSNET aims at (co-)organizing roadmap workshops at all major events in order to encourage continuous reflection on where we stand, where we want to go, and, most importantly, how we can get there. Format: It will be a one day workshop consisting of three components: * The first component will aim at giving a critical analysis of the present state of Machine Translation in the broadest sense. * The second component will be dedicated to visions of the future. * The third component will aim at identifying major research challenges and establishing intermediate milestones on our way towards our goals. Each of the components could contain * an invited speaker or panel, * submitted papers (reviewed), * and ample space for discussion. The results of the discussions will be published in a report that will published on ELSNET's website, and that will serve as a starting point of a broad consultation of the MT community on the future directions of Machine Translation in the broadest sense. Papers: We invite papers that * give critical analyses of the present state of the art in machine translation of written and spoken language, * present visions of the future of machine translation, both from a theoretical, a technological or from an application point of view, or * identify major milestones and challenges (both theoretical, practical and organisational) on our way towards the future. Submission: Abstracts must be submitted in English, and should be no more than 4 pages long and in single column format. Submissions should be sent electronically in plain text, MS Word or PDF to steven.krauwer@elsnet.org. Important dates: * Submission deadline: 13 February * Notification: 22 February * Final papers due: 5 March * Workshop: 16 March Proceedings: Participants will receive proceedings, containing * a summary of the results of the previous MT Roadmap workshop (at the MT Summit) * summaries of the invited talks * full versions of the submitted papers Audience: The primary audience consists of people with an analytical or future oriented, programmatic interest, both from research and from industry. Registration: The workshop registration fee is 5000 yen (app. 38 USD or 43 Euro). Registration details can be found on the TMI 2002 website (see below). URLs: Main conference: http://www.kecl.ntt.co.jp/events/tmi/ This workshop: http://www.elsnet.org/roadmap-tmi2002.html Contact point: Steven Krauwer (steven.krauwer@elsnet.org) ELSNET / Utrecht University Trans 10, 3512 JK Utrecht, NL phone +31 30 253 6050 fax +31 30 253 6000 Core Programme Committee: * Steven Krauwer * Laurie Gerber * Elliott Macklovitch * Hans Uszkoreit * Susan Armstrong * Herman Caeyers * John Hutchins From WJHutchins@compuserve.com Wed Feb 6 10:42:33 2002 From: WJHutchins@compuserve.com (John Hutchins) Date: Wed, 6 Feb 2002 05:42:33 -0500 Subject: [MT-List] Human Language Technology Conference (HLT-2002) Call for Attendance Message-ID: <200202060542_MC3-F0E7-CF1D@compuserve.com> Details of a conference of interest to MT. -------------Forwarded Message----------------- From: Priscilla Rasmussen, INTERNET:rasmusse@cs.rutgers.edu To: [unknown], INTERNET:rasmusse@cs.rutgers.edu = Date: 05/02/102 21:08 PM RE: Human Language Technology Conference (HLT-2002) Call for Attendance FIRST CALL FOR ATTENDANCE HLT 2002, Human Language Technology Conference March 24-27, 2002 Catamaran Resort Hotel, San Diego, California http://hlt2002.org Human language technology (HLT) incorporates a broad spectrum of disciplines working towards two closely related goals: to enable computers to interact with humans using natural language capabilities, and to serve as useful adjuncts to humans in language understanding by providing services such as automatic translation, information retrieval and information extraction. The HLT 2002 Conference, following the great success of HLT 2001, is a forum for researchers to present high-quality, very recent, cutting-edge work, to exchange ideas and to explore emerging new research directions. The Conference and Program Chairs have now received over 170 submissions from researcher= s in computer science, speech science, engineering, etc., who are exploring innovative methods for improving human language technology. HLT 2002 will also include a special focus on Language Processing of Biological Data, which includes both Information Extraction of Biological Data and Language Modeling of Biological Data. This special focus, sponsored by NSF, will comprise back-to-back tutorial sessions at the opening of the conference and a paper session within the larger conference setting. Further information is available at the Conference web site, http://hlt2002.org. The Conference will span four days, running from early Sunday afternoon through noon Wednesday. It will include peer-reviewed research presentations, posters, demonstrations, panel sessions and time for discussion. In order to encourage cutting-edge work, submissions were accepted in mid-January. Those submissions are under review at the moment, and the full list of papers, posters, and demonstrations will not be known until February 11. A loose-leaf proceedings will be distributed at the conference, with a final proceedings distributed during July, 2002. =3D>Space at the conference is limited by the conference facilities, so =3D>we can only accept the first 330 registrants, with registration =3D>officially beginning on or about Feb 15, after authors have been =3D>informed of acceptances. To reserve a place now, please send email t= o =3D>Priscilla Rasmussen, rasmusse@cs.rutgers.edu, including your name, =3D>affiliation, and email address. While the cost of the conference has not yet been finalized, we expect the registration fee to be in the range of $375-400, including meals during the conference. CONFERENCE COMMITTEES General chair: Mitch Marcus, University of Pennsylvania (USA) Co-chair: David Yarowsky, Johns Hopkins University (USA) Executive Program Committee: James Allan, University of Massachusetts (USA) Sadaoki Furui, Tokyo Institute of Technology (Japan) Ralph Grishman, New York University (USA) Donna Harman, NIST (USA) Lynette Hirschman, MITRE (USA) Eduard Hovy, ISI (USA) Kevin Knight, ISI (USA) Joseph Mariani, LIMSI-CNRS (France) John Makhoul, BBN Technologies (USA) Nelson Morgan, University of California at Berkeley (USA) Mari Ostendorf, University of Washington (USA) Hans Uszkoreit, Saarland University and DKFI (Germany) Demonstration Co-chairs: Clifford Weinstein, MIT Lincoln Laboratory (USA) Bob Younger, SPAWAR Systems Center (USA) Special Focus Committee: Chair: Aravind Joshi, University of Pennsylvania (USA) Co-chair: Lynette Hirschman, MITRE (USA) CONFERENCE VENUE The HLT Conference will be held at the Catamaran Resort Hotel in San Diego, California. The famous San Diego Zoo is the home of Hua Mei, the only baby giant panda to be born in the US. Sea World is one of the area's better known attractions, where you can see the killer whale Shamu. San Diego also houses Balboa Park, the largest urban cultural park. You can stroll through the Gaslamp Quarter or through Old Town. Nearby La Jolla houses the Birch Aquarium, and Carlsbad houses Legoland. Heading south gets you to Tijuana, Mexico. IMPORTANT DATES (all dates are in 2002) **NOW** Reserve space to attend (February 11 Authors informed of reviewing decisions) February 15 Registration will officially begin March 24-27 HLT 2002 Conference, San Diego July 20 Proceedings published (target date) FURTHER INFORMATION Up-to-date information about the Conference and registration will be posted at http://hlt2002.org. ----------------------- Internet Header -------------------------------- Sender: rasmusse@athos.rutgers.edu Received: from athos.rutgers.edu (athos.rutgers.edu [128.6.25.4]) by siaag2af.compuserve.com (8.9.3/8.9.3/SUN-1.12) with ESMTP id QAA09149= ; Tue, 5 Feb 2002 16:08:09 -0500 (EST) Received: (from rasmusse@localhost) by athos.rutgers.edu (8.8.8/8.8.8) id PAA24939; Tue, 5 Feb 2002 15:19:58 -0500 (EST) Date: Tue, 5 Feb 2002 15:19:57 EST From: Priscilla Rasmussen To: rasmusse@cs.rutgers.edu Subject: Human Language Technology Conference (HLT-2002) Call for Attenda= nce Message-ID: From cgdaniec@us.ibm.com Sat Feb 9 00:42:31 2002 From: cgdaniec@us.ibm.com (Claudia Gdaniec) Date: Fri, 8 Feb 2002 19:42:31 -0500 Subject: [MT-List] MT evaluation and test suites Message-ID: Does anyone know whether there are documents with multiple reference translations available anywhere that could be used for MT evaluation purposes? (Any language) Another question: Does anybody know of freely available test suites that could be used for MT training/testing/evaluation purposes? Again, any languages. Claudia Gdaniec From Info@globalization.com Wed Feb 13 14:06:12 2002 From: Info@globalization.com (Info (Globalization)) Date: Wed, 13 Feb 2002 14:06:12 -0000 Subject: [MT-List] Freelance opportunity for Russian Computational Linguist Message-ID: <61ACB752C91DD311978300105A36DFFE010C9924@NT-MAIL3> To the members of the MT-List Mailing List Hello, An opening for a freelance Russian Computational Linguist has been posted today on our site, http://www.globalization.com. Please log onto the site for full details and an application form. Best regards, from The Globalization Team From H.Fulford@lboro.ac.uk Mon Feb 18 16:36:07 2002 From: H.Fulford@lboro.ac.uk (Heather Fulford) Date: Mon, 18 Feb 2002 16:36:07 +0000 Subject: [MT-List] PhD opportunity Message-ID: <3.0.6.32.20020218163607.0092c1a0@staff-mailin.lboro.ac.uk> EPSRC PhD RESEARCH STUDENTSHIP IN THE MANAGEMENT SCIENCE & INFORMATION SYSTEMS RESEARCH GROUP, BUSINESS SCHOOL, LOUGHBOROUGH UNIVERSITY PhD STUDENTSHIP for September 2002 Applications are invited for a three-year EPSRC research studentship award to commence in September 2002 in the Business School, Loughborough University. The successful applicant will work on a project investigating the adoption of language translation software by small translation businesses. Project summary: The demand for language translation services has increased significantly over the past decade, and to help meet that demand, software has been developed to support human translators, including machine translation systems, translation memory, and terminology management tools. This project comprises a study of the adoption of such software by UK translation businesses, focussing on the benefits the software affords, its impact on translators' working practice, and the strategies employed to integrate the software into a translator's workflow. Applicants should have a good honours degree or equivalent in a relevant area (e.g. translation studies, information systems, computational linguistics, or computer science). A Masters degree, or relevant research experience, would be an advantage. For full details of the Business School PhD programme, please go to: http://www.lboro.ac.uk/departments/bs/resdoct.html For an online application form, or to download a form, please go to: http://www.lboro.ac.uk/admin/central_admin/pg/forms.html For additional information and advice about this PhD studentship, please contact: Dr. Heather Fulford The Business School Loughborough University Loughborough Leicestershire LE11 3TU UK Tel. +44 (0)1509 222435 Fax +44 (0)1509 223960 E-mail h.fulford@lboro.ac.uk The studentship is open to UK and other EU students, although non-UK students qualify on a fees-only basis. The studentship is available from September 2002. Deadline for applications: 22 April 2002. PLEASE MARK 'Translation Tools Project' ON THE APPLICATION From WJHutchins@compuserve.com Tue Feb 19 12:01:23 2002 From: WJHutchins@compuserve.com (John Hutchins) Date: Tue, 19 Feb 2002 07:01:23 -0500 Subject: [MT-List] Nous aimerions avoir votre opinion Message-ID: <200202190701_MC3-F23E-664E@compuserve.com> This is a MIME-encapsulated message --9f0d3a4d-1581-409b-a4f5-e9b396561f35 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Recived today, and forwarded to the list. John Hutchins 19 Feb -------------Forwarded Message----------------- From: "Isabelle de ROSE", INTERNET:isabellederose@free.fr To: [unknown], INTERNET:info@eamt.org = Date: 19/02/102 09:23 PM RE: Nous aimerions avoir votre opinion = Bonjour Dans le cadre de la cr=E9ation de son site www.about-translations.com/index.php , la communaut=E9 des traducteurs a = le plaisir de vous inviter =E0 participer aux discussions sur les sujets qu= i nous tiennent tous =E0 cour. Pour votre information, et pour que vous puissiez partager vos impression= s et vos commentaires avec les membres d=E9j=E0 inscrits, nous vous proposo= ns de participer =E0 une discussion, en anglais et en fran=E7ais, le Samedi 23 = f=E9vrier prochain, =E0 15h GMT +1. et jusqu'=E0 la fin des demandes de traduction = pour le mill=E9naire. Parmi les th=E8mes abord=E9s, la gratuit=E9 ou non des places de march=E9= , les compatibilit=E9s des logiciels de traduction, Les tests de traduction. Le passage sous la gamme XP. Les discussions se tiendront sur notre site www.about-translations.com/index.php. Suivez le main menu / Forum / Let's talk about.... Les autres forums d=E9j=E0 ouverts sont =E0 votre disposition et regroupe= nt le freelancing sous la forme de t=E9l=E9travail Ont =E9t=E9 invit=E9s des repr=E9sentants pour les logiciels de TAO du ma= rch=E9 ainsi que des personnes connues dans le monde du t=E9l=E9travail. Vous pouvez v= ous inscrire d=E8s maintenant, gratuitement, et d=E9couvrir les sujets qu= i seront abord=E9s. Nous vous attendons. A tr=E8s bient=F4t. L'=E9quipe de About-Traductions ----------------------- Internet Header -------------------------------- Sender: isabellederose@free.fr Received: from naam.pair.com (naam.pair.com [209.68.1.237]) by siaag2af.compuserve.com (8.9.3/8.9.3/SUN-1.12) with SMTP id EAA26886 for ; Tue, 19 Feb 2002 04:23:11 -0500 (EST) Received: (qmail 88087 invoked by uid 3138); 19 Feb 2002 09:23:09 -0000 Delivered-To: cbrace-eamt:org-info@eamt.org Received: (qmail 88080 invoked from network); 19 Feb 2002 09:23:08 -0000 Received: from postfix3-2.free.fr (213.228.0.169) by naam.pair.com with SMTP; 19 Feb 2002 09:23:08 -0000 Received: from IsabelledeROSE (nas-cbv-5-146-185.dial.proxad.net [62.147.= 146.185]) by postfix3-2.free.fr (Postfix) with SMTP id B501D17EDF for ; Tue, 19 Feb 2002 10:23:06 +0100 (CET) Message-ID: <009b01c1b927$9f7b3700$b992933e@IsabelledeROSE> From: "Isabelle de ROSE" To: Subject: Nous aimerions avoir votre opinion Date: Tue, 19 Feb 2002 10:27:15 +0100 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary=3D"----=3D_NextPart_000_0098_01C1B930.006E9360" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4133.2400 Disposition-Notification-To: "Isabelle de ROSE" X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400 --9f0d3a4d-1581-409b-a4f5-e9b396561f35 Content-Type: application/octet-stream; name="UNTITLED.021" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="UNTITLED.021" PCFET0NUWVBFIEhUTUwgUFVCTElDICItLy9XM0MvL0RURCBIVE1MIDQuMCBUcmFuc2l0aW9uYWwv L0VOIj4NCjxIVE1MPjxIRUFEPg0KPE1FVEEgaHR0cC1lcXVpdj1Db250ZW50LVR5cGUgY29udGVu dD0idGV4dC9odG1sOyBjaGFyc2V0PWlzby04ODU5LTEiPg0KPE1FVEEgY29udGVudD0iTVNIVE1M IDUuNTAuNDEzNC4xMDAiIG5hbWU9R0VORVJBVE9SPg0KPFNUWUxFPjwvU1RZTEU+DQo8L0hFQUQ+ DQo8Qk9EWSBiZ0NvbG9yPSNmZmZmZmY+DQo8RElWPjxGT05UIGZhY2U9QXJpYWwgc2l6ZT0yPjxG T05UIGZhY2U9IlRpbWVzIE5ldyBSb21hbiIgDQpzaXplPTM+Qm9uam91cjxCUj48QlI+RGFucyBs ZSBjYWRyZSBkZSBsYSBjculhdGlvbiBkZSBzb24gc2l0ZTxCUj48L0ZPTlQ+PEEgDQpocmVmPSJo dHRwOi8vd3d3LmFib3V0LXRyYW5zbGF0aW9ucy5jb20vaW5kZXgucGhwIj48Rk9OVCBmYWNlPSJU aW1lcyBOZXcgUm9tYW4iIA0Kc2l6ZT0zPnd3dy5hYm91dC10cmFuc2xhdGlvbnMuY29tL2luZGV4 LnBocDwvRk9OVD48L0E+PEZPTlQgDQpmYWNlPSJUaW1lcyBOZXcgUm9tYW4iIHNpemU9Mz4gLCBs YSBjb21tdW5hdXTpIGRlcyB0cmFkdWN0ZXVycyBhIGxlPEJSPnBsYWlzaXIgDQpkZSB2b3VzIGlu dml0ZXIg4CBwYXJ0aWNpcGVyJm5ic3A7IGF1eCBkaXNjdXNzaW9ucyBzdXIgbGVzIHN1amV0cyBx dWk8QlI+bm91cyANCnRpZW5uZW50IHRvdXMg4CBjb3VyLjxCUj48QlI+UG91ciB2b3RyZSBpbmZv cm1hdGlvbiwgZXQgcG91ciBxdWUgdm91cyBwdWlzc2lleiANCnBhcnRhZ2VyIHZvcyBpbXByZXNz aW9uczxCUj5ldCB2b3MgY29tbWVudGFpcmVzIGF2ZWMgbGVzIG1lbWJyZXMgZOlq4CBpbnNjcml0 cywgDQpub3VzIHZvdXMgcHJvcG9zb25zIGRlPEJSPnBhcnRpY2lwZXIg4CB1bmUgZGlzY3Vzc2lv biwgZW4gYW5nbGFpcyBldCBlbiANCmZyYW7nYWlzLCBsZSBTYW1lZGkgMjMgZul2cmllcjxCUj5w cm9jaGFpbiwg4CAxNWggR01UICsxLiBldCBqdXNxdSfgIGxhIGZpbiBkZXMgDQpkZW1hbmRlcyBk ZSB0cmFkdWN0aW9uIHBvdXIgbGU8QlI+bWlsbOluYWlyZS48QlI+PEJSPlBhcm1pIGxlcyB0aOht ZXMgYWJvcmTpcywgDQpsYSBncmF0dWl06SBvdSBub24gZGVzIHBsYWNlcyBkZSBtYXJjaOksPEJS PmxlcyBjb21wYXRpYmlsaXTpcyBkZXMgbG9naWNpZWxzIGRlIA0KdHJhZHVjdGlvbiw8QlI+TGVz IHRlc3RzIGRlIHRyYWR1Y3Rpb24uPEJSPkxlIHBhc3NhZ2Ugc291cyBsYSBnYW1tZSANClhQLjxC Uj48QlI+TGVzIGRpc2N1c3Npb25zIHNlIHRpZW5kcm9udCBzdXIgbm90cmUgc2l0ZTxCUj48L0ZP TlQ+PEEgDQpocmVmPSJodHRwOi8vd3d3LmFib3V0LXRyYW5zbGF0aW9ucy5jb20vaW5kZXgucGhw Ij48Rk9OVCBmYWNlPSJUaW1lcyBOZXcgUm9tYW4iIA0Kc2l6ZT0zPnd3dy5hYm91dC10cmFuc2xh dGlvbnMuY29tL2luZGV4LnBocDwvRk9OVD48L0E+PEZPTlQgDQpmYWNlPSJUaW1lcyBOZXcgUm9t YW4iIHNpemU9Mz4uPEJSPlN1aXZleiBsZSBtYWluIG1lbnUgLyBGb3J1bSAvIExldCdzIHRhbGsg DQphYm91dC4uLi48QlI+TGVzIGF1dHJlcyBmb3J1bXMgZOlq4CBvdXZlcnRzIHNvbnQg4CB2b3Ry ZSBkaXNwb3NpdGlvbiBldCANCnJlZ3JvdXBlbnQgbGU8QlI+ZnJlZWxhbmNpbmcgc291cyBsYSBm b3JtZSBkZSB06WzpdHJhdmFpbDxCUj48QlI+T250IOl06SBpbnZpdOlzIA0KZGVzIHJlcHLpc2Vu dGFudHMgcG91ciBsZXMgbG9naWNpZWxzIGRlIFRBTyBkdSBtYXJjaOkgYWluc2k8QlI+cXVlIGRl cyBwZXJzb25uZXMgDQpjb25udWVzIGRhbnMgbGUgbW9uZGUgZHUgdOls6XRyYXZhaWwuIFZvdXMg cG91dmV6IHZvdXMgaW5zY3JpcmUgZOhzIG1haW50ZW5hbnQsIA0KZ3JhdHVpdGVtZW50LCBldCBk 6WNvdXZyaXIgbGVzIHN1amV0cyBxdWkgc2Vyb250IGFib3Jk6XMuPC9GT05UPjwvRk9OVD48L0RJ Vj4NCjxESVY+PEZPTlQgZmFjZT1BcmlhbCBzaXplPTI+PEZPTlQgZmFjZT0iVGltZXMgTmV3IFJv bWFuIiBzaXplPTM+PEJSPk5vdXMgdm91cyANCmF0dGVuZG9ucy4gQSB0cuhzIGJpZW509HQuPEJS Pkwn6XF1aXBlIGRlIA0KQWJvdXQtVHJhZHVjdGlvbnM8L0ZPTlQ+PEJSPjwvRElWPjwvRk9OVD48 L0JPRFk+PC9IVE1MPg0K --9f0d3a4d-1581-409b-a4f5-e9b396561f35-- From bond@cslab.kecl.ntt.co.jp Fri Feb 22 09:45:14 2002 From: bond@cslab.kecl.ntt.co.jp (Francis Bond) Date: Fri, 22 Feb 2002 18:45:14 +0900 (JST) Subject: [MT-List] MT evaluation and test suites In-Reply-To: (cgdaniec@us.ibm.com) References: Message-ID: <200202220945.SAA17284@fornost.icl.kecl.ntt.co.jp> G'day, Claudia> Does anyone know whether there are documents with multiple Claudia> reference translations available anywhere that could be used Claudia> for MT evaluation purposes? (Any language) We have a Japanese-English test set with multiple reference translations (3718 Japanese sentences) available on the web under: http://www.kecl.ntt.co.jp/icl/mtg/resources/index.html We actually have it translated into several other languages (Chinese, Korean, Malay and French) although they are not available on-line. Claudia> Another question: Does anybody know of freely available test Claudia> suites that could be used for MT training/testing/evaluation Claudia> purposes? Again, any languages. I believe JEIDA has a similar test set, but I don't know if it is freely available. -- Francis Bond NTT Communication Science Laboratories | Machine Translation Research Group Come to TMI-2002 in Kyoto, Japan: ! From wiggjd@sbu.ac.uk Thu Feb 7 10:54:58 2002 From: wiggjd@sbu.ac.uk (David Wigg) Date: Thu, 07 Feb 2002 10:54:58 +0000 Subject: [MT-List] Outlook Express and ISO characters Message-ID: <3C625D02.3D18B8A3@sbu.ac.uk> Accented and other ISO characters (160-255) Further to some messages in December I have been experimenting with how to send and receive ISO characters in the range 160-255 using Netscape (4.6) and Outlook Express (5.5) (under Windows95). I have been entering these characters holding down the Alt key whilst using the keypad to type a zero followed by the decimal value of the character (e.g. 'a grave' is 224 in ISO 8859-1). So far I think both systems work as expected using the Western European ISO 8859-1 character set using MIME encoding with Quoted printable set. However, when trying another ISO 8859 character set such as Cyrillic ISO 8859-5 Netscape works as expected but I cannot get Outlook Express to work as expected. First of all even when the send character set is changed to ISO 8859-5 when entering Alt/0224 I still get 'a grave' shown on the display and an 'a' without an accent can be received by Netscape or Outlook Express. As usual the help information is not helpful for my problems (I suppose if it were I would not be complaining now!). Can anyone tell me how I can use Outlook Express to show, send and receive foreign characters in the other ISO 8859 character sets? Thanks. David. From ref@cs.cmu.edu Tue Feb 19 21:24:53 2002 From: ref@cs.cmu.edu (Robert Frederking) Date: Tue, 19 Feb 2002 16:24:53 -0500 Subject: [MT-List] AMTA-2002: Call for Tutorial and Workshop Proposals Message-ID: <29613.1014153893@lti.cs.cmu.edu> Please circulate as widely as possible: --- CALL FOR TUTORIAL AND WORKSHOP PROPOSALS --- The Association for Machine Translation in the Americas AMTA-2002 Conference Tiburon, California (near San Francisco) October 8-12, 2002 Conference theme: FROM RESEARCH TO REAL USERS Ever since the showdown between Empiricists and Rationalists a decade ago at TMI-92, MT researchers have hotly pursued promising paradigms for MT, including data-driven approaches and hybrids that integrate these with more traditional rule-based components. During the same period, commercial MT systems with standard transfer architectures have evolved along a parallel and almost unrelated track, increasing their coverage and achieving much broader acceptance and usage. This raises a number of interesting questions (see the main conference Call For Participation), primarily concerned with why this disconnect exists, and whether it is going to change. TUTORIAL AND WORKSHOP PROPOSAL SUBMISSIONS Proposals for tutorials and workshops are now being solicited on these and other topics of direct interest and impact for MT researchers, developers, vendors or users of MT technologies. We welcome and encourage participation by members of AMTA's sister organizations, AAMT in Asia and EAMT in Europe, as well. Workshops will be held on Tuesday October 8th. Approximately 7 hours may be allocated per workshop. Tutorials will be held on Wednesday October 9th. Tutorials would typically last 3 hours, although other arrangements might be possible. Proposals should state the topic(s) to be addressed, the rationale for addressing it and the structure of the activities. Proposals should be in English and not longer than 4 pages. Please submit proposals as soon as possible to Bob Frederking at . Proposals must be submitted on or before Friday, April 12, 2002. For general conference information and further details as they become available, visit: http://www.amtaweb.org/AMTA2002/ CONFERENCE ORGANIZERS Elliott Macklovitch, General Chair Stephen D. Richardson, Program Chair Violetta Cavalli-Sforza, Local Arrangements Chair Bob Frederking, Workshops and Tutorials Laurie Gerber, Exhibits Coordinator -- Robert E. Frederking Email: ref@cs.cmu.edu Language Technologies Institute Telephone: +1-412-268-6656 Carnegie Mellon University FAX: +1-412-268-6298 5000 Forbes Avenue Pittsburgh, PA 15213 USA http://www.cs.cmu.edu/~ref/ From teruko+@cs.cmu.edu Wed Feb 20 21:49:19 2002 From: teruko+@cs.cmu.edu (Teruko Mitamura) Date: Wed, 20 Feb 2002 16:49:19 -0500 Subject: [MT-List] TMI 2002 - Call for Participation Message-ID: <2426.1014241759@kyoto.lti.cs.cmu.edu> --------------------------------- TMI 2002 - Call for Participation --------------------------------- The 9th Conference on Theoretical and Methodological Issues in Machine Translation March 13 - 17, 2002 Keihanna, Japan http://www.kecl.ntt.co.jp/events/tmi/ The ninth meeting of the TMI conference will be held March 13-17, 2002 near the historic cities of Nara and Kyoto in Japan. The workshops and tutorials will be held jointly with the Natural Language Processing Society, Japan. On-line registration is now available. Please visit TMI 2002 registration page: http://www.kecl.ntt.co.jp/events/tmi/registration.html Locations and Times: -------------------- TMI-2002 Sessions (March 13-15 (Wed-Fri), 2002) NTT Communication Science Laboratories, NTT Keihanna building 2-4, Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan, 619-0237 Workshops/Tutorials (March 16-17 (Sat-Sun), 2002) Workshop: MT Roadmap http://www.elsnet.org/roadmap-tmi2002.html Tutorials: http://www.kecl.ntt.co.jp/events/tmi/tutorials.html Keihanna Plaza, 1-7, Hikaridai, Seika-cho, Soraku-gun, Kyoto, Japan, 619-0237 --------------------------------------------------------------------------- TMI-2002 Schedule --------------------------------------------------------------------------- March 13 Wednesday 10:00 Registration 10:45 Opening Remarks 11:00 Invited Speaker: Mutsumi Imai, Keio University "Universal and language-specific constraints children use in building up the mental lexicon" 12:00 Two Experiments in Situated MT Jim Cowie, Sergei Nirenburg 12:30 Lunch 14:00 Stone Soup Translation Paul Davis, Chris Brew 14:30 Sentence Generation for Pattern-Based MT Koichi Takeda 15:00 Rapid Adaptive Development of Semantic Analysis Grammars Alicia Tribble, Alon Lavie, Lori Levin 15:30 Break 16:00 Pronominal Anaphora Resolution in the KANTOO Multilingual MT System Teruko Mitamura, Eric Nyberg, Enrique Torrejon, David Svoboda, Annelen Brunner, Kathryn Baker 16:30 An Experimental Multilingual Bi-directional Speech Translation System Tomohiro Konuma, Kenji Matsui, Yumi Wakita, Kenji Mizutani, Mitsuru Endo, Masashi Murata ------------------------------------------------------------------------ March 14 Thursday 9:30 Sign Language Synthesis using HPSG Ian Marshall, Eva Safar 10:00 Alternation-Based Lexicon Reconstruction Timothy Baldwin, Francis Bond 10:30 Corpus-Driven Splitting of Compound Words Ralf Brown 11:00 Break 11:30 Incremental Construction and Maintenance of Morphological Analyzers Based on Augmented Letter Transducers Alicia Garrido-Alenda, Mikel Forcada, Rafael Carrasco 12:00 A Method of Adding New Entries to a Valency Dictionary by Exploiting Existing Lexical Resources Sanae Fujita, Francis Bond 12:30 Lunch 14:00 Panel Discussion: New Applications for MT Moderator: Eric Nyberg 15:00 Extracting Semantic Classes and Morphosyntactic Features for English-Polish MT Barbara Gawronska, Bjorn Erlendsson, Hanna Duczak 15:30 Break 16:00 An Iterative Algorithm for Translation Acquisition of Adpositions Hiroshi Kanayama 16:30 Application of Translation Knowledge Acquired by Hierarchical Phrase Alignment Kenji Imamura ---------------------------------------------------------------------- March 15 Friday 9:30 Grammar for Ellipsis Resolution in Japanese Shigeko Nariyama 10:00 Machine Translation without a Bilingual Dictionary Jesse Pinkham, Martine Smets 10:30 Break 11:00 Correction of Errors in a Modality Corpus used for MT by using Machine Learning Method Masaki Murata, Masao Utiyama, Kiyotaka Uchimoto, Qing Ma, Hitoshi Isahara 11:30 Challenges in Automated Elicitation of a Controlled Bilingual Corpus Katharina Probst, Lori Levin 12:00 Lunch 13:30 Statistical MT Based on Hierarchical Phrase Alignment Taro Watanabe, Kenji Imamura, Eiichiro Sumita 14:00 Corpus-Assisted Expansion of Manual MT Knowledge Setsuo Yamada, Kenji Imamura, Kazuhide Yamamoto 15:00 Tour of ATR, CRL and NTT Laboratories (proposed) --------------------------------------------------------------------------- March 16 Saturday Workshop: Machine Translation Roadmap --------------------------------------------------------------------------- March 17 Sunday Tutorials: 10:00 Example-based Machine Translation Eiichiro Sumita (ATR) 12:00 Lunch 13:00 Statistical Machine Translation Kevin Knight (ISI/USC) 15:00 Break 15:15 Translation Memories Timothy Baldwin (CSLI, Stanford University) ---------------------------------------------------------------------------- TMI 2002 Officers: ------------------ Program Committee Chairs: Teruko Mitamura and Eric Nyberg, Carnegie Mellon University, USA Local Arrangements: Francis Bond and Hiromi Nakaiwa, NTT Communication Science Laboratories, Kyoto, Japan General Chair: Sergei Nirenburg, Computing Research Lab, NMSU, USA Program Committee: ------------------ Teruko Mitamura & Eric Nyberg (co-chairs) Carnegie Mellon Timothy Baldwin CSLI Christian Boitet Universit,Ai(B Joseph Fourier Andrew Bredenkamp University of Essex Lynn Carlson U.S. Department of Defense Satoru Ikehara Tottori University Hitoshi Isahara CRL Japan Kevin Knight USC-ISI Satoshi Sato Kyoto University Harold Somers UMIST Koichi Takeda TRL-IBM Hideki Tanaka ATR From sarah@sarahnichols.com Fri Feb 22 17:45:38 2002 From: sarah@sarahnichols.com (Sarah Nichols) Date: Fri, 22 Feb 2002 17:45:38 -0000 Subject: [MT-List] Natural Language Engineering Message-ID: Natural Language Engineering Volume 7 - Issue 04 - December 2001 is now online. Visit http://journals.cambridge.org/journal_naturallanguageengineering for abstracts, table of contents and to browse an electronic sample copy. ISSN 1351-3249 Published four times a year by Cambridge University Press **Coming up in Volume 8, two special issues: Issue 2: June 2002: Robust Methods in Analysis of Natural Language Data Issue 3: Sept 2002: Word Sense Disambiguation Systems Sarah Nichols From DILM21@msn.com Sun Feb 24 07:32:36 2002 From: DILM21@msn.com (Arma Gedon2002) Date: Sat, 23 Feb 2002 23:32:36 -0800 Subject: [MT-List] co-operation on machine translation Message-ID: ------=_NextPart_001_0007_01C1BCC2.5FF12100 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi, My name is Mohammad Hosin Mehraby Aghdam, I live in Iran (West Azarba= ijan). =20 I am computer Eng. With 15 years Experience. During in 1987-2002, I desig= ned 6 package about application (MT and other), Now I design (3 version d= esigned) a great software about translator from English to Turkish , Fars= i, Arabic and I ready to make to other fields. My skills : 1- Fully in Morphology. 2- " " Semantics. 3- " " Analytic. 4- Modern mathematic methods for machine translation data structure. 5- New Algorithms in Parser, Pragmatics and other fields in translation. 6- Fully in Delphi, very strong in C++, and professional in Data Structur= e & IT systems. sincerely. My E-MAIL(s) : DILM21@msn.com & hosinagdam2002@ms= n.com & hosinagdam@msn.com Get more from the Web. FREE MSN Explorer download : http:/= /explorer.msn.com ------=_NextPart_001_0007_01C1BCC2.5FF12100 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

Hi, My name is Mohammad Hosin Mehraby Aghda= m, I live in Iran (West Azarbaijan).

=

 

I am computer Eng. With 15 years Experie= nce. During in 1987-2002, I designed 6 package about application (MT and = other), Now I design (3 version designed) a great software about translat= or from English to Turkish , Farsi, Arabic and I ready to make to other f= ields.

My skills :

1- Fu= lly in Morphology.

2- " " Semantics.

3- " " Analytic.

<= P>4- Modern mathematic methods for machine translation data structure.

5- New Algorithms in Parser, Pragmatics and other fields in translat= ion.

6= - Fully in Delphi, very strong in C++, and professional in Data Structure= & IT systems.

sincerely= .

My E-MAIL(s) : DILM21@msn.co= m <mailto:DILM21@msn.com> & hosinagdam2002@msn.com <mailto:hosinagdam2002@msn.com> = & hosinagdam@msn.com <mailto:hosinagdam@m= sn.com>




Get more from the Web. FREE MSN Explorer download : http://explorer.msn.com

------=_NextPart_001_0007_01C1BCC2.5FF12100-- From Vladimir Rykov Tue Mar 5 13:17:09 2002 From: Vladimir Rykov (Vladimir Rykov) Date: Tue, 5 Mar 2002 16:17:09 +0300 Subject: [MT-List] Free Lance In-Reply-To: <3C625D02.3D18B8A3@sbu.ac.uk> References: <3C625D02.3D18B8A3@sbu.ac.uk> Message-ID: <1221924767.20020305161709@mail.ru> Russian Computational Linguist (PhD) could join a CL, MT, TM, IT, AI project as a free lance contractor (without dragging his body across the Universe) -- Best regards, Vladimir Rykov mailto:rykov2000@mail.ru PhD in Computational Linguistics, MIPT, MOSCOW http://rykov.narod.ru/ Engl. http://www.blkbox.com/~gigawatt/rykov.html Tel +7-903-749-19-99 From lucy_jing@yahoo.com Wed Mar 6 00:32:48 2002 From: lucy_jing@yahoo.com (lucy jing) Date: Tue, 5 Mar 2002 16:32:48 -0800 (PST) Subject: [MT-List] Re: Contents of MT-List digest Message-ID: <20020306003248.11148.qmail@web9608.mail.yahoo.com> __________________________________________________ Do You Yahoo!? Try FREE Yahoo! Mail - the world's greatest free email! http://mail.yahoo.com/ From Info@globalization.com Thu Mar 14 10:32:40 2002 From: Info@globalization.com (Info (Globalization)) Date: Thu, 14 Mar 2002 10:32:40 -0000 Subject: [MT-List] Freelance opportunity for Dutch Computational Linguists Message-ID: <61ACB752C91DD311978300105A36DFFE010C996B@NT-MAIL3> > To the members of the MT-List Mailing List > > Hello, > > An opening for freelance Dutch Computational Linguists has been posted > today on our site, http://www.globalization.com. Please log onto the site > for full details and an application form. > > Best regards, > from The Globalization Team From Andrei.Popescu-Belis@issco.unige.ch Fri Mar 15 10:23:07 2002 From: Andrei.Popescu-Belis@issco.unige.ch (Andrei Popescu-Belis) Date: Fri, 15 Mar 2002 11:23:07 +0100 Subject: [MT-List] PDF version of the Van Slype MTEval report Message-ID: <3C91CB8B.A2C5FE16@issco.unige.ch> Dear members of the MT-List, We would like to inform you that a digitized version of the Van Slype 1979 report on MT evaluation is now available at: http://www.issco.unige.ch/projects/isle/ewg.html (see the Working Group documents) This report contains valuable analyses of criteria for MT evaluation, and was produced for the European Community by synthesizing the contributions of a wide set of experts. The full reference is: Georges Van Slype (1979) - Critical Study of Methods for Evaluating the Quality of Machine Translation. Final Report, Bureau Marcel van Dijk / European Commission, Brussels. In the Evaluation Work Group of the ISLE project, we believe that this work is insufficiently known and used. Maghi King took thus the initiative to have the document digitized at ISSCO/TIM/ETI, University of Geneva, then publish it on the ISLE/EWG website. Hoping that it may prove useful to the MT community, Andrei Popescu-Belis -- ISSCO/TIM/ETI, Université de Genève tél: (41 22) 705 86 81 40, bd du Pont d'Arve fax: (41 22) 705 86 89 1211 Genève 4 - Suisse http://www.issco.unige.ch/staff/andrei From White_John@prc.com Fri Mar 15 19:05:01 2002 From: White_John@prc.com (White John) Date: Fri, 15 Mar 2002 14:05:01 -0500 Subject: [MT-List] PDF version of the Van Slype MTEval report Message-ID: <3EC8E8BD927A5A4A86AFCC88E5CCED6C3B5ED9@MCL6.DCMETRO.ADROOT.PRC.COM> Andrei, This is great news. Van Slype is a rich resource for evaluation = methods and rationales. This availability will allow a new generation of MT = researchers take advantage of it. John White -----Original Message----- From: Andrei Popescu-Belis = [mailto:Andrei.Popescu-Belis@issco.unige.ch]=20 Sent: Friday, March 15, 2002 5:23 AM To: mt-list@eamt.org Subject: [MT-List] PDF version of the Van Slype MTEval report Dear members of the MT-List, We would like to inform you that a digitized version of the Van Slype 1979 report on MT evaluation is now available at: http://www.issco.unige.ch/projects/isle/ewg.html (see the Working Group documents) This report contains valuable analyses of criteria for MT evaluation, and was produced for the European Community by synthesizing the contributions of a wide set of experts. The full reference is: Georges Van Slype (1979) - Critical Study of Methods for Evaluating the Quality of Machine Translation. Final Report, Bureau Marcel van Dijk / European Commission, Brussels.=20 In the Evaluation Work Group of the ISLE project, we believe that this work is insufficiently known and used. Maghi King took thus the initiative to have the document digitized at ISSCO/TIM/ETI, University of Geneva, then publish it on the ISLE/EWG website. Hoping that it may prove useful to the MT community, Andrei Popescu-Belis --=20 ISSCO/TIM/ETI, Universit=E9 de Gen=E8ve=20 t=E9l: (41 22) 705 86 81 40, bd du Pont d'Arve fax: (41 22) 705 86 89 1211 Gen=E8ve 4 - Suisse http://www.issco.unige.ch/staff/andrei --=20 For MT-List info, see http://www.eamt.org/mt-list.html From Andrei.Popescu-Belis@issco.unige.ch Thu Mar 14 11:19:20 2002 From: Andrei.Popescu-Belis@issco.unige.ch (Andrei Popescu-Belis) Date: Thu, 14 Mar 2002 12:19:20 +0100 Subject: [MT-List] PDF version of the Van Slype MTEval report Message-ID: <3C908738.676866A2@issco.unige.ch> Dear members of the MT-List, We would like to inform you that a digitized version of the Van Slype 1979 report on MT evaluation is now available at: http://www.issco.unige.ch/projects/isle/ewg.html (see the Working Group documents) This report contains valuable analyses of criteria for MT evaluation, and was produced for the European Community by synthesizing the contributions of a wide set of experts. The full reference is: Georges Van Slype (1979) - Critical Study of Methods for Evaluating the Quality of Machine Translation. Final Report, Bureau Marcel van Dijk / European Commission, Brussels. In the Evaluation Work Group of the ISLE project, we believe that this work is insufficiently known and used. Maghi King took thus the initiative to have the document digitized at ISSCO/TIM/ETI, University of Geneva, then publish it on the ISLE/EWG website. Hoping that it may prove useful to the MT community, Andrei Popescu-Belis -- ISSCO/TIM/ETI, Université de Genève tél: (41 22) 705 86 81 40, bd du Pont d'Arve fax: (41 22) 705 86 89 1211 Genève 4 - Suisse http://www.issco.unige.ch/staff/andrei From Lori Levin Tue Mar 19 04:17:45 2002 From: Lori Levin (Lori Levin) Date: Mon, 18 Mar 2002 23:17:45 -0500 Subject: [MT-List] CFP for ESSLLI Speech Translation Workshop Message-ID: <27557.1016511465@alexis.boltz.cs.cmu.edu> ESSLLI-2002 Workshop on Recent Advances in Speech Translation Systems August 12-16, 2002 Trento, IT A workshop held as part of the 14th European Summer School in Logic, Language and Information ESSLLI-2002 Trento, Italy August 5-16, 2002 ** REVISED CALL FOR PAPERS ** ** NOTE EXTENDED SUBMISSION DEADLINE: APRIL 15, 2002 ** ORGANIZERS: Alon Lavie and Lori Levin (Carnegie Mellon University) Fabio Pianesi (ITC-irst) DESCRIPTION: Speech Translation research has made significant strides over the last decade, with several large scale research efforts (C-STAR, Verbmobil, SLT, NESPOLE! and others) significantly advancing the state-of-the-art. A wide variety of different approaches to MT has been pursued in the various research efforts, and in some cases these have been combined in multi-engine approaches. Nevertheless, current speech translation technology is still far away from broad commercial application, with a basic tradeoff between quality of translation and domain coverage. Some recent research has focused on issues of robustness and domain-portability and on enhancing the communication abilities using multi-modal interaction. The purpose of this workshop will be to present the current state-of-the-art of speech translation research and explore the current promising trends and developments. This workshop is intended to complement the ACL-02 workshop on Speech-to-Speech Translation, which will be held in conjunction with ACL-02 in Philadelphia about one month earlier. Our intention is to have deeper and more focussed presentations and discussions on several identified key topics and issues in current speech translation research. In support of this goal, the workshop program will be organized around daily theme topics, taking advantage of the ESSLLI workshop format of five daily sessions of 90 minutes each. Each daily theme will consist of one or two long (30 minute) presentations of research and/or position papers which explore the theme, followed by extensive time for discussion of the main issues related to the theme. Groups and researchers are encouraged to submit distinct papers to both workshops. Please note that the submission deadline to this workshop has been extended to April 15th, in order not to conflict with the deadline of the ACL-02 workshop. Some possible theme topics include: - Architechture and design considerations for ST systems - New approaches to ST systems and their components - Domain and language portability issues for ST systems - Improving communication robustness - Robust Speech Recognition for ST applications (i.e, dealing with noise, bandwidth and platform issues) - Integration of ST with alternative modalities for cross-lingual communication - Evaluation of Speech Translation - specific problems and approaches. - Moving from prototypes to real-world systems and applications (i.e, issues related to translation quality, user interfaces, speech translation on small devices, etc.) SUBMISSION: We invite both research and position paper submissions from all researchers in the area of speech translation and related topics. Submissions will be Electronic, in either postscript, pdf or MS word formats. Submissions should not exceed 10 (A4 or letter) pages, typeset in 10-12 point, with at least 2.5 cm / 1 inch margins. All submissions will be reviewed by an international program committee. The accepted papers will be made available in a summer school reader. A joint volume of expanded versions of a selection of papers from both the ACL-02 and ESSLLI-02 workshops is being planned. A joint editorial board will be established after the workshops to select candidate papers from those presented at the two workshops and to consider possible publishing venues. Submissions should be sent by Monday, April 15, 2002 to the following email address: alavie@cs.cmu.edu IMPORTANT DATES: Apr 15, 2002: Deadline for submissions May 03, 2002: Notification of acceptance May 31, 2002: Final version due Aug 12, 2002: Start of workshop PROGRAM COMMITTEE: Alon Lavie (Carnegie Mellon) Lori Levin (Carnegie Mellon) Fabio Pianesi (ITC-irst) Tanja Schultz (Carnegie Mellon) Steven Krauwer (OTS) Yuqing Gao (IBM) Satoshi Nakamura (ATR) Herve Blanchon (Universite Joseph Fourier) Marcello Federico (ITC-irst) FURTHER INFORMATION: To obtain further information about ESSLLI-2002 please visit http://www.esslli2002.it/ This workshop is held as part of the ESSLLI-2002 summer school. Therefore all workshop participants are required to register for ESSLLI-2002. Registration information will be announced in due time by the local organizers on the ESSLLI-2002 website. From Andrei.Popescu-Belis@issco.unige.ch Wed Mar 20 12:47:19 2002 From: Andrei.Popescu-Belis@issco.unige.ch (Andrei Popescu-Belis) Date: Wed, 20 Mar 2002 13:47:19 +0100 Subject: [MT-List] Workshop on MT Evaluation at LREC 2002 Message-ID: <3C9884D7.D0CCFA5E@issco.unige.ch> Dear members of the mt-list, Please find below the call for participation for the MT Evaluation workshop organized on May 27th, 2002 at the LREC 2002 Conference, Canary Islands. Please accept our apologies if you receive multiple copies of this announcement. Thank you, Andrei Popescu-Belis ISSCO/TIM/ETI, Université de Genève --------------------------------------------------------------- Machine Translation Evaluation: Human Evaluators Meet Automated Metrics 27 May 2002 A hands-on evaluation workshop at LREC 2002 (27 May - 2 June 2002) Las Palmas, Canary Islands Second call for interest and participation --------------------------------------------------------------- Important dates LREC 2002 advance registration deadline: March 29th, 2002 Please check the Conference's webpage at: http://www.lrec-conf.org/lrec2002/ Distribution of pre-workshop material: April 2002 Workshop: May 27th, 2002 09:00 to 13:00 morning session 14:30 to 18:30 afternoon session --------------------------------------------------------------- Preliminary Schedule Morning introduction and welcome background on workshop theme integration of evaluation exercises (start) Afternoon integration of evaluation exercise (continue) reports cross-evaluation analysis final wrap-up ------------------------------------------------------------- Background The Evaluation Working Group of the ISLE project has organised a series of workshops on MT evaluation. Each of these workshops has contained a practical component, where participants have been asked to carry out exercises involving MT evaluation. These workshops proved to be very illuminating, and have stimulated ongoing work in the area, much of it was reported in the latest workshop in the series, held at the MT Summit meeting in September 2001. Results from previous workshops can be consulted at http://www.issco.unige.ch/projects/isle/ewg.html, and the proceedings from the MT Summit in Santiago de Compostela can be requested from the organisers. The workshop at LREC 2002 will continue the series, and will consist primarily of hands-on exercises defined to investigate empirically a small number of metrics proposed for evaluation of MT systems and the potential relationships between them. In an effort to develop a more systematic MT evaluation methodology, recent work in the EAGLES and ISLE projects, funded by the EU and NSF, has created a framework of characteristics in terms of which MT evaluations and systems, past and future, can be described and classified. The resulting taxonomy can be consulted at: http://issco-www.unige.ch/projects/isle/taxonomy2/. Previous workshops have led to critical analysis of measures drawn from the literature, and to the creation of new measures. Of the latter, several are aimed at eventual automation of the evaluation task and/or at finding relatively simple and inexpensive measures which correlate well with more complex measures that are hard to automate or expensive to implement. Given this background, the time has come to concentrate on systematizing the actual evaluation measures themselves. For any particular measure, one would like to know how accurate it is, how expensive and/or difficult to apply, how independent of other measures, etc. Very little of this type of information is available to date. This workshop will focus on these issues. The organizers will provide the participants in advance with the materials required to: - perform a small evaluation, using one or two measures - perform a cross-measure analysis of the resulting scores - create a general characterization of the measure's performance. The participants will then apply these measures to the data made available, and bring their results to the workshop in order to integrate them with other participants' results. The overall intention of the workshop is to discover, empirically, what kinds of characteristics are easily determinable, and how accurate they actually are. Only through a process of assessing the evaluations can we eventually arrive at a small but accurate set of measures that adequately cover the set of phenomena MT system evaluators, system developers, and potential MT users care about. It is our hope that participants will feel inspired to continue this process, so that the combined results can be assembled later, integrated into the framework, and become a valuable resource to anyone interested in MT evaluation. ------------------------------------------------------------- Organizing Committee Marianne Dabbadie EVALING, Paris, France Tony Hartley Centre for Translation Studies, University of Leeds, UK Eduard Hovy USC Information Sciences Institute, Marina del Rey, USA Margaret King ISSCO/TIM/ETI, University of Geneva, Switzerland Bente Maegaard Center for Sprogteknologi, Copenhagen, Denmark Sandra Manzi ISSCO/TIM/ETI, University of Geneva, Switzerland Keith J. Miller The MITRE Corporation, USA Widad Mustafa El Hadi Université Lille III - Charles de Gaulle, France Andrei Popescu-Belis ISSCO/TIM/ETI, University of Geneva, Switzerland Florence Reeder The MITRE Corporation, USA Michelle Vanni U.S. Department of Defense, USA ------------------------------------------------------------- Intention to participate: Participants wishing to receive preparatory data should send the the following information to contact person below: - name, address, email contact; - experience in MT evaluation; - languages known and level of comprehension (elementary, fair, good, near-native, native); Contact: Andrei Popescu-Belis Email: andrei.popescu-belis@issco.unige.ch Fax: (41 22) 705 86 89 Regular mail: ISSCO/TIM/ETI, University of Geneva 40, bd du Pont d'Arve CH-1211 Geneva 4 - SWITZERLAND Cost of the Workshop: LREC 2002 participants: 90 EURO Other participants: 140 EURO Registration forms are available on the LREC 2002 conference site: http://www.lrec-conf.org/lrec2002/ Main conference and workshop site: Palacio de Congresos, Las Palmas, Canary Islands --------------------------------------------------------------- From WJHutchins@compuserve.com Thu Mar 21 10:12:25 2002 From: WJHutchins@compuserve.com (John Hutchins) Date: Thu, 21 Mar 2002 05:12:25 -0500 Subject: [MT-List] back issues of MTNI Message-ID: <200203210512_MC3-F6A0-225E@compuserve.com> I have in my possession a number of copies of back issues of MT News International (from 1992 to 2001). Please contact me if you would like to fill your gaps. = With regards, John Hutchins (WJHutchins@compuserve.com) From Info@globalization.com Thu Mar 21 12:27:15 2002 From: Info@globalization.com (Info (Globalization)) Date: Thu, 21 Mar 2002 12:27:15 -0000 Subject: [MT-List] Survey on the usage of machine translation Message-ID: <61ACB752C91DD311978300105A36DFFE010C9973@NT-MAIL3> This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ------_=_NextPart_001_01C1D0D3.BBF3ABB0 Content-Type: text/plain; charset="iso-8859-1" To the members of the MT-List We are currently hosting a survey on the usage of machine translation, accessible from our homepage, http://www.globalization.com. Indeed, the recent growth in the use of machine translation (MT) and the integration of MT technology into new tools and production workflows has created a new evolving market which is still difficult to define. It is our aim to use the data gathered through this survey to produce a rough mapping of MT usage and users today, and at the same time confirm the emergence of new trends in the application of MT. To this purpose, we have prepared a multiple-choice questionnaire which we would like to invite you to take 10 minutes to answer. We thank you in advance for your time with this. Best regards, from The Globalization Team PS: Our apologies if you are already a member of our site, as you will already have received this invitation. ------_=_NextPart_001_01C1D0D3.BBF3ABB0 Content-Type: text/html; charset="iso-8859-1"
To the members of the MT-List
 
 
We are currently hosting a survey on the usage of machine translation, accessible from our homepage, http://www.globalization.com.

Indeed, the recent growth in the use of machine translation (MT) and the integration of MT technology into new tools and production workflows has created a new evolving market which is still difficult to define.

It is our aim to use the data gathered through this survey to produce a rough mapping of MT usage and users today, and at the same time confirm the emergence of new trends in the application of MT.

To this purpose, we have prepared a multiple-choice questionnaire which we would like to invite you to take 10 minutes to answer.
We thank you in advance for your time with this.

Best regards,
from The Globalization Team
 
PS: Our apologies if you are already a member of our site, as you will already have received this invitation.
------_=_NextPart_001_01C1D0D3.BBF3ABB0-- From E B Sat Mar 23 12:42:30 2002 From: E B (E B) Date: 23 Mar 2002 13:42:30 +0100 Subject: [MT-List] Software announcement Message-ID: <952252096erich@itla.ch> To all professional translators on the MT-List:=20 You are kindly invited to visit www.itla.ch/t4t_a.html to view a demo of a new glossary-driven pretranslation program. Thank you for your interest. Erich --=20 Erich Brandenberger ITLA P.O. Box 43 8702 Zollikon, Switzerland T +41 1 396 2010 F +41 1 391 3522 From postediting@hotmail.com Wed Mar 27 16:31:21 2002 From: postediting@hotmail.com (Jeff Allen) Date: Wed, 27 Mar 2002 16:31:21 Subject: [MT-List] Book review published on MT postediting book Message-ID: Dear MT-listers, A new book on MT postediting has been reviewed. The book: KRINGS Hans, edited by Geoffrey KOBY. 2001. Repairing Texts: Empirical Investigations of Machine Translation Post-Editing Processes. 2001. Translation Studies series. (Translated from German to English by Geoffrey Koby, Gregory Shreve, Katjz Mischerikow and Sarah Litzer) Ohio: Kent State University Press. The book review: An electronic version of the book review is available in two parts at: http://www.translation.zone in the "Articles" section The electronic version is published by Multilingual Press. (note: make sure to copy the entire URL address to your web browser if the hyperlinks below are truncated) Part 1 http://www.translationzone.com/Scripts/WebObjects.dll/TZ.woa/2/wo/gRZe7ZhsIHTP29X2oSr7Vs1ey64/0.0.18.9.0.0.0.0.6.2.0 Part 2 http://www.translationzone.com/Scripts/WebObjects.dll/TZ.woa/2/wo/gRZe7ZhsIHTP29X2oSr7Vs1ey64/0.0.18.9.0.0.0.0.6.1.0 A hardcopy version of the book review is published in the recent issue of MultiLingual Computing & Technology magazine, #46 Volume 13 Issue 2. www.multilingual.com Book order information: A short abstract of the book and order information is available at: http://bookmasters.com/ksu-press/ksu071.htm Toll free 1-800-247-6553 in the US and Canada Price: Cloth/US$55.00 Shipping: US$4.50 Best, Jeff Allen postediting@hotmail.com or jeff.allen@free.fr _________________________________________________________________ Send and receive Hotmail on your mobile device: http://mobile.msn.com From dewsbery@berlin.snafu.de Mon Mar 25 11:18:03 2002 From: dewsbery@berlin.snafu.de (Victor Dewsbery) Date: Mon, 25 Mar 2002 12:18:03 +0100 Subject: FW: [MT-List] Software announcement Message-ID: Hi Erich, Thanks for the URL. A couple of questions: 1. How does the program handle multiple entries in the glossary list (e.g. Kurs = price, course, policy)? 2. Does the program have any "fuzzy matching" capability? For example, if the glossary contains "geheim = secret", how does it handle the inflected forms "geheime, geheimes, geheimen, geheimer, geheimem"? 3. How does the program differ from the existing translation memory programs? I work with DejaVu, and as far as I see it can do more than your demo. Specifically, it handles not only glossary entries, but also recurring sentences and "fuzzy matching" (on the terminology and the sentence level), and in addition it can handle a wide range of file formats. There are several other translation memory programs, too; they differ in detail and refinement, and prices range from the "freebie" to the mega-investment level, but basically they all work on the same general principle. FWIW, - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Victor Dewsbery, B.A., BDÜ, MIL, Berlin www.dewsbery.de - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > To all professional translators on the MT-List: > You are kindly invited to visit www.itla.ch/t4t_a.html > to view a demo of a new glossary-driven > pretranslation program. Thank you for your interest. > Erich > -- > Erich Brandenberger > ITLA > P.O. Box 43 > 8702 Zollikon, Switzerland From steveri@microsoft.com Tue Mar 26 08:11:10 2002 From: steveri@microsoft.com (Steve Richardson) Date: Tue, 26 Mar 2002 00:11:10 -0800 Subject: [MT-List] AMTA-2002 ***Updated submission guidelines*** - Call for Papers Message-ID: <0FDD2891FCDF6E42891AEDE2198E5F7C04D8D8CE@red-msg-04.redmond.corp.microsoft.com> This is a multi-part message in MIME format. --------------InterScan_NT_MIME_Boundary Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C1D49D.C9ABDBD3" ------_=_NextPart_001_01C1D49D.C9ABDBD3 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable --- CALL FOR PAPERS --- The Association for Machine Translation in the Americas =20 *** SUBMISSION GUIDELINES HAVE CHANGED *** *** SEE UPDATED GUIDELINES BELOW *** =20 AMTA-2002 Conference Location: Tiburon, California=20 Dates: October 8-12, 2002 =20 The Association for Machine Translation in the Americas (AMTA) is pleased to announce its fifth biennial conference, planned for October 8-12, 2002, in Tiburon (near San Francisco), California. =20 CONFERENCE THEME: From Research to Real Users =20 Ever since the showdown between Empiricists and Rationalists a decade ago at TMI-92, MT researchers have hotly pursued promising paradigms for MT, including data-driven approaches (e.g., statistical, example-based) and hybrids that integrate these with more traditional rule-based components. =20 During the same period, commercial MT systems with standard transfer architectures have evolved along a parallel and almost unrelated track, increasing their coverage (primarily through manual update of their lexicons, we assume) and achieving much broader acceptance and usage, principally through the medium of the Internet. Web page translators have become commonplace; a number of online translation services have appeared, including in their offerings both raw and post-edited MT; and large corporations have been turning increasingly to MT to address the exigencies of global communication. Still, the output of the transfer-based systems employed in this expansion represents but a small drop in the ever-growing translation marketplace bucket. =20 Now, 10 years later, we wonder if this mounting variety of MT users is any better off, and if the promise of the research technologies is being realized to any measurable degree. In this regard, we pose the following questions: =20 Why aren't any current commercially available MT systems primarily data-driven? =20 Do any commercially available systems integrate (or plan to integrate) data-driven components? =20 Do data-driven systems have significant performance or quality issues? =20 Can such systems really provide better quality to users, or is their main advantage one of fast, facilitated customization? =20 If any new MT technology could provide such benefits (somewhat higher quality, or facilitated customization), would that be the key to more widespread use of MT, or are there yet other more relevant unresolved issues, such as system integration? =20 If better quality, customization, or system integration aren't the answer, then what is it that users really need from MT in order for it to be more useful to them? =20 We solicit participation on these and other topics related to the research, development, and use of MT in the form of original papers, demonstrations, workshops, tutorials, and panels. We invite all who are interested in MT to participate, including developers, researchers, end users, professional translators, managers, and marketing experts. We especially invite users to share their experiences, developers to describe their novel systems, managers and marketers to talk about what is happening in the marketplace, researchers to detail new capabilities or methods, and visionaries to describe the future as they see it. We also welcome and encourage participation by members of AMTA's sister organizations, AAMT in Asia and EAMT in Europe. =20 INVITED SPEAKERS =20 We are pleased to announce that invited speakers for the conference will include Yorick Wilks and Ken Church, both notable participants at TMI-92, and Jaap van der Meer, former president of ALPNET. We anticipate that the speakers will provide a sharp and stimulating focus on the theme of the conference. =20 Further details regarding the conference, including a call for Tutorial and Workshop proposals, may be found on the AMTA Web site at: http://www.amtaweb.org/AMTA2002/ =20 =20 CONFERENCE ORGANIZERS Elliott Macklovitch, General Chair Stephen D. Richardson, Program Chair Violetta Cavalli-Sforza, Local Arrangements Chair=20 Bob Frederking, Workshops and Tutorials=20 Laurie Gerber, Exhibits Coordinator=20 =20 =20 *** PLEASE NOTE THAT THE SUBMISSION GUIDELINES HAVE CHANGED*** *** UPDATED GUIDELINES ARE PROVIDED BELOW *** =20 PAPER AND SYSTEM DESCRIPTION/DEMONSTRATION SUBMISSIONS GUIDELINES =20 We are pleased to announce that the AMTA-2002 conference proceedings will be published in the Lecture Notes in Artificial Intelligence series by Springer-Verlag. (LNCS/LNAI series home page is located at: http://www.springer.de/comp/lncs/index.html) =20 It is therefore recommended that initial submissions to AMTA-2002 adhere as closely as possible to the formatting guidelines for authors located at: http://www.springer.de/comp/lncs/authors.html These guidelines will need to be strictly adhered to for the final versions of submissions that are accepted for publication in the proceedings.=20 =20 All submissions should be in English, and it is recommended that they be prepared using Latex2e or Microsoft Word, per instructions at the authors' web site given above (see site for details on using other text processing systems). Once prepared, they should be submitted electronically for review in one of the following three formats: =20 PDF (recommended) PostScript Microsoft Word =20 All submissions will be received and processed using the Conference Management Toolkit (CMT), located at: http://cmt.research.microsoft.com/AMTA2002.=20 =20 Authors should follow the instructions at the CMT web site to register, enter information about themselves and their paper, and upload a copy of their paper in one of the acceptable formats by the submission deadline. =20 Any questions regarding submissions or the use of this web site should be directed in email to: AMTA2002@microsoft.com.=20 =20 Important SUBMISSION DEADLINES are as follows:=20 =20 Submissions uploaded at CMT web site: April 15, 2002 (Monday) Notification of acceptance: May 31, 2002 (Friday) Final versions of papers due: July 15, 2002 (Monday) =20 At the CMT web site, authors will be asked to designate their submissions for one of the three conference tracks listed below. Again, initial submissions are expected to adhere as closely as possible to the guidelines found at http://www.springer.de/comp/lncs/authors.html. Information regarding submission length and additional requirements is also provided below. =20 Conference tracks: =20 1. Theoretical papers: Unpublished papers describing original work on all aspects of Machine Translation. Preference will be given to papers that include concrete results and that address the theme of moving MT research technology (including, but not limited to, data-driven systems or components) into real use. Papers may not be longer than 10 pages. =20 2. User studies: Studies of users' experiences with implementing MT or testing its applicability to some task. Of particular interest are experiences deploying new or advanced MT technology in a production context. Users, managers, and sales/marketing professionals are especially welcome to submit. Studies may not be longer than 8 pages. =20 3. System descriptions with optional system demonstrations: Approx. 25 minutes will be allocated per system description/demo. Descriptions may not be longer than 4 pages. The goal of system descriptions is to educate participants about the features and functionality of current and emerging MT systems. Sales presentations are not appropriate. The following additional information should be provided in each system description; - name and contact information of system builder - system category (research, pre-market prototype, or commercially available) - system characteristics (e.g., languages, domains, integration/networking features) If a system demonstration is included, please provide the following information: - hardware platform and operating system - name and contact information of system operations specialist =20 ------_=_NextPart_001_01C1D49D.C9ABDBD3 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Message
       &nbs= p;          &nbs= p;           --- CALL FOR = PAPERS ---

The Association = for=20 Machine=20 Translation in the=20 Americas

 

     *** SUBMISSION = GUIDELINES=20 HAVE CHANGED ***

        *** SEE = UPDATED=20 GUIDELINES BELOW ***

 

AMTA-2002=20 Conference

Location:  = Tiburon,=20 California 

Dates:  = October 8-12,=20 2002

 

The Association = for=20 Machine=20 Translation in the Americas = (AMTA) is=20 pleased to announce its fifth biennial conference, planned for October = 8-12,=20 2002, in Tiburon (near San Francisco), California.

 

CONFERENCE THEME: From Research to = Real=20 Users

 

Ever since the = showdown=20 between Empiricists and Rationalists a decade ago at TMI-92, MT = researchers have=20 hotly pursued promising paradigms for MT, including data-driven = approaches=20 (e.g., statistical, example-based) and hybrids that integrate these with = more=20 traditional rule-based components.

 

During the same = period,=20 commercial MT systems with standard transfer architectures have evolved = along a=20 parallel and almost unrelated track, increasing their coverage = (primarily=20 through manual update of their lexicons, we assume) and achieving much = broader=20 acceptance and usage, principally through the medium of the Internet. = Web page=20 translators have become commonplace; a number of online translation = services=20 have appeared, including in their offerings both raw and post-edited MT; = and=20 large corporations have been = turning=20 increasingly to MT to address the = exigencies of=20 global communication.  Still, the output of the transfer-based = systems=20 employed in this expansion represents but a small drop in the = ever-growing=20 translation marketplace bucket.

 

Now, 10 years = later, we=20 wonder if this mounting variety of MT users is any better off, and if = the=20 promise of the research technologies is being realized to any measurable = degree.  In this regard, we pose the following = questions:

 

Why aren't any = current=20 commercially available MT systems primarily = data-driven?

 

Do any = commercially=20 available systems integrate (or plan to integrate) data-driven=20 components?

 

Do data-driven = systems have=20 significant performance or quality issues?

 

Can such systems = really=20 provide better quality to users, or is their main advantage one of fast, = facilitated customization?

 

If any new MT = technology=20 could provide such benefits (somewhat higher quality, or facilitated=20 customization), would that be the key to more widespread use of MT, or = are there=20 yet other more relevant unresolved issues, such as system=20 integration?

 

If better quality, = customization, or system integration aren't the answer, then what is it = that=20 users really need from MT in order for it to be more useful to=20 them?

 

We solicit = participation on=20 these and other topics related to the research, development, and use of = MT in=20 the form of original papers, demonstrations, workshops, tutorials, and = panels.=20 We invite all who are interested in MT to participate, including = developers,=20 researchers, end users, professional translators, managers, and = marketing=20 experts. We especially invite users to share their experiences, = developers to=20 describe their novel systems, managers and marketers to talk about what = is=20 happening in the marketplace, researchers to detail new capabilities or = methods,=20 and visionaries to describe the future as they see it.  We also = welcome and=20 encourage participation by members of AMTA's sister organizations, AAMT = in=20 Asia and EAMT in=20 Europe.

 

INVITED SPEAKERS

 

We are pleased to announce that = invited speakers=20 for the conference will include Yorick Wilks and Ken Church, both = notable=20 participants at TMI-92, and = Jaap van=20 der Meer, former = president of=20 ALPNET.  We anticipate that the speakers will provide a sharp = and=20 stimulating focus on the theme of the conference.

 

Further=20 details regarding the=20 conference, including a call = for Tutorial=20 and Workshop proposals, may be found on the AMTA Web = site at:

http://www.amtaweb.org/AMTA2002/

 

 

CONFERENCE=20 ORGANIZERS


Elliott=20 Macklovitch
, General=20 Chair

Stephen D. = Richardson,=20 Program Chair

Violetta = Cavalli-Sforza,=20 Local Arrangements Chair

Bob Frederking, = Workshops=20 and Tutorials

Laurie Gerber, = Exhibits=20 Coordinator

 

 

*** PLEASE NOTE THAT THE SUBMISSION=20 GUIDELINES HAVE CHANGED***

       &nbs= p; ***=20 UPDATED GUIDELINES ARE PROVIDED BELOW ***

 

PAPER AND SYSTEM=20 DESCRIPTION/DEMONSTRATION SUBMISSIONS=20 GUIDELINES

 

We are pleased to announce that the AMTA-2002 conference proceedings = will be=20 published in the Lecture Notes in Artificial Intelligence series by=20 Springer-Verlag.  (LNCS/LNAI series home page is located at: http://www.springer.de/comp/lncs/index.html)

 

It is therefore recommended = that initial=20 submissions to AMTA-2002 adhere as closely as possible to the formatting = guidelines for authors located at:

http://www.springer.de/comp/lncs/authors.html<= /SPAN>

These guidelines will need to = be strictly=20 adhered to for the final versions of submissions that are accepted = for=20 publication in the proceedings. 

 

All submissions should be in English, and it = is=20 recommended that they be prepared using Latex2e or Microsoft Word, per=20 instructions at the authors' web site given above (see site for details = on using=20 other text processing systems). Once prepared, they should be submitted=20 electronically for review in one of the following three=20 formats:

 

PDF (recommended)

PostScript

Microsoft Word

 

All submissions will be received and=20 processed using the Conference Management Toolkit (CMT), located = at: http://cmt.research.microsoft.com/AMTA2002.=20

 

Authors should follow the instructions at the = CMT web=20 site to register, enter information about themselves and their paper, = and upload=20 a copy of their paper in one of the acceptable formats by the = submission deadline. 

 

Any questions regarding submissions or the = use of this=20 web site should be directed in email to:

AMTA2002@microsoft.com.=20

 

Important SUBMISSION DEADLINES are = as=20 follows:

 

Submissions uploaded at CMT web site:   April 15, = 2002 = (Monday)

Notification of=20 acceptance:       &nbs= p;    =20 May 31,=20 2002=20 (Friday)

Final versions of = papers=20 due:       &nbs= p;  =20 July 15,=20 2002=20 (Monday)

 

At the CMT web site, authors will be asked to = designate=20 their submissions for one of the three conference tracks listed below. Again, initial submissions are expected to = adhere as=20 closely as possible to the guidelines found at http://www.springer.de/comp/lncs/authors.html<= /A>.=20 Information regarding submission length and additional requirements = is also=20 provided below.

 

Conference tracks:

 

1.=20 Theoretical papers: Unpublished papers describing original work on all = aspects=20 of Machine=20 Translation.  Preference = will be=20 given to papers that include concrete results and that address the theme = of=20 moving MT research technology (including, but not limited to, = data-driven=20 systems or components) into real use.  Papers may not be longer than 10 = pages.

 

2.=20 User studies: Studies of users’ experiences with implementing MT = or testing its=20 applicability to some task.  Of particular interest are experiences = deploying new or advanced MT technology in a production context. =  Users,=20 managers, and sales/marketing professionals are especially welcome to=20 submit.  Studies may = not be=20 longer than 8 pages.

 

3.=20 System descriptions with optional system demonstrations: Approx. 25 = minutes will=20 be allocated per system description/demo.  Descriptions may not be longer than 4 = pages. The goal of=20 system descriptions is to educate participants about the features and=20 functionality of current and emerging MT systems. Sales presentations = are not=20 appropriate. The following additional=20 information should be provided in each system = description;

-  name and = contact=20 information of system builder

-  system = category=20 (research, pre-market prototype, or commercially = available)

-  system=20 characteristics (e.g., languages, domains, integration/networking=20 features)

If a system = demonstration is=20 included, please provide the following information:

-  hardware = platform=20 and operating system

-  name and = contact=20 information of system operations specialist

 

------_=_NextPart_001_01C1D49D.C9ABDBD3-- --------------InterScan_NT_MIME_Boundary-- From melamed@cs.nyu.edu Sat Mar 30 01:32:16 2002 From: melamed@cs.nyu.edu (Dan Melamed) Date: Fri, 29 Mar 2002 20:32:16 -0500 (EST) Subject: [MT-List] job @ NYU: statistical MT, etc. Message-ID: <200203300132.g2U1WGn13885@dept.cs.nyu.edu> R E S E A R C H S C I E N T I S T / E N G I N E E R (PARALLEL TEXT PROCESSING) The Proteus Project at New York University is commencing an exciting new research program whose purpose is to dramatically improve the accuracy and availability of machine translation and related applications. We seek an ambitious research scientist/engineer to join our rapidly growing team. The primary responsibility of the position will be to help us fulfill our contractual R&D obligations. Ideally, our new colleague should also have the energy and creativity to improve on the state of the art. At a minimum, applicants should have excellent software engineering skills. We would prefer that they also have an empirically-oriented PhD in CS, Math, Physics, or a similar field. The position will be open until filled, and is expected to start in June or soon thereafter. To apply, please send the following items to the undersigned: * your CV * your salary expectations * names, phone #'s, and email addresses for at least 3 references * answers to the questionnaire at http://www.cs.nyu.edu/cs/projects/proteus/opp/staff/questionnaire.txt More information about the Proteus Project is on the web at http://www.cs.nyu.edu/cs/projects/proteus/. New York University is located in Greenwich Village, the intellectual and cultural epicenter of one of the most fun cities in the world. ----------------------------------------------------------------------- Prof. I. Dan Melamed 719 Broadway Avenue #701 New York, NY, 10003 melamed@cs.nyu.edu From davidsgreenough@hotmail.com Fri Apr 12 13:22:29 2002 From: davidsgreenough@hotmail.com (David Greenough) Date: Fri, 12 Apr 2002 08:22:29 -0400 Subject: [MT-List] VOICEMETHODS, LLC Message-ID: For companies interested in or, actively engaged in: Speaker-independent voice-command and multi-lingual applications Serving global businesses, government and education markets Since forming VoiceMethods LLC, Ectaco, Inc., a $30 million manufacturer of hand-held electronic dictionaries and language learning software, has been busy working through the strategic, commercial and customer implications of rolling out UT-103, the first hand-held device based on the company¡¦s award winning speech recognition/Universal Translator technology. In fact, the first manufacturing run of the device, 10,000 units, is sold out. Additionally, we are prototyping a series of custom, enterprise-level applications. For instance, we are developing hand-held, voice command, multi-lingual translators for U.S. Government military use, custom, multi-purpose phrase translation devices for Police Departments, versions of UT-103 for Windows and (downloadable) iPAQ, as well as exploring significant opportunities in Education, Travel, Social Services and number of other potentially profitable industries and market segments. VoiceMethods strategy, from the beginning, is to be the right partner in the New World economy, bringing first-class, globally distributed software and applications development solutions to customers in the US. With no sacrifice in quality or time, VoiceMethods can reduce both IT and R&D budgets through aggressive pricing made possible by our offshore development center. We have been successful in offering FREE pre-planning analysis of our client's business and technology requirements and applying our ideas to developing voice enabled and multi-lingual solutions using existing and proven technology customized for your specific needs. In most cases, we have been able to deliver ¡§proof of concept¡¨ prototype applications quickly and on a nearly risk-free basis. VoiceMethods is organized for today¡¦s technology demands. We employ highly skilled scientists, mathematicians, researchers, analysts and software engineers. We operate with a proven, experienced and well disciplined team, using development and implementation methodology without the unnecessary steps or cost. Our Executive team, Professional Services group, and Technical Architects are comprised of American staff based at our office in New York City with extensive experience in language translation and speech recognition software development. Currently, we are seeking strategic partners for development projects in all of the following market spaces: „h Language learning and oral proficiency „h Technical and Vocational Training „h Entertainment „h Travel „h Enterprise applications (Internet) „h Embedded technology (Cellular telephone, wireless applications, PDAs, geopositioning devices, Peer-to-peer network devices, Beaming technologies) „h Law Enforcement applications „h Public Safety applications „h Military applications To schedule a FREE pre-planning analysis, please call today. David S. Greenough VoiceMethods LLC 212-535-8105 telephone 917-579-839 mobile davidsgreenough@hotmail.com _________________________________________________________________ Join the world’s largest e-mail service with MSN Hotmail. http://www.hotmail.com From steven.krauwer@let.uu.nl Thu Apr 18 14:03:02 2002 From: steven.krauwer@let.uu.nl (Steven Krauwer) Date: Thu, 18 Apr 2002 15:03:02 +0200 (MEST) Subject: [MT-List] Directories of language and speech technology experts Message-ID: <200204181303.g3ID32Y21491@sfinx.let.uu.nl> ELSNET's Directory of Language and Speech Technology Experts and Organisations _________________________________________________________________ Find language and speech technology experts all over the world, -- and be found if you are an expert yourself! ELSNET has just taken over the responsibility for the former joint ELSNET / STN directory of experts in the field of language and speech processing and related areas. This directory is intended to provide direct access to the top experts in these fields. At this moment the directory includes 900 experts from 56 countries all over the world, both from the academic and from the industrial community. Connected to this directory we have a directory of language and speech technology organisations, containing over 2500 private and public companies, service providers, research labs, etc. At this moment you can browse the list by name (in alphabetic order) or by country, and the pages have been submitted to all major search engines. A keyword search facility is provided. Visit the experts directory at http://www.elsnet.org/experts.html and the organisations directory at http://www.elsnet.org/organisations.html and add your own and your organisation's profile if they are not already included, or check them if they are. Please note that your data will be publicly available on the web, but that we will not hand them over to third parties or use them for mass mailings that are not connected to the maintenance of the directory or the network as a whole. ELSNET is the European Network of Excellence in Human Language Technologies, established in 1991, and supported by the European Commission. __________________________________________________________________________ Steven Krauwer, ELSNET coordinator, UiL-OTS, Trans 10, 3512 JK Utrecht, NL phone: +31 30 253 6050, fax: +31 30 253 6000, email: s.krauwer@elsnet.org http://www.elsnet.org From steven.krauwer@let.uu.nl Thu Apr 18 15:07:55 2002 From: steven.krauwer@let.uu.nl (Steven Krauwer) Date: Thu, 18 Apr 2002 16:07:55 +0200 (MEST) Subject: [MT-List] [CfP] COLING2002 Roadmap Workshop Message-ID: <200204181407.g3IE7tt24047@sfinx.let.uu.nl> Second announcement and call for papers and other contributions SUBMISSION DEADLINE SUNDAY MAY 5 A Roadmap for Computational Linguistics Saturday, August 31 2002 Workshop in conjunction with COLING 2002 (August 24 - September 1, 2002, Taipei, Taiwan) Organized by ELSNET Context and objective ELSNET is the European Network of Excellence in Human Language Technologies, which was created in 1991, with a view to supporting and facilitating research, development and training in the field of language and speech technologies and related areas. The network funded by the European Commission, but its scope is not limited to Europe. This workshop should be seen as a step in ELSNET's aim to build a roadmap for language and speech technology. It is one of a number of workshops of this type that have been and will be organised in order to arrive at a broadly supported roadmap for our field, which should help us identifying major challenges, setting research priorities and defining common goals. At this workshop we will * confront the audience with the approach and the results of ELSNET's roadmapping exercise thus far; * invite participants to give their own presentation of what they see as the main longer term challenges and internmediate milestones in our field as well as their strategies to meet these challenges; * organise panel and discussion sessions aimed at reaching a consensus on what the main challenges and priorities are. As a special feature we will invite a number of rapporteurs to have a critical look at the papers presented at the various thematic sessions of the main conference, with a view to relating them to the main challenges and milestones: do we spot new challenges, new milestones, new strategies, new directions, regional differences, etc. All reports and summaries of discussions will be integrated in ELSNET's Roadmap, and given wide distribution via ELSNET's communication channels (website, newsletter, discussion forums, etc). Target audience A workshop of this type will be most appealing to people who are interested in developing longer term strategic views, e.g. senior scientists in charge of longer term research policies (both in academia and in industry), but also researchers and developers who have specific views on what will happen or what should happen, should feel invited to attend and contribute, as well as people who are responsible for the education of future generartions of researchers and developers. Papers We invite papers that can contribute to the creation and further improvement of the roadmap for our field, such as (but not limited to): * visions of the future * identification of major challenges emanating from research results or from application needs * identification of major milestones on our way towards our goals and their interdependencies * comparative assessment of technologies and their impact on our future * ways to address the growing demand for training, especially (but noy exclusively) across disciplines and aimed at professionals already working in the field * ways to address the increasing needs for novel resources, both for the commercially interesting languages and for other languages Other contributions We also invite proposals for other possible contributions to this workshop, such as (but not limited to): * Proposals for a panel session. Proposers should describe the specific topic of a one hour panel session, including names of potential panelists. The proposer of an accepted panel should be prepared to assist the ELSNET team in the organisation of the panel. * Proposals to act as a rapporteur for a specific subtheme of the main conference. Proposers should be willing to give an up to 15 minutes report on the impact the papers related to this subtheme presented at the main conference on the future development of our field. Proposals should identify the proposer's preferred subtheme, and should provide evidence that the proposer has sufficient professional experience to do this job (e.g. a CV, pointers to relevant publications, supporting letters from colleagues, etc.). Panel reports and subtheme reports will be given wide distribution after the workshop, but by their nature they can not be published in the proceedings. In order to compensate panel organisers and rapporteurs for the fact that their efforts -however impactful they may turn out to be- will not result in standard academic publications, ELSNET will reimburse their workshop registration fee. Submission and calendar Abstracts for workshop papers, panel proposals or proposals to act as a rapporteur should not exceed four A4 pages, and should be sent electronically in plain ASCII text, in MS Word or in PDF format to Steven.Krauwer@elsnet.org by Sunday, May 05 2002. Deadline for Submissions: Sun 05 May 2002 Notification of Acceptance for papers, panels and rapporteurs: Fri 24 May 2002 Final Versions of Papers Due: Fri 28 June 2002 Workshop: Sat 31 August 2002 Registration and other information Registration details and other information will be published on the main conference website: http://www.coling2002.sinica.edu.tw/ The URL for this workshop is http://www.elsnet.org/roadmap-coling2002.html Workshop PC *composition to be confirmed* Contact Steven Krauwer (Chair), steven.krauwer@let.uu.nl ELSNET / Utrecht University http://www.elsnet.org Trans 10 phone: +31 30 253 6050 3512 JK UTRECHT, NL fax: +31 30 253 6000 From ling98@videotron.ca Mon Apr 22 23:56:51 2002 From: ling98@videotron.ca (Michael Blekhman) Date: Mon, 22 Apr 2002 18:56:51 -0400 Subject: [MT-List] Company presentation Message-ID: <005c01c1ea50$fdfa3100$baa9c818@videotron.ca> Lingvistica '98 Inc. A Canadian Software Developer: West European, Slavic, and Asian Languages Lingvistica '98 Inc. is based in Montreal, Canada, with a partner company in Toronto - VirtualWare Technologies. We are one of the most well known Canadian developers of linguistic software, with affiliations in 2 largest Ukrainian cities - Kyiv (Kiev) and Kharkiv (Kharkov). Our sister company, Lingvistica b.v., is based in Dongen, The Netherlands. Please visit our web sites at: www.ling98.com www.lingvistica.com http://www.allvirtualware.com Our software catalog can be downloaded from: www.ling98.com/Catalog.doc Our web mining lexicographic software is described in our most recent contributions that can be downloaded from: www.ling98.com/Web_robot_Paper.doc www.ling98.com/New_Words.doc Lingvistica '98 Inc. develops linguistic resources: lexical databases, dictionaries, lexicons as well as machine translation engines and operational MT systems. We also have experience in developing speech and text recognition systems as well as multimedia language teaching systems. We are presently developing or have developed linguistic products on the order of France Telecom, SYSTRAN, New Mexico State University, Language Engineering Company, and many other organizations throughout the world. Our latest developments are an Azerbaijani spell-checker and a larger German-French dictionary - the latter created jointly with the SCIPER company, our France-based partner. Earlier, we developed a series of proper name dictionaries for the NMSU CRL: English, Arabic, Chinese, Japanese, Persian, Russian, Spanish, and Turkish. Projects presently under way are: Polish<=>English translations software Azerbaijani grammar checker English<=>Russian translator workbench Turkish<=>English translation software German for Russians and Russian for English-speaking Learners tutorial software Our most ambitious project and, at the same time, a far-reaching goal is the Communication through the Computer project, performed presently by 2 teams of researchers, one based in Montreal, Canada, and the other - in Kiev, Ukraine. The project includes speech recognition and synthesis, "intelligent" language teaching and translation software, spelling and grammar checkers, multilingual Internet mining tools, etc. Regards, Dr. Michael S. Blekhman President, Lingvistica '98 Inc. Montreal, Canada. From steven.krauwer@let.uu.nl Thu Apr 25 14:18:12 2002 From: steven.krauwer@let.uu.nl (Steven Krauwer) Date: Thu, 25 Apr 2002 15:18:12 +0200 (MEST) Subject: [MT-List] Call for participation: Roadmap Workshop at LREC2002 Message-ID: <200204251318.g3PDICk01447@sfinx.let.uu.nl> PROGRAM and CALL FOR PARTICIPATION Towards a Roadmap for Multimodal Language Resources and Evaluation An ELSNET workshop at LREC 2002 Las Palmas, Canary Islands, Spain Sunday, June 2 2002 (14:30 - 20:00) Aim of the workshop: The aim of the proposed workshop is to bring together key players in the field of resources and evaluation in order to make a first step towards the creation of a broadly supported Roadmap for Language Resources, i.e. a broadly supported view on the longer, medium and shorter term needs and priorities. This activity should be seen in the context of ELSNET's other roadmapping activities (see http://www.elsnet.org/roadmap.html), which aim at developing a technological roadmap for the whole field of Human Language Technologies. The purpose of such roadmaps is to give the R&D community an instrument to identify opportunities for concertation of their activities and better exploitation of possible synergies between players all over the world. Scope of this workshop: As there is no standard model for roadmaps for resources and evaluation available, we will narrow the scope of this roadmapping workshop to a specific sub-area: Multimodal Language Resources and Evaluation. This will make our discussions more focused and concrete, and it will also allow us to exploit the fact that this workshop will take place the day after the workshop dedicated to Multimodal Resources and Evaluation of Multimodal Systems (MREMS) in general. Provisional program: Start End Action Title & Actor(s) 14:30 14:45 Opening Introduction to this workshop (Steven Krauwer) 14:45 15:15 Talk Summary of the MREMS Workshop (Mark Maybury) 15:15 15:40 Talk Challenges and Important Aspects in Planning and Performing Evaluation Studies for Multimodal Dialogue Systems (Susanne Höllerer) 15:40 16:05 Talk XML and multimodal corpus design: experiences with multi-layered stand-off annotations in the GeM corpus (John Bateman, Judy Delin, Renate Herschel) 16:05 16:30 Talk Towards a roadmap for Human Language Technologies: Dutch-Flemish experience (Diana Binnenpoorte, Catia Cucchiarini, Elisabeth D'Halleweyn, Janienke Sturm and Folkert de Vriend) 16:30 17:00 Break 17:00 17:30 Talk Introduction to the plenary exercises (Steven Krauwer) 17:30 18:30 Exercise Identifying priorities (All) 18:30 19:30 Exercise Putting them on a timeline (All) 19:30 20:00 Discussion Where to go from here (All & Steven Krauwer) 20:00 Closing Recommended reading (preferably before the workshop): * ELSNET's First Roadmap Report (http://utrecht.elsnet.org/roadmap/docs/rm-bernsen-v2.pdf), edited by Ole Bernsen * ELSNET's Second Roadmap Report (http://utrecht.elsnet.org/roadmap/docs/rm-eisele-v2.pdf), edited by Dorothee and Andreas Eisele) Registration: The registration fee for the workshop is 90 EURO for conference participants and 140 EURO for others. The fee includes two coffee breaks and the proceedings of the workshop. URLs: * Workshop: http://www.elsnet.org/roadmap-lrec2002.html * Conference: http://www.lrec-conf.org Core Programme Committee: * Steven Krauwer (ELSNET / Utrecht University) * Hans Uszkoreit (DFKI Saarbruecken) * Antonio Zampolli (Univ of Pisa) * Joseph Mariani (LIMSI, Paris) * Ulrich Heid (IMS Stuttgart) * Khalid Choukri (ELDA Paris) * Mark Maybury (MITRE) Contact point: Steven Krauwer, ELSNET coordinator, UiL-OTS, Trans 10, 3512 JK Utrecht, NL phone: +31 30 253 6050, fax: +31 30 253 6000, email: s.krauwer@elsnet.org __________________________________________________________________________ Steven Krauwer, ELSNET coordinator, UiL-OTS, Trans 10, 3512 JK Utrecht, NL phone: +31 30 253 6050, fax: +31 30 253 6000, email: s.krauwer@elsnet.org http://www.elsnet.org From M_H_Mehraby@msn.com Tue Apr 30 23:16:27 2002 From: M_H_Mehraby@msn.com (Mohammad Hosin Mehraby Agdam) Date: Wed, 1 May 2002 02:46:27 +0430 Subject: [MT-List] How can i find (Cooperate) with a machine translation company? Message-ID: ------=_NextPart_001_0001_01C1F0BA.6410F220 Content-Type: text/plain; charset="windows-1256" Content-Transfer-Encoding: quoted-printable Dear Sears , =20 Greetings. I'm Mohamad Hosein Mehrabi , a computer eng. from Tehran(IRAN)= University, and 12 years experience and software designing on MT "1990-2= 002". =20 I explain all known my MT activities; My request is : Please tell me (How can i cooperate with your company or teams), if may b= e. During 1994-2000, I designed and producted 3 version of translator from E= nglish into Farsi/Arabic. Now i'm completing a software with modern metho= ds. As indicated during 12 years I solved many problems and finded some i= nnovation ways for MT design(Only target). I wrote 2 books about =20 "ambiguity (or indeterminacies & unsnarl) and Data structure for MT with = high coefficient". I am full success on MT design for English into Turkish,Azari,, Persian,A= rabic. I will can product a MT (English into Turkish/Azari/Arabic/Persia) with f= ollowing terms: I need to =B7 two lexicologist =B7 one Typist =B7 5-6 months for one target language. I'll can explain all my findings in conference or meetings. My skills on Software and computational linguistics: -Algorithm designing with AI technics & Fuzzy logic -Exper in Pragmatic & Semantic Patterns -Data Structure analysis -Designing and producting management -Programming with C++, Delphi & Pascal My Phone: 0098 411 6580048 =20 My E-Mails : M_H_Mehraby@msn.com hosinagdam2002@msn.com DILM21@msn.com Accept my best wishes, Sincerely, Mohamad Hosein Mehrabi.Get more from the Web. FREE MSN Explorer download= : http://explorer.msn.com ------=_NextPart_001_0001_01C1F0BA.6410F220 Content-Type: text/html; charset="windows-1256" Content-Transfer-Encoding: quoted-printable

Dear Sears ,

Greetings. I'm Mohamad Hosein Mehrabi , a computer= eng. from Tehran(IRAN) University, and 12 years experience and software = designing on MT "1990-2002".

I explain all known = my MT activities;

My request is :

Please tell me (How can i cooperate with your company or teams), if may = be.

Dur= ing 1994-2000, I designed and producted 3 version of translator from Engl= ish into Farsi/Arabic. Now i'm completing a software with modern methods.= As indicated during 12 years I solved many problems and finded some inno= vation ways for MT design(Only target). I wrote 2 books about

"ambiguity (or indeterminacies & unsnarl) and Data structur= e for MT with high coefficient".

I am full success on MT design for English into Turkish,Azari,, Persian,= Arabic.

I will can product a MT (English into Turkish= /Azari/Arabic/Persia) with following terms:

I need to=

=B7 = two lexicologist

=B7 one Typist

=B7 5-6= months for one target language.

I'll can explain all my findings in conference or meetings.<= /P>

<= P align=3Dleft>My skills o= n Software and computational linguistics:

-Algorithm designing with AI technics & Fuzzy logic

-Exper in Pragmatic & Semantic Patterns

-Data= Structure analysis

-Designing and producting managem= ent

-Programming with C++, Delphi & Pascal

My Phone: 0098 411 6580= 048

My E-Mails :

= <= P align=3Dleft>M_H_Mehraby@msn.com

hosinagdam2002@msn= .com

DILM21@msn.com

Ac= cept my best wishes,

Sincerely,

M= ohamad Hosein Mehrabi.

<= br clear=3Dall>
Get more from the Web. FREE MSN Explorer download : <= a href=3D'http://explorer.msn.com'>http://explorer.msn.com

------=_NextPart_001_0001_01C1F0BA.6410F220-- From Andy.Way" SENIOR LECTURERS/ LECTURERS SCHOOL of COMPUTER APPLICATIONS DUBLIN CITY UNIVERSITY, DUBLIN, IRELAND The School of Computer Applications is seeking researchers to fill several positions at Senior Lecturer and Lecturer level. Researchers will have access to new and very generous funding programmes from the Irish Government, and will enjoy a stimulating and supportive environment for research. Appointees will initially be expected to spend much of their time developing a personal and/or collaborative research programme and will be eligible for immediate tenure. We welcome applications from researchers in any area of computing or quantitative methods, but especially in any of the following: Dependable Systems Modelling and Scientific Computing Digital Multimedia and Information Management Language and Intelligence (NLP and AI) Applicants should normally have completed or be about to complete a Ph.D. in any area of computing or quantitative methods, or have a good track record of leading research programmes in these fields in industry or academia. Successful overseas applicants will be given re-location assistance including help securing visas, work-permits, and accommodation. Initial enquiries can be made to Prof. Joseph Morris, Head of School, tel. ++ 353 1 700 8419, e-mail jmorris@computing.dcu.ie Application forms can be obtained from the Personnel Office, D.C.U., Glasnevin, Dublin 9, tel. +353 1 7045939, email personnel.applications@dcu.ie, or they can be downloaded from the School's web page at http://www.computing.dcu.ie. Applications should include a full CV and a description of the applicant's current and/or planned research direction. Lecturer to Senior Lecturer scale range euro 30,554 - euro 76,008 Appointment level and salary will be commensurate with experience, and will suitably recompense the exceptional candidate. Closing date for applications: 31st May 2002. DCU is an equal opportunities employer. From ling98@videotron.ca Fri May 17 15:30:51 2002 From: ling98@videotron.ca (Michael Blekhman) Date: Fri, 17 May 2002 10:30:51 -0400 Subject: [MT-List] International Journal for Translation: Call for Papers Message-ID: <006501c1fdaf$71f6fa40$baa9c818@videotron.ca> Dear colleagues, It's my great pleasure to invite you to take part in the 3rd (2003) issue on machine translation of the International Journal for Translation. This issue will include descriptions of operational MT and MAT systems. If you are interested, please E-mail your contributions to: ling98@canada.com Please send them as RTF or DOC files, in the Times New Roman font, 11 pts. The maximum size of a paper should be 20 pages. The deadline is December 20, 2002. Thank you very much in advance! Sincerely, Dr. Michael S. Blekhman, Editor, MT Issue of IJT. President, Lingvistica '98 Inc. President, Lingvistica b.v. From nadamides@aslib.com" Apologies if you have received more than one message. Translating and the Computer 24 - Conference and Exhibition Supported by: EAMT, IAMT, BCS and ITI 21-22 November 2002 CBI, London This conference is one of the few international events which focuses on the user aspects of translation software and as such has been particularly beneficial to a very wide audience including translators, business managers, researchers and language experts. Once again, this year the conference will address the latest developments in translation (and translation-related) software. It will address the needs of the following conference attendees: industry public administration agencies freelancers development This call for papers invites abstracts of papers to be presented at the conference. The papers (and the presentations) should focus on the user aspects of translation or translation-related software rather than on theoretical issues. Presentations accompanied by demonstrations are especially welcome. TOPICS The range of topics includes (but is not limited to) use of MT systems machine-aided translation and translation aids controlled languages and their use in MT speech translation terminology localisation multilingual document management/workflow case studies of technology-based solutions the Internet and translation aids/services the value of "free" versus "charging" services/sites on the Internet SUBMISSION GUIDELINES Authors are required to submit an abstract of a MINIMUM of 500 words of the paper they would like to present, together with an outline of the structure of the paper and short BIOGRAPHY. Abstracts should be sent by POST or EMAIL before 20th June 2002 to: Nicole Adamides, Conference Organiser Aslib, The Association for Information Management Staple Hall, Stone House Court, London, EC3A 7PB Tel: +44(0) 20 7903 0000 Fax: +44 (0) 20 7903 0011 Email: nadamides@aslib.com The abstracts will be considered by the Programme Chairs, namely: Daniel Grasmick, SAP; Professor Ruslan Mitkov, University of Wolverhampton; Chris Pyne, Lionbridge Technologies Deutschland and Olaf-Michael Stefanov, United Nations. The authors of abstracts will be notified of acceptance or rejection of their submissions by 1 August 2002. The full length versions of the accepted papers (authors will be provided with detailed camera-ready copy guidelines) will be included in the conference proceedings and must be submitted by 10th October 2002 NICOLE ADAMIDES, Training Aslib/IMI, Staple Hall, Stone House Court, London EC3A 7PB Tel: +44 (0)20 7903 0031 Fax: +44 (0)20 7903 0011 www.aslib.com Email: nadamides@aslib.com From bond@cslab.kecl.ntt.co.jp Tue May 28 10:09:22 2002 From: bond@cslab.kecl.ntt.co.jp (Francis Bond) Date: Tue, 28 May 2002 18:09:22 +0900 (JST) Subject: [MT-List] TMI-2002 proceedings. Message-ID: <200205280909.SAA07455@fornost.icl.kecl.ntt.co.jp> G'day, the TMI-2002 proceedings are now available on-line: http://www.eamt.org/archive/tmi2002/ Thanks to the EAMT for hosting them. -- Francis Bond Local Chair TMI-2002 NTT Communication Science Laboratories | Machine Translation Research Group From Francine.Braun-Chen@cec.eu.int Tue May 28 15:04:38 2002 From: Francine.Braun-Chen@cec.eu.int (Francine.Braun-Chen@cec.eu.int) Date: Tue, 28 May 2002 16:04:38 +0200 Subject: [MT-List] MT Evaluation History Message-ID: This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ------_=_NextPart_001_01C20650.9ADCB0E0 Content-Type: text/plain; charset="windows-1256" Hi, Here are four sites where you will find a lot of information on MT evaluations: http://www.isi.edu/natural-language/mteval/ http://issco-www.unige.ch/publications/evaluators-forum.html http://issco-www.unige.ch/projects/isle/mt-eval-whereis.html http://www.lrec-conf.org/lrec2002/ Enjoy ! Francine -----Original Message----- From: hannouna [mailto:hannouna@uruklink.net] Sent: Thursday, January 18, 2001 3:20 PM To: mt-list@eamt.org Subject: [MT-List] MT Evaluation History Dear Friends of MT - List , I am a Ph.D student carrying out a research regarding the "Evaluation of MT Systems that go from English into Arabic ". I am in need of certain information relevant to the : History of MT Evaluation , Theoretical background of MT Evaluation metrics , scales , and measurements and other issues . If you are interested to give me information on these aspects , please kindly send it to the following e-mail address : hannouna@uruklink.net .You can also contact me on this postal address : Miss.Yasmin H.Hannouna AL-Karkh Post Office P.O.Box . 28322 Baghdad - Iraq . I highly appreciate your interest and cooperation , with respect . Awaiting your kind reply . Best regards . Sincerely , Yasmin . ------_=_NextPart_001_01C20650.9ADCB0E0 Content-Type: text/html; charset="windows-1256" Content-Transfer-Encoding: quoted-printable
Hi,
 
Here are four sites where you will find a = lot of=20 information on MT evaluations:
 
http://www.isi.edu/= natural-language/mteval/
ht= tp://issco-www.unige.ch/publications/evaluators-forum.html
ht= tp://issco-www.unige.ch/projects/isle/mt-eval-whereis.html

http://www.lrec-conf.org/lre= c2002/

Enjoy !

Francine

-----Original Message-----
From: hannouna=20 [mailto:hannouna@uruklink.net]
Sent: Thursday, January 18, = 2001 3:20=20 PM
To: mt-list@eamt.org
Subject: [MT-List] MT = Evaluation=20 History

Dear Friends of MT - List , =
I am a Ph.D student carrying out a = research=20 regarding the  "Evaluation of MT Systems that go from English = into Arabic=20 ". I am in need of certain information relevant to the : History of = MT=20 Evaluation , Theoretical background of MT Evaluation metrics , scales = , and=20 measurements and other issues . If you are interested to give me = information=20 on these aspects , please kindly send it to the following e-mail = address : hannouna@uruklink.net .You=20 can also contact me on this postal address :
 
Miss.Yasmin H.Hannouna
AL-Karkh Post Office
P.O.Box . 28322
Baghdad - Iraq .  
 
I highly appreciate your interest = and cooperation=20 , with respect .
 
Awaiting your kind reply . =
 
Best regards .
Sincerely ,
Yasmin . =
------_=_NextPart_001_01C20650.9ADCB0E0-- From m.shuttleworth@ic.ac.uk Mon May 27 13:46:00 2002 From: m.shuttleworth@ic.ac.uk (Shuttleworth, Mark) Date: Mon, 27 May 2002 13:46:00 +0100 Subject: [MT-List] job opportunity Message-ID: This may possibly be of interest to someone out there... Best wishes Mark Shuttleworth Mr Mark Shuttleworth Senior Lecturer in Scientific, Technical and Medical Translation Humanities Programme Imperial College of Science, Technology and Medicine Mechanical Engineering Building Exhibition Road London SW7 2AZ Telephone +44 (0)20 7594 8774 Fax +44 (0)20 7594 8759 E-mail m.shuttleworth@ic.ac.uk Website http://www.hu.ic.ac.uk/translation IMPERIAL COLLEGE OF SCIENCE, TECHNOLOGY AND MEDICINE HUMANITIES PROGRAMME Part-time (0.5) Lecturer In Translation with Localisation Applications are invited for this new half-time post in the Humanities Programme (South Kensington Campus) to start in October 2002 or as soon = as possible thereafter. The successful applicant will be expected to hold = a PhD or Master's degree in translation studies or to possess the equivalent = in experience in the translation industry, to be fully conversant with = software and web localisation and to be in a position to contribute to the = Humanities Programme's research in Translation Studies. He/she should be fluent in English and at least one other language. The post-holder will assume a share of the teaching the MSc in = Scientific, Technical and Medical Translation with Translation Technology, with = possible input into the new Imperial College Translation Centre. Depending on experience, he/she will also be expected to supervise PhD students. Salary will be on the Lecturer Scale =C2=A328,602 - 32,537 (pro rata), = plus =C2=A32,134 (pro rata) London Allowance, according to experience. = Interviews will take place on 2 July 2002. Informal enquiries should be made in the = first instance to Mr Mark Shuttleworth (e-mail m.shuttleworth@ic.ac.uk). = Further particulars can be obtained from Claire Mawson (e-mail = c.mawson@ic.ac.uk). Closing date for applications is 21st June 2002. The College is striving towards Equal Opportunities Many thanks Mark Shuttleworth From ajoscelyne@bootstrap.fr Thu May 30 10:12:46 2002 From: ajoscelyne@bootstrap.fr (Andrew Joscelyne) Date: Thu, 30 May 2002 11:12:46 +0200 Subject: [MT-List] Must-attend Language Technology event in September In-Reply-To: <1221924767.20020305161709@mail.ru> References: <3C625D02.3D18B8A3@sbu.ac.uk> <3C625D02.3D18B8A3@sbu.ac.uk> Message-ID: <5.1.0.14.0.20020530111139.0239eec0@pop.wanadoo.fr> --=====================_63154800==_.ALT Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: quoted-printable LangTech2002 - The New European Forum for Language Technology Dates: 26-27 Sept 2002 Location : - Berlin, Germany. Registration and information : http://www.lang-tech.org Attendees Language technology industrial developers, integrators, researchers,=20 investors and media from Europe, USA and ROW. Audience Technology-aware seekers of speech/knowledge/multilingual solutions have a= =20 chance to meet the European and global language R&D and business community Purpose The conference, the first of what is hoped to be an annual event in Europe,= =20 aims to boost innovation in and knowledge about language and speech=20 technologies by offering opportunities for CTOs, developers, exhibitors,=20 sponsors, funding agencies, businesses seeking VC to meet, be seen and=20 discover existing solutions and market opportunities, as well as emerging=20 technology visions from leading experts. The organizing committee includes Hans Uszk=F6reit, Bente Maegaard and= Joseph=20 Mariani with support from the European Commission. Sponsoring and exhibition There are visibility-boosting opportunities for exhibiting and sponsoring=20 at LangTech2002. Details on the website. --=====================_63154800==_.ALT Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable LangTech2002 - The New European Forum for Language=20 Technology

Dates: 26-27 Sept 2002

Location : - Berlin, Germany.

Registration and information : http://www.lang-tech.org 

Attendees
Language technology industrial developers, integrators, researchers, investors and media from Europe, USA and ROW.

Audience
Technology-aware seekers of speech/knowledge/multilingual solutions have a chance to meet the European and global language R&D and business community

Purpose
The conference, the first of what is hoped to be an annual event in Europe, aims to boost innovation in and knowledge about language and speech technologies by offering opportunities for CTOs, developers, exhibitors, sponsors, funding agencies, businesses seeking VC to meet, be seen and discover existing solutions and market opportunities, as well as emerging technology visions from leading experts.

The organizing committee includes Hans Uszk=F6reit, Bente Maegaard and Joseph Mariani with support from the European Commission.
 
Sponsoring and exhibition
There are visibility-boosting opportunities for exhibiting and sponsoring at LangTech2002. Details on the website.

--=====================_63154800==_.ALT-- From beliaev@mail.axon.ru Fri May 31 09:52:08 2002 From: beliaev@mail.axon.ru (Larisa N. Beliaeva) Date: Fri, 31 May 2002 12:52:08 +0400 Subject: [MT-List] e-mail Message-ID: <3CF739B7.C198D1D7@mail.axon.ru> Dear friends and colleagues! This is to unform you on changing my e-mail address which will be valid from May, 30. The old address is not valid anymore.. New address is as follows: belyaev@mail.wplus.net. Best regards, Larissa Beliaeva From ÷ÌÁÄÉÍÉÒ òÙËÏ×" Message-ID: We are full of joy that had changed your address! Should we tell it Putin? -----Original Message----- From: "Larisa N. Beliaeva" To: "mt-list@eamt.org" Date: Fri, 31 May 2002 12:52:08 +0400 Subject: [MT-List] e-mail > > Dear friends and colleagues! > This is to unform you on changing my e-mail address which will be valid > from May, 30. > The old address is not valid anymore.. > New address is as follows: > belyaev@mail.wplus.net. > Best regards, > Larissa Beliaeva > > > > -- > For MT-List info, see http://www.eamt.org/mt-list.html > P bI K O B B. B. MOCKBA Vladimir Rykov, PhD in Computational Linguistics, MOSCOW http://rykov.narod.ru/ Engl. http://www.blkbox.com/~gigawatt/rykov.html Tel +7-903-749-19-99 From =?koi8-r?Q?=9FIAAEIEO_oUEI=3F?= Sat Jun 1 04:14:59 2002 From: =?koi8-r?Q?=9FIAAEIEO_oUEI=3F?= (Xiuming Huang) Date: Sat, 1 Jun 2002 04:14:59 +0100 Subject: [MT-List] e-mail Message-ID: What a bright idea! Don't forget Mr. George Bush. -----Original Message----- From: ????????|?a?|?=9C ?=9C?????a To: Larisa N. Beliaeva Cc: mt-list@eamt.org Sent: 02-5-31 10:53 Subject: Re: [MT-List] e-mail We are full of joy that had changed your address! Should we tell it Putin? -----Original Message----- From: "Larisa N. Beliaeva" To: "mt-list@eamt.org" Date: Fri, 31 May 2002 12:52:08 +0400 Subject: [MT-List] e-mail >=20 > Dear friends and colleagues! > This is to unform you on changing my e-mail address which will be valid > from May, 30. > The old address is not valid anymore.. > New address is as follows: > belyaev@mail.wplus.net. > Best regards, > Larissa Beliaeva >=20 >=20 >=20 > --=20 > For MT-List info, see http://www.eamt.org/mt-list.html >=20 P bI K O B B. B. MOCKBA Vladimir Rykov, PhD in Computational Linguistics,=20 MOSCOW http://rykov.narod.ru/ Engl. http://www.blkbox.com/~gigawatt/rykov.html Tel +7-903-749-19-99 --=20 For MT-List info, see http://www.eamt.org/mt-list.html ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote also confirms that this email message has been swept=20 for the presence of computer viruses. ********************************************************************** From ÷ÌÁÄÉÍÉÒ òÙËÏ×" Message-ID: It's OK with George and Vlad. I am happy our MT discussion becomes more and more fruitful. We are human beings - not robots (yet). As to my weekend - I spent it at(on) off-Moscow reservoir on our yachtclub sailing boat ( see http://parus-mfti.narod.ru - photo album - http://parus-mfti.narod.ru/foto.html - 1st pic is mine) training our boys. They were happy but told me that I sing too much with radio. On the way back Sunday afternoon our yacht rule (translate it through your MT engines) had broken and they saw that I can do smth more then singing - we came safely back. A pity that MT list women did not see what a tough guy I was (and am!). Vladimir -----Original Message----- From: "Tomas Bueno" To: "?IAAEIEO oUEI?" Date: Sun, 2 Jun 2002 17:12:21 -0300 Subject: RE: [MT-List] e-mail > > Your weekend must have been fantastic, if you feel like you have to take on > Ms Beliaeva like this. > > -----Original Message----- > From: mt-list-admin@eamt.org [mailto:mt-list-admin@eamt.org]On Behalf Of > Xiuming Huang > Sent: Saturday, June 01, 2002 12:15 AM > To: '?ia?eieo ou??? '; 'Larisa N. Beliaeva ' > Cc: 'mt-list@eamt.org ' > Subject: RE: [MT-List] e-mail > > > What a bright idea! Don't forget Mr. George Bush. > > -----Original Message----- > From: ????????|?a?|? ? ?????a > To: Larisa N. Beliaeva > Cc: mt-list@eamt.org > Sent: 02-5-31 10:53 > Subject: Re: [MT-List] e-mail > > > We are full of joy that had changed your address! > > Should we tell it Putin? > > > -----Original Message----- > From: "Larisa N. Beliaeva" > To: "mt-list@eamt.org" > Date: Fri, 31 May 2002 12:52:08 +0400 > Subject: [MT-List] e-mail > > > > > Dear friends and colleagues! > > This is to unform you on changing my e-mail address which will be > valid > > from May, 30. > > The old address is not valid anymore.. > > New address is as follows: > > belyaev@mail.wplus.net. > > Best regards, > > Larissa Beliaeva > > > > > > > > -- > > For MT-List info, see http://www.eamt.org/mt-list.html > > > > > P bI K O B B. B. MOCKBA > > Vladimir Rykov, PhD in Computational Linguistics, > MOSCOW > http://rykov.narod.ru/ > Engl. http://www.blkbox.com/~gigawatt/rykov.html > Tel +7-903-749-19-99 > > > -- > For MT-List info, see http://www.eamt.org/mt-list.html > > > ********************************************************************** > This email and any files transmitted with it are confidential and > intended solely for the use of the individual or entity to whom they > are addressed. If you have received this email in error please notify > the system manager. > > This footnote also confirms that this email message has been swept > for the presence of computer viruses. > > ********************************************************************** > > -- > For MT-List info, see http://www.eamt.org/mt-list.html > > --- > Incoming mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.365 / Virus Database: 202 - Release Date: 24/05/02 > > --- > Outgoing mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.365 / Virus Database: 202 - Release Date: 24/05/02 > > P bI K O B B. B. MOCKBA Vladimir Rykov, PhD in Computational Linguistics, MOSCOW http://rykov.narod.ru/ Engl. http://www.blkbox.com/~gigawatt/rykov.html Tel +7-903-749-19-99 From olyturralde@yahoo.com Tue Jun 18 02:08:39 2002 From: olyturralde@yahoo.com (=?iso-8859-1?q?Orfi=20Yturralde?=) Date: Tue, 18 Jun 2002 02:08:39 +0100 (BST) Subject: [MT-List] Thesis topic Message-ID: <20020618010839.62136.qmail@web9208.mail.yahoo.com> Hello! I am a Master of Science in Computer Science student and I'm currently looking for a good thesis topic in the field of MT. Developing a translation model that will work better that the 4 models is the topic I currently have in mind. I will gladly welcome other suggestions. Thanks for your support. Orfi __________________________________________________ Do You Yahoo!? Yahoo! - Official partner of 2002 FIFA World Cup http://fifaworldcup.yahoo.com From H.Fulford@lboro.ac.uk Tue Jun 18 17:41:45 2002 From: H.Fulford@lboro.ac.uk (Heather Fulford) Date: Tue, 18 Jun 2002 17:41:45 +0100 Subject: [MT-List] PHD RESEARCH OPPORTUNITY Message-ID: <3.0.6.32.20020618174145.00b35a10@staff-mailin.lboro.ac.uk> EPSRC PhD RESEARCH STUDENTSHIP IN THE MANAGEMENT SCIENCE & INFORMATION SYSTEMS RESEARCH GROUP, BUSINESS SCHOOL, LOUGHBOROUGH UNIVERSITY PhD STUDENTSHIP for September 2002 Applications are invited for a three-year EPSRC research studentship award to commence in September 2002 in the Business School, Loughborough University. The successful applicant will work on a project investigating the adoption of language translation software by small translation businesses. Project summary: The demand for language translation services has increased significantly over the past decade, and to help meet that demand, software has been developed to support human translators, including machine translation systems, translation memory, and terminology management tools. This project comprises a study of the adoption of such software by UK translation businesses, focussing on the benefits the software affords, its impact on translators' working practice, and the strategies employed to integrate the software into a translator's workflow. Applicants should have a good honours degree or equivalent in a relevant area (e.g. translation studies, information systems, computational linguistics, or computer science). A Masters degree, or relevant research experience, would be an advantage. For full details of the Business School PhD programme, please go to: http://www.lboro.ac.uk/departments/bs/resdoct.html For an online application form, or to download a form, please go to: http://www.lboro.ac.uk/admin/central_admin/pg/forms.html For additional information and advice about this PhD studentship, please contact: Dr. Heather Fulford The Business School Loughborough University Loughborough Leicestershire LE11 3TU UK Tel. +44 (0)1509 222435 Fax +44 (0)1509 223960 E-mail h.fulford@lboro.ac.uk The studentship is available from September 2002. Deadline for applications: 30 June 2002. PLEASE MARK 'Translation Tools Project' ON THE APPLICATION From macklovi@IRO.UMontreal.CA Wed Jun 19 16:58:40 2002 From: macklovi@IRO.UMontreal.CA (Elliott Macklovitch) Date: Wed, 19 Jun 2002 11:58:40 -0400 Subject: [MT-List] AMTA-2002: First Call for Participation Message-ID: <3D10AA30.D75A5BB9@IRO.UMontreal.CA> --------------01FFD4389CD7D2D5AA397383 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit AMTA-2002 Tiburon, California October 8-12, 2002 Online registration is now available. Register before July 31 and save! For more details, visit the AMTA Web site at : http://www.amtaweb.org ***************************************** Elliott Macklovitch Laboratoire RALI, Dept. d'informatique et de recherche operationnelle Universite de Montreal, C.P. 6128, succursale Centre-ville Montreal, Quebec, Canada H3C 3J7 tel: (514) 343-7535 fax: (514) 343-2496 http://www-rali.iro.umontreal.ca --------------01FFD4389CD7D2D5AA397383 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit
AMTA-2002
Tiburon, California
October 8-12, 2002

Online registration is now available. Register before July 31 and save!

For more details, visit the AMTA Web site at :
                                               http://www.amtaweb.org
 
 

*****************************************

Elliott Macklovitch
     Laboratoire RALI, Dept. d'informatique et de recherche operationnelle
     Universite de Montreal, C.P. 6128, succursale Centre-ville
     Montreal, Quebec, Canada  H3C 3J7
     tel: (514) 343-7535        fax: (514) 343-2496
     http://www-rali.iro.umontreal.ca
  --------------01FFD4389CD7D2D5AA397383-- From deborahc@microsoft.com Thu Jun 27 23:15:20 2002 From: deborahc@microsoft.com (Deborah Coughlin) Date: Thu, 27 Jun 2002 15:15:20 -0700 Subject: [MT-List] Posting for Customization Strategies for MT Workshop Message-ID: <9795C27BA2AECB4BA2A99CFD6068237B01C5DDB5@RED-MSG-10.redmond.corp.microsoft.com> This is a multi-part message in MIME format. --------------InterScan_NT_MIME_Boundary Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C21E28.201F1E61" ------_=_NextPart_001_01C21E28.201F1E61 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Customization Strategies for MT Chairs: Jessie Pinkham, Deborah Coughlin, Bill Dolan=20 =20 Machine translation systems must have customization capabilities in order to claim success as a commercial product or a research prototype. Minimally, these include the ability to add translations for new words and phrases, but may also include more sophisticated functionality such as adapting to new syntactic structures or writing styles, and may even be the means of acquiring all the system's translation knowledge (e.g., in statistical systems), We propose to bring MT developers and researchers together to discuss the customization capabilities of their systems, with an emphasis on using common data for the discussion. =20 This workshop is intended to cover all types of customization strategies, although preference in selection will be given to novel strategies. We encourage participation from MT developers of commercial systems, and researchers working on all types of MT systems (traditional transfer, interlingua, example-based, or statistical). To make the discussion more interesting, we request that participants demonstrate the customization capabilities of their system using data freely available to everyone, such as Hansard data for French-English or other data available from ELRA or LDC. Microsoft has also agreed to make technical manual data available for several language pairs (English and any of these: French, Spanish, German, Japanese) for purposes of research related to this workshop. (For access to this data please send email to CustomWS@microsoft.com.) We request that interested parties submit a two page abstract with the following information: * overview of your MT system * description of customization capabilities * comparison of this strategy to other known strategies * data that will be used to test capability * proposed evaluation to determine the effectiveness of the customization * estimate of time that would be required to customize for the chosen domain, based on your sample run. Dates: Call for participation: June 8 Abstract submission deadline: July 7 Acceptance notification: July 22 Early registration for AMTA: July 31 Papers due: September 6 AMTA workshop October 8 =20 Instructions for submission: All submissions should be in English, and it is recommended that they be submitted in one of the following three formats: PDF (preferred); PostScript; Microsoft Word. All submissions will be received and processed using the Conference Management Toolkit (CMT) located at http://cmt.research.microsoft.com/CustomWS. Authors should follow the instructions at the CMT web site to register, enter information about themselves and their abstract, and upload a copy of their abstract in one of the acceptable formats by the submission deadline. Report any problems with the website to CustomWS@microsoft.com. For information on obtaining Microsoft Data to participate in the workshop, please contact CustomWS@microsoft.com =20 =20 =20 =20 ------_=_NextPart_001_01C21E28.201F1E61 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Posting for Customization Strategies for MT Workshop

Customization = Strategies for MT

Chairs: Jessie = Pinkham, Deborah Coughlin, Bill Dolan

 
Machine translation = systems must have customization capabilities in order to claim success = as a commercial product or a research prototype.  Minimally, these = include the ability to add translations for new words and phrases, but = may also include more sophisticated functionality such as adapting to = new syntactic structures or writing styles, and may even be the means of = acquiring all the system’s translation knowledge (e.g., in = statistical systems), We propose to bring MT developers and researchers = together to discuss the customization capabilities of their systems, = with an emphasis on using common data for the discussion.  =

This workshop is = intended to cover all types of customization strategies, although = preference in selection will be given to novel strategies. We encourage = participation from MT developers of commercial systems, and researchers = working on all types of MT systems (traditional transfer, interlingua, = example-based, or statistical).

To make the = discussion more interesting, we request that participants demonstrate = the customization capabilities of their system using data freely = available to everyone, such as Hansard data for French-English or other = data available from ELRA or LDC.  Microsoft has also agreed to make = technical manual data available for several language pairs (English and = any of these: French, Spanish, German, Japanese) for purposes of = research related to this workshop. (For access to this data please send = email to CustomWS@microsoft.com.)

We request that = interested parties submit a two page abstract with the following = information:

·        = overview of your MT system

·        = description of customization capabilities

·        = comparison of this strategy to other known strategies

·        data = that will be used to test capability

·        = proposed evaluation to determine the effectiveness of the = customization

·        = estimate of time that would be required to customize for the chosen = domain, based on your sample run.


Dates:

         &nbs= p;  Call for = participation:          = ;            =        June 8

         &nbs= p;  Abstract submission = deadline:          &nbs= p;     July 7

         &nbs= p;  Acceptance = notification:          =             &= nbsp; July 22

         &nbs= p;  Early registration for = AMTA:           &n= bsp;     July 31

         &nbs= p;  Papers = due:           &nb= sp;           &nbs= p;            = ;      September 6

         &nbs= p;  AMTA = workshop           = ;            =           October 8

 

Instructions for = submission:

All submissions = should be in English, and it is recommended that they be submitted in = one of the following three formats: PDF (preferred); PostScript; = Microsoft Word. All submissions will be received and processed using the = Conference Management Toolkit (CMT) located at http://cmt.research.microsoft.com/CustomWS<= FONT COLOR=3D"#008000" SIZE=3D2 FACE=3D"Arial">.

Authors should follow = the instructions at the CMT web site to register, enter information = about themselves and their abstract, and upload a copy of their abstract = in one of the acceptable formats by the submission deadline. Report any = problems with the website to CustomWS@microsoft.com.

For information on = obtaining Microsoft Data to participate in the workshop, please contact = CustomWS@microsoft.com

 

 

 

 

------_=_NextPart_001_01C21E28.201F1E61-- --------------InterScan_NT_MIME_Boundary-- From Lori Levin Mon Jul 1 21:21:38 2002 From: Lori Levin (Lori Levin) Date: Mon, 01 Jul 2002 16:21:38 -0400 Subject: [MT-List] Interlingua Workshop: Call for Abstracts Message-ID: <16609.1025554898@alexis.boltz.cs.cmu.edu> CALL FOR ABSTRACTS Workshop on Interlingua Reliability Tuesday, October 8, 2002 at AMTA 2002 Tiburon, California, USA Association for Machine Translation in the Americas *** This workshop is on a tight time line, so submit right away *** **** Abstract of 500 to 1500 words by July 14 **** Interlingua representations are usually very rich and extremely knowledge intensive. One of the benefits of such a system is that meaning representation can be relatively uniform across multiple source languages. This richness, however, leads to wide potential variation among different producers (human or machine) of interlingual representations. Given an interlingua language specification and a specific document, how similar will the interlinguas for that document be if produced by different linguists from different sites? from the same site? the same linguist on different days? by different semantic analyzers/parsers? The goal of the workshop is to investigate issues of reliability of interlingua representation. Papers are invited on: - inter-annotator agreement in manually producing interlingual representations - discussions on whether interlingual representations need to be canonical, and, if so, if there is any hope that they ever will be; and whether observations about the semantic structure of languages can inform the design of such a canonical representation - methodology or technology for ensuring common understanding of interlingual representations - discussions of the impact of differences/errors in interligual representations on the overall NLP system - methodology or technology for ensuring reproducibility of interlingual representations - measures of semantic similarity of interlingual expressions within an interlingual system - measures and methods for inter-translatability of interlingual representations in different interlingual systems - interaction of monolingual or parallel proposition bank/predicate-argument bank efforts (or other broad but shallow semantic resources) with interlingual reliability The workshop will consist of paper presentations, as well as one or more experiments in inter-annotator agreement. Pre-registered participants will receive a specification of an interlingual language, plus some text to annotate in advance. The experiments during the workshop will involve discussions of annotation differences among the participants, measures of agreement, etc. Instructions for submitting abstracts ------------------------------------- Length: 500 to 1500 words Submission deadline: July 14, 2002 Send to: Lori Levin, lsl@cs.cmu.edu Format: One of the following: ascii, ps, psf, doc Instructions for Participation in Interlingua Coding Experiment --------------------------------------------------------------- If you want to participate in the interlingua coding experiments, but do not want to present a paper, send an email containing your name and institution to Lori Levin, lsl@cs.cmu.edu, by July 14, 2002. Time Line --------- Submission deadline: July 14 Notification: July 30 AMTA early registration deadline: July 31 Papers due: September 6 Workshop date: October 8 Program Committee ------------------ Bonnie Dorr UMD David Farwell NMSU Stephen Helmreich NMSU Lori Levin CMU Keith Miller MITRE Boyan Onyshkeyvch DOD The workshop is sponsored in part by the Special Interest Group on Interlinguas of the AMTA. For further information about this series of workshops see http://crl.nmsu.edu/Events/FWOI/index.html. From malek.boualem@rd.francetelecom.com Tue Jul 2 18:29:20 2002 From: malek.boualem@rd.francetelecom.com (BOUALEM Malek FTRD/DMI/LAN) Date: Tue, 2 Jul 2002 19:29:20 +0200 Subject: [MT-List] Cahier de tests pour traduction et resume Message-ID: <8C19E3FBB6467846AFF97E366D34CB60147204@LANMHS20.rd.francetelecom.fr> This is a multi-part message in MIME format. ------_=_NextPart_001_01C221EE.0020967C Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Bonjour, Quelqu'un aurait-il par hasard (=E9tabli) un cahier de tests (ou m=E9thode d'=E9valuation) pour tester et =E9valuer des logiciels de traduction automatique et/ou des logiciels de r=E9sum=E9 ? Merci d'avance pout toute information utile =E0 ce sujet. English query : Looking for evaluation methods for translation and/or for summarization software. Thank you in advance. Malek Boualem France T=E9l=E9com R&D ------_=_NextPart_001_01C221EE.0020967C Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Cahier de tests pour traduction et resume

Bonjour,

Quelqu'un aurait-il par hasard = (=E9tabli) un cahier de tests
(ou m=E9thode d'=E9valuation) pour = tester et =E9valuer des logiciels
de traduction automatique et/ou des = logiciels de r=E9sum=E9 ?
Merci d'avance pout toute information = utile =E0 ce sujet.

English query :
Looking for evaluation methods for = translation and/or for
summarization software.
Thank you in advance.

Malek Boualem
France T=E9l=E9com R&D

------_=_NextPart_001_01C221EE.0020967C-- From Lori Levin Wed Jul 3 23:10:27 2002 From: Lori Levin (Lori Levin) Date: Wed, 03 Jul 2002 18:10:27 -0400 Subject: [MT-List] Interlingua Reliability -- Deadline Extended Message-ID: <18320.1025734227@alexis.boltz.cs.cmu.edu> CALL FOR ABSTRACTS Workshop on Interlingua Reliability Tuesday, October 8, 2002 ****** New submission deadline -- July 21 ******** *** This workshop is on a tight time line, so submit right away *** at AMTA 2002 Tiburon, California, USA Association for Machine Translation in the Americas Interlingua representations are usually very rich and extremely knowledge intensive. One of the benefits of such a system is that meaning representation can be relatively uniform across multiple source languages. This richness, however, leads to wide potential variation among different producers (human or machine) of interlingual representations. Given an interlingua language specification and a specific document, how similar will the interlinguas for that document be if produced by different linguists from different sites? from the same site? the same linguist on different days? by different semantic analyzers/parsers? The goal of the workshop is to investigate issues of reliability of interlingua representation. Papers are invited on: - inter-annotator agreement in manually producing interlingual representations - discussions on whether interlingual representations need to be canonical, and, if so, if there is any hope that they ever will be; and whether observations about the semantic structure of languages can inform the design of such a canonical representation - methodology or technology for ensuring common understanding of interlingual representations - discussions of the impact of differences/errors in interligual representations on the overall NLP system - methodology or technology for ensuring reproducibility of interlingual representations - measures of semantic similarity of interlingual expressions within an interlingual system - measures and methods for inter-translatability of interlingual representations in different interlingual systems - interaction of monolingual or parallel proposition bank/predicate-argument bank efforts (or other broad but shallow semantic resources) with interlingual reliability The workshop will consist of paper presentations, as well as one or more experiments in inter-annotator agreement. Pre-registered participants will receive a specification of an interlingual language, plus some text to annotate in advance. The experiments during the workshop will involve discussions of annotation differences among the participants, measures of agreement, etc. Instructions for submitting abstracts ------------------------------------- Length: 500 to 1500 words Submission deadline: July 14, 2002 Send to: Lori Levin, lsl@cs.cmu.edu Format: One of the following: ascii, ps, psf, doc Instructions for Participation in Interlingua Coding Experiment --------------------------------------------------------------- If you want to participate in the interlingua coding experiments, but do not want to present a paper, send an email containing your name and institution to Lori Levin, lsl@cs.cmu.edu, by July 14, 2002. Time Line --------- Submission deadline: July 21 Notification: August 10 Papers due: September 6 Workshop date: October 8 Program Committee ------------------ Bonnie Dorr UMD David Farwell NMSU Stephen Helmreich NMSU Lori Levin CMU Keith Miller MITRE Boyan Onyshkeyvch DOD The workshop is sponsored in part by the Special Interest Group on Interlinguas of the AMTA. For further information about this series of workshops see http://crl.nmsu.edu/Events/FWOI/index.html. From WJHutchins@compuserve.com Fri Jul 5 10:55:14 2002 From: WJHutchins@compuserve.com (John Hutchins) Date: Fri, 5 Jul 2002 05:55:14 -0400 Subject: [MT-List] new update of Compendium Message-ID: <200207050555_MC3-1-546-3170@compuserve.com> The latest update (June 2002) of the "Compendium of translation software", which lists current MT systems, MT services, and translation tools of all kinds, is now available on the EAMT website: http://www.eamt.org/compendium.html As before, there are numerous changes, additions and deletions since the previous update (January 2002), e.g. the demise of Sail Labs and IBM TranslationManager/2, the various acquisitions by SDL and by Trados, and the addition of many online services. If the product you are looking for is not there, it may be very new (let me know and it will be added) or it may no longer be available, and it may be in an earlier edition of the "Compendium" (see http://ourworld.compuserve.com/homepages/WJHutchins/compendium.htm) John Hutchins 5 July From steven.krauwer@let.uu.nl Tue Jul 2 10:55:01 2002 From: steven.krauwer@let.uu.nl (Steven Krauwer) Date: Tue, 2 Jul 2002 11:55:01 +0200 (MEST) Subject: [MT-List] Call for Participation: Roadmap Workshop at COLONG2002 Message-ID: <200207020955.g629t1G24843@sfinx.let.uu.nl> CALL FOR PARTICIPATION A Roadmap for Computational Linguistics Saturday, August 31 2002, 09:00-17:30 Workshop in conjunction with COLING 2002 (August 24 - September 1, 2002, Taipei, Taiwan) Organized by ELSNET CONTEXT AND OBJECTIVE ELSNET is the European Network of Excellence in Human Language Technologies, which was created in 1991, with a view to supporting and facilitating research, development and training in the field of language and speech technologies and related areas. The network funded by the European Commission, but its scope is not limited to Europe. This workshop should be seen as a step in ELSNET's aim to build a roadmap for language and speech technology. It is one of a number of workshops of this type that have been and will be organised in order to arrive at a broadly supported roadmap for our field, which should help us identifying major challenges, setting research priorities and defining common goals. At this workshop we will * confront the audience with the approach and the results of ELSNET's roadmapping exercise thus far; * invite participants to give their own presentation of what they see as the main longer term challenges and internmediate milestones in our field as well as their strategies to meet these challenges; * organise a discussion session aimed at reaching a consensus on what the main challenges and priorities are. As a special feature we are inviting a number of rapporteurs to have a critical look at the papers presented at the various thematic sessions of the main conference, with a view to relating them to the main challenges and milestones: do we spot new challenges, new milestones, new strategies, new directions, regional differences, etc. All reports and summaries of discussions will be integrated in ELSNET's Roadmap, and given wide distribution via ELSNET's communication channels (website, newsletter, discussion forums, etc). ** NB: the invitation to act as a rapporteur is still open!!! ** ** check the workshop site for details (and your reward) ** PROVISIONAL PROGRAM Please note that the selection of areas for the area reports is tentative, and will depend on the actual areas covered by the conference. Time Activity [ Speaker(s) ] ----------------------------------------------------------------- 09:00 Opening and Introduction [ Steven Krauwer (ELSNET) ] 09:20 The ELSNET Roadmap [ Hans Uszkoreit (DFKI) ] 09:50 Area report 1: Language Resources 10:10 Area report 2: Parsing 10:30 BREAK 11:00 Talk 1: Why NLP should move into IAS [ Victor Raskin, Sergei Nirenburg, Mikhail J. Atallah, Christian F. Hempelmann, Katrina E. Triezenberg ] 11:20 Area report 3: Generation 11:40 Area report 4: Morphology and Syntax 12:00 Area report 5: Semantics 12:20 LUNCH 13:30 Talk 2: MEANING: A Roadmap to Knowledge Technologies [ Eneko Agirre, German Rigau, Bernardo Magnini, Piek Vossen, John Carroll ] 13:50 Area report 6: Information Retrieval and Extraction 14:10 Area report 7: Machine Translation 14:30 Area report 8: Multimodality 14:50 Area report 9: Dialogue and Discourse 15:10 Area report 10: Asian Language Processing 15:30 BREAK 16:00 Discussion & work: What does the roadmap for computational linguistics look like? [ Steven Krauwer, Hans Uszkoreit ] 17:15 Summary [ Steven Krauwer ] 17:30 CLOSING TARGET AUDIENCE A workshop of this type will be most appealing to people who are interested in developing longer term strategic views, e.g. senior scientists in charge of longer term research policies (both in academia and in industry), but also researchers and developers who have specific views on what will happen or what should happen, should feel invited to attend and contribute, as well as people who are responsible for the education of future generartions of researchers and developers. REGISTRATION AND OTHER INFORMATION Registration details and other information can be found on the main conference website: http://www.coling2002.sinica.edu.tw/ The URL for this workshop is http://www.elsnet.org/roadmap-coling2002.html CONTACT Steven Krauwer (Chair), steven.krauwer@let.uu.nl ELSNET / Utrecht University http://www.elsnet.org Trans 10 phone: +31 30 253 6050 3512 JK UTRECHT, NL fax: +31 30 253 6000 From cb@lim.nl Thu Jul 4 18:30:22 2002 From: cb@lim.nl (Colin Brace) Date: Thu, 4 Jul 2002 10:30:22 -0700 (PDT) Subject: [MT-List] Upcoming events Message-ID: <20020704173022.18978.qmail@web40307.mail.yahoo.com> Hi all, I have updated the listings of upcoming events on the EAMT web site. There are some half-dozen MT-related conferences in the coming year: * LangTech2002, 26-27 Sept 2002, Berlin, Germany. * AMTA 2002 AMTA 2002: From Research to Real Users, 8-12 October 2002, Tiburon, California. * EAMT Workshop: Teaching Machine Translation, 14-15 November 2002, Manchester, UK. * ASLIB Translating & the Computer 24, 21-22 November 2002, London, UK. * EAMT Workshop, EACL Conference, April 12-17, 2003, Budapest, Hungary. * EAMT-CLAW 2003: Controlled Language Translation, Joint Holding of the 7th International Workshop of the European Association of Machine Translation and the 4th International Workshop on Controlled Language Applications. 15th-17th May 2003, Dublin City University, Ireland. See: http://www.eamt.org/events.html This page has links to the respective web sites of the events. If anything should be added to or updated on this page, please let me know. ===== Colin Brace Amsterdam http://www.lim.nl __________________________________________________ Do You Yahoo!? Sign up for SBC Yahoo! Dial - First Month Free http://sbc.yahoo.com From cgdaniec@us.ibm.com Fri Jul 12 18:03:48 2002 From: cgdaniec@us.ibm.com (Claudia Gdaniec) Date: Fri, 12 Jul 2002 13:03:48 -0400 Subject: [MT-List] opportunity for computational linguist (Latin American Spanish) Message-ID: Opportunity to contribute significantly to improvement of English->Spanish machine translation quality: IBM's T.J. Watson Research Center, LMT Group, has immediate openings for two highly skilled computational linguists who are native speakers of a Latin American variety of Spanish. They will join a multiyear project aimed at significantly raising the quality of machine translation of English to Spanish, working closely with the R&D team on IBM's premier translation product: http://www-3.ibm.com/pvc/products/voice/translation_server.shtml Successful candidates should have completed graduate work in computational linguistics or linguistics -- Master's or Ph.D. degree required, and Ph.D. preferred -- and should be able to express linguistic ideas and generalizations formally in a computational linguistic system. Experience with machine translation is desirable, but not required. Interested candidates are urged to apply immediately by going to the URL http://careers3.peopleclick.com/JobPosts/Client40_GLDTR/BU1/External/139-1398.htm?ShowReturn=Yes and filling out the form. Another route to this form is to go to: http://www.research.ibm.com/about/career.shtml then click on "Employment with IBM Research -- United States", and then "Postdoc - Knowledge Systems". From steveri@microsoft.com Mon Jul 15 23:44:28 2002 From: steveri@microsoft.com (Steve Richardson) Date: Mon, 15 Jul 2002 15:44:28 -0700 Subject: [MT-List] AMTA-2002 - Call for Participation - online registration now available Message-ID: <0FDD2891FCDF6E42891AEDE2198E5F7C05F1EB08@red-msg-04.redmond.corp.microsoft.com> --- CALL FOR PARTICIPATION --- --- ONLINE REGISTRATION NOW AVAILABLE! --- The Association for Machine Translation in the Americas AMTA-2002 Conference Location: Tiburon, California=20 Dates: October 8-12, 2002 The Association for Machine Translation in the Americas (AMTA) is pleased to announce its fifth biennial conference, planned for October 8-12, 2002, in Tiburon (near San Francisco), California. Online registration is now available on the conference web site: http://www.amtaweb.org/AMTA2002/ Register at a discounted rate until August 11, 2002! A preliminary program, providing the schedule for tutorials, workshops, exhibits, accepted papers, panels, and invited speakers for the conference, is also now posted on the conference web site. We look forward to seeing you in Tiburon! CONFERENCE THEME: From Research to Real Users Ever since the showdown between Empiricists and Rationalists a decade ago at TMI-92, MT researchers have hotly pursued promising paradigms for MT, including data-driven approaches (e.g., statistical, example-based) and hybrids that integrate these with more traditional rule-based components. During the same period, commercial MT systems with standard transfer architectures have evolved along a parallel and almost unrelated track, increasing their coverage (primarily through manual update of their lexicons, we assume) and achieving much broader acceptance and usage, principally through the medium of the Internet. Web page translators have become commonplace; a number of online translation services have appeared, including in their offerings both raw and post-edited MT; and large corporations have been turning increasingly to MT to address the exigencies of global communication. Still, the output of the transfer-based systems employed in this expansion represents but a small drop in the ever-growing translation marketplace bucket. Now, 10 years later, we wonder if this mounting variety of MT users is any better off, and if the promise of the research technologies is being realized to any measurable degree. In this regard, we pose the following questions: Why aren't any current commercially available MT systems primarily data-driven? Do any commercially available systems integrate (or plan to integrate) data-driven components? Do data-driven systems have significant performance or quality issues? Can such systems really provide better quality to users, or is their main advantage one of fast, facilitated customization? If any new MT technology could provide such benefits (somewhat higher quality, or facilitated customization), would that be the key to more widespread use of MT, or are there yet other more relevant unresolved issues, such as system integration? If better quality, customization, or system integration aren't the answer, then what is it that users really need from MT in order for it to be more useful to them? INVITED SPEAKERS We are pleased to announce that invited speakers for the conference will include Yorick Wilks and Ken Church, both notable participants at TMI-92, and Jaap van der Meer, former CEO of ALPNET. We anticipate that the speakers will provide a sharp and stimulating focus on the theme of the conference. CONFERENCE ORGANIZERS Elliott Macklovitch, General Chair Stephen D. Richardson, Program Chair Violetta Cavalli-Sforza, Local Arrangements Chair=20 Bob Frederking, Workshops and Tutorials=20 Laurie Gerber, Exhibits Coordinator=20 PROGRAM COMMITTEE Arendse Bernth (IBM Research) Christian Boitet (GETA, CLIPS, IMAG) Ralf Brown (LTI, CMU) Robert Cain (Foreign Broadcast Information Service) Michael Carl (RALI) Bill Dolan (Microsoft Research)=20 Laurie Gerber (Language Technology Broker) Stephen Helmreich (CRL, NMSU) Eduard Hovy (ISI, USC) Pierre Isabelle (XRCE) Christine Kamprath (Caterpillar) Elliott Macklovitch (RALI) Bente Maegaard (CST) Michael McCord (IBM Research) Robert C. Moore (Microsoft Research)=20 Hermann Ney (RWTH Aachen) Sergei Nirenburg (CRL, NMSU) Franz Och (RWTH Aachen) Joseph Pentheroudakis (Microsoft Research)=20 Jessie Pinkham (Microsoft Research)=20 Fred Popowich (Gavagai Technology Inc.) Florence Reeder (MITRE) Harold Somers (UMIST) Keh-Yih Su (Behavior Design Corp.) Eiichiro Sumita (ATR) Hans Uszkoreit (DFKI) Lucy Vanderwende (Microsoft Research)=20 Hideo Watanabe (TRL, IBM) Andy Way (Dublin City Univ.) Eric Wehrli (Univ. of Geneva) John White (Northrop Grumman IT) Jin Yang (SYSTRAN) Ming Zhou (Microsoft Research)=20 From malek.boualem@rd.francetelecom.com Fri Jul 26 11:13:10 2002 From: malek.boualem@rd.francetelecom.com (BOUALEM Malek FTRD/DMI/LAN) Date: Fri, 26 Jul 2002 12:13:10 +0200 Subject: [MT-List] URL of the METEO system Message-ID: <8C19E3FBB6467846AFF97E366D34CB6058A9B5@LANMHS20.rd.francetelecom.fr> Hi, Could someone give me the URL of the Canadian METEO translation system ? Thank you in advance. Malek Boualem France Telecom R&D From steveri@microsoft.com Fri Jul 19 00:34:14 2002 From: steveri@microsoft.com (Steve Richardson) Date: Thu, 18 Jul 2002 16:34:14 -0700 Subject: [MT-List] AMTA workshop on Customization Strategies for MT: **submission deadline extended ** Message-ID: <0FDD2891FCDF6E42891AEDE2198E5F7C035E9171@red-msg-04.redmond.corp.microsoft.com> This is a multi-part message in MIME format. --------------InterScan_NT_MIME_Boundary Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C22EB3.A0926D5A" ------_=_NextPart_001_01C22EB3.A0926D5A Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable With the pushing back of the early registration date for the AMTA conference and fewer abstracts being submitted by the deadline than anticipated (and hoped for), the organizers of this workshop are extending the deadlines and hoping that by inviting specific groups and individuals working with various MT systems to participate, they will take the opportunity to submit an abstract and join in what promises to be a very enlightening workshop. We invite your participation--please review the Call below and consider submitting an abstract by July 24. Also, please forward this notice to anyone you feel may be interested in participating. Thanks, Steve Richardson, program chair, AMTA 2002 P.S. Online registration for AMTA 2002 is now available at www.amtaweb.org/AMTA2002.=20 ************************************************************************ ******************** Call for Abstracts for the AMTA workshop on Customization Strategies for MT **New abstract submission deadline: July 24** Machine translation systems must have customization capabilities in order to claim success as a commercial product or a research prototype. Minimally, these include the ability to add translations for new words and phrases, but may also include more sophisticated functionality such as adapting to new syntactic structures or writing styles, and may even be the means of acquiring all the system's translation knowledge (e.g., in statistical systems), We propose to bring MT developers and researchers together to discuss the customization capabilities of their systems, with an emphasis on using common data for the discussion.=20 This workshop is intended to cover all types of customization strategies, although preference in selection will be given to novel strategies. We encourage participation from MT developers of commercial systems, and researchers working on all types of MT systems (traditional transfer, interlingua, example-based, or statistical). To make the discussion more interesting, we would prefer (although it is not required) that participants demonstrate the customization capabilities of their system using data freely available to everyone, such as Hansard data for French-English or other data available from ELRA or LDC. Microsoft has also agreed to make technical manual data available for several language pairs (English and any of these: French, Spanish, German, Japanese) for purposes of research related to this workshop. (For access to this data please send email to CustomWS@microsoft.com.) We request that interested parties submit a two page abstract with the following information: - overview of your MT system - description of customization capabilities - comparison of this strategy to other known strategies - data that will be used to test capability - proposed evaluation to determine the effectiveness of the customization - estimate of time that would be required to customize for the chosen domain, based on your sample run. -=20 Dates: Call for participation: June 8 Abstract submission deadline: July 24 Acceptance notification: Aug 7 Early registration for AMTA: Aug 11 Papers due: September 6 AMTA workshop October 8 Instructions for submission: All submissions should be in English, and it is recommended that they be submitted in one of the following three formats: PDF (preferred); PostScript; Microsoft Word. All submissions will be received and processed using the Conference Management Toolkit (CMT) located at http://cmt.research.microsoft.com/CustomWS . Authors should follow the instructions at the CMT web site to register, enter information about themselves and their abstract, and upload a copy of their abstract in one of the acceptable formats by the submission deadline. Report any problems with the website to CustomWS@microsoft.com. For information on obtaining Microsoft Data to participate in the workshop, please contact CustomWS@microsoft.com Organizers: Jessie Pinkham, Deborah Coughlin, Bill Dolan ------_=_NextPart_001_01C22EB3.A0926D5A Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Message

With the = pushing back=20 of the early registration date for the AMTA conference and fewer = abstracts being=20 submitted by the deadline than anticipated (and hoped for), the = organizers=20 of this workshop are extending the deadlines and hoping that = by inviting=20 specific groups and individuals working with various MT systems to=20 participate, they will take the opportunity to submit an abstract and = join in=20 what promises to be a very enlightening workshop.

We = invite your=20 participation--please review the Call below and consider submitting = an=20 abstract by July 24. Also, please = forward this=20 notice to anyone you feel may be interested in=20 participating.

Thanks,

Steve = Richardson,=20 program chair, AMTA 2002

P.S. Online registration for AMTA 2002 is now = available at www.amtaweb.org/AMTA2002.=20

**********************************************= **********************************************

Call for Abstracts for the AMTA workshop = on=20 Customization Strategies for MT

**New = abstract=20 submission deadline: July 24**

Machine translation systems must have = customization=20 capabilities in order to claim success as a commercial product or a = research=20 prototype. Minimally, these include the ability to add translations for = new=20 words and phrases, but may also include more sophisticated functionality = such as=20 adapting to new syntactic structures or writing styles, and may even be = the=20 means of acquiring all the system's translation knowledge (e.g., in = statistical=20 systems), We propose to bring MT developers and researchers together to = discuss=20 the customization capabilities of their systems, with an emphasis on = using=20 common data for the discussion.

This workshop is intended to cover all = types of=20 customization strategies, although preference in selection will be given = to=20 novel strategies. We encourage participation from MT developers of = commercial=20 systems, and researchers working on all types of MT systems (traditional = transfer, interlingua, example-based, or statistical).

To make the discussion more interesting,=20 we would prefer (although it is not required) that = participants=20 demonstrate the customization capabilities of their system using data = freely=20 available to everyone, such as Hansard data for French-English or other = data=20 available from ELRA or LDC. Microsoft has also agreed to make technical = manual=20 data available for several language pairs (English and any of these: = French,=20 Spanish, German, Japanese) for purposes of research related to this = workshop.=20 (For access to this data please send email to=20 CustomWS@microsoft.com.)

We request that interested parties submit = a two page=20 abstract with the following information:

- overview of your MT system

- description of customization=20 capabilities

- comparison of this strategy to other = known=20 strategies

- data that will be used to test=20 capability

- proposed evaluation to determine the = effectiveness=20 of the

customization

- estimate of time that would be required = to=20 customize for the

chosen domain, based on your sample = run.

-

Dates:

Call for participation: June 8

Abstract submission = deadline: July 24

Acceptance notification: Aug 7

Early registration for = AMTA: Aug 11

Papers=20 due: September 6

AMTA workshop October 8

Instructions for submission:

All submissions should be in English, and = it is=20 recommended that they be submitted in one of the following three = formats: PDF=20 (preferred); PostScript; Microsoft Word. All submissions will be = received and=20 processed using the Conference Management Toolkit (CMT) located at = http://cmt.research.microsoft.com/CustomWS= .

Authors should follow the = instructions=20 at the CMT web site to register, enter information about themselves and = their=20 abstract, and upload a copy of their abstract in one of the acceptable = formats=20 by the submission deadline. Report any problems with the website to=20 CustomWS@microsoft.com.

For information on = obtaining Microsoft=20 Data to participate in the workshop, please contact=20 CustomWS@microsoft.com

Organizers: Jessie = Pinkham, Deborah=20 Coughlin, Bill Dolan

=00 ------_=_NextPart_001_01C22EB3.A0926D5A-- --------------InterScan_NT_MIME_Boundary-- From lsteyaert@telelingua.com Tue Jul 30 09:13:25 2002 From: lsteyaert@telelingua.com (Liesbet Steyaert) Date: Tue, 30 Jul 2002 10:13:25 +0200 Subject: [MT-List] Abbreviation lists (EN, FR, DE, NL, ES, IT) Message-ID: <000c01c237a0$fb23aef0$5300000a@translate.be> Dear list members, Does anyone know whether there exist lists of (common) abbreviations containing a punctuation mark (full stop) (f.ex. for the English language 'Mr.', 'etc.', ...) for the following languages: English, French, German, Dutch, Spanish, Italian. I would like to find such exhaustive abbreviation lists containing a full stop in order to avoid confusion between a sentence boundary and an abbreviation. Any suggestion would be greatly appreciated! Thank you very much, Liesbet Steyaert From Andy.Way" Preliminary Call For Papers Joint Conference combining the 7th International Workshop of the European Association for Machine Translation and the 4th Controlled Language Applications Workshop Main Conference theme: Controlled Language Translation Location: Dublin City University, Ireland Dates: 15th-17th May, 2003 =09=09Conference URL: http://www.eamt.org/eamt-claw03/ This document constitutes the preliminary call for papers for the 2003 join= t conference of the European Association for Machine Translation (EAMT) and the Controlled Language Applications Workshop (CLAW). The main theme of the conference is controlled translation. It is envisaged that papers addressin= g this theme will be featured on the middle day of the conference, with the first day given over to more general papers on machine translation (MT), an= d the final day dedicated to other papers focussing more on controlled language issues. Over the years, there have been many conferences on MT, involving rule-base= d approaches, statistical and example-based approaches, hybrid and multi-engine approaches as well as those limited to particular sublanguage domains. In addition, there has been an increased level of interest in controlled languages, culminating in the series of Workshops on controlled language applications. These have given impetus to both monolingual and multilingual guidelines and applications using controlled language, for man= y different languages. Controlled languages are subsets of natural languages whose grammars and dictionaries have been restricted in order to reduce or eliminate both ambiguity and complexity. Traditionally, controlled languages fall into two major categories: those that improve readability for human readers, particularly non-native speakers, and those that improve computational processing of the text. It is often claimed that machine-oriented controlle= d language should be of particular benefit when it comes to the use of translation tools (including machine translation, translation memory, multilingual terminology tools etc.). Experience has shown that high quality MT systems can be designed for specialized domains (e.g. METEO). However, the area of controlled translation has remained relatively unaddressed. This is rather strange given its undoubted importance. Such examples that exist use rule-based MT (RBMT) systems to translate controlled language documentation, e.g. Caterpillar's CTE and CMU's KANT system, and General Motors CASL and LantMark, etc. However, fine-tuning general systems designed for use with unrestricted texts to derive specific, restricted applications is complex and expensive. There are several examples of using Translation Memory (TM) tools in a controlled language workflow, yet these have been primarily for combining T= M and MT tools. Very few attempts have been made where Example-based MT (EBMT= ) systems have been designed specifically for controlled language application= s and use. This is even harder to fathom: using traditional RBMT systems lead= s to the well-known `knowledge acquisition bottleneck', which can be overcome by using corpus-based MT technology. Furthermore, the quality of EBMT (and Translation Memory) systems depends on the quality of the reference translations in the system database; the more these are controlled, the better the expected quality of translation output by the system. The primary aim of this unique conference, therefore, is to elicit papers o= n controlled translation, and provide a forum in which the problems may be outlined, possible solutions proposed, and in general to bring together developers, implementors, researchers and end-users from the publications, authoring, translation and localization fields to discuss how ideas from both the authoring and translation camps might be integrated in this common area. Some specific topics which might be addressed include: * What is controlled translation? * RBMT and controlled translation. * TM/EBMT and controlled translation. * Influence and interplay of controlled language upon both source-language parsing and target-language generation in an MT system= . * Role of the lexicon in controlled translation. * Can we expect better controlled translations from a hybrid approach? O= r from a multi-engine approach? * Towards a Roadmap for controlled translation - the way ahead? In addition, we welcome contributions on MT as well as on controlled language which do not address the main theme per se. Suitable example topic= s include, but are not restricted to, the following: Machine Translation * MT for the Web; * Practical MT systems; * Methodologies for MT; * Speech and dialogue translation; * Text and speech corpora for MT and knowledge extraction from corpora; * MT evaluation techniques and evaluation results; * MT postediting. Controlled Language * Examples of controlled languages: their definition, by whom, and intended usage; * Consequences for technical authors and implications for Natural Language Processing; * Practical experiences of teaching and using controlled languages; * Application of controlled languages in speech systems. Finally, intentions to present system demonstrations are particularly welcomed. Abstracts for demos must not exceed 400 words. Developers should outline the design of their system and provide sufficient details to allow the evaluation of its validity, quality, and relevance to controlled language. Pointers to web sites running the demo preview and/or screen camcorder video clips will also be helpful. Programme It is anticipated that papers which address the central theme of the joint conference, controlled translation, will be featured on the middle day of the three. The first day will be given over to papers focussing primarily o= n MT, and the third day will feature papers focussing more on controlled language issues. Papers will each be allowed 30 minutes, including questions. Invited Speakers We are pleased to announce that invited speakers for the conference will include Steven Krauwer, University of Utrecht and Coordinator of ELSNET, an= d Lou Cremers, Oc=E9 Technologies. We anticipate that the speakers will provi= de a sharp and stimulating focus on the theme of the conference. Attendance Fees and Registration Details of registration procedures, including registration fees, will be announced as soon as they become available. It is anticipated that participants will be able to register for the MT part, the joint session, and the CLAW day separately, or in various combinations. But we expect by far the best value option to be a package deal which allows attendance at all three days. In addition, there will be a discount for early registration, the deadline for which will be 31st March, 2003. Important Dates Draft papers due 29th November, 2002 Reviews due 31st January, 2003 Notification of acceptance 14th February, 2003 Camera-ready papers & pre-registration due 31st March, 2003 Submission Details Papers accepted for the conference will be published in a proceedings volum= e available to all attendees. Papers should describe unique work not publishe= d before. Papers that are being submitted to other conferences should include this information on the first page. Paper submissions should follow these conventions: * Maximum length is 4000 words * 8.5" x 11" page size * Single-column, single-spaced, 1" margins * 12 point font * Include title, authors, and contact info centered at the top of the first page * Include an abstract of about 100 words Electronic submission is strongly encouraged. We prefer PDF files, sent as EMail attachments. Electronic submissions should be sent to Eric Nyberg (ehn@cs.cmu.edu), with `Submission for EAMT-CLAW 2003' in the Subject line = of=20 the email. Papers for each of the three sessions will be reviewed separately. Please indicate which session your paper is to be reviewed under. Please note that papers will not be accepted (at the camera-ready copy stage) unless at least one of the authors has pre-registered for the conference. Organizing Committee The Organizing Committee consists of: * John Hutchins (WJHutchins@compuserve.com), on behalf of the EAMT, * Arendse Bernth (arendse@us.ibm.com), on behalf of CLAW, together with three local Organizers: * Dorothy Kenny (Dorothy.Kenny@dcu.ie) * Sharon O'Brien (Sharon.OBrien@dcu.ie) * Andy Way (away@computing.dcu.ie) Contact any of the above for more details. Programme Committee The programme committee will include, among others: Jeff Allen (Mycom France and MIT2, France), Arendse Bernth (IBM Watson Research, USA), Kurt Godden (Lockheed Martin, USA), John Hutchins (EAMT President, University of East Anglia, UK), Dorothy Kenny (Dublin City University, Ireland), Jaro Lajovic (Dept. of Intelligent Systems Institute, Ljubljana, Slovenia), Bente Maegaard (Center for Language Technology, Copenhagen, Denmark), Teruko Mitamura (Carnegie Mellon University, USA), Eric Nyberg (Carnegie Mellon University, USA), Sharon O'Brien (Dublin City University, Ireland), Ursula Reuther (IAI, University of Saarbr=FCcken, Germany), Joerg Sch=FCtz (IAI, University of Saarbr=FCcken, Germany), Harol= d Somers (UMIST, UK), Andy Way (Dublin City University, Ireland), Rick Wojcik (Boeing, USA) From nadamides@aslib.com" The 24th Translating and the Computer Conference will take place in = London=20 on 21-22 November, with a pre-conference seminar on XML taking place on = 20=20 November 2002. Details of the conference, including the programme, exhibitors,=20 pre-conference seminar and fees, can be found at: www.aslib.com/conferences Speakers include: Hans Uszkoreit, DFKI, Germany Daniel Gervais, MultiCorpora R&D Inc., USA Matthias Heyn, TRADOS, Germany Karin Spalink, Sony Ericsson, USA Andrew Bredenkamp, acrolinx GmbH, Germany Lorna Joy,SCH=DCCO International, UK Yves Champollion, France G=E1bor Pr=F3sz=E9ky, MorphoLogic, Hungary Reinhard Sch=E4ler, University of Limerick, Ireland Veronique Anne Sauron, University of Geneva, Switzerland Monika K=E4ser, CLS Corporate Language Services, Switzerland Mike Roche, IBM Software Group, Ireland Dan Dube, ISOGEN International, USA Lee Gillam, University of Surrey, UK I hope you will attend the conference and look forward to hearing from = you. My apologies if you have received more than one copy of this message. Yours sincerely, NICOLE ADAMIDES, Training Aslib/IMI, Staple Hall, Stone House Court, London EC3A 7PB Tel: +44 (0)20 7903 0031 Fax: +44 (0)20 7903 0011 www.aslib.com Email: nadamides@aslib.com From joel.bourgeoys.1@agora.ulaval.ca Thu Aug 8 17:12:16 2002 From: joel.bourgeoys.1@agora.ulaval.ca (Joel Bourgeoys) Date: Thu, 8 Aug 2002 12:12:16 -0400 Subject: [MT-List] Master's thesis on Machine Translation of Simplified English Message-ID: <000701c23ef6$5d292650$f9c2cb84@LTAL> Hello, I have completed a master's thesis on machine translation of Simplified English. The thesis is written in French but here is an English abstract. ABSTRACT The main objective of the thesis is to demonstrate the pros and cons of using Simplified English for Machine Translation. We analyse the qualitative and quantitative differences between the MT of a technical text written with SE rules and the MT of a technical text written without those rules. We first verify if there are more translation errors in one text. Then, we verify if there are differences in the types of errors between the two texts. Finally, we offer suggestions for the modification of SE and/or the machine translation software to adapt them to the Machine Translation of Simplified English. If anyone would be interested in reading my thesis, please feel free to send me an e-mail and I will be glad to send it to you. It is a zipped PDF document of 1.75 megabytes. Thank you, Jo=EBl Bourgeoys ****************************************************** =A0Laboratoire de Traitement Automatique=20 =A0des langues naturelles (L-TAL)=20 =A0Universit=E9 Laval, Qu=E9bec =A0T=E9l: (418)656-2131 poste 8087 =A0Fax: (418)656-7144 =A0joel.bourgeoys.1@agora.ulaval.ca ****************************************************** From ling98@videotron.ca Tue Aug 13 21:01:26 2002 From: ling98@videotron.ca (Michael Blekhman) Date: Tue, 13 Aug 2002 16:01:26 -0400 Subject: [MT-List] Linguistic resources by Lingvistica '98 Inc. Message-ID: <025601c24304$34d994c0$6401a8c0@michaelb> Dr. Michael S. Blekhman President, Lingvistica '98 Inc. Montreal, Canada; President, Lingvistica b.v. Dongen, The Netherlands www.ling98.com www.lingvistica.com Tel: (514) 331-0172 ling98@canada.com Grammatical Dictionaries Lingvistica '98 Inc. and Lingvistica b.v. would like to inform you of our new project: Grammatical Dictionaries for European Languages. It is our goal to develop and supply to the international market representative grammatical dictionaries as text files to be used in various language engineering projects: machine translation, automatic abstracting and indexing, web mining, search engines, etc. Each grammatical dictionary includes words (not less than 100,000 for each language) and morphological features, such as POS, declension/conjugation models, etc. The first grammatical dictionary in this project developed by Lingvistica '98 is that for the Polish language. It includes almost 120,000 words with grammatical features (as well as pronunciations) attached to each word. Dictionaries for the following languages have also been developed and are being presently tested and proof-edited: German, Russian, Ukrainian. The following ones are under way: Dutch, English, French, Turkish, and Pushtu. We are also developing bi-directional dictionaries supplied as linguistic resources. We are working in tight collaboration with LogoMedia, the world-known Boston-based MT developer, SCIPER, a French company specializing in developing linguistic resources, The CJK Dictionary Institute, Inc., based in Japan, Onyx Consulting and Computer Research Laboratory, New Mexico, USA. August 11, 2002. Montreal, Canada From cheemin@cs.usm.my Fri Aug 23 23:52:08 2002 From: cheemin@cs.usm.my (LEE) Date: Fri, 23 Aug 2002 15:52:08 -0700 (Pacific Daylight Time) Subject: [MT-List] bilingual text categorization Message-ID: <3D66BC98.000003.00664@lee_chee_min> --------------Boundary-00=_WIJBG6G0000000000000 Content-Type: Multipart/Alternative; boundary="------------Boundary-00=_WIJBBHK0000000000000" --------------Boundary-00=_WIJBBHK0000000000000 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hello,=0D Is there any one conducting any research on bilingual text/topic/product categorization?=0D What approaches have been used or proposed to be used?=0D =0D Thanks & regards --------------Boundary-00=_WIJBBHK0000000000000 Content-Type: Text/HTML; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable =0D =0A
Hello,
Is there any one conducting any research on=20 bilingual text/topic/product categorization?
What approaches have been used or proposed to be used?
 
Thanks & regards
 
=09 =09 =09 =09 =09 =09 =09
____________________________________________________
  IncrediMail - Email has finally evolved -=20
Click=20 Here
--------------Boundary-00=_WIJBBHK0000000000000-- --------------Boundary-00=_WIJBG6G0000000000000 Content-Type: unknown/unknown Content-Transfer-Encoding: base64 Content-ID: <3EE179EA-1F70-423A-B9B5-04D85B0952BA> R0lGODlhFAAPALMIAP9gAM9gAM8vAM9gL/+QL5AvAGAvAP9gL////wAAAAAAAAAAAAAAAAAAAAAA AAAAACH/C05FVFNDQVBFMi4wAwEAAAAh+QQJFAAIACwAAAAAFAAPAAAEVRDJSaudJuudrxlEKI6B URlCUYyjKpgYAKSgOBSCDEuGDKgrAtC3Q/R+hkPJEDgYCjpKr5A8WK9OaPFZwHoPqm3366VKyeRt E30tVVRscMHDqV/u+AgAIfkEBWQACAAsAAAAABQADwAABBIQyUmrvTjrzbv/YCiOZGmeaAQAIfkE CRQACAAsAgABABAADQAABEoQIUOrpXIOwrsPxiQUheeRAgUA49YNhbCqK1kS9grQhXGAhsDBUJgZ AL2Dcqkk7ogFpvRAokSn0p4PO6UIuUsQggSmFjKXdAgRAQAh+QQFCgAIACwAAAAAFAAPAAAEEhDJ Sau9OOvNu/9gKI5kaZ5oBAAh+QQJFAAIACwCAAEAEAANAAAEShAhQ6ulcg7Cuw/GJBSF55ECBQDj 1g2FsKorWRL2CtCFcYCGwMFQmBkAvYNyqSTuiAWm9ECiRKfSng87pQi5SxCCBKYWMpd0CBEBACH5 BAVkAAgALAAAAAAUAA8AAAQSEMlJq7046827/2AojmRpnmgEADs= --------------Boundary-00=_WIJBG6G0000000000000-- From hatamin@ciyasoft.com Sat Aug 24 05:47:49 2002 From: hatamin@ciyasoft.com (Naquib U. Hatami) Date: Sat, 24 Aug 2002 00:47:49 -0400 Subject: [MT-List] Pashto Machine Translation Message-ID: <002201c24b29$663532e0$acbc6444@Nedah> This is a multi-part message in MIME format. ------=_NextPart_000_0023_01C24B07.DF2392E0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, We are in the process of developing the first Pashto-English-Pashto Machine Translation software. Does anyone know if the databases are developed by anyone previously? Please respond to hatamin@ciyasoft.com Regards Naquib Hatami EVP Business Development Ciyasoft Coporation, Inc. (703) 899-6060 www.ciyasoft.com ------=_NextPart_000_0023_01C24B07.DF2392E0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi,

We are in the process of developing the first = Pashto-English-Pashto Machine Translation software. Does anyone know if the databases are = developed by anyone previously? Please respond to hatamin@ciyasoft.com

 

 

Regards

Naquib Hatami

EVP Business = Development

Ciyasoft Coporation, = Inc.

(703) 899-6060

www.ciyasoft.com

 

 

------=_NextPart_000_0023_01C24B07.DF2392E0-- From hatamin@ciyasoft.com Sat Aug 24 05:47:49 2002 From: hatamin@ciyasoft.com (Naquib U. Hatami) Date: Sat, 24 Aug 2002 00:47:49 -0400 Subject: [MT-List] Pashto Machine Translation Message-ID: <002201c24b29$663532e0$acbc6444@Nedah> This is a multi-part message in MIME format. ------=_NextPart_000_0023_01C24B07.DF2392E0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, We are in the process of developing the first Pashto-English-Pashto Machine Translation software. Does anyone know if the databases are developed by anyone previously? Please respond to hatamin@ciyasoft.com Regards Naquib Hatami EVP Business Development Ciyasoft Coporation, Inc. (703) 899-6060 www.ciyasoft.com ------=_NextPart_000_0023_01C24B07.DF2392E0 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi,

We are in the process of developing the first = Pashto-English-Pashto Machine Translation software. Does anyone know if the databases are = developed by anyone previously? Please respond to hatamin@ciyasoft.com

 

 

Regards

Naquib Hatami

EVP Business = Development

Ciyasoft Coporation, = Inc.

(703) 899-6060

www.ciyasoft.com

 

 

------=_NextPart_000_0023_01C24B07.DF2392E0-- From hatamin@ciyasoft.com Sat Aug 24 05:50:12 2002 From: hatamin@ciyasoft.com (Naquib U. Hatami) Date: Sat, 24 Aug 2002 00:50:12 -0400 Subject: [MT-List] Our Farsi Databases Message-ID: <002701c24b29$bbd7ce60$acbc6444@Nedah> This is a multi-part message in MIME format. ------=_NextPart_000_0028_01C24B08.34C62E60 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, Ciyasoft has developed the first English-Farsi-English Machine Translation. Can anyone help us in selling these products? Regards Naquib Hatami EVP Business Development Ciyasoft Coporation, Inc. (703) 899-6060 www.ciyasoft.com ------=_NextPart_000_0028_01C24B08.34C62E60 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi,

Ciyasoft has developed the first = English-Farsi-English Machine Translation. Can anyone help us in selling these = products?

 

Regards

Naquib Hatami

EVP Business = Development

Ciyasoft Coporation, = Inc.

(703) 899-6060

www.ciyasoft.com

 

 

------=_NextPart_000_0028_01C24B08.34C62E60-- From WJHutchins@compuserve.com Fri Aug 23 13:31:13 2002 From: WJHutchins@compuserve.com (John Hutchins) Date: Fri, 23 Aug 2002 08:31:13 -0400 Subject: [MT-List] back issues of MT News International Message-ID: <200208230831_MC3-1-CA6-F5B4@compuserve.com> MT News International nos. 1-18, 1992-1997 These issues are now available as PDF files on the EAMT website (http://www.eamt.org/mtni.html) These issues contain a wealth of information about MT systems and developments during the 1990s, some of which may be difficult to find otherwise. John Hutchins 23 Aug 2002 From cheemin@cs.usm.my Tue Aug 27 00:26:14 2002 From: cheemin@cs.usm.my (LEE) Date: Mon, 26 Aug 2002 16:26:14 -0700 (Pacific Daylight Time) Subject: [MT-List] Latent Semantic Indexing for product categorization Message-ID: <3D6AB916.000003.01076@lee_chee_min> --------------Boundary-00=_Q35HG6G0000000000000 Content-Type: Multipart/Alternative; boundary="------------Boundary-00=_Q35HBHK0000000000000" --------------Boundary-00=_Q35HBHK0000000000000 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Is Latent Semantic Indexing is just for information retrieval?=0D Can it be used for product categorization?=0D =0D Let say i have an English query, how can I conduct prouct categorization = if the documents are in Malay language? --------------Boundary-00=_Q35HBHK0000000000000 Content-Type: Text/HTML; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable =0D =0A
Is Latent Semantic Indexing is just for information=20 retrieval?
Can it be used for product categorization?
 
Let say i have an English query, how can I conduct prouct=20 categorization if the documents are in Malay=20 language?
=09 =09 =09 =09 =09 =09 =09
____________________________________________________
  IncrediMail - Email has finally evolved -=20
Click=20 Here
--------------Boundary-00=_Q35HBHK0000000000000-- --------------Boundary-00=_Q35HG6G0000000000000 Content-Type: unknown/unknown Content-Transfer-Encoding: base64 Content-ID: <0B4C5AD2-696C-4827-8A7F-B6427854623B> R0lGODlhFAAPALMIAP9gAM9gAM8vAM9gL/+QL5AvAGAvAP9gL////wAAAAAAAAAAAAAAAAAAAAAA AAAAACH/C05FVFNDQVBFMi4wAwEAAAAh+QQJFAAIACwAAAAAFAAPAAAEVRDJSaudJuudrxlEKI6B URlCUYyjKpgYAKSgOBSCDEuGDKgrAtC3Q/R+hkPJEDgYCjpKr5A8WK9OaPFZwHoPqm3366VKyeRt E30tVVRscMHDqV/u+AgAIfkEBWQACAAsAAAAABQADwAABBIQyUmrvTjrzbv/YCiOZGmeaAQAIfkE CRQACAAsAgABABAADQAABEoQIUOrpXIOwrsPxiQUheeRAgUA49YNhbCqK1kS9grQhXGAhsDBUJgZ AL2Dcqkk7ogFpvRAokSn0p4PO6UIuUsQggSmFjKXdAgRAQAh+QQFCgAIACwAAAAAFAAPAAAEEhDJ Sau9OOvNu/9gKI5kaZ5oBAAh+QQJFAAIACwCAAEAEAANAAAEShAhQ6ulcg7Cuw/GJBSF55ECBQDj 1g2FsKorWRL2CtCFcYCGwMFQmBkAvYNyqSTuiAWm9ECiRKfSng87pQi5SxCCBKYWMpd0CBEBACH5 BAVkAAgALAAAAAAUAA8AAAQSEMlJq7046827/2AojmRpnmgEADs= --------------Boundary-00=_Q35HG6G0000000000000-- From olyturralde@yahoo.com Tue Aug 27 05:25:55 2002 From: olyturralde@yahoo.com (=?iso-8859-1?q?Orfi=20Yturralde?=) Date: Tue, 27 Aug 2002 05:25:55 +0100 (BST) Subject: [MT-List] research on unsupervised grammar acquisition system Message-ID: <20020827042555.37549.qmail@web9207.mail.yahoo.com> Hello! I'm currently working on a thesis proposal entitled Unsupervised Grammar Acquisition System for Tagalog Language. Is there anybody who has worked on the same topic as mine or is presently doing the same project? I'll be very glad to have a copy of your work for my related literature. It'll surely be of big help. Thanks, Orfi __________________________________________________ Do You Yahoo!? Yahoo! Finance - Get real-time stock quotes http://finance.yahoo.com From T.Peers@postgrad.umist.ac.uk Tue Aug 27 19:21:48 2002 From: T.Peers@postgrad.umist.ac.uk (Toby Peers) Date: Tue, 27 Aug 2002 19:21:48 +0100 Subject: [MT-List] Partial Translation/Language Acquisition Project Message-ID: <1030472508.3d6bc33ca0bf0@webmail1.umist.ac.uk> Hi All I'm currently finishing off my MSc dissertation in CALL. I've built a website which is supposed to generate comprehensible input for German learners of English. It does this by automatically mixing German and english into a 'hybrid text'. Does anyone think this system might be of use as a form of automatic partial MT, as I suggest at the site below? http://lismore.ccl.umist.ac.uk/L5081/toby/XML_stuff/hybrid_texts/html/ Any comments or information on other attempts at the automatic generation of textual comprehensible input or partial MT would be most welcome (There are a few links to previous projects at the above site). Thanks Toby From cheemin@cs.usm.my Wed Aug 28 21:44:06 2002 From: cheemin@cs.usm.my (LEE) Date: Wed, 28 Aug 2002 13:44:06 -0700 (Pacific Daylight Time) Subject: [MT-List] categorizing documents Message-ID: <3D6D3616.000005.01320@lee_chee_min> --------------Boundary-00=_IXMK6RO0000000000000 Content-Type: Multipart/Alternative; boundary="------------Boundary-00=_IXMK12S0000000000000" --------------Boundary-00=_IXMK12S0000000000000 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable is there anyone doing any research on enriching the concepts of category?= =0D eg. organizing mass information by categorizing documents according to th= eir topics in bilingual corpus.=20 --------------Boundary-00=_IXMK12S0000000000000 Content-Type: Text/HTML; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable =0D =0A
is there anyone doing any research on enriching the concepts o= f=20 category?
eg. organizing mass information by categorizing documents acco= rding=20 to their topics in bilingual corpus.
 
 
=09 =09 =09 =09 =09 =09 =09
____________________________________________________
  IncrediMail - Email has finally evolved -=20
Click=20 Here
--------------Boundary-00=_IXMK12S0000000000000-- --------------Boundary-00=_IXMK6RO0000000000000 Content-Type: unknown/unknown Content-Transfer-Encoding: base64 Content-ID: <9B7E64FE-D1FA-4169-957F-B5071B7ACA02> R0lGODlhFAAPALMIAP9gAM9gAM8vAM9gL/+QL5AvAGAvAP9gL////wAAAAAAAAAAAAAAAAAAAAAA AAAAACH/C05FVFNDQVBFMi4wAwEAAAAh+QQJFAAIACwAAAAAFAAPAAAEVRDJSaudJuudrxlEKI6B URlCUYyjKpgYAKSgOBSCDEuGDKgrAtC3Q/R+hkPJEDgYCjpKr5A8WK9OaPFZwHoPqm3366VKyeRt E30tVVRscMHDqV/u+AgAIfkEBWQACAAsAAAAABQADwAABBIQyUmrvTjrzbv/YCiOZGmeaAQAIfkE CRQACAAsAgABABAADQAABEoQIUOrpXIOwrsPxiQUheeRAgUA49YNhbCqK1kS9grQhXGAhsDBUJgZ AL2Dcqkk7ogFpvRAokSn0p4PO6UIuUsQggSmFjKXdAgRAQAh+QQFCgAIACwAAAAAFAAPAAAEEhDJ Sau9OOvNu/9gKI5kaZ5oBAAh+QQJFAAIACwCAAEAEAANAAAEShAhQ6ulcg7Cuw/GJBSF55ECBQDj 1g2FsKorWRL2CtCFcYCGwMFQmBkAvYNyqSTuiAWm9ECiRKfSng87pQi5SxCCBKYWMpd0CBEBACH5 BAVkAAgALAAAAAAUAA8AAAQSEMlJq7046827/2AojmRpnmgEADs= --------------Boundary-00=_IXMK6RO0000000000000-- From WJHutchins@compuserve.com Mon Sep 2 13:44:35 2002 From: WJHutchins@compuserve.com (John Hutchins) Date: Mon, 2 Sep 2002 08:44:35 -0400 Subject: [MT-List] Multilingual Content Creation Survey Sponsorship Message-ID: <200209020844_MC3-1-E02-6494@compuserve.com> Please pass this notice to anyone who may be interested. Replies should be made to Michael Anobile -------------------------------------------------------- RE: Multilingual Content Creation Survey Sponsorship LISA and OSCAR are looking for help to promote an Industry Survey that will educate the international business community concerning the costs, markets, challenges and outlook associated with multilingual content creation and localization. Multilingual content authoring and content localization (i.e., content creation) can be considered as "two sides of the same coin"-- high quality information, which is delievered in our customers preferred language. Today, however, due to different technical environments, workflows and user preferences-- content creation is not being efficiently managed. = The survey's goal is to analyze content creation in the context of global business requirements so that greater synergy between the processes, the technologies, the clients and their service partners will result. The survey addresses all aspects of content creation in terms of current and future practices. The results of this survey will include market and application data, content revenues and costs, user opinions and forcasts regarding workflow technology and trends. = The survey will run in September 2002, and selected results will be presented at the LISA Forum Europe 2002 in Heidelberg (see below for details). The survey is being driven by OSCAR, LISA's standards group for 'Open Standards for Container/Content Allowing Re-use' and supported by the MultiLingual Technology Group at SAP AG, the title Sponsor for the LISA Forum Europe. The Survey will be distributed to over 15,000 companies. The final results will be available in November. There are several ways to help support the Survey. 1. Sponsorship of USD $1,500.00 entitles you to a banner advert on the survey Web page/company recognition on all survey announcements, receive the raw data/answers to the survey, and company logo is listed as a sponsor in the final report/publication. 2. Promote the survey on your own Web site with link to the questionnaire 3. Promote the survey to your partners or customers (e.g. in electronic mail, newsletters, etc.) 4. Participate in the survey The pricing scheme for the survey's final report is valid as follows. When the survey is: - completed and ordered same day@ - USD $150.00 - completed by LISA member@ - USD $225.00 company $275.00 - not completed by LISA member@ - USD $295.00 company $375.00 - non-member completed survey@ - USD $295.00 company $375.00 - public purchase@ - USD $495.00 company $595.00 The survey and the proceeds collected will be used to: - Sponsor OSCAR activities - Feature standards sessions and education programs at LISA events - Help LISA organize similar projects in collaboration with other standards organizations Please let us know by August 30, 2002 if you agree to support the survey, and how. A SAMPLE of the survey is available at http://www.lisa.o= rg/interact/2002/mcc_survey.pdf Please contact us if you have any questions or would like to discuss the project in more detail. Thank you and best regards, Daniel Grasmick, OSCAR Chair Mike Anobile, LISA Director =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D About LISA ---------- The Localization Industry Standards Association (LISA) is the premier organization for the GILT (Globalization, Internationalization, Localization, and Translation) business communities. Consisting of over 200 leading IT companies, solutions providers, and an increasing number of vertical market corporations with an internationally focused strategy, LISA provides best practice, business guidelines, and multilingual communication standards for translation and localization workflow, and enterprise globalization. LISA initiatives, conferences, and training programs help companies implement cost-effective international business models. Web Site: www.lisa.org About the LISA Forum Europe 2002 -------------------------------- The topic of the LISA Forum Europe 2002 is 'Standards in Localization and Translation- Multilingual Content Creation, Workflow Management, Web-Services and your Company's ROI'. The Forum takes place November 4-7, 2002 at the Marriott Hotel in Heidelberg, Germany. Web Site: http://www.lisa.org/events/2002europe/ = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D From asirjean@langtech.co.uk Thu Sep 5 10:45:26 2002 From: asirjean@langtech.co.uk (Angelique Sirjean) Date: Thu, 05 Sep 2002 10:45:26 +0100 Subject: [MT-List] Integrating TM and MT technologies with automated management facilities Message-ID: <5.1.1.6.0.20020905104152.030cd628@192.168.0.1> --=====================_4508282==.ALT Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: quoted-printable PRESS RELEASE 05th September 2002 London, UK Integrating TM and MT technologies with automated management facilities The Language Technology Centre, in conjunction with German software=20 developer CAS Software as technology partner, and two user organisations in= =20 Italy and Greece, respectively, obtained funding from the eContent=20 programme of the European Commission for a trial. The consortium is in the= =20 process of developing a web standard compliant interface and testing the=20 integration of translation technology and customised machine translation=20 within a well defined domain and user environment. A first pilot was ready= =20 at the end of June and is currently tested and linguistically fine tuned.=20 CAS Software and LTC as technology partners intend to release the outcome=20 of the project as a product suitable in a variety of multilingual web=20 communication environments. ABOUT LTC Language Technology Centre (LTC) was established in 1992 by Dr Adriane=20 Rinsche with the objective of providing language technology solutions to a= =20 wide variety of potential application areas. LTC specialises in building=20 multilingual websites, software localisation, consultancy in language=20 technology, technical translation, and software development. Press contact: Dr Adriane Rinsche The Language Technology Centre Ltd, 5-7 Kingston Hill, Kingston, Surrey KT2 7PW UK Phone: +44-20-8549-2359 Fax: +44-20-8974-6994 E-mail: info@langtech.co.uk Web: http://www.langtech.co.uk Ang=E9lique Sirjean - Marketing The Language Technology Centre Ltd Tel: +44(0)20 8549 6267 Fax: +44 (0)20 8974 6994 5-7 Kingston Hill Kingston upon Thames Surrey - KT2 7PW - GB http://www.langtech.co.uk --=====================_4508282==.ALT Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable PRESS RELEASE
05th September 2002
London, UK

Integrating TM and MT technologies with automated management facilities

The Language Technology Centre
, in conjunction with German software developer CAS Software as technology partner, and two user organisations in Italy and Greece, respectively, obtained funding from the eContent programme of the European Commission for a trial. The consortium is in the process of developing a web standard compliant interface and testing the integration of translation technology and customised machine translation within a well defined domain and user environment. A first pilot was ready at the end of June and is currently tested and linguistically fine tuned. CAS Software and LTC as technology partners intend to release the outcome of the project as a product suitable in a variety of multilingual web communication environments.

ABOUT LTC
Language Technology Centre (LTC) was established in 1992 by Dr Adriane Rinsche with the objective of providing language technology solutions to a wide variety of potential application areas. LTC specialises in building multilingual websites, software localisation, consultancy in language technology, technical translation, and software development.

Press contact:
Dr Adriane Rinsche
The Language Technology Centre Ltd,
5-7 Kingston Hill,
Kingston,
Surrey KT2 7PW
UK
Phone: +44-20-8549-2359
Fax: +44-20-8974-6994
E-mail: info@langtech.co.uk
Web: http://www.langtech.co.uk

Ang=E9lique Sirjean - Marketing
The Language Technology Centre Ltd
Tel: +44(0)20 8549 6267
Fax: +44 (0)20 8974 6994
5-7 Kingston Hill
Kingston upon Thames
Surrey - KT2 7PW - GB
http://www.langtech.co.uk

--=====================_4508282==.ALT-- From jack@kanji.org Fri Sep 6 13:42:51 2002 From: jack@kanji.org (Jack Halpern) Date: Fri, 06 Sep 2002 07:42:51 -0500 Subject: [MT-List] A question on terminology management Message-ID: <200209061242.AA18463@mail.kanji.org> Hello from Japan I am Jack Halpern, the CEO of the The CJK Dictionary Institute, which specializes in building large scale CJK lexical databases, especially for proper nouns and technical terminology (see http://www.cjk.org). I would like to investigate the merits of the various terminology management systems available and wonder if there is a detailed website on this or if someone can list the main or recommeded products, especially those that support Chinese. Thanks for your help. Regards, Jack Halpern President, The CJK Dictionary Institute, Inc. http://www.cjk.org Phone: +81-48-473-3508 From Koen.Kerremans@ehb.be Fri Sep 6 09:32:38 2002 From: Koen.Kerremans@ehb.be (KERREMANS, Koen) Date: Fri, 6 Sep 2002 10:32:38 +0200 Subject: [MT-List] textual material related to the field of financial forensics Message-ID: <000201c2557f$f5704b60$fc09a9c0@studttk.ehb.be> Hi all, As a participant in the FF POIROT project, a project that aims at compiling an ontology for the financial forensics domain and that started only one week ago, I was wondering if someone knows of good parallel or comparative textual material (for Dutch, Italian, French and English) that I can use for the development of a multilingual terminological database. Thanks, Koen Kerremans From Christian.Boitet@imag.fr Fri Sep 6 07:35:21 2002 From: Christian.Boitet@imag.fr (Christian Boitet) Date: Fri, 6 Sep 2002 08:35:21 +0200 Subject: [MT-List] Latent Semantic Indexing for product categorization In-Reply-To: <3D6AB916.000003.01076@lee_chee_min> References: <3D6AB916.000003.01076@lee_chee_min> Message-ID: Hello! 6/9/02 At 16:26 -0700 26/08/02, LEE wrote: >Is Latent Semantic Indexing is just for information retrieval? >Can it be used for product categorization? > >Let say i have an English query, how can I conduct prouct >categorization if the documents are in Malay language? Conceptual vectors may help. They can be used for X-lingual thematic indexin= g. You need a specialized concept space CS (e.g. leaves of a thesaurus) in English. To get to the Malay, either adapt the same CS to Malay, or translate the salient Malay words of your documents into English. Ref: articles by Mathieu Lafourcade at LREC-02, TALN-00, -01, 02, COLING-00, COLING-02 and its references. mailto:Mathieu.Lafourcade@lirmm.fr To get access to Malay monolingual dictionairies, mailto:zarin@cs.usm.MY (zaharin), Tang Enya Kong Best, Xan -- ------------------------------------------------------------------------- Christian Boitet (Pr. Universite' Joseph Fourier) Tel: +33.4-7651-4355/4817 GETA, CLIPS, IMAG-campus, BP53 Fax: +33.4-7651-4405 385, rue de la Bibliothe`que Mel: Christian.Boitet@imag.fr 38041 Grenoble Cedex 9, France Mobile: +33-(0)6-6005-1969 http://www-clips.imag.fr/geta/christian.boitet ------------------------------------------------------------------------- Serveurs de dictionnaires: projet SILFIDE (http://silfide.imag.fr) et plus particuli=E8rement fran=E7ais-malais (http://www-clips.imag.fr/geta/services/fem/) Projet C-STAR (http://www.c-star.org/) et projet europe'en Nespole (http://nespole.itc.it) de traduction de parole Projet UNL de communication et recherche d'information multilingue sur le re'seau http://www.unl.ias.unu.edu ou http://www.unl.org, Projet PAPILLON de construction coop=E9rative d'une base lexicale multilingue et de construction de dictionnaires http://www.papillon-dictionary.org/ From macklovi@IRO.UMontreal.CA Fri Sep 6 19:23:21 2002 From: macklovi@IRO.UMontreal.CA (Elliott Macklovitch) Date: Fri, 06 Sep 2002 14:23:21 -0400 Subject: [MT-List] AMTA-2002: Call for Participation Message-ID: <3D78F299.E6F99DC4@IRO.UMontreal.CA> --- AMTA-2002 CONFERENCE --- *** CALL FOR PARTICIPATION **** The Association for Machine Translation in the Americas (AMTA) invites all those interested in translation automation to attend AMTA-2002, the Association's fifth biennial conference, which will be held in Tiburon, California (across the Bay from San Francisco) on October 8-12, 2002. The theme of this year's conference is "From Research to Real Users". AMTA-2002 will feature invited talks by: Ken Church of AT&T Labs -Research; Yorick Wilks of the University of Sheffield; and Jaap van der Meer, former CEO of ALPNET. The full program, as well as online registration forms, are available on the conference Web site: http://www.amtaweb.org/AMTA2002/ *** We encourage people to register now, particularly for the tutorial *** program. (Tutorials with insufficient registration are subject to *** cancellation PRIOR TO the conference.) We also encourage you to *** reserve your accommodations asap, in order to take advantage of *** the discounted rates at the conference venue. We look forward to seeing you in Tiburon! THE CONFERENCE ORGANIZERS Elliott Macklovitch, General Chair Stephen D. Richardson, Program Chair Violetta Cavalli-Sforza, Local Arrangements Chair Bob Frederking, Workshops and Tutorials Laurie Gerber, Exhibits Coordinator From cheemin@cs.usm.my Mon Sep 23 23:22:24 2002 From: cheemin@cs.usm.my (Chee Min) Date: Mon, 23 Sep 2002 15:22:24 -0700 (Pacific Daylight Time) Subject: [MT-List] logical form Message-ID: <3D8F9420.000003.00248@lee-chee-min> --------------Boundary-00=_CTWWG6G0000000000000 Content-Type: Multipart/Alternative; boundary="------------Boundary-00=_CTWWBHK0000000000000" --------------Boundary-00=_CTWWBHK0000000000000 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable hello,=0D =0D Can anybody has the solution, suggestion on how to build the logical form which is done by Microsoft Research Centre to perform grammar checki= ng. It has been implemented inside the Microsoft Word 95 Spelling Checker.=0D =0D Thank You.=0D =20 --------------Boundary-00=_CTWWBHK0000000000000 Content-Type: Text/HTML; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable =0D =0A
hello,
 
    Can anybody has the solution, suggestion on= how to=20 build the logical form which is done by Microsoft Research Centre t= o=20 perform grammar checking. It has been implemented inside the Micros= oft=20 Word 95 Spelling Checker.
 
Thank You.
   
=09 =09 =09 =09 =09 =09 =09
____________________________________________________
  IncrediMail - Email has finally evolved -=20
Click=20 Here
--------------Boundary-00=_CTWWBHK0000000000000-- --------------Boundary-00=_CTWWG6G0000000000000 Content-Type: image/gif Content-Transfer-Encoding: base64 Content-ID: <2973F164-C16F-44E9-82CC-3A52B9E4EF67> R0lGODlhFAAPALMIAP9gAM9gAM8vAM9gL/+QL5AvAGAvAP9gL////wAAAAAAAAAAAAAAAAAAAAAA AAAAACH/C05FVFNDQVBFMi4wAwEAAAAh+QQJFAAIACwAAAAAFAAPAAAEVRDJSaudJuudrxlEKI6B URlCUYyjKpgYAKSgOBSCDEuGDKgrAtC3Q/R+hkPJEDgYCjpKr5A8WK9OaPFZwHoPqm3366VKyeRt E30tVVRscMHDqV/u+AgAIfkEBWQACAAsAAAAABQADwAABBIQyUmrvTjrzbv/YCiOZGmeaAQAIfkE CRQACAAsAgABABAADQAABEoQIUOrpXIOwrsPxiQUheeRAgUA49YNhbCqK1kS9grQhXGAhsDBUJgZ AL2Dcqkk7ogFpvRAokSn0p4PO6UIuUsQggSmFjKXdAgRAQAh+QQFCgAIACwAAAAAFAAPAAAEEhDJ Sau9OOvNu/9gKI5kaZ5oBAAh+QQJFAAIACwCAAEAEAANAAAEShAhQ6ulcg7Cuw/GJBSF55ECBQDj 1g2FsKorWRL2CtCFcYCGwMFQmBkAvYNyqSTuiAWm9ECiRKfSng87pQi5SxCCBKYWMpd0CBEBACH5 BAVkAAgALAAAAAAUAA8AAAQSEMlJq7046827/2AojmRpnmgEADs= --------------Boundary-00=_CTWWG6G0000000000000-- From cheemin@cs.usm.my Wed Sep 25 20:00:08 2002 From: cheemin@cs.usm.my (Chee Min) Date: Wed, 25 Sep 2002 12:00:08 -0700 (Pacific Daylight Time) Subject: [MT-List] categorization Message-ID: <3D9207B8.000005.00548@lee-chee-min> --------------Boundary-00=_8SC06RO0000000000000 Content-Type: Multipart/Alternative; boundary="------------Boundary-00=_8SC012S0000000000000" --------------Boundary-00=_8SC012S0000000000000 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Is anyone know how to categorize Malay and English text/documents or any equaivalent bilingual languages.=0D =0D if so please send me some references.=0D =0D thank you=20 --------------Boundary-00=_8SC012S0000000000000 Content-Type: Text/HTML; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable =0D =0A
Is anyone know how to categorize Malay and English=20 text/documents or any equaivalent bilingual languages.
 
if so please send me some references.
 
thank you 
=09 =09 =09 =09 =09 =09 =09
____________________________________________________
  IncrediMail - Email has finally evolved -=20
Click=20 Here
--------------Boundary-00=_8SC012S0000000000000-- --------------Boundary-00=_8SC06RO0000000000000 Content-Type: image/gif Content-Transfer-Encoding: base64 Content-ID: <98BEC980-9198-4458-A898-82ED10D4FFB3> R0lGODlhFAAPALMIAP9gAM9gAM8vAM9gL/+QL5AvAGAvAP9gL////wAAAAAAAAAAAAAAAAAAAAAA AAAAACH/C05FVFNDQVBFMi4wAwEAAAAh+QQJFAAIACwAAAAAFAAPAAAEVRDJSaudJuudrxlEKI6B URlCUYyjKpgYAKSgOBSCDEuGDKgrAtC3Q/R+hkPJEDgYCjpKr5A8WK9OaPFZwHoPqm3366VKyeRt E30tVVRscMHDqV/u+AgAIfkEBWQACAAsAAAAABQADwAABBIQyUmrvTjrzbv/YCiOZGmeaAQAIfkE CRQACAAsAgABABAADQAABEoQIUOrpXIOwrsPxiQUheeRAgUA49YNhbCqK1kS9grQhXGAhsDBUJgZ AL2Dcqkk7ogFpvRAokSn0p4PO6UIuUsQggSmFjKXdAgRAQAh+QQFCgAIACwAAAAAFAAPAAAEEhDJ Sau9OOvNu/9gKI5kaZ5oBAAh+QQJFAAIACwCAAEAEAANAAAEShAhQ6ulcg7Cuw/GJBSF55ECBQDj 1g2FsKorWRL2CtCFcYCGwMFQmBkAvYNyqSTuiAWm9ECiRKfSng87pQi5SxCCBKYWMpd0CBEBACH5 BAVkAAgALAAAAAAUAA8AAAQSEMlJq7046827/2AojmRpnmgEADs= --------------Boundary-00=_8SC06RO0000000000000-- From Koen.Kerremans@ehb.be Wed Sep 25 07:26:42 2002 From: Koen.Kerremans@ehb.be (KERREMANS, Koen) Date: Wed, 25 Sep 2002 08:26:42 +0200 Subject: [MT-List] term extraction Message-ID: <000901c2645c$845902f0$fc09a9c0@studttk.ehb.be> Hello, Does anyone know of good term extraction tools/methods? My purpose is to compare some of the existing methodologies to one another and to evaluate their performances on domain-specific texts. Good references or surveys of term extraction tools/methods are welcome as well. Regards, Koen Kerremans From a.hartley@leeds.ac.uk Wed Sep 25 13:39:55 2002 From: a.hartley@leeds.ac.uk (Tony Hartley) Date: Wed, 25 Sep 2002 13:39:55 +0100 Subject: [MT-List] Translation Post-Doc at Leeds UK Message-ID: <013201c26490$a72a7140$79c00b81@SMLAH> University of Leeds, UK Postdoctoral Research Fellow in Translation Studies Centre for Translation Studies School of Modern Languages and Cultures This full-time post is available for a fixed term of three years, from 1 January 2003 or as soon as possible thereafter. With your demonstrable research potential in translation, interpreting or subtitling, you will make a key contribution to the expanding research activities of this recently created Centre. You will be enthusiastic about viewing translation (or subtitling or interpreting) as an inherently collaborative activity, often mediated by technology. The Centre has excellent interdisciplinary links with communication studies, computer science, education, linguistics and psychology, so you must have a sense of intellectual adventure and feel comfortable using computer tools for language analysis. You are able to publish in English and have excellent knowledge of at least one of the Centre's languages of research: Arabic, Bulgarian, Chinese, French, German, Greek, Italian, Japanese, Portuguese, Russian, Spanish. You will co-supervise PhD students. Interviews will probably take place on 14 and 15 November 2002. Salary: Research 1A (£17,626 - £26,491 p.a.) Informal enquiries to Professor Tony Hartley, Centre for Translation Studies email a.hartley@leeds.ac.uk tel. +44 (0)113 343 3285 Further details are available from http://www.leeds.ac.uk/jobadverts/ > 'Academic Related', or contact Human Resources tel.: +44 (0)113 343 5771 Job ref.: R35/13 Closing date: 21 October 2002 begin 666 ATT00308.html M/"%$3T-465!%($A434P@4%5"3$E#("(M+R]7,T,O+T141"!(5$U,(#0N,"!4 M7!E(B!#3TY414Y4/2)T97AT+VAT;6P[(&-H87)S M970]:7-O+3@X-3DM,2(^#0H-"@T*/$U%5$$@8V]N=&5N=#TB35-(5$U,(#4N M-3 N-#$S-"XV,# B(&YA;64]1T5.15)!5$]2/CPO2$5!1#X-"CQ"3T19/@T* M/$1)5CX\1D].5"!F86-E/4%R:6%L('-I>F4],CX\4U!!3B -"G-T>6QE/2)M M3H@)TU3($UI;F-H;R6QE/2)M3H@)TU3 M($UI;F-H;R6QE/2)M3H@)TU3($UI;F-H;R6QE/2)M3H@)TU3($UI;F-H;R2 R,# S(&]R M(&%S('-O;VX@87,@<&]S6]U('=I;&P@;6%K M92!A(&ME>2!C;VYT2!C2!T96-H;F]L;V=Y+B!4:&4@0V5N=')E(&AA&-E M;&QE;G0@#0II;G1E2!L:6YK7-I6QE/2)M3H@ M)TU3($UI;F-H;R2!T86ME('!L M86-E( T*;VX@,30@86YD(#$U($YO=F5M8F5R(#(P,#(N/&\Z<#X\+V\Z<#X\ M+U-004X^/"]0/@T*/% @8VQA6QE M/2)M3H@)TU3($UI;F-H;R3H@#0I297-E87)C:" Q02 HHS$W+#8R-B M(*,R-BPT.3$@<"YA+BD\;SIP M/CPO;SIP/CPO4U!!3CX\+U ^#0H\4"!C;&%S'0^/%-0 M04X@#0IS='EL93TB;7-O+69A2!(87)T;&5Y+"!#96YT6QE/2)M6QE/2)M3H@)TU3 M($UI;F-H;R Hi, I'm looking for translation dictionaries (preferably in electronic form), related to the financial and/or legal domain. Does someone happen to know whether these sources exist and where I can find them? Thanks, Koen Kerremans From d@mondialsolutions.net Wed Oct 2 14:33:28 2002 From: d@mondialsolutions.net (Dieter H. Dreiser - mondial solutions) Date: Wed, 2 Oct 2002 21:33:28 +0800 Subject: [MT-List] Aerospace translators required In-Reply-To: <000201c26954$634a0580$fc09a9c0@studttk.ehb.be> Message-ID: Hi, we require translators for English into FIGS experienced in ATA 100 and ATA 200 compliance. Please reply to ATA@mondialsolutions.net From gor@acm.org Tue Oct 1 16:57:42 2002 From: gor@acm.org (Gregor Erbach) Date: Tue, 1 Oct 2002 16:57:42 +0100 Subject: [MT-List] Re: [Corpora-List] looking for domain-specific translation dictionaries In-Reply-To: <000201c26954$634a0580$fc09a9c0@studttk.ehb.be> References: <000201c26954$634a0580$fc09a9c0@studttk.ehb.be> Message-ID: <1033487862.3d99c5f6ce597@www2.dfki.de> Quoting "KERREMANS, Koen" : > I'm looking for translation dictionaries (preferably in electronic form), > related to the financial and/or legal domain. Does someone happen to know > whether these sources exist and where I can find them? For a German resource in the finance/business domain, try Mr. Honey's Business English Dictionary (German/English). For multilingual dictionaries, the EU resources Eurodicautom (used by the commission's translation service) and Euterpe (European parliament terminlogy database) should contain a lot of financial/legal terms as well. It may be difficult to obtain the data; we used them for a EU-funded multilingual search engine project. regards, Gregor Erbach ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dr. Gregor Erbach http://purl.org/net/gregor/ Saarland University http://www.uni-sb.de/ Computational Linguistics Dept. http://www.coli.uni-sb.de/ Project COLLATE http://collate.dfki.de/ Tel. +49 (681) 302-5354 mailto:gor@acm.org From Koen.Kerremans@ehb.be Fri Oct 4 08:36:46 2002 From: Koen.Kerremans@ehb.be (KERREMANS, Koen) Date: Fri, 4 Oct 2002 09:36:46 +0200 Subject: [MT-List] term extraction info In-Reply-To: Message-ID: <000001c26b78$cbe8bbb0$fc09a9c0@studttk.ehb.be> Hello, These are the references I got in answer to my question concerning "term extraction" (cf. see below). Each reference is preceded by the name of the person who gave me the information. Feel free to add more info to this list. Regards, Koen Kerremans 1. Books: -Jerome Richalot: Pearson, J. (1998). Terms in context. Benjamins, John Publishing Company. 2. Articles: -Piklu Gupta: Heid, U. (1999). "Extracting terminologically relevant collocations from German technical texts". [search via Google] -Klaus Fleischmann: L'homme, Benali, Bertrand, Laudique. (1996). "Definition of an evaluation grid for term extraction software". In Terminology 3:2. Benjamins Publishing Co. -Chantal Enguehard: Enguehard, C., Pantéra, L., "Automatic Natural Acquisition of a Terminology", Journal of quantitative linguistics, vol.2, n°1, pp.27-32, 1995. -Chantal Enguehard: C. Enguehard, B. Daille, E. Morin, “Tools for Terminology Processing”, The Indo-European Conference on Multilingual Communications Technologies (IEMCT), R. K. Arora, M. Kulkarni, H. Darbari (editors), Tata McGraw-Hill, pp.218-229, Pune, India, June 2002. 3. Websites: -Johan Haller: http://www.iai.uni-sb.de/de/pub.html -John Kohl: http://www.xplanation.com [xplanation has a term-extraction tool that is part of the MT system that this company uses. It is pretty good at identifying noun phrases. They are located in Leuven. They also have controlled-English software] -Ross Smith: http://www.mkms.xerox.com [XEROX have a terminology management program called XTS which contains an extraction function] -François Rousselot: http://www-ensais.u-strasbg.fr/liia/LIIA_Products_Installers/install.htm [this tool is based on repeated segments: there is a small english documentation in the program] -Scheiden: http://www.biomath.jussieu.fr/ATALA/outil/ [section "Extraction de termes"] 4. Notes: -Sabine Kirchmeier-Andersen (http://www.id.cbs.dk/medarbej/ska/sabine_da.shtml) recommends Word Smith Tools and Quirk who both can use a LGP korpus in order to identify automatically frequent LSP candidates. She thinks the latest articles by Beatrice Daille et al. about term extraction describe the most efficient methodology. -Antal van den Bosch (http://ilk.kub.nl/~antalb/) did some experimenting with a memory-based shallow parser after which he extracted terms using the tf*idf method in statistics > -----Original Message----- > Hello, > > Does anyone know of good term extraction tools/methods? My purpose is to > compare some of the existing methodologies to one another and to evaluate > their performances on domain-specific texts. Good references or surveys of > term extraction tools/methods are welcome as well. > > Regards, > > Koen Kerremans > From Koen.Kerremans@ehb.be Fri Oct 4 08:44:44 2002 From: Koen.Kerremans@ehb.be (KERREMANS, Koen) Date: Fri, 4 Oct 2002 09:44:44 +0200 Subject: [MT-List] term extraction info (2) Message-ID: <000101c26b79$e85f4dd0$fc09a9c0@studttk.ehb.be> !!I forgot some references. This is the new (complete) version of responses!! Hello, These are the references I got in answer to my question concerning "term extraction" (cf. see below). Each reference is preceded by the name of the person who gave me the information. Feel free to add more info to this list. Regards, Koen Kerremans 1. Books: -Jerome Richalot: Pearson, J. (1998). Terms in context. Benjamins, John Publishing Company. -Jorge Vivaldi: Bourigault, D.; Jacquemin C. y M.-C. L'Homme (eds.) Recent Advances in Computational Terminology. John Benjamins Publishing Company. Amsterdam. 2. Articles: -Piklu Gupta: Heid, U. (1999). "Extracting terminologically relevant collocations from German technical texts". [search via Google] -Klaus Fleischmann: L'homme, Benali, Bertrand, Laudique. (1996). "Definition of an evaluation grid for term extraction software". In Terminology 3:2. Benjamins Publishing Co. -Chantal Enguehard: Enguehard, C., Pantéra, L., "Automatic Natural Acquisition of a Terminology", Journal of quantitative linguistics, vol.2, n°1, pp.27-32, 1995. -Chantal Enguehard: C. Enguehard, B. Daille, E. Morin, “Tools for Terminology Processing”, The Indo-European Conference on Multilingual Communications Technologies (IEMCT), R. K. Arora, M. Kulkarni, H. Darbari (editors), Tata McGraw-Hill, pp.218-229, Pune, India, June 2002. -Jorge Vivaldi: Vivaldi, J. y H. Rodríguez (2000) "Improving term extraction by combining different techniques". Ananiadou S. y D. Maynard (eds.) in the proceedings of Workshop on Computational terminology for medical and biological Applications (NLP2000). Patras, 4 de junio. Pags. 61-68. -Jorge Vivaldi: Terminology, Vol. 7, num.1. John Benjamins. Pag. 31-47. John Benjamins Publishing Company, Amsterdam. -Vivaldi J. and H. Rodríguez (2002) Medical Term Extraction using EWN ontology. Proceedings of "Terminology and Knowledge Engineering 2002" (TKE'02). Nancy. 3. Websites: -Johan Haller: http://www.iai.uni-sb.de/de/pub.html -John Kohl: http://www.xplanation.com [xplanation has a term-extraction tool that is part of the MT system that this company uses. It is pretty good at identifying noun phrases. They are located in Leuven. They also have controlled-English software] -Ross Smith: http://www.mkms.xerox.com [XEROX have a terminology management program called XTS which contains an extraction function] -François Rousselot: http://www-ensais.u-strasbg.fr/liia/LIIA_Products_Installers/install.htm [this tool is based on repeated segments: there is a small english documentation in the program] -Scheiden: http://www.biomath.jussieu.fr/ATALA/outil/ [section "Extraction de termes"] -Nicholas Hernandez: http://www.limsi.fr/Individu/jacquemi/ [Fastr is a parser for term and variant recognition. Fastr take as input a corpus and a list of terms and ouputs the indexed corpus in which terms and variants are recognized] 4. Notes: -Sabine Kirchmeier-Andersen (http://www.id.cbs.dk/medarbej/ska/sabine_da.shtml) recommends Word Smith Tools and Quirk who both can use a LGP korpus in order to identify automatically frequent LSP candidates. She thinks the latest articles by Beatrice Daille et al. about term extraction describe the most efficient methodology. -Antal van den Bosch (http://ilk.kub.nl/~antalb/) did some experimenting with a memory-based shallow parser after which he extracted terms using the tf*idf method in statistics > -----Original Message----- > Hello, > > Does anyone know of good term extraction tools/methods? My purpose is to > compare some of the existing methodologies to one another and to evaluate > their performances on domain-specific texts. Good references or surveys of > term extraction tools/methods are welcome as well. > > Regards, > > Koen Kerremans > From Koen.Kerremans@ehb.be Fri Oct 4 09:01:06 2002 From: Koen.Kerremans@ehb.be (KERREMANS, Koen) Date: Fri, 4 Oct 2002 10:01:06 +0200 Subject: [MT-List] term extraction info (3) In-Reply-To: <000101c26b79$e85f4dd0$fc09a9c0@studttk.ehb.be> Message-ID: <000001c26b7c$31c3afa0$fc09a9c0@studttk.ehb.be> Several people also recommended the Trados software programme ExtraTerm. More info can be found at: http://www.trados.com Regards, Koen -----Oorspronkelijk bericht----- Van: mt-list-admin@eamt.org [mailto:mt-list-admin@eamt.org]Namens KERREMANS, Koen Verzonden: vrijdag 4 oktober 2002 9:45 Aan: mt-list@eamt.org; corpora@hd.uib.no Onderwerp: [MT-List] term extraction info (2) !!I forgot some references. This is the new (complete) version of responses!! Hello, These are the references I got in answer to my question concerning "term extraction" (cf. see below). Each reference is preceded by the name of the person who gave me the information. Feel free to add more info to this list. Regards, Koen Kerremans 1. Books: -Jerome Richalot: Pearson, J. (1998). Terms in context. Benjamins, John Publishing Company. -Jorge Vivaldi: Bourigault, D.; Jacquemin C. y M.-C. L'Homme (eds.) Recent Advances in Computational Terminology. John Benjamins Publishing Company. Amsterdam. 2. Articles: -Piklu Gupta: Heid, U. (1999). "Extracting terminologically relevant collocations from German technical texts". [search via Google] -Klaus Fleischmann: L'homme, Benali, Bertrand, Laudique. (1996). "Definition of an evaluation grid for term extraction software". In Terminology 3:2. Benjamins Publishing Co. -Chantal Enguehard: Enguehard, C., Pantéra, L., "Automatic Natural Acquisition of a Terminology", Journal of quantitative linguistics, vol.2, n°1, pp.27-32, 1995. -Chantal Enguehard: C. Enguehard, B. Daille, E. Morin, “Tools for Terminology Processing”, The Indo-European Conference on Multilingual Communications Technologies (IEMCT), R. K. Arora, M. Kulkarni, H. Darbari (editors), Tata McGraw-Hill, pp.218-229, Pune, India, June 2002. -Jorge Vivaldi: Vivaldi, J. y H. Rodríguez (2000) "Improving term extraction by combining different techniques". Ananiadou S. y D. Maynard (eds.) in the proceedings of Workshop on Computational terminology for medical and biological Applications (NLP2000). Patras, 4 de junio. Pags. 61-68. -Jorge Vivaldi: Terminology, Vol. 7, num.1. John Benjamins. Pag. 31-47. John Benjamins Publishing Company, Amsterdam. -Vivaldi J. and H. Rodríguez (2002) Medical Term Extraction using EWN ontology. Proceedings of "Terminology and Knowledge Engineering 2002" (TKE'02). Nancy. 3. Websites: -Johan Haller: http://www.iai.uni-sb.de/de/pub.html -John Kohl: http://www.xplanation.com [xplanation has a term-extraction tool that is part of the MT system that this company uses. It is pretty good at identifying noun phrases. They are located in Leuven. They also have controlled-English software] -Ross Smith: http://www.mkms.xerox.com [XEROX have a terminology management program called XTS which contains an extraction function] -François Rousselot: http://www-ensais.u-strasbg.fr/liia/LIIA_Products_Installers/install.htm [this tool is based on repeated segments: there is a small english documentation in the program] -Scheiden: http://www.biomath.jussieu.fr/ATALA/outil/ [section "Extraction de termes"] -Nicholas Hernandez: http://www.limsi.fr/Individu/jacquemi/ [Fastr is a parser for term and variant recognition. Fastr take as input a corpus and a list of terms and ouputs the indexed corpus in which terms and variants are recognized] 4. Notes: -Sabine Kirchmeier-Andersen (http://www.id.cbs.dk/medarbej/ska/sabine_da.shtml) recommends Word Smith Tools and Quirk who both can use a LGP korpus in order to identify automatically frequent LSP candidates. She thinks the latest articles by Beatrice Daille et al. about term extraction describe the most efficient methodology. -Antal van den Bosch (http://ilk.kub.nl/~antalb/) did some experimenting with a memory-based shallow parser after which he extracted terms using the tf*idf method in statistics > -----Original Message----- > Hello, > > Does anyone know of good term extraction tools/methods? My purpose is to > compare some of the existing methodologies to one another and to evaluate > their performances on domain-specific texts. Good references or surveys of > term extraction tools/methods are welcome as well. > > Regards, > > Koen Kerremans > -- For MT-List info, see http://www.eamt.org/mt-list.html From Koen.Kerremans@ehb.be Fri Oct 4 09:01:06 2002 From: Koen.Kerremans@ehb.be (KERREMANS, Koen) Date: Fri, 4 Oct 2002 10:01:06 +0200 Subject: [Corpora-List] RE: [MT-List] term extraction info (3) In-Reply-To: <000101c26b79$e85f4dd0$fc09a9c0@studttk.ehb.be> Message-ID: <000001c26b7c$31c3afa0$fc09a9c0@studttk.ehb.be> Several people also recommended the Trados software programme ExtraTerm. More info can be found at: http://www.trados.com Regards, Koen -----Oorspronkelijk bericht----- Van: mt-list-admin@eamt.org [mailto:mt-list-admin@eamt.org]Namens KERREMANS, Koen Verzonden: vrijdag 4 oktober 2002 9:45 Aan: mt-list@eamt.org; corpora@hd.uib.no Onderwerp: [MT-List] term extraction info (2) !!I forgot some references. This is the new (complete) version of responses!! Hello, These are the references I got in answer to my question concerning "term extraction" (cf. see below). Each reference is preceded by the name of the person who gave me the information. Feel free to add more info to this list. Regards, Koen Kerremans 1. Books: -Jerome Richalot: Pearson, J. (1998). Terms in context. Benjamins, John Publishing Company. -Jorge Vivaldi: Bourigault, D.; Jacquemin C. y M.-C. L'Homme (eds.) Recent Advances in Computational Terminology. John Benjamins Publishing Company. Amsterdam. 2. Articles: -Piklu Gupta: Heid, U. (1999). "Extracting terminologically relevant collocations from German technical texts". [search via Google] -Klaus Fleischmann: L'homme, Benali, Bertrand, Laudique. (1996). "Definition of an evaluation grid for term extraction software". In Terminology 3:2. Benjamins Publishing Co. -Chantal Enguehard: Enguehard, C., Pantéra, L., "Automatic Natural Acquisition of a Terminology", Journal of quantitative linguistics, vol.2, n°1, pp.27-32, 1995. -Chantal Enguehard: C. Enguehard, B. Daille, E. Morin, “Tools for Terminology Processing”, The Indo-European Conference on Multilingual Communications Technologies (IEMCT), R. K. Arora, M. Kulkarni, H. Darbari (editors), Tata McGraw-Hill, pp.218-229, Pune, India, June 2002. -Jorge Vivaldi: Vivaldi, J. y H. Rodríguez (2000) "Improving term extraction by combining different techniques". Ananiadou S. y D. Maynard (eds.) in the proceedings of Workshop on Computational terminology for medical and biological Applications (NLP2000). Patras, 4 de junio. Pags. 61-68. -Jorge Vivaldi: Terminology, Vol. 7, num.1. John Benjamins. Pag. 31-47. John Benjamins Publishing Company, Amsterdam. -Vivaldi J. and H. Rodríguez (2002) Medical Term Extraction using EWN ontology. Proceedings of "Terminology and Knowledge Engineering 2002" (TKE'02). Nancy. 3. Websites: -Johan Haller: http://www.iai.uni-sb.de/de/pub.html -John Kohl: http://www.xplanation.com [xplanation has a term-extraction tool that is part of the MT system that this company uses. It is pretty good at identifying noun phrases. They are located in Leuven. They also have controlled-English software] -Ross Smith: http://www.mkms.xerox.com [XEROX have a terminology management program called XTS which contains an extraction function] -François Rousselot: http://www-ensais.u-strasbg.fr/liia/LIIA_Products_Installers/install.htm [this tool is based on repeated segments: there is a small english documentation in the program] -Scheiden: http://www.biomath.jussieu.fr/ATALA/outil/ [section "Extraction de termes"] -Nicholas Hernandez: http://www.limsi.fr/Individu/jacquemi/ [Fastr is a parser for term and variant recognition. Fastr take as input a corpus and a list of terms and ouputs the indexed corpus in which terms and variants are recognized] 4. Notes: -Sabine Kirchmeier-Andersen (http://www.id.cbs.dk/medarbej/ska/sabine_da.shtml) recommends Word Smith Tools and Quirk who both can use a LGP korpus in order to identify automatically frequent LSP candidates. She thinks the latest articles by Beatrice Daille et al. about term extraction describe the most efficient methodology. -Antal van den Bosch (http://ilk.kub.nl/~antalb/) did some experimenting with a memory-based shallow parser after which he extracted terms using the tf*idf method in statistics > -----Original Message----- > Hello, > > Does anyone know of good term extraction tools/methods? My purpose is to > compare some of the existing methodologies to one another and to evaluate > their performances on domain-specific texts. Good references or surveys of > term extraction tools/methods are welcome as well. > > Regards, > > Koen Kerremans > -- For MT-List info, see http://www.eamt.org/mt-list.html From bermanv@zahav.net.il Fri Oct 4 10:02:34 2002 From: bermanv@zahav.net.il (Vadim Berman) Date: Fri, 4 Oct 2002 11:02:34 +0200 Subject: [MT-List] Partner wanted Message-ID: <000001c26b89$dcb77f60$0200a8c0@mshome.net> Hello all, I work on an innovatory machine translation engine which includes features not found in most commercial applications (the features are listed later). Currently I have working prototype and a small demo database. I am looking for a company or an institution to provide me adminstrative and marketing support; also, while it is possible for me to construct the real-world database by myself for some languages, it would be beneficial to employ professional linguists for this task. The distinctive features of my engine are: 1. Innovatory lexical data organization which allows: * Faster and more nuance-tolerant word entry * Style sensitivity and control - ability to reshape the result so it will bear the same meaning but in different style. For example, the user may write a phrase using common words in French, and the program would translate it into official style in English. Also, style statistic information contributes to accurate ambiguities resolution * Ability to extract text topics, and, again, aiding to resolve ambiguities by using so-called "flexible domains" * Model and entities are grammar-indepent and is suitable for any language without the need to reprogram or even recompile * Interlingua approach 2. Descriptive mini-language for collocations, rules and idioms which allows: * Effective handling of "lexical holes" * Another means to resolve ambiguities * Fast entry with a GUI tool, no need for direct linguist - prorammer interaction * Override-driven rule / collocation / idiom selection: that is, more specific rule / collocation / idiom is selected to be the active one when two or more rules conflict. * Unknown word heuristics & even translation, is some cases 3. Scalability and small footprint More info available on request. Best regards, Vadim Berman From olyturralde@yahoo.com Mon Oct 7 11:06:49 2002 From: olyturralde@yahoo.com (=?iso-8859-1?q?Orfi=20Yturralde?=) Date: Mon, 7 Oct 2002 11:06:49 +0100 (BST) Subject: [MT-List] morphological rule acquisition Message-ID: <20021007100649.73655.qmail@web40511.mail.yahoo.com> Hello! I'm currently working on my thesis proposal entitled Automatic Acquisition of Morphological Rules for Tagalog. I would like to know anybody working on the same area that will contribute to my related literatures. Thank you very much. Orfi __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com From R.Mitkov@wlv.ac.uk Fri Oct 4 10:25:20 2002 From: R.Mitkov@wlv.ac.uk (Ruslan Mitkov) Date: Fri, 04 Oct 2002 10:25:20 +0100 Subject: [MT-List] Re: [Corpora-List] term extraction info (2) In-Reply-To: <000101c26b79$e85f4dd0$fc09a9c0@studttk.ehb.be> Message-ID: <4.1.20021004102253.00bb7b60@mail.wlv.ac.uk> >1. Books: > >-Jerome Richalot: Pearson, J. (1998). Terms in context. Benjamins, John >Publishing Company. >-Jorge Vivaldi: Bourigault, D.; Jacquemin C. y M.-C. L'Homme (eds.) Recent >Advances in Computational Terminology. John Benjamins Publishing Company. >Amsterdam. This one should not be omitted: Jacquemin, C. 2001. Spotting and discovering terms through NLP. MIT Press. From macklovi@IRO.UMontreal.CA Tue Oct 8 18:27:56 2002 From: macklovi@IRO.UMontreal.CA (via the vacation program) Date: Tue, 8 Oct 2002 13:27:56 -0400 Subject: [MT-List] away from the office | absent du bureau Message-ID: <200210081727.g98HRufF004894@mercure.iro.umontreal.ca> Je serai absent du bureau jusqu'au 15 octobre 2002. Du 8 au 12 octobre, je serai à Tiburon CA où j'asssiterai à la conférence AMTA-2002. Si vous devez me rejoindre d'urgence, vous pourriez m'écrire à "E_Macklovitch@hotmail.com". Autrement, je répondrai à votre message dès mon retour. I'll be away from the office until October 15, 2002. From October 8 to October 12, I'll be attending the AMTA-2002 Conference in Tiburon CA. If you need to reach me urgently, you can email me at "E_Macklovitch@hotmail.com". Otherwise, I'll reply to your message as soon as I get back. Elliott Macklovitch Coordonnateur du laboratoire RALI From WJHutchins@compuserve.com Wed Oct 16 11:32:12 2002 From: WJHutchins@compuserve.com (John Hutchins) Date: Wed, 16 Oct 2002 06:32:12 -0400 Subject: [MT-List] Online machine translation collaboartor sought Message-ID: <200210160632_MC3-1-15F0-3253@compuserve.com> RE: Online machine translation collaboartor sought = AlphaGalileo, the European online research news service (www.alphagalileo= .org), is looking for a collaborator to assist in the creation of an online machine transl= ation option within the site. We operate in all European languages, but mainly English, French, Finnish= , German, Greek, = Portuguese and Swedish. Please get in touch if this is of interest. Peter Green Director AlphaGalileo: the Internet press centre for European research in science = and the arts http://www.alphagalileo.org peter.green@alphagalileo.org Voice: +44 (0) 1793 514276 Fax: +44 (0)870 052 4429 Mobile: +44 (0) 7866 727141 From ling98@videotron.ca Thu Oct 17 01:57:15 2002 From: ling98@videotron.ca (Michael Blekhman) Date: Wed, 16 Oct 2002 20:57:15 -0400 Subject: [MT-List] Call for contributions Message-ID: <002f01c27578$22e12160$6400a8c0@michaelb> Dear colleagues, As a guest editor of the International Journal for Translation, I would like to invite you to participate in the 3rd, 2003, issue of the journal covering practical MT issues. The deadline is January 15, 2003. If you are interested, please feel free to E-mail me your papers, up to 20 pages each, in the MS RTF or *.DOC format. The papers should be in English, preferably in the Times New Roman font, 11 pts. Sincerely, Michael Blekhman Lingvistica '98 Inc. Montreal, Canada Lingvistica b.v. Dongen, The Netherlands ling98@canada.com www.ling98.com Tel: 514 331 0172 From amp@apptek.com Mon Oct 21 19:25:18 2002 From: amp@apptek.com (Masoud Pirnazar) Date: Mon, 21 Oct 2002 14:25:18 -0400 Subject: [MT-List] RE: MT-List digest, Vol 1 #55 - 8 msgs In-Reply-To: <20021018230010.67E415383B@pairlist.net> Message-ID: Vadim, I'd like to get a little more info about your system. Also, what language pairs are you experimenting with, and how big a vocabulary/database have you built? Do you do any deep parsing? Or just co-locations? We have a "translation memory" tool, which we've been tweaking for a while, that lets the end-user put in phrases and sentences, and will try to match pieces of a new sentence to what's in the database. For a language like Arabic, you really have to do morphology before you can do much matching. A "part of speech" tagger approach helps in the matching and disambiguating, too. It's all statistical, not too "deep meaning" oriented--i.e. get 95% of the translations through ok, don't worry about the 5% misunderstandings. --Masoud Pirnazar From: "Vadim Berman" To: Date: Fri, 4 Oct 2002 11:02:34 +0200 Subject: [MT-List] Partner wanted Hello all, I work on an innovatory machine translation engine which includes features not found in most commercial applications (the features are listed later). Currently I have working prototype and a small demo database. I am looking for a company or an institution to provide me adminstrative and marketing support; also, while it is possible for me to construct the real-world database by myself for some languages, it would be beneficial to employ professional linguists for this task. The distinctive features of my engine are: 1. Innovatory lexical data organization which allows: * Faster and more nuance-tolerant word entry * Style sensitivity and control - ability to reshape the result so it will bear the same meaning but in different style. For example, the user may write a phrase using common words in French, and the program would translate it into official style in English. Also, style statistic information contributes to accurate ambiguities resolution * Ability to extract text topics, and, again, aiding to resolve ambiguities by using so-called "flexible domains" * Model and entities are grammar-indepent and is suitable for any language without the need to reprogram or even recompile * Interlingua approach 2. Descriptive mini-language for collocations, rules and idioms which allows: * Effective handling of "lexical holes" * Another means to resolve ambiguities * Fast entry with a GUI tool, no need for direct linguist - prorammer interaction * Override-driven rule / collocation / idiom selection: that is, more specific rule / collocation / idiom is selected to be the active one when two or more rules conflict. * Unknown word heuristics & even translation, is some cases 3. Scalability and small footprint More info available on request. Best regards, Vadim Berman From Koen.Kerremans@ehb.be Tue Oct 22 09:14:10 2002 From: Koen.Kerremans@ehb.be (KERREMANS, Koen) Date: Tue, 22 Oct 2002 10:14:10 +0200 Subject: [MT-List] alternative methods/approaches in automatic term extraction Message-ID: <000001c279a3$00ae5390$fc09a9c0@studttk.ehb.be> Hi, The possibilities of automatic term extraction are usually explored using statistical information, linguistic information or a combination of both. Does anyone know of "alternative methods/approaches", i.e. methods/approaches that make use of other types of information (besides statistical or linguistic information)? Kind regards, Koen Kerremans From Andy.Way" =09=09=09***Final Call For Papers*** Joint Conference combining the 7th International Workshop of the European Association for Machine Translation and the 4th Controlled Language Applications Workshop Main Conference theme: Controlled Language Translation Location: Dublin City University, Ireland Dates: 15th-17th May, 2003 =09=09Conference URL: http://www.eamt.org/eamt-claw03/ =09=09 =09=09Invited Speakers: Steven Krauwer, University of Utrecht and =09=20 Coordinator of ELSNET =09=09=09=09 Lou Cremers, Oc=E9 Technologies =09=09 Over the years, there have been many conferences on MT, involving rule-base= d approaches, statistical and example-based approaches, hybrid and multi-engine approaches as well as those limited to particular sublanguage domains. In addition, there has been an increased level of interest in controlled languages, culminating in the series of Workshops on controlled language applications. These have given impetus to both monolingual and multilingual guidelines and applications using controlled language, for man= y different languages.=09 Controlled languages are subsets of natural languages whose grammars and dictionaries have been restricted in order to reduce or eliminate both ambiguity and complexity. Traditionally, controlled languages fall into two major categories: those that improve readability for human readers, particularly non-native speakers, and those that improve computational processing of the text. It is often claimed that machine-oriented controlle= d language should be of particular benefit when it comes to the use of translation tools (including machine translation, translation memory, multilingual terminology tools etc.). Experience has shown that high quality MT systems can be designed for specialized domains (e.g. METEO). However, the area of controlled translation has remained relatively unaddressed. This is rather strange given its undoubted importance. Such examples that exist use rule-based MT (RBMT) systems to translate controlled language documentation, e.g. Caterpillar's CTE and CMU's KANT system, and General Motors CASL and LantMark, etc. However, fine-tuning general systems designed for use with unrestricted texts to derive specific, restricted applications is complex and expensive. The primary aim of this unique conference, therefore, is to elicit papers o= n controlled translation, and provide a forum in which the problems may be outlined, possible solutions proposed, and in general to bring together developers, implementors, researchers and end-users from the publications, authoring, translation and localization fields to discuss how ideas from both the authoring and translation camps might be integrated in this common area. Some specific topics which might be addressed include: * What is controlled translation? * RBMT and controlled translation. * TM/EBMT and controlled translation. * Influence and interplay of controlled language upon both source-language parsing and target-language generation in an MT system= . * Role of the lexicon in controlled translation. * Can we expect better controlled translations from a hybrid approach? O= r from a multi-engine approach? * Towards a Roadmap for controlled translation - the way ahead? In addition, we welcome contributions on MT as well as on controlled language which do not address the main theme per se. Please consult the=20 conference URL (http://www.eamt.org/eamt-claw03/) for some suggestions.=20 Important Dates Draft papers due 29th November, 2002 Reviews due 31st January, 2003 Notification of acceptance 14th February, 2003 Camera-ready papers & pre-registration due 31st March, 2003 Submission Details Papers accepted for the conference will be published in a proceedings volum= e available to all attendees. Papers should describe unique work not publishe= d before. Papers that are being submitted to other conferences should include this information on the first page. Paper submissions should follow these conventions: * Maximum length is 4000 words * 8.5" x 11" page size * Single-column, single-spaced, 1" margins * 12 point font * Include title, authors, and contact info centered at the top of the first page * Include an abstract of about 100 words Electronic submission is strongly encouraged. We prefer PDF files, sent as EMail attachments. Electronic submissions should be sent to Eric Nyberg (ehn@cs.cmu.edu), with `Submission for EAMT-CLAW 2003' in the Subject line = of=20 the email. Other Information Please consult the conference website at: http://www.eamt.org/eamt-claw03/ or mail Andy Way (away@computing.dcu.ie). From rmk@cse.iitk.ac.in Fri Oct 25 14:31:36 2002 From: rmk@cse.iitk.ac.in (R M K Sinha) Date: Fri, 25 Oct 2002 19:01:36 +0530 (IST) Subject: [MT-List] English to Hindi machine translation Message-ID: Dear colleagues, We have web-enabled prelim. version of our English to Hindi Machine Aided Translation System. The address is: http://anglahindi.iitk.ac.in Please use the system and provide us with your feedback. Thanks, Dr. R.M.K. Sinha | Phone: +91-512-597174 (Office) Professor | +91-512-598254/591650 (Residence) Computer Science & Engineering | and Electrical Engineering | Fax: +91-512-590725 or Indian Institute of Technology,| +91-512-590260 Kanpur 208016 India | E-mail: rmk@cse.iitk.ac.in From aligak@yahoo.com Mon Oct 28 02:30:51 2002 From: aligak@yahoo.com (Ellias Kaker) Date: Sun, 27 Oct 2002 18:30:51 -0800 (PST) Subject: [MT-List] Farsi Machine Translation Message-ID: <20021028023051.29876.qmail@web13203.mail.yahoo.com> Hi, I am looking for a good commercial Farsi to English Machine Translation system(bi directional). This will be used on UN document translation. Thank You Ellias __________________________________________________ Do you Yahoo!? Y! Web Hosting - Let the expert host your web site http://webhosting.yahoo.com/ From WJHutchins@compuserve.com Tue Oct 29 10:25:59 2002 From: WJHutchins@compuserve.com (John Hutchins) Date: Tue, 29 Oct 2002 05:25:59 -0500 Subject: [MT-List] update of Compendium Message-ID: <200210290526_MC3-1-17E7-9892@compuserve.com> = The latest update (October 2002) of the "Compendium of translation software", which lists current MT systems, MT services, and translation tools of all kinds, is now available on the EAMT website: http://www.eamt.org/compendium.html As before, there are numerous changes, additions and deletions since the previous update (June 2002). If the product you are looking for is not there, it may be very new (let me know and it will be added). Or perhaps it may no longer be available, and you may find it in an earlier edition of the "Compendium" (see http://ourworld.compuserve.com/homepages/WJHutchins/compendium.htm) John Hutchins 29 October From WJHutchins@compuserve.com Tue Oct 29 14:37:01 2002 From: WJHutchins@compuserve.com (John Hutchins) Date: Tue, 29 Oct 2002 09:37:01 -0500 Subject: [MT-List] Farsi Machine Translation Message-ID: <200210290937_MC3-1-1806-62E@compuserve.com> Dear Ellias Kaker, As far as I know there is only the following: CiyaTran = Company: Ciyasoft Corporation Category: MT system Dictionaries: 85 technical dictionaries Requirements: 256MB RAM Translation speed: 400 pages/min. Price: unknown Source: http://www.ciyasoft.com This is from English to Farsi only. The company produces systems also from English to Arabic, Dari and Pashto. However, you will see from the "Compendium of translation software" (www.eamt.org/compendium.htm) that there are a number of companies producing electronic dictionaries for English/Farsi. Best wishes, John Hutchins =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > = Hi, I am looking for a good commercial Farsi to English Machine Translation system(bi directional). This will be used on UN document translation. Thank You Ellias< From nikkhou@elda.fr Tue Oct 29 14:58:14 2002 From: nikkhou@elda.fr (Mahtab Nikkhou) Date: Tue, 29 Oct 2002 15:58:14 +0100 Subject: [MT-List] Farsi Machine Translation In-Reply-To: <200210290937_MC3-1-1806-62E@compuserve.com> Message-ID: <4.3.2.7.2.20021029155551.01643ee8@pop3.norton.antivirus> Dear Ellias, You may also have a look at Apptek's website at the following link: http://www.apptek.com. They work on Persian-English MT systems. Best wishes Mahtab Nikkhou ELDA At 09:37 29/10/2002 -0500, John Hutchins wrote: >Dear Ellias Kaker, > >As far as I know there is only the following: > >CiyaTran >Company: Ciyasoft Corporation >Category: MT system >Dictionaries: 85 technical dictionaries >Requirements: 256MB RAM >Translation speed: 400 pages/min. >Price: unknown >Source: http://www.ciyasoft.com > >This is from English to Farsi only. The company produces systems also >from English to Arabic, Dari and Pashto. > >However, you will see from the "Compendium of translation software" >(www.eamt.org/compendium.htm) that there are a number >of companies producing electronic dictionaries for English/Farsi. > >Best wishes, >John Hutchins > >========================================= > > >Hi, >I am looking for a good commercial Farsi to English >Machine Translation system(bi directional). This will >be used on UN document translation. > > >Thank You >Ellias< > > >-- > For MT-List info, see http://www.eamt.org/mt-list.html ************************************************************************* Mahtab Nikkhou mailto:nikkhou@elda.fr Language Resources Project Manager ELDA - The Evaluation and Language resources Distribution Agency 55-57, rue Brillat-Savarin, 75013 Paris - France Tel.:+33 1 43 13 33 33 - Fax:+33 1 43 13 33 30 WWW: http://www.elda.fr LREC News: http://www.lrec-conf.org LangTech2002 Overview: http://www.lang-tech.org Subscribe to our Euromap Language Technologies newsletter: English version: http://www.hltcentral.org/newsletter French version: http://www.elda.fr/fr/proj/euromap/newsletter.html ************************************************************************* From nadamides@aslib.com" Translating & the Computer 24 Conference, 21-22 November 2002 in London. Speakers include: Keynote Speaker: Hans Uszkoreit, DFKI, Germany Daniel Gervais, MultiCorpora R&D Inc., USA Matthias Heyn, TRADOS, Germany Karin Spalink, Sony Ericsson, USA Andrew Bredenkamp, acrolinx GmbH, Germany Christine Thielen, SAP AG, Germany Lorna Joy,SCH=DCCO International, UK Yves Champollion, France G=E1bor Pr=F3sz=E9ky, MorphoLogic, Hungary Reinhard Sch=E4ler, University of Limerick, Ireland Veronique Anne Sauron, University of Geneva, Switzerland Monika K=E4ser, CLS Corporate Language Services, Switzerland Mike Roche, IBM Software Group, Ireland Dan Dube, ISOGEN International, USA Lee Gillam, University of Surrey, UK The event is supported by BCS - Natural Language Group, EAMT, IAMT, IoL = and=20 ITI. The full programme can be found at www.aslib.com/conferences. We are=20 pleased to announce that there is a pre-conference seminar: XML - How = can=20 it benefit your business?. This seminar is aimed at anyone who is=20 interested in finding out what XML is, its uses, and business reasons = for=20 implementing it. People from roles as diverse as directors, project=20 managers, subject matter experts, and programmers will find the seminar=20 relevant. Details of the seminar can be found on the Aslib web site. I look forward to receiving your booking form soon! NICOLE ADAMIDES, Training Aslib/IMI, Temple Chambers, 3-7 Temple Avenue, London, EC4Y 0HP Tel: +44 (0)20 7583 8900 Fax: +44 (0)20 7583 8401 www.aslib.com Email: nadamides@aslib.com From nikolay@npp.cit.bg Mon Nov 4 13:39:56 2002 From: nikolay@npp.cit.bg (Nikolay Ivanov) Date: Mon, 4 Nov 2002 14:39:56 +0100 Subject: [MT-List] East European languages Message-ID: <00c801c28407$a9eb41c0$4a0510ac@nmarkova> Dear members of the list, Nikolay here from Bulgaria. MT in my country is actually at a very early stage though two commercial products Bultra (www.bultra.com) and WebTrance (http://tran.skycode.com) for English-Bulgarian machine translation are available on the market. Do any of you here take any interest in MT of English into Eastern European languages, especially Bulgarian, Serbian and Romanian? Do any of you work on similar projects? Thanx indeed in advance, Nikolay Ivanov, PR specialist at Kozloduy NPP (www.kznpp.org) From rcm@sasaska.net Wed Nov 13 11:49:17 2002 From: rcm@sasaska.net (Rafael Cordones Marcos) Date: Wed, 13 Nov 2002 12:49:17 +0100 Subject: [MT-List] Crosslingual or Translingual? Message-ID: <20021113124917.41084221.rcm@sasaska.net> Hi, This is a very general question on terminology. What's the difference between "cross-lingual" and "trans-lingual"? I am preparing my M.Sc. Thesis proposal which has the following temptative title "TraMaLi: A Translingual Mailing List Manager". It is a mailing list manager that will allow participants to use their native language when sending and receiving e-mails to the list. The software will use external machine translation services to perform the translation and forward the translation to each user according to their preferences. My questions are: - Is the title correct? - Bonus ;-) question: do you know of any existing software that already does this? I can send a pointer to the proposal when I have it ready. Thanks for your time! Best regards, Rafa -- Rafael Cordones Marcos mailto:rcm@sasaska.net http://sasaska.net From Christian.Boitet@imag.fr Thu Nov 14 08:33:53 2002 From: Christian.Boitet@imag.fr (Christian Boitet) Date: Thu, 14 Nov 2002 09:33:53 +0100 Subject: [MT-List] Crosslingual or Translingual? -- for an e-mail system In-Reply-To: <20021113124917.41084221.rcm@sasaska.net> References: <20021113124917.41084221.rcm@sasaska.net> Message-ID: Hello! 14/11/02 At 12:49 +0100 13/11/02, Rafael Cordones Marcos wrote: >Hi, > >This is a very general question on terminology. What's the difference >between "cross-lingual" and "trans-lingual"? I would say they are almost synonymous. If you want desperately to separate the meanings, the following can help. The prefix "cross" comes from "crux" in Latin. It implies the notion that at least 2 lines are crossing each other, the 2 being equal. The prefix "trans-" also comes from Latin and implies a passage through a limit, that is, only 2 lines crossing each other, with one passive and the other active. In current English usage, both can mean a passage through something: in English, you say "cross the street", while in French and German you don't say "croiser la rue" or "die Stra=DFe kreuzen", but "traverser=8A" and "=8Adurchqueren". Hence, we might say that - translingual can mean "from one language to another" - crosslingual can mean "between all languages considered" In other words, a translingual system for languages L1=8ALn would not have to offer a direct passage from any Li to any Lj, a crosslingual system would. >I am preparing my M.Sc. Thesis proposal which has the following >temptative title "TraMaLi: A Translingual Mailing List Manager". It is a >mailing list manager that will allow participants to use their native >language when sending and receiving e-mails to the list. The software >will use external machine translation services to perform the >translation and forward the translation to each user according to their >preferences. > >My questions are: > > - Is the title correct? > - Bonus ;-) question: do you know of any existing software that > already does this? I can send a pointer to the proposal when I > have it ready. There is a project called UNL-mail by the UNL foundation (UNDL). I don't know the details, but I presume it concentrates on the MT part of your design (format for multilingual e-mails including UNL graphs, manpilations on the mail server to send files to distant "enconverters" and "deconverters" and package/manage the results, etc.). Maybe you could enter in contact with H.Uchida on this subject. >Thanks for your time! > >Best regards, > >Rafa > >-- >Rafael Cordones Marcos >mailto:rcm@sasaska.net >http://sasaska.net > >-- > For MT-List info, see http://www.eamt.org/mt-list.html Best regards, CB -- ------------------------------------------------------------------------- Christian Boitet (Pr. Universite' Joseph Fourier) Tel: +33.4-7651-4355/4817 GETA, CLIPS, IMAG-campus, BP53 Fax: +33.4-7651-4405 385, rue de la Bibliothe`que Mel: Christian.Boitet@imag.fr 38041 Grenoble Cedex 9, France Mobile: +33-(0)6-6005-1969 http://www-clips.imag.fr/geta/christian.boitet ------------------------------------------------------------------------- Serveurs de dictionnaires: projet SILFIDE (http://silfide.imag.fr) et plus particuli=E8rement fran=E7ais-malais (http://www-clips.imag.fr/geta/services/fem/) Projet C-STAR (http://www.c-star.org/) et projet europe'en Nespole (http://nespole.itc.it) de traduction de parole Projet UNL de communication et recherche d'information multilingue sur le re'seau http://www.unl.ias.unu.edu ou http://www.undl.org, Projet PAPILLON de construction coop=E9rative d'une base lexicale multilingue et de construction de dictionnaires http://www.papillon-dictionary.org/ From Christian.Boitet@imag.fr Tue Nov 19 09:54:27 2002 From: Christian.Boitet@imag.fr (Christian Boitet) Date: Tue, 19 Nov 2002 10:54:27 +0100 Subject: Re[2]: [MT-List] Crosslingual or Translingual? -- for an e-mail system In-Reply-To: <63312457.20021117114257@timesup.org> References: <20021113124917.41084221.rcm@sasaska.net> <63312457.20021117114257@timesup.org> Message-ID: Hall Alex, 19/11/02 At 11:42 +0100 17/11/02, alex barth wrote: >Thursday, November 14, 2002, 9:33:53 AM, Christian.Boitet@imag.fr wrote: > >> Hello! 14/11/02 > >> At 12:49 +0100 13/11/02, Rafael Cordones Marcos wrote: >>>Hi, >>> >>>This is a very general question on terminology. What's the difference >>>between "cross-lingual" and "trans-lingual"? > >> I would say they are almost synonymous. >> If you want desperately to separate the meanings, the following can help= =2E > >> The prefix "cross" comes from "crux" in Latin. It implies the notion >> that at least 2 lines are crossing each other, the 2 being equal. > >> The prefix "trans-" also comes from Latin and implies a passage >> through a limit, that is, only 2 lines crossing each other, with one >> passive and the other active. > >> In current English usage, both can mean a passage through something: >> in English, you say "cross the street", while in French and German >> you don't say "croiser la rue" or "die Stra=DFe kreuzen", but >> "traverser=8A" and "=8Adurchqueren". > >die Stra=DFe =FCberqueren! Interessant und richtig. Das WB "Deutsch-Franzoesisch" von Gisela Liebold und Harald Liebold (VEB Verlag, Leipzig, 1986) gibt zwar (S. 98) : "durchqueren -> traverser" und (S. 409) "ueberqueren -> traverser"=8A aber durchqueren hei=DFt eigentlich "traverser" mit der Meinung von "parcourir" (das Land durchqueren). In any case, my idea was to point out that French or German don't use "croix-" or "kreuz-" as a possible prefix for this meaning. Howevever, the fact that German has 2 possibilities and French 1 only is interesting per se. > >> Hence, we might say that >> - translingual can mean "from one language to another" >> - crosslingual can mean "between all languages considered" > >> In other words, a translingual system for languages L1=8ALn would not >> have to offer a direct passage from any Li to any Lj, a crosslingual >> system would. Would you agree with that? Best, CB -- ------------------------------------------------------------------------- Christian Boitet (Pr. Universite' Joseph Fourier) Tel: +33.4-7651-4355/4817 GETA, CLIPS, IMAG-campus, BP53 Fax: +33.4-7651-4405 385, rue de la Bibliothe`que Mel: Christian.Boitet@imag.fr 38041 Grenoble Cedex 9, France Mobile: +33-(0)6-6005-1969 http://www-clips.imag.fr/geta/christian.boitet ------------------------------------------------------------------------- Serveurs de dictionnaires: projet SILFIDE (http://silfide.imag.fr) et plus particuli=E8rement fran=E7ais-malais (http://www-clips.imag.fr/geta/services/fem/) Projet C-STAR (http://www.c-star.org/) et projet europe'en Nespole (http://nespole.itc.it) de traduction de parole Projet UNL de communication et recherche d'information multilingue sur le re'seau http://www.unl.ias.unu.edu ou http://www.undl.org, Projet PAPILLON de construction coop=E9rative d'une base lexicale multilingue et de construction de dictionnaires http://www.papillon-dictionary.org/ From thomas.fallgatter@csfs.com Mon Nov 25 13:30:38 2002 From: thomas.fallgatter@csfs.com (Fallgatter Thomas (KIPL 8)) Date: Mon, 25 Nov 2002 14:30:38 +0100 Subject: [MT-List] Online dictionary English-Thai Message-ID: Dear colleagues Does anybody know an online dictionary English-Thai on the web? Thank you Thomas Fallgatter Credit Suisse Financial Services Zurich From Robert@Zakon.org Mon Nov 25 14:01:16 2002 From: Robert@Zakon.org (Robert H'obbes' Zakon) Date: Mon, 25 Nov 2002 09:01:16 -0500 Subject: [MT-List] Online dictionary English-Thai In-Reply-To: Message-ID: Search google.com for english thai dictionary You should see a few hits. - Robert H'obbes' Zakon, www.Zakon.org From softex@hn.vnn.vn Wed Nov 27 02:27:48 2002 From: softex@hn.vnn.vn (Hung, Le Khanh) Date: Wed, 27 Nov 2002 09:27:48 +0700 Subject: [MT-List] Grammar Message-ID: <003101c295d6$26c415c0$5919a8c0@HungSoftex> This is a multi-part message in MIME format. ------=_NextPart_000_0020_01C295F7.4008BB50 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable Dear colleagues Does anybody know any online full English Generative Grammar (of any = formalism) and Medium sized English Corpus on the web? Thank you Le Khanh Hung National Research Center for Technological Progress Ha Noi, Viet Nam ------=_NextPart_000_0020_01C295F7.4008BB50 Content-Type: text/html; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable

Dear=20 colleagues

Does anybody know any online full English Generative = Grammar=20 (of any formalism) and Medium sized English Corpus on the = web?

Thank=20 you

Le Khanh Hung
National Research Center for Technological Progress
Ha Noi, Viet Nam
 
------=_NextPart_000_0020_01C295F7.4008BB50-- From bond@cslab.kecl.ntt.co.jp Wed Nov 27 05:45:52 2002 From: bond@cslab.kecl.ntt.co.jp (bond@cslab.kecl.ntt.co.jp) Date: Wed, 27 Nov 2002 14:45:52 +0900 (JST) Subject: [MT-List] Grammar In-Reply-To: <003101c295d6$26c415c0$5919a8c0@HungSoftex> References: <003101c295d6$26c415c0$5919a8c0@HungSoftex> Message-ID: <20021127054552.4E58391C0@pop.cslab.kecl.ntt.co.jp> > Does anybody know any online full English Generative Grammar (of any = > formalism) and Medium sized English Corpus on the web? There is a medium size English Generative Grammar (HPSG) available at: http://www-csli.stanford.edu/~aac/lkb.html A small amount of text parsed with this grammar is available at: http://lingo.stanford.edu/ftp/redwoods/ It is almost undocumented (but see http://lingo.stanford.edu/redwoods/ for a description). There is a huge collection of raw English text available at: http://gutenberg.net/ -- Francis Bond NTT Communication Science Laboratories | Machine Translation Research Group From WJHutchins@compuserve.com Fri Dec 6 10:32:33 2002 From: WJHutchins@compuserve.com (John Hutchins) Date: Fri, 6 Dec 2002 05:32:33 -0500 Subject: [MT-List] EAMT/CLAW conference May 2003: call for papers Message-ID: <200212060532_MC3-1-1E3F-9C71@compuserve.com> ****Extended Deadline for Submission: Jan 10th 2003**** Joint Conference combining the 7th International Workshop of the Europe= an Association for Machine Translation and the 4th Controlled Language Applications Workshop Main Conference theme: Controlled Language Translation Location: Dublin City University, Ireland Dates: 15th-17th May, 2003 Conference URL: http://www.eamt.org/eamt-claw03/ = Invited Speakers: Steven Krauwer, University of Utrecht and = Coordinator of ELSNET Lou Cremers, Oc=E9 Technologies Over the years, there have been many conferences on MT, involving rule-based approaches, statistical and example-based approaches, hybrid and multi-engine approaches as well as those limited to particular sublanguag= e domains. In addition, there has been an increased level of interest in controlled languages, culminating in the series of Workshops on controlle= d language applications. These have given impetus to both monolingual and multilingual guidelines and applications using controlled language, for many different languages. = Controlled languages are subsets of natural languages whose grammars and dictionaries have been restricted in order to reduce or eliminate both ambiguity and complexity. Traditionally, controlled languages fall into t= wo major categories: those that improve readability for human readers, particularly non-native speakers, and those that improve computational processing of the text. It is often claimed that machine-oriented controlled language should be of particular benefit when it comes to the use of translation tools (including machine translation, translation memory, multilingual terminology tools etc.). Experience has shown that high quality MT systems can be designed for specialized domains (e.g. METEO). However, the area of controlled translation has remained relatively unaddressed. This is rather strange given its undoubted importance. Such examples that exist use rule-based M= T (RBMT) systems to translate controlled language documentation, e.g. Caterpillar's CTE and CMU's KANT system, and General Motors CASL and LantMark, etc. However, fine-tuning general systems designed for use with= unrestricted texts to derive specific, restricted applications is complex= and expensive. The primary aim of this unique conference, therefore, is to elicit papers= on controlled translation, and provide a forum in which the problems may be outlined, possible solutions proposed, and in general to bring together developers, implementors, researchers and end-users from the publications= , authoring, translation and localization fields to discuss how ideas from both the authoring and translation camps might be integrated in this comm= on area. Some specific topics which might be addressed include: * What is controlled translation? * RBMT and controlled translation. * TM/EBMT and controlled translation. * Influence and interplay of controlled language upon both source-language parsing and target-language generation in an MT system. * Role of the lexicon in controlled translation. * Can we expect better controlled translations from a hybrid approach?= Or from a multi-engine approach? * Towards a Roadmap for controlled translation - the way ahead? In addition, we welcome contributions on MT as well as on controlled language which do not address the main theme per se. Please consult the = conference URL (http://www.eamt.org/eamt-claw03/) for some suggestions. = Important Dates/Prizes for 'Best Papers' Owing to a large number of requests for an extension to the original deadline of Nov 29th (and with apologies, and thanks, to those who have submitted so far), we have come up with the following, new schedule: = Paper Submissions: Jan 10, 2003 (extended) Reviews due: Feb 14, 2003 Notification of Acceptance: Feb 28, 2003 Camera Ready Copy: Mar 31, 2003 = Note that the programme committee will select a set of up to 4 `best papers' (best MT, best Controlled Language, best Controlled Translation, Best Student Submission) for whom registration fees will be waived. = Submission Details Papers accepted for the conference will be published in a proceedings volume available to all attendees. Papers should describe unique work not published before. Papers that are being submitted to other conferences should inclu= de this information on the first page. Paper submissions should follow these= conventions: * Maximum length is 4000 words * 8.5" x 11" page size * Single-column, single-spaced, 1" margins * 12 point font * Include title, authors, and contact info centered at the top of the first page * Include an abstract of about 100 words Electronic submission is strongly encouraged. We prefer PDF files, sent a= s EMail attachments. Electronic submissions should be sent to Eric Nyberg (ehn@cs.cmu.edu), with `Submission for EAMT-CLAW 2003' in the Subject lin= e of = the email. Other Information Please consult the conference website at: http://www.eamt.org/eamt-claw03= / or mail Andy Way (away@computing.dcu.ie). From Andy.Way" =09 ****Extended Deadline for Submission: Jan 10th 2003**** Joint Conference combining the 7th International Workshop of the European Association for Machine Translation and the 4th Controlled Language Applications Workshop Main Conference theme: Controlled Language Translation Location: Dublin City University, Ireland Dates: 15th-17th May, 2003 =09=09Conference URL: http://www.eamt.org/eamt-claw03/ =09=09 =09Invited Speakers: Steven Krauwer, University of Utrecht and=20 =09 Coordinator of ELSNET =09=09 Lou Cremers, Oc=E9 Technologies Over the years, there have been many conferences on MT, involving rule-base= d approaches, statistical and example-based approaches, hybrid and multi-engine approaches as well as those limited to particular sublanguage domains. In addition, there has been an increased level of interest in controlled languages, culminating in the series of Workshops on controlled language applications. These have given impetus to both monolingual and multilingual guidelines and applications using controlled language, for man= y different languages.=09 Controlled languages are subsets of natural languages whose grammars and dictionaries have been restricted in order to reduce or eliminate both ambiguity and complexity. Traditionally, controlled languages fall into two major categories: those that improve readability for human readers, particularly non-native speakers, and those that improve computational processing of the text. It is often claimed that machine-oriented controlle= d language should be of particular benefit when it comes to the use of translation tools (including machine translation, translation memory, multilingual terminology tools etc.). Experience has shown that high quality MT systems can be designed for specialized domains (e.g. METEO). However, the area of controlled translation has remained relatively unaddressed. This is rather strange given its undoubted importance. Such examples that exist use rule-based MT (RBMT) systems to translate controlled language documentation, e.g. Caterpillar's CTE and CMU's KANT system, and General Motors CASL and LantMark, etc. However, fine-tuning general systems designed for use with unrestricted texts to derive specific, restricted applications is complex and expensive. The primary aim of this unique conference, therefore, is to elicit papers o= n controlled translation, and provide a forum in which the problems may be outlined, possible solutions proposed, and in general to bring together developers, implementors, researchers and end-users from the publications, authoring, translation and localization fields to discuss how ideas from both the authoring and translation camps might be integrated in this common area. Some specific topics which might be addressed include: * What is controlled translation? * RBMT and controlled translation. * TM/EBMT and controlled translation. * Influence and interplay of controlled language upon both source-language parsing and target-language generation in an MT system= . * Role of the lexicon in controlled translation. * Can we expect better controlled translations from a hybrid approach? O= r from a multi-engine approach? * Towards a Roadmap for controlled translation - the way ahead? In addition, we welcome contributions on MT as well as on controlled language which do not address the main theme per se. Please consult the=20 conference URL (http://www.eamt.org/eamt-claw03/) for some suggestions.=20 Important Dates/Prizes for 'Best Papers' Owing to a large number of requests for an extension to the original deadline of Nov 29th (and with apologies, and thanks, to those who have submitted so far), we have come up with the following, new schedule:=20 Paper Submissions: Jan 10, 2003 (extended) Reviews due: Feb 14, 2003 Notification of Acceptance: Feb 28, 2003 Camera Ready Copy: Mar 31, 2003 =20 Note that the programme committee will select a set of up to 4 `best papers' (best MT, best Controlled Language, best Controlled Translation, Best Student Submission) for whom registration fees will be waived.=20 Submission Details Papers accepted for the conference will be published in a proceedings volum= e available to all attendees. Papers should describe unique work not publishe= d before. Papers that are being submitted to other conferences should include this information on the first page. Paper submissions should follow these conventions: * Maximum length is 4000 words * 8.5" x 11" page size * Single-column, single-spaced, 1" margins * 12 point font * Include title, authors, and contact info centered at the top of the first page * Include an abstract of about 100 words Electronic submission is strongly encouraged. We prefer PDF files, sent as EMail attachments. Electronic submissions should be sent to Eric Nyberg (ehn@cs.cmu.edu), with `Submission for EAMT-CLAW 2003' in the Subject line = of=20 the email. Other Information Please consult the conference website at: http://www.eamt.org/eamt-claw03/ or mail Andy Way (away@computing.dcu.ie). From mtix@IRO.UMontreal.CA Mon Dec 16 18:02:31 2002 From: mtix@IRO.UMontreal.CA (MT Summit IX) Date: Mon, 16 Dec 2002 13:02:31 -0500 Subject: [MT-List] MT Summit IX Message-ID: MACHINE TRANSLATION SUMMIT IX September 23-28, 2003 New Orleans, USA http://www.mt-summit.org CALL FOR PAPERS The ninth Machine Translation Summit, organized by the International Association for Machine Translation (IAMT) and hosted by the Association for Machine Translation in the Americas (AMTA), will be held in New Orleans, Louisiana, from 23 to 28 September 2003. MT Summit IX will feature a comprehensive programme that will include research papers, reports on users' experiences, discussions of policy issues, invited talks, panels, exhibits, tutorials, and workshops. We define machine translation in the broadest possible sense, to include not just fully automatic MT but tools for translation support and multilingual text processing as well. We invite all those with an interest in translation automation-researchers, developers, translation service providers, users, or managers-to participate in the conference. MT Summit IX hereby invites original submissions on all aspects machine and machine-aided translation. The submissions must be in English and fall into one of three categories: * research (or theoretical) papers: maximum length 8 pages; * user's studies (including manager's experiences): maximum length 8 pages; * system presentations (with optional demos): maximum length 4 pages. Electronic submissions are strongly preferred, in PDF or MS-Word format only please. As the reviewing process will be blind, authors of the first two categories (research and user studies) are requested to keep their papers anonymous. This means that these submissions should NOT include the author's name; rather, papers should be identified only by their title. In addition to the electronic file containing their paper, authors must also submit a separate cover page on which the name and affiliation of the author(s) do appear, along with the title of the paper and the category of the submission. The two electronic files should be attached to an email and sent to the following address: mtix@iro.umontreal.ca Note that the requirement of anonymity does not apply to system presentations. However, those submitting system descriptions with demos are asked to specify on their cover page the type of equipment they will require for their demonstration. We will acknowledge receipt of all papers received before the deadline and issue a submission number to each author. Please refer to that number in all subsequent correspondence. Anyone who is unable to make an electronic submission is asked to contact the Program Chair. Guidelines and style files for the preparation of the final camera-ready copy will be made available to the authors of accepted submissions in due time. IMPORTANT DATES: - May 11, 2003 : Deadline for the submission of papers - June 30, 2003 : Notification of acceptance sent to authors - July 31, 2003 : Camera-ready copy due - September 23, 2003 : Conference opens PROPOSALS FOR PANELS AND SPECIAL SESSIONS: Proposals are also invited for panel and/or special sessions on issues of general importance to machine and machine-aided translation, which should be the subject of public debate at MT Summit IX. Please send your proposals to the Program Chair and include a description of the session theme, a justification of its importance, and the names of some suggested speakers or panellists. CONFERENCE WEBSITE: Please consult the Conference Website for the latest information on all aspects of the Summit: http://www.mt-summit.org MT SUMMIT IX ORGANIZING COMMITTEE General Chair: Eduard Hovy USC / Information Sciences Institute hovy@isi.edu Program Chair: Elliott Macklovitch RALI / Université de Montréal mtix@iro.umontreal.ca Local Arrangements Chair: Florence Reeder Mitre Corporation freeder@mitre.org Exhibit and Tutorial Co-chairs: Laurie Gerber gerbl@pacbell.net Keith Miller keith@mitre.org Conference Webmaster: Jin Yang SYSTRAN Software, Inc. webmaster@amtaweb.org