This site is for testing only. Don’t upload valuable research as testing data will not be maintained.

Layer

Name Wordlists of languages of Indonesia and the Philippines
Description Recordings and transcription of a large wordlist (c. 1,100 items) for many languages of Indonesia and the Philippines. Additional textual data was collected for a subset of languages. Made as part of the OCSEAN (OCeanic and South East Asian Navigators; MSCA-RISE-2019, Project Number 873207) project. The initial workshop was held in Uppsala, Sweden from 20 June to 15 July 2022.
Type Media
Content Warning
Contributor Mufeng
Entries 8
Allow ANPS? No
Added to System 2023-11-05 00:52:20
Updated in System 2024-03-28 12:02:48
Subject linguistics, language, PARADISEC
Creator
Publisher Owen Edwards
Contact admin@paradisec.org.au
Citation
DOI
Source URL https://catalog.paradisec.org.au/repository/OCSEAN
Linkback https://catalog.paradisec.org.au/repository/OCSEAN
Date From
Date To
Image
Latitude From
Longitude From
Latitude To
Longitude To
Language
License Open (subject to agreeing to PDSC access conditions)
Usage Rights Open (subject to agreeing to PDSC access conditions)
Date Created (externally)

Amanatun wordlist

Type
Other

Details

Latitude
11.2255
Longitude
102.164
Start Date
2022-07-16
End Date
2022-07-16

Description

A wordlist of 1228 items in the Amanatun variety/dialect of Uab Meto. The list was recorded with Alfred in three sessions: evening 04/07/2022, evening 05/07/2022, and afternoon 16/07/2022. Due the 20 minute limit of the recording device there are eight recordings in total 1. 1-220 (recorded 04/07/2022, by I Made Netra) 2. 221-233 (recorded 04/07/2022, by I Made Netra) 3. 234-394 (recorded 05/07/2022, by I Made Netra) 4. 395-439 (recorded 16/07/2022, by Owen Edwards) 5. 440-695 (recorded 16/07/2022, by Owen Edwards) 6. 696-884 (recorded 16/07/2022, by Owen Edwards) 7. 885-1076 (recorded 16/07/2022, by Owen Edwards) 8. 1077-1228 (recorded 16/07/2022, by Owen Edwards) OCSEAN-AOZ_20220715-WORDLIST.wav contains all eight recordings concatenated together in a single file. OCSEAN-AOZ_20220715-WORDLIST_TEXTGRID.txt contains a text-grid file made in Praat. This contains two tiers: the top tier with the number of the item, and the bottom tier with a broad phonetic transcription made by Owen Edwards. Different responses to the same prompt are differentiated alphabetically (e.g. 1a, 1b). OCSEAN-AOZ_20220715-WORDLIST.txt is a text file which contains the cleaned up version of all the Uab Meto data collected with the orthographic transcriptions made by Alfred Snae and broad phonetic transcriptions made by Owen Edwards. In Owen Edward’s transcriptions three phonetic heights are distinguished among the mid vowels: mid-low [ɛ] and [ɔ], mid-high [e] and [o], and slightly higher [ɪ] and [ʊ]. All these mid vowels are transcribed by Alfred Snae (the consultant) as and . In final syllables mid-high [e o] and slightly higher [ɪ ʊ] are historically high vowels which have lowered after a (historically) penultimate mid-low vowel. In penultimate syllables, mid-high [e o] are mid vowels which have raised before a (historically) high vowel; e.g.*okiʔ *[ʔɔkiʔ] ‘wave’ > [ʔokeʔ], *mepu *[mɛpu] > [mepo]. All vowel-initial words begin with a glottal stop. Long vowels analysable as sequences of two identical vowels are transcribed with two vowel symbols. Stress regularly falls on the penultimate vowel of a word. OCSEAN-AOZ_20220715-WORDLIST.pdf contains the first 393 items as filled in by hand by I Made Netra in IPA. OCSEAN-AOZ_20220715-WORDLIST_TYPED.pdf contains a typed up version of the wordlist, made by Alfred Snae in orthographic transcription. Meto items are given in brackets after the Indonesian prompt.

Extended Data

ID
OCSEAN-AOZ_20220704
Languages
Uab Meto - aoz
Countries
Indonesia - ID
Publisher
Owen Edwards
Contact
admin@paradisec.org.au
License
Open (subject to agreeing to PDSC access conditions)
Rights
Open (subject to agreeing to PDSC access conditions)

Sources

TLCMap ID
tc440b
Linkback
https://catalog.paradisec.org.au/repository/OCSEAN/AOZ_20220704
Source
https://catalog.paradisec.org.au/repository/OCSEAN/AOZ_20220704
Created At
2023-11-05 00:52:20
Updated At
2023-11-17 15:41:18

Ba'a Wordlist

Type
Other

Details

Latitude
11.2255
Longitude
102.164
Start Date
2022-07-08
End Date
2022-07-08

Description

A wordlist of 1228 items recorded in the Ba'a from Oelunggu village, Rote. The wordlist was recorded in 23 sessions, roughly corresponding to semantic domains. Metadata of the speaker is recorded on the first recording. 1. 1-45; The physical world 2. 46-85; The physical world 3. 86-197; People, Animals 4. 198-249; Animals 5. 250-274; Animals 6. 275-407; The Body 7. 408-440; The Body 8. 441-554; Food and drink, Clothing and grooming, The house 9. 555-669; Agriculture and vegetation 10. 670-722; Basic actions 11. 723-785; Motion 12. 786-810; Possession 13. 811-835; Spatial relations 14. 838-884; Spatial relations 15. 885-932; Quantity 16. 933-970; Time 17. 971-1032; Sense perception 18. 1033-1076; Emotions and values 19. 1077-1125; Cognition 20. 1126-1156; Speech and language 21. 1157-1175; Social and political relations 22. 1176-1200; Warfare and hunting 23. 1201-1228; Law/Religion and belief Each recording is accompanied by a video (.mp4) an audio file extracted from the video (.wav) and a backup recording made with a separate device. Items 1-45 and 670-722 also have accompanying time-aligned ELAN transcriptions. LLG_20220708-WORDLIST contains a scan of the wordlist written by James Ngginak in orthography. Numbers correspond to the item number in the wordlist. LLG_20220708-WORDLIST_1TO147_670TO722.pdf contains a scan of items 1-147 and 670-722 transcribed by Zuvyati Tlonaen. Name of speaker James Ngginak. Age 34. Education Magister. Occupation teacher. Languages Ba'a Oelunggu, Indonesian, Kupang Malay, English. Place of birth Rote Ba'a Oelunggu. Father and mother place birth in Ba'a Oelunggu. Father and mother languages Ba'a Oelunggu. Collector is Zuviyati Tlonaen. Place of recording in Uppsala University.

Extended Data

ID
OCSEAN-LLG_20220703
Languages
Lole - llg
Countries
Indonesia - ID
Publisher
Owen Edwards
Contact
admin@paradisec.org.au
License
Open (subject to agreeing to PDSC access conditions)
Rights
Open (subject to agreeing to PDSC access conditions)

Sources

TLCMap ID
tc440c
Linkback
https://catalog.paradisec.org.au/repository/OCSEAN/LLG_20220703
Source
https://catalog.paradisec.org.au/repository/OCSEAN/LLG_20220703
Created At
2023-11-05 00:52:20
Updated At
2023-11-17 15:41:18

Dela wordlist

Type
Other

Details

Latitude
11.2255
Longitude
102.164
Start Date
2022-07-08
End Date
2022-07-08

Description

A wordlist of 999 items for Dela, a language of Indonesia. The recordings are divided to five sections based on the numbers. Each section included an MP4, WAV, WAV back up, and an EAF file. The transcription is based on Dela orthograpy except that ɓ and ɗ in all positions are written as b' and d'. Made as part of the OCSEAN (OCeanic and South East Asian Navigators; MSCA-RISE-2019, Project Number 873207) project. The initial workshop was held in Uppsala, Sweden from 20 June to 15 July 2022.

Extended Data

ID
OCSEAN-ROW_20220708
Languages
Dela-Oenale - row
Countries
Indonesia - ID
Publisher
Owen Edwards
Contact
admin@paradisec.org.au
License
Open (subject to agreeing to PDSC access conditions)
Rights
Open (subject to agreeing to PDSC access conditions)

Sources

TLCMap ID
tc440d
Linkback
https://catalog.paradisec.org.au/repository/OCSEAN/ROW_20220708
Source
https://catalog.paradisec.org.au/repository/OCSEAN/ROW_20220708
Created At
2023-11-05 00:52:20
Updated At
2023-11-17 15:41:18

Abui wordlist

Type
Other

Details

Latitude
-8.294
Longitude
124.588
Start Date
2022-07-15
End Date
2022-07-15

Description

A wordlist of 445 items in Abui. The list was recorded with in five sessions, according to the worldist items at the end of the file name. The first four sessions were recorded on the 03/07/2022 1. items 1-50 2. items 51-100 3. items 101-150 4. items 151-200 5. items 201-300 AOZ_20220715-WORDLIST.pdf contains items 1-445 items as filled in by hand by Cindy Copas. This contains an orthographic transcription. Long vowels are not differentiated from short vowels and the uvular plosive /q/ is not differentiated from the velar plosive /k/. AOZ_20220715-WORDLIST.txt contains a typed up version of items 1-300. The transcriptions is phonemic according to standard IPA practices. eaf files contain time-aligned phonemic transcriptions for use in Elan

Extended Data

ID
OCSEAN-ABZ_20220715
Languages
Abui - abz
Countries
Indonesia - ID
Publisher
Owen Edwards
Contact
admin@paradisec.org.au
License
Open (subject to agreeing to PDSC access conditions)
Rights
Open (subject to agreeing to PDSC access conditions)

Sources

TLCMap ID
tc440e
Linkback
https://catalog.paradisec.org.au/repository/OCSEAN/ABZ_20220715
Source
https://catalog.paradisec.org.au/repository/OCSEAN/ABZ_20220715
Created At
2023-11-05 00:52:20
Updated At
2023-11-17 15:41:18

Kupang Malay wordlist

Type
Other

Details

Latitude
3.90215
Longitude
117.2812
Start Date
2022-07-05
End Date
2022-07-05

Description

A 147 item wordlist in Kupang Malay. This list was filled in one day at Uppsala University - Sweden by June Jacob.

Extended Data

ID
OCSEAN-MKN_20220705
Languages
Malay, Kupang - mkn
Countries
Indonesia - ID
Publisher
Owen Edwards
Contact
admin@paradisec.org.au
License
Open (subject to agreeing to PDSC access conditions)
Rights
Open (subject to agreeing to PDSC access conditions)

Sources

TLCMap ID
tc440f
Linkback
https://catalog.paradisec.org.au/repository/OCSEAN/MKN_20220705
Source
https://catalog.paradisec.org.au/repository/OCSEAN/MKN_20220705
Created At
2023-11-05 00:52:20
Updated At
2023-11-17 15:41:18

Balinese wordlist

Type
Other

Details

Latitude
3.90215
Longitude
117.2812
Start Date
2022-07-31
End Date
2022-07-31

Description

A 1,228 item wordlist in Balinese. The first six recordings were according to semantic domains, the final four sessions were not. 1. 1-85; The physical world (consultant I Ketut Artawa, recorded 03/07/2022) 2. 86-147; People (consultant I Ketut Artawa, recorded 05/07/2022) 3. 148-249; Animals (consultant I Ketut Artawa, recorded 05/07/2022) 4. 250-439; The Body (consultant Ni Luh Nyoman Seri Malini, recorded 14/07/2022) 5. 440-457; Food and drink (consultant Ni Luh Nyoman Seri Malini, recorded 22/07/2022) 6. 458-501; Food and drink (consultant Ni Luh Nyoman Seri Malini, recorded 14/07/2022, no video recording) 7. 502-751 (consultant Ni Luh Nyoman Seri Malini, recorded 17/08/2022) 8. 752-991 (consultant Ni Luh Nyoman Seri Malini, recorded 17/08/2022) 9. 992-1122 (consultant Ni Luh Nyoman Seri Malini, recorded 17/08/2022) 10. 1123-1228 (consultant Ni Luh Nyoman Seri Malini, recorded 17/08/2022) OCSEAN-BAN_20220714-WORDLIST.pdf contains a handwritten filled-in version of the entire wordlist with words transcribed in IPA, except (sometimes looks like ) represents a palatal glide [j]. Both and represent a voiced palatal affricate/stop [ɟ]. represents a voiceless palatal affricate/stop.

Extended Data

ID
OCSEAN-BAN_20220817
Languages
Bali - ban
Countries
Indonesia - ID
Publisher
Owen Edwards
Contact
admin@paradisec.org.au
License
Open (subject to agreeing to PDSC access conditions)
Rights
Open (subject to agreeing to PDSC access conditions)

Sources

TLCMap ID
tc4410
Linkback
https://catalog.paradisec.org.au/repository/OCSEAN/BAN_20220817
Source
https://catalog.paradisec.org.au/repository/OCSEAN/BAN_20220817
Created At
2023-11-05 00:52:20
Updated At
2023-11-17 15:41:18

Balinese text

Type
Other

Details

Latitude
3.90215
Longitude
117.2812
Start Date
2022-07-06
End Date
2022-07-06

Description

A short text in Balines I Gusti Ngurah Parthama introduces himself and explains what he is doing.

Extended Data

ID
OCSEAN-BAN_20220706
Languages
Bali - ban
Countries
Indonesia - ID
Publisher
Owen Edwards
Contact
admin@paradisec.org.au
License
Open (subject to agreeing to PDSC access conditions)
Rights
Open (subject to agreeing to PDSC access conditions)

Sources

TLCMap ID
tc4411
Linkback
https://catalog.paradisec.org.au/repository/OCSEAN/BAN_20220706
Source
https://catalog.paradisec.org.au/repository/OCSEAN/BAN_20220706
Created At
2023-11-05 00:52:20
Updated At
2023-11-17 15:41:18

Enggano Wordlist

Type
Other

Details

Latitude
3.90215
Longitude
117.2812
Start Date
2022-07-12
End Date
2022-07-12

Description

A wordlist of 1228 items in Enggano. The wordlist was recorded in 21 sessions, according to semantic domains, as outlined below. 1228 items were recorded (except for items 86 to 147). 1. 1-85; The physical world (recorded 05/07/2022) 2. 86-147; People (recording missing) 3. 148-249; Animals 4. 250-439; The Body 5. 440-501; Food and drink 6. 502-521; Clothing and grooming 7. 522-554; The house 8. 555-669; Agriculture and vegetation 9. 670-722; Basic actions 10. 723-785; Motion 11. 786-810; Possession 12. 811-884; Spatial relations 13. 885-932; Quantity 14. 933-970; Time 15. 971-1032; Sense perception 16. 1033-1075; Emotions and values 17. 1076-1125; Cognition 18. 1126-1156; Speech and language 19. 1157-1175; Social and political relations 20. 1176-1200; Warfare and hunting 21. 1201-1228; Law/Religion and belief OCSEAN-ENO_20220712-WORDLIST.pdf contains a scan of the transcription. Transcriptions in black pen were probably made by Engga Sangian Zakaria and are in orthography. This follows Indonesian orthography with the following exceptions: < ' > = [ʔ] (apostrophe represents glottal stop) < ė > = [ə] (e with dot represents mid-central vowel) < u̇ > = [ɨ] (u with dot represents high-central vowel) < ˉ > = [ ~ ] (macron represents nasalized vowel) both and < j > appear to represent [ j ] (palatal glide) Words which were given as identical in Indonesian and Enggano are transcribed with standard Indonesian orthography. Transcription in blue pen appears to be by someone else, and uses a mix of IPA and orthographic transcription. Items 1-981 have been transcribed and are present in the pdf and text document.

Extended Data

ID
OCSEAN-ENO_20220712
Languages
Enggano - eno
Countries
Indonesia - ID
Publisher
Owen Edwards
Contact
admin@paradisec.org.au
License
Open (subject to agreeing to PDSC access conditions)
Rights
Open (subject to agreeing to PDSC access conditions)

Sources

TLCMap ID
tc4412
Linkback
https://catalog.paradisec.org.au/repository/OCSEAN/ENO_20220712
Source
https://catalog.paradisec.org.au/repository/OCSEAN/ENO_20220712
Created At
2023-11-05 00:52:20
Updated At
2023-11-17 15:41:18
All Layers