Publications by Giampiero Salvi

Peer reviewed

Articles

[1]
G. Saponaro, "Beyond the Self: Using Grounded Affordances to Interpret and Describe Others’ Actions," IEEE Transactions on Cognitive and Developmental Systems, 2019.
[2]
A. Selamtzis et al., "Effect of vowel context in cepstral and entropy analysis of pathological voices," Biomedical Signal Processing and Control, vol. 47, pp. 350-357, 2019.
[3]
S. Strömbergsson, G. Salvi and D. House, "Acoustic and perceptual evaluation of category goodness of /t/ and /k/ in typical and misarticulated children's speech," Journal of the Acoustical Society of America, vol. 137, no. 6, pp. 3422-3435, 2015.
[4]
C. Koniaris, G. Salvi and O. Engwall, "On mispronunciation analysis of individual foreign speakers using auditory periphery models," Speech Communication, vol. 55, no. 5, pp. 691-706, 2013.
[5]
D. Neiberg, G. Salvi and J. Gustafson, "Semi-supervised methods for exploring the acoustics of simple productive feedback," Speech Communication, vol. 55, no. 3, pp. 451-469, 2013.
[6]
G. Salvi et al., "Language bootstrapping : Learning Word Meanings From Perception-Action Association," IEEE transactions on systems, man and cybernetics. Part B. Cybernetics, vol. 42, no. 3, pp. 660-671, 2012.
[7]
G. Salvi et al., "SynFace-Speech-Driven Facial Animation for Virtual Speech-Reading Support," Eurasip Journal on Audio, Speech, and Music Processing, vol. 2009, pp. 191940, 2009.
[8]
G. Salvi, "Dynamic behaviour of connectionist speech recognition with strong latency constraints," Speech Communication, vol. 48, no. 7, pp. 802-818, 2006.
[9]
G. Salvi, "Segment boundary detection via class entropy measurements in connectionist phoneme recognition," Speech Communication, vol. 48, no. 12, pp. 1666-1676, 2006.
[10]
C. Siciliano et al., "Intelligibility of an ASR-controlled synthetic talking face," Journal of the Acoustical Society of America, vol. 115, no. 5, pp. 2428, 2004.
[11]
G. Salvi, "Developing acoustic models for automatic speech recognition in swedish," The European Student Journal of Language and Speech, vol. 1, 1999.

Conference papers

[12]
A. Castellana et al., "Cepstral and entropy analyses in vowels excerpted from continuous speech of dysphonic and control speakers," in Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech 2017, 2017, pp. 1814-1818.
[13]
G. Saponaro et al., "Interactive Robot Learning of Gestures, Language and Affordances," in Grounding Language Understanding, 2017.
[14]
A. Fahlström Myrman and G. Salvi, "Partitioning of Posteriorgrams using Siamese Models for Unsupervised Acoustic Modelling," in Grounding Language Understanding, 2017.
[15]
A. Kumar Dhaka and G. Salvi, "Sparse Autoencoder Based Semi-Supervised Learning for Phone Classification with Limited Annotations," in Grounding Language Understanding, 2017.
[16]
K. Stefanov, J. Beskow and G. Salvi, "Vision-based Active Speaker Detection in Multiparty Interaction," in Grounding Language Understanding, 2017.
[17]
G. Salvi, "An Analysis of Shallow and Deep Representations of Speech Based on Unsupervised Classification of Isolated Words," in Recent Advances in Nonlinear Speech Processing, 2016, pp. 151-157.
[18]
J. Lopes et al., "Detecting Repetitions in Spoken Dialogue Systems Using Phonetic Distances," in INTERSPEECH-2015, 2015, pp. 1805-1809.
[19]
A. Pieropan et al., "A dataset of human manipulation actions," in ICRA 2014 Workshop on Autonomous Grasping and Manipulation : An Open Challenge, 2014, 2014.
[20]
A. Pieropan et al., "Audio-Visual Classification and Detection of Human Manipulation Actions," in 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014), 2014, pp. 3045-3052.
[21]
N. Vanhainen and G. Salvi, "Free Acoustic and Language Models for Large Vocabulary Continuous Speech Recognition in Swedish," in Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), 2014.
[22]
S. Strömbergsson, G. Salvi and D. House, "Gradient evaluation of /k/-likeness in typical and misarticulated child speech," in Proceedings  of ICPLA 2014, 2014.
[23]
N. Vanhainen and G. Salvi, "Pattern Discovery in Continuous Speech Using Block Diagonal Infinite HMM," in 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2014; Florence; Italy; 4 May 2014 through 9 May 2014, 2014, pp. 3719-3723.
[24]
G. Salvi and N. Vanhainen, "The WaveSurfer Automatic Speech Recognition Plugin," in LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, pp. 3067-3071.
[25]
C. Oertel and G. Salvi, "A Gaze-based Method for Relating Group Involvement to Individual Engagement in Multimodal Multiparty Dialogue," in ICMI 2013 - Proceedings of the 2013 ACM International Conference on Multimodal Interaction, 2013, pp. 99-106.
[26]
G. Salvi, "Biologically Inspired Methods for Automatic Speech Understanding," in Biologically Inspired Cognitive Architectures 2012, 2013, pp. 283-286.
[27]
G. Saponaro, G. Salvi and A. Bernardino, "Robot anticipation of human intentions through continuous gesture recognition," in Proceedings of the 2013 International Conference on Collaboration Technologies and Systems, CTS 2013, 2013, pp. 218-225.
[28]
C. Oertel et al., "The KTH Games Corpora : How to Catch a Werewolf," in IVA 2013 Workshop Multimodal Corpora: Beyond Audio and Video : MMC 2013, 2013.
[29]
C. Koniaris, O. Engwall and G. Salvi, "Auditory and Dynamic Modeling Paradigms to Detect L2 Mispronunciations," in 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, Vol 1, 2012, pp. 898-901.
[30]
C. Koniaris, O. Engwall and G. Salvi, "On the Benefit of Using Auditory Modeling for Diagnostic Evaluation of Pronunciations," in International Symposium on Automatic Detection of Errors in Pronunciation Training (IS ADEPT), Stockholm, Sweden, June 6-8, 2012, 2012, pp. 59-64.
[31]
N. Vanhainen and G. Salvi, "Word Discovery with Beta Process Factor Analysis," in 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, Vol 1, 2012, pp. 798-801.
[32]
G. Salvi et al., "Analisi Gerarchica degli Inviluppi Spettrali Differenziali di una Voce Emotiva," in 7° convegno AISV, Contesto comunicativo e variabilità nella produzione e percezione della lingua (AISV). Lecce, Italy. 26 Gennaio - 28 Gennaio 2011, 2011.
[33]
G. Ananthakrishnan and G. Salvi, "Using Imitation to learn Infant-Adult Acoustic Mappings," in 12th Annual Conference Of The International Speech Communication Association 2011 (INTERSPEECH 2011), Vols 1-5, 2011, pp. 772-775.
[34]
G. Salvi et al., "Cluster Analysis of Differential Spectral Envelopes on Emotional Speech," in 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-4, 2010, pp. 322-325.
[35]
V. Krunic et al., "Affordance based word-to-meaning association," in ICRA : 2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, 2009, pp. 4138-4143.
[36]
J. Beskow, G. Salvi and S. Al Moubayed, "SynFace : Verbal and Non-verbal Face Animation from Audio," in Proceedings of The International Conference on Auditory-Visual Speech Processing AVSP'09, 2009.
[37]
S. Al Moubayed et al., "Virtual Speech Reading Support for Hard of Hearing in a Domestic Multi-Media Setting," in INTERSPEECH 2009 : 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, 2009, pp. 1443-1446.
[38]
V. Krunic et al., "Associating word descriptions to learned manipulation task models," in IEEE/RSJ International Conference on Intelligent RObots and Systems (IROS), 2008.
[39]
J. Beskow et al., "Hearing at Home : Communication support in home environments for hearing impaired persons," in INTERSPEECH 2008 : 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, 2008, pp. 2203-2206.
[40]
E. Agelfors et al., "User evaluation of the SYNFACE talking head telephone," in Computers Helping People With Special Needs, Proceedings, 2006, pp. 579-586.
[41]
G. Salvi, "Advances in regional accent clustering in Swedish," in Proceedings of European Conference on Speech Communication and Technology (Eurospeech), 2005, pp. 2841-2844.
[42]
G. Salvi, "Ecological language acquisition via incremental model-based clustering," in Proceedings of European Conference on Speech Communication and Technology (Eurospeech), 2005, pp. 1181-1184.
[43]
G. Salvi, "Segment boundaries in low latency phonetic recognition," in NONLINEAR ANALYSES AND ALGORITHMS FOR SPEECH PROCESSING, 2005, pp. 267-276.
[44]
J. Beskow et al., "SYNFACE - A talking head telephone for the hearing-impaired," in COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS : PROCEEDINGS, 2004, pp. 1178-1185.
[45]
K.-E. Spens et al., "SYNFACE, a talking head telephone for the hearing impaired," in IFHOH 7th World Congress for the Hard of Hearing. Helsinki Finland. July 4-9, 2004, 2004.
[46]
G. Salvi, "Accent clustering in Swedish using the Bhattacharyya distance," in Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS), Barcelona Spain, 2003, pp. 1149-1152.
[47]
I. Karlsson, A. Faulkner and G. Salvi, "SYNFACE - a talking face telephone," in Proceedings of EUROSPEECH 2003, 2003, pp. 1297-1300.
[48]
G. Salvi, "Truncation error and dynamics in very low latency phonetic recognition," in Proceedings of Non Linear Speech Processing (NOLISP), 2003.
[49]
G. Salvi, "Using accent information in ASR models for Swedish," in Proceedings of INTERSPEECH'2003, 2003, pp. 2677-2680.
[50]
F. T. Johansen et al., "The cost 249 speechdat multilingual reference recogniser," in In Proceedings of the International Conference on Language Resources and Evaluation (LREC), 2000.
[51]
F. T. Johansen et al., "The cost 249 speechdat multilingual reference recogniser," in In Proceedings of XLDB Workshop on Very Large Telephone Speech Databases, 2000.
[52]
B. Lindberg et al., "a noise robust multilingual reference recogniser based on speechdat(II)," in In Proceedings of the International Conference on Spoken Language Processing (ICSLP), 2000.
[53]
E. Agelfors et al., "A synthetic face as a lip-reading support for hearing impaired telephone users - problems and positive results," in European audiology in 1999 : proceeding of the 4th European Conference in Audiology, Oulu, Finland, June 6-10, 1999, 1999.
[54]
E. Agelfors et al., "Synthetic visual speech driven from auditory speech," in Proceedings of Audio-Visual Speech Processing (AVSP'99)), 1999.

Non-peer reviewed

Articles

[55]
G. Salvi and S. Al Moubayed, "Spoken Language Identification using Frame Based Entropy Measures," TMH-QPSR, vol. 51, no. 1, pp. 69-72, 2011.
[56]
T. Öhman and G. Salvi, "Using HMMs and ANNs for mapping acoustic to visual speech," TMH-QPSR, vol. 40, no. 1-2, pp. 45-50, 1999.

Conference papers

[57]
S. Al Moubayed et al., "Studies on Using the SynFace Talking Head for the Hearing Impaired," in Proceedings of Fonetik'09 : The XXIIth Swedish Phonetics Conference, June 10-12, 2009, 2009, pp. 140-143.
[58]
B. Lindblom et al., "(Re)use of place features in voiced stop systems : Role of phonetic constraints," in Proceedings of Fonetik 2008, 2008, pp. 5-8.
[59]
S. Al Moubayed, J. Beskow and G. Salvi, "SynFace Phone Recognizer for Swedish Wideband and Narrowband Speech," in Proceedings of The second Swedish Language Technology Conference (SLTC), 2008, pp. 3-6.

Chapters in books

[60]
B. Lindblom et al., "Sound systems are shaped by their users : The recombination of phonetic substance," in Where Do Phonological Features Come From? : Cognitive, physical and developmental bases of distinctive speech categories, G. Nick Clements, G. N.; Ridouane, R. Ed., : John Benjamins Publishing Company, 2011, pp. 67-97.

Theses

[61]
G. Salvi, "Mining Speech Sounds : Machine Learning Methods for Automatic Speech Recognition and Analysis," Doctoral thesis Stockholm : KTH, Trita-CSC-A, 2006:12, 2006.
Latest sync with DiVA:
2019-5-26 00:21:44

Go to category: