Transliterating non-ASCII characters

The Scopus bibliographic data are Unicode encoded and can contain non-ASCII characters. The data from WoS have author names transformed into ASCII. How can we make both data compatible?

I first prepared a test file examples.txt (UTF-8 with BOM)

AU Garcia-Calvo, T
RI García-Calvo, Tomas/AAN-6825-2021; OLIVA, DAVID SANCHEZ/L-1698-2014;

AU Bazina, AM
   Pericic, TP
   Mihanovic, F
RI Peričić, Tina Poklepović/G-8402-2017; Mihanović, Frane/E-3337-2017
RI Chmura, Paweł/U-6645-2019; Struzik, Artur/AAC-2669-2021; Popowczak,
AU  - Marinović, M.
AU  - Kjær, M.
AU  - Renström, P.A.F.H.
AU  - Ibáñez, S.J.
AU  - Kristjánsdóttir, H.
Cvetković, D., Doob, M., Sachs, H., (1995) Spectra of Graphs: Theory and Application, pp. 18-20. , Johann Ambrosius Barth, Heidelberg, 3rd edn; 
Заболотский Александр Викторович <azabolotskii@hse.ru>
الدبلوم التنفيذي | مهارات الذكاء الاصطناعي وعلم البيانات
Cui, Chunfang; Tong, Zhongliang 干燥新技术及应用 /Gan zao xin ji shu ji ying yong [Di 1 ban. ed.]

After some searching on Google I found the solution in Transliterating non-ASCII characters with Python

wdir = "C:/Users/vlado/work2/mark/ascii"
import sys; sys.path.append(wdir)
import os; os.chdir(wdir)
import io
from unidecode import unidecode
 
infile = open("examples.txt","r",encoding="utf-8-sig")
data = infile.read()
infile.close()
 
a = unidecode(data)
 
print(data)
print(a)

We get the following trasliteration:

AU Garcia-Calvo, T
RI Garcia-Calvo, Tomas/AAN-6825-2021; OLIVA, DAVID SANCHEZ/L-1698-2014;

AU Bazina, AM
   Pericic, TP
   Mihanovic, F
RI Pericic, Tina Poklepovic/G-8402-2017; Mihanovic, Frane/E-3337-2017
RI Chmura, Pawel/U-6645-2019; Struzik, Artur/AAC-2669-2021; Popowczak,
AU  - Marinovic, M.
AU  - Kjaer, M.
AU  - Renstrom, P.A.F.H.
AU  - Ibanez, S.J.
AU  - Kristjansdottir, H.
Cvetkovic, D., Doob, M., Sachs, H., (1995) Spectra of Graphs: Theory and Application, pp. 18-20. , Johann Ambrosius Barth, Heidelberg, 3rd edn; 
Zabolotskii Aleksandr Viktorovich <azabolotskii@hse.ru>
ldblwm ltnfydhy | mhrt ldhk lSTn`y w`lm lbynt
Cui, Chunfang; Tong, Zhongliang Gan Zao Xin Ji Zhu Ji Ying Yong  /Gan zao xin ji shu ji ying yong [Di 1 ban. ed.]

URLs

  1. Chris J. Lu, Allen C. Browne, Divita Guy: Using Lexical Tools to Convert Unicode Characters to ASCII