Menggunakan Scannerdalam beberapa kasus adalah cara yang sangat mudah untuk mengekstraksi angka dari string. Dan itu hampir sama kuatnya dengan NumberFormatterketika memecahkan kode dan menangani berbagai format angka dan lokal. Ini dapat mengekstraksi angka dan mata uang dengan pemisah desimal dan grup yang berbeda.
import Foundation
// The code below includes manual fix for whitespaces (for French case)
let strings = ["en_US": "My salary is $9,999.99",
"fr_FR": "Mon salaire est 9 999,99€",
"de_DE": "Mein Gehalt ist 9999,99€",
"en_GB": "My salary is £9,999.99" ]
// Just for referce
let allPossibleDecimalSeparators = Set(Locale.availableIdentifiers.compactMap({ Locale(identifier: $0).decimalSeparator}))
print(allPossibleDecimalSeparators)
for str in strings {
let locale = Locale(identifier: str.key)
let valStr = str.value.filter{!($0.isWhitespace || $0 == Character(locale.groupingSeparator ?? ""))}
print("Value String", valStr)
let sc = Scanner(string: valStr)
// we could do this more reliably with `filter` as well
sc.charactersToBeSkipped = CharacterSet.decimalDigits.inverted
sc.locale = locale
print("Locale \(locale.identifier) grouping separator: |\(locale.groupingSeparator ?? "")| . Decimal separator: \(locale.decimalSeparator ?? "")")
while !(sc.isAtEnd) {
if let val = sc.scanDouble() {
print(val)
}
}
}
Namun, ada masalah dengan pemisah yang bisa dipahami sebagai pembatas kata.
// This doesn't work. `Scanner` just ignores grouping separators because scanner tends to seek for multiple values
// It just refuses to ignore spaces or commas for example.
let strings = ["en_US": "$9,999.99", "fr_FR": "9999,99€", "de_DE": "9999,99€", "en_GB": "£9,999.99" ]
for str in strings {
let locale = Locale(identifier: str.key)
let sc = Scanner(string: str.value)
sc.charactersToBeSkipped = CharacterSet.decimalDigits.inverted.union(CharacterSet(charactersIn: locale.groupingSeparator ?? ""))
sc.locale = locale
print("Locale \(locale.identifier) grouping separator: \(locale.groupingSeparator ?? "") . Decimal separator: \(locale.decimalSeparator ?? "")")
while !(sc.isAtEnd) {
if let val = sc.scanDouble() {
print(val)
}
}
}
// sc.scanDouble(representation: Scanner.NumberRepresentation) could help if there were .currency case
Tidak ada masalah untuk mendeteksi lokal secara otomatis. Perhatikan bahwa pengelompokanSeparator dalam bahasa Prancis dalam string "Mon salaire est 9 999,99 €" bukan spasi, meskipun itu dapat membuat persis seperti ruang (di sini tidak). Itulah mengapa kode di bawah ini berfungsi dengan baik tanpa !$0.isWhitespacekarakter yang difilter.
let stringsArr = ["My salary is $9,999.99",
"Mon salaire est 9 999,99€",
"Mein Gehalt ist 9.999,99€",
"My salary is £9,999.99" ]
let tagger = NSLinguisticTagger(tagSchemes: [.language], options: Int(NSLinguisticTagger.Options.init().rawValue))
for str in stringsArr {
tagger.string = str
let locale = Locale(identifier: tagger.dominantLanguage ?? "en")
let valStr = str.filter{!($0 == Character(locale.groupingSeparator ?? ""))}
print("Value String", valStr)
let sc = Scanner(string: valStr)
// we could do this more reliably with `filter` as well
sc.charactersToBeSkipped = CharacterSet.decimalDigits.inverted
sc.locale = locale
print("Locale \(locale.identifier) grouping separator: |\(locale.groupingSeparator ?? "")| . Decimal separator: \(locale.decimalSeparator ?? "")")
while !(sc.isAtEnd) {
if let val = sc.scanDouble() {
print(val)
}
}
}
// Also will fail if groupingSeparator == decimalSeparator (but don't think it's possible)