Aturan Jempol untuk Kapitalisasi Judul


30

Menurut situs ini aturan umum yang direkomendasikan oleh Manual Gaya Kantor Percetakan Pemerintah AS adalah

Mengapitalisasi semua kata dalam judul publikasi dan dokumen, kecuali a, a, the, at, oleh, untuk, pada, pada, pada, ke, atas, dan, seperti, tetapi, atau, dan juga tidak.

Ini mungkin tidak benar karena saya tidak dapat menemukan rekomendasi seperti itu di Manual Gaya , tetapi mari kita tetap menggunakan aturan ini.


Tantangan

Diberikan string input yang terdiri dari kata-kata huruf kecil yang dibatasi oleh spasi, menghasilkan kapitalisasi string sesuai dengan aturan berikut

  • Kata pertama dan terakhir ditulis dengan huruf besar.
  • Semua kata lain dikapitalisasi, kecuali a , an , the , at , oleh , untuk , di , dari , pada , ke , atas , dan , seperti , tetapi , atau , dan juga tidak .

String input akan mengandung setidaknya satu kata dan setiap kata berisi setidaknya satu huruf dan hanya karakter dari ahingga z.

Ini adalah tantangan kode golf, jadi coba gunakan sesedikit mungkin byte dalam bahasa pilihan Anda. Anda dapat menulis program lengkap atau fungsi untuk menyelesaikan tugas.

Testcases

"the rule of thumb for title capitalization" -> "The Rule of Thumb for Title Capitalization"
"programming puzzles and code golf" -> "Programming Puzzles and Code Golf"
"the many uses of the letter a" -> "The Many Uses of the Letter A"
"title" -> "Title"
"and and and" -> "And and And"
"a an and as at but by for in nor of on or the to up" -> "A an and as at but by for in nor of on or the to Up"
"on computable numbers with an application to the entscheidungsproblem" -> "On Computable Numbers With an Application to the Entscheidungsproblem"

1
Haruskah kata-kata awal / akhir ditulis dengan huruf besar walaupun mereka ada dalam daftar pengecualian? Contoh Anda mengatakan ya, tetapi spec hanya mengatakan huruf besar kecuali jika ada dalam daftar, dan tidak ada kata pertama / terakhir. Perhatikan bahwa dua kemungkinan sangat berbeda, satu menjadi filter sederhana dan yang kedua membutuhkan perilaku khusus dalam kasus tepi (literal).
CAD97

3
@ CAD97 Aturan untuk kapitalisasi adalah dua poin, bukan Kutipan. Dan poin pertama mengatakan, "Kata pertama dan terakhir ditulis dengan huruf besar." dan yang kedua mengatakan "Semua kata lain ditulis dengan huruf besar, kecuali ..." yang berarti kata pertama dan terakhir selalu ditulis dengan huruf besar.
Laikoni

Aku rindu itu, entah bagaimana. Tetap saja, terima kasih sudah menjelaskan.
CAD97

Saya tidak yakin benar-benar perlu untuk menentukan bahwa setiap kata berisi setidaknya satu huruf. :)
David Conrad

Jawaban:


11

Python 2, 118 byte

Lihat bu, jangan regex!

for w in`input()`.split():print[w.title(),w][`w`in"'a'an'and'as'at'the'by'but'for'nor'in'of'on'or'to'up'"].strip("'"),

Masukan harus dibungkus dengan tanda kutip. Output memiliki ruang trailing dan tidak ada trailing newline (saya berasumsi tidak apa-apa). Verifikasi semua kasus uji pada Ideone .

Penjelasan

Mari kita ambil input a or ansebagai contoh kita.

Menggunakan Python 2 ini `x`shortcut untuk repr, kami membungkus masukan dalam tanda kutip tunggal: 'a or an'. Kemudian kami berpisah pada spasi putih dan beralih ke kata-kata.

Di dalam loop, kita ambil repr lagi . Untuk kata-kata pertama dan terakhir, ini memberi "'a"dan "an'". Dengan kata lain, itu memberi 'or'. Kami ingin menghindari penggunaan huruf besar jika sesuai dengan pola yang terakhir dan ada dalam daftar kata-kata pendek. Jadi kita dapat mewakili daftar kata sebagai string "'a'an'...'up'"dan tahu bahwa reprkata pendek apa pun akan menjadi substring.

`w` in "..."memberikan nilai boolean, yang dapat kita perlakukan sebagai 0atau 1untuk keperluan pengindeksan ke dalam daftar [w.title(), w]. Singkatnya, kita beri judul huruf besar kata jika itu di awal, di akhir, atau tidak dalam daftar kata-kata pendek. Kalau tidak, kita biarkan saja. Untungnya, title()masih berfungsi seperti yang diharapkan dengan input suka 'a.

Akhirnya, kami menghapus tanda kutip tunggal dari kata dan mencetaknya dengan spasi tambahan.


8

05AB1E , 68 61 byte

Disimpan 7 byte berkat Adnan

™ð¡Dg<UvyN__NXQ_“a€¤€€€›€‹€‡€†€‚€‰€„€¾€ƒ€œ€³€—š¯“#™yå&&il})ðý

Cobalah online!

Penjelasan

“a€¤€€€›€‹€‡€†€‚€‰€„€¾€ƒ€œ€³€—š¯“adalah string kamus yang diterjemahkan sebagai a an the at by for in of on to up and as but or nor.

™                          # title case input string
ð¡                         # split on spaces
Dg<U                       # store index of last word in X

vy                         # for each word
  N__                      # is it not first index?
     NXQ_                  # is it not last index
         “...“             # the compressed string 
              #            # split on spaces
               ™           # convert to title case
                yå         # is current word in this list?
                  &&       # and the 3 previous conditions together
                    il     # if all are true, convert to lower case
                      }    # end loop
)ðý                        # wrap stack in list and join by spaces

2
Itu tidak pernah berhenti membuat saya takjub apa yang berhasil Anda capai dengan serangkaian pendek karakter yang benar-benar tidak dapat dikenali. Sepertinya itu bekerja dengan baik :) +1
ElPedro

Bah! Saya sangat dekat, dan saya tidak dapat menemukan cara untuk mencukur karakter.
mbomb007

@ mbomb007: Lebih baik cepat sebelum Jelly, MATL atau bahasa lain yang dapat menerapkan fungsi ke indeks datang dan mengalahkan ini :) Sepertinya saya ingat bahasa dengan regex terkompresi juga, tetapi tidak ingat apa namanya. Ini cukup lama sehingga mungkin masih bisa golf.
Emigna

1
Untuk 62 byte :)
Adnan

@ Adnan: Saya mulai seperti itu tetapi hanya dengan kata-kata 3-char (yang berakhir lebih lama), tapi saya tidak mempertimbangkan untuk mengambil kata-kata 2-char juga ... adaripada €…menyimpan byte tambahan juga jika lead dari dengan itu :) Terima kasih!
Emigna

7

GNU sed 81 74 73 Bytes

Termasuk +1 untuk -r

s/\b./\u&/g
:;s/.(And?|A[st]?|The|By|But|[FN]or|In|O[fnr]|To|Up) /\L&/;t

Baris pertama menggunakan huruf besar untuk setiap kata. Yang kedua mengalihkan semua kata yang diperlukan kembali ke huruf kecil.

Cobalah secara Online!


6

Retina, 69 66 byte

Gunakan huruf besar pada huruf pertama dari setiap kata, lalu ubah kata yang dipilih menjadi huruf kecil jika itu bukan kata pertama atau terakhir. Ada ruang di ujung baris terakhir.

T`l`L`\b.
+T`L`l` (And?|A[st]?|The|By|But|[FN]or|In|O[fnr]|To|Up) 

Cobalah online

Ini juga berfungsi dengan a . bukan ruang pertama.

Ada banyak regex dengan panjang yang sama, tetapi saya tidak dapat menemukan cara untuk memotongnya lagi ...


(Pendekatan ini juga 69 byte di Pip, tapi saya tidak bisa menggunakan +trik untuk mempersingkatnya.)
DLosc

@Dosc, Terima kasih. Idk kenapa aku tidak melihat itu. Saya sudah dekat.
mbomb007

3

JavaScript (ES6), 141 138 135 133 byte

Disimpan 3 byte berkat mbomb007

s=>s.replace(/(\w+)( ?)/g,(a,w,n,i)=>i&&n&&/^(a[nst]?|the|by|in|of|on|to|up|and|but|[fn]?or)$/.exec(w)?a:a[0].toUpperCase()+a.slice(1))

Uji kasus


3

Jelly , 58 byte

“Ð/ṃƇ¬þṄẊƙ€,⁽ṙƬ®OṪJ"ɦ3×kf3Ṙç%ġu’b26ịØaṣ”z
e€¢¬T;2Ḷ¤
ḲŒtǦK

TryItOnline! atau jalankan semua tes

Bagaimana?

Sebuah string terkompresi dengan spasi yang memisahkan kata-kata adalah 47byte, membaginya dengan 1byte, untuk 48byte.

Dua string terkompresi yang tidak terpisahkan dari kata-kata panjang 2dan 3(dengan 'a' di akhir satu) masing-masing akan menjadi 40byte plus 2untuk membelah masing-masing dan 1untuk bergabung dengan mereka, untuk 45byte.

Satu nomor basis 250 seperti yang dijelaskan di bawah ini adalah 32byte, kemudian 3untuk mengkonversi ke basis 26, 3untuk mengindeks ke dalam huruf kecil dan 3untuk membaginya pada karakter yang tidak digunakan 'z',, untuk 41byte.

Jadi, pencarian kata-kata untuk tidak menggunakan huruf besar:
“Ð/ṃƇ¬þṄẊƙ€,⁽ṙƬ®OṪJ"ɦ3×kf3Ṙç%ġu’
dibentuk seperti:

Ambil kata-kata itu dan bergabunglah dengan pemisah:
s="a an the at by for in of on to up and as but or nor"

Label selanjutnya 'a'sebagai 1, 'b'seperti 2dengan pemisah sebagai 0:

alpha = ' abcdefghijklmnopqrstuvwxyz'
x = [alpha.index(v) for v in s]
x
[1,0,1,14,0,20,8,5,0,1,20,0,2,25,0,6,15,18,0,9,14,0,15,6,0,15,14,0,20,15,0,21,16,0,1,14,4,0,1,19,0,2,21,20,0,15,18,0,14,15,18]

Konversikan ini menjadi angka dasar 26(huruf terakhir yang digunakan adalah 'y'angka plus untuk pemisah, kode Python untuk ini adalah:
n=sum(v*26**i for i,v in enumerate(x[::-1]))

Ubah itu menjadi 250nomor dasar (menggunakan daftar untuk digit):

b=[]
while n:
    n,d = divmod(n,250)
    b=[d]+b
b
[16,48,220,145,8,32,202,209,162,13,45,142,244,153,9,80,207,75,35,161,52,18,108,103,52,205,24,38,237,118]

Cari karakter pada indeks-indeks tersebut dalam codepage jelly:

codepage = '''¡¢£¤¥¦©¬®µ½¿€ÆÇÐÑ×ØŒÞßæçðıȷñ÷øœþ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQR TUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~¶°¹²³⁴⁵⁶⁷⁸⁹⁺⁻⁼⁽⁾ƁƇƊƑƓƘⱮƝƤƬƲȤɓƈɗƒɠɦƙɱɲƥʠɼʂƭʋȥẠḄḌẸḤỊḲḶṂṆỌṚṢṬỤṾẈỴẒȦḂĊḊĖḞĠḢİĿṀṄȮṖṘṠṪẆẊẎŻạḅḍẹḥịḳḷṃṇọṛṣṭụṿẉỵẓȧḃċḋėḟġḣŀṁṅȯṗṙṡṫẇẋẏż«»‘’“”'''
r=''.join(codepage[i-1] for i in b)
r
'Ð/ṃƇ¬þṄẊƙ€,⁽ṙƬ®OṪJ"ɦ3×kf3Ṙç%ġu'

(catatan: karena implementasi aktual bersifat bijektif, jika bmemiliki 0digit, seseorang harus menurunkannya terlebih dahulu)

Sisanya:

ḲŒtǦK - Main link: title string
Ḳ      - split on spaces
    ¦  - apply to indexes
   Ç   -     given by calling the last link (1) as a monad (with the split title string)
 Œt    -     title case (first letter of each (only) word to upper case)
     K - join on spaces

e€¢¬T;2Ḷ¤ - Link 1, find indexes to capitalise: split title string
e€        - is an element of, for €ach
  ¢       - the result of calling the last link (2) as a nilad
   ¬      - logical not
    T     - get the truthy indexes (indexes of words that are not in the list)
     ;    - concatenate with
        ¤ - nilad followed by link(s) as a nilad
      2Ḷ  - range(2) -> [0,1]
                (we always want to capitalise the first index, 1, and the last index, 0)

“Ð/ṃƇ¬þṄẊƙ€,⁽ṙƬ®OṪJ"ɦ3×kf3Ṙç%ġu’b26ịØaṣ”z - Link 2, make the word list: no arguments
“Ð/ṃƇ¬þṄẊƙ€,⁽ṙƬ®OṪJ"ɦ3×kf3Ṙç%ġu’          - the base 250 number
                                b26       - convert to base 26
                                   ị      - index into
                                    Øa    - lowercase alphabet
                                      ṣ   - split on
                                       ”z - literal 'z' (the separator 0 indexes into `z`)

2

PHP, 158 Bytes

10 Bytes disimpan oleh @Titus

foreach($w=explode(" ",$argv[1])as$k=>$v)echo" "[!$k],$k&&$k+1<count($w)&&preg_match("#^(a[snt]?|and|[fn]or|up|by|but|the|to|in|o[rnf])$#",$v)?$v:ucfirst($v);

PHP versi sebelumnya, 174 Bytes

foreach($w=explode(" ",$argv[1])as$k=>$v)$k&&$k+1<count($w)&&in_array($v,[a,an,the,at,by,"for",in,of,on,to,up,"and","as",but,"or",nor])?:$w[$k]=ucfirst($v);echo join(" ",$w);

Gema di loop menghemat 10 byte:foreach(...)echo" "[!$k],(condition)?$v:ucfirst($v);
Titus

2

TI-Basic, 295 + 59 + 148 = 502 byte

Sekarang Anda dapat memanfaatkan kalkulator Anda. Bagus untuk sekolah :)

Program Utama, 295 byte

Pada dasarnya, trik untuk mencocokkan kata-kata sehingga semua Atidak menjadi aadalah untuk melampirkan dengan spasi, seperti ganti " A "dengan " a ". Ini juga secara otomatis membuatnya sehingga kata pertama dan terakhir tetap dikapitalisasi, karena mereka tidak memiliki ruang di kedua sisi dan dengan demikian tidak akan cocok dengan salah satu kata. (Genius, kan? Dan super panjang karena huruf kecil masing-masing dua byte ...)

"("+Ans+")→Str1
"@A ~ a@An ~ an@The ~ the@At ~ at@By ~ by@For ~ for@In ~ in@Of ~ of@On ~ on@To ~ to@Up ~ up@And ~ and@As ~ as@But ~ but@Or ~ or@Nor ~ nor@→Str2
For(I,2,length(Ans
If "@"=sub(Str2,I-1,1
Then
" "+Str1+"~"+sub(Str2,I,inString(Str2,"@",I)-I)+" "
prgmQ
Ans→Str1
End
End

Subprogram ( prgmQ), 59 byte:

Ans→Str9
inString(Ans,"~
sub(Str9,Ans,length(Str9)-Ans+1→Str8
Str9
prgmR
Repeat Str9=Ans+Str8
Ans+Str8→Str9
prgmR
End

Subprogram ( prgmR), 148 byte:

Ans→Str0
inString(Ans,"~→Z
inString(Str0,"~",Ans+1→Y
inString(sub(Str0,1,Z-1),sub(Str0,Z+1,Ans-Z-1→X
sub(Str0,1,-1+inString(Str0,"~
If X
sub(Str0,1,X-1)+sub(Str0,Y+1,length(Str0)-Y)+sub(Str0,X+length(sub(Str0,Z+1,Y-Z-1)),Z-X-length(sub(Str0,Z+1,Y-Z-1

PS ~mewakili token 0x81dan @mewakili token 0x7F, pelajari lebih lanjut di sini .


2

Java 7, 271 259 258 byte

String c(String x){String a[]=x.split(" "),s=" ",r=w(a[0])+s;for(int i=0,l=a.length-1;i<l;r+=(!s.matches("^(a[nst]?|the|by|in|of|on|to|up|and|but|[fn]?or)$")|i==l?w(s):s)+" ")s=a[++i];return r;}String w(String w){return(char)(w.charAt(0)-32)+w.substring(1);}

Tidak digabungkan & kode uji:

Coba di sini.

class M{
  static String c(String x){
    String a[] = x.split(" "),
           s = " ",
           r = w(a[0]) + s;
    for(int i = 0, l = a.length-1; i < l; r += (!s.matches("^(a[nst]?|the|by|in|of|on|to|up|and|but|[fn]?or)$") | i == l
                                                 ? w(s)
                                                 : s)   + " "){
      s = a[++i];
    }
    return r;
  }

  static String w(String w) {
    return (char)(w.charAt(0) - 32) + w.substring(1);
  }

  public static void main(String[] a){
    System.out.println(c("the rule of thumb for title capitalization"));
    System.out.println(c("programming puzzles and code golf"));
    System.out.println(c("the many uses of the letter a"));
    System.out.println(c("title"));
    System.out.println(c("and and and"));
    System.out.println(c("a an and as at but by for in nor of on or the to up"));
    System.out.println(c("on computable numbers with an application to the entscheidungsproblem"));
  }
}

Keluaran:

The Rule of Thumb for Title Capitalization 
Programming Puzzles and Code Golf 
The Many Uses of the Letter A 
Title 
And and And 
A an and as at but by for in nor of on or the to Up 
On Computable Numbers With an Application to the Entscheidungsproblem 

1

Groovy, 131 129

Dua byte disimpan berkat carusocomputing

{it.split()*.with{a->a in "a an the at by for in of on to up and as but or nor".split()?a:a.capitalize()}.join(" ").capitalize()}

Bagus, saya berada di 137; kamu menang. Hapus i->dan gunakan ituntuk menyimpan 2 byte. {it.split()*.with{a->a in "a an the at by for in of on to up and as but or nor".split()?a:a.capitalize()}.join(" ").capitalize()}
Magic Gurita Guci

1
Saya tidak tahu Groovy tetapi apakah ini benar-benar menggunakan huruf besar pada kata pertama dan terakhir?
Emigna

@Emigna penutup huruf kapital terakhir dimulai dengan salah satu kata.
Magic Gurita Guci

@ Emigna tidak benar-benar, saya melewatkan persyaratan itu (kata terakhir yang perlu ditulis dalam huruf besar). Saya perlu menyesuaikan anwser saya.
Krzysztof Atłasik

Dua penggunaan .capitalize()mengambil banyak byte. Adakah cara singkat untuk membuat alias .capitalize()?
Cyoce

1

C #, 305 byte

Masih banyak ruang untuk perbaikan tetapi di sini Anda mulai:

s=>{;var b=s.Split(' ');b[0]=((char)(b[0][0]-32))+b[0].Substring(1);int i=0,n=b.Length;for(;++i<n;)if(!"a,an,the,at,by,for,in,of,on,to,up,and,as,but,or,nor".Split(',').Contains(b[i]))b[i]=((char)(b[i][0]-32))+b[i].Substring(1);b[n-1]=((char)(b[n-1][0]-32))+b[n-1].Substring(1);return string.Join(" ",b);};

1

Ruby, 123 117 111 102 bytes

->s{s.gsub(/ .|^./,&:upcase).gsub(/ (A[nts]?|The|By|In|To|Up|And|But|[NF]or|O[rnf])(?= )/,&:downcase)}

Sorry for all the edits - this should be the last one.


1

Python, 177 bytes

Delivered in function format for byte saving purposes. This is not an especially competitive answer, but it is one that doesn't require repr() or regex trickery. It is also version-agnostic; it works with Python 2 or 3.

Though it is perhaps a very by-the-rules solution.

def t(s):
 w="a an the the at by for in of on to up and as but or nor".split()
 l=[(s.title(),s)[s in w]for s in s.split()]
 for x in(0,-1):l[x]=l[x].title()
 return' '.join(l)

1

PHP, 109 142 bytes

<?=preg_replace_callback("# (A[snt]?|And|[FN]or|Up|By|But|The|To|In|O[rnf])(?= )#",function($m){return strtolower($m[0]);},ucwords($argv[1]));

A merger of user59178´s and mbomb007´s answer.

uppercases the first letter of every word, then lowercases all words from the list surrounded by spaces.
Unfortunately, the callback has to operate on the complete set; this costs 29 bytes.


it works not for a an and as at but by for in nor of on or the to up
Jörg Hülsermann

1

Racket 353 bytes

(define(cap i)(set! i(string-append i))(define c(string-ref i 0))(string-set! i 0(if(char-upper-case? c)c(integer->char(-(char->integer c)32))))i)
(let*((ex(list"a""an""the""at""by""for""in""of""on""to""up""and""as""but""or""and""nor"))(sl(string-split s)))
(string-join(for/list((i sl)(n(in-naturals)))(cond[(= n 0)(cap i)][(member i ex)i][(cap i)]))))

Ungolfed:

(define (f s)

  (define (cap i)                 ; sub-fn to capitalize first letter of a word
    (set! i (string-append i))
    (define c (string-ref i 0))
    (string-set! i 0
                 (if (char-upper-case? c)
                     c
                     (integer->char (-(char->integer c)32))))
    i)

  (let* ((ex (list "a""an""the""at""by""for""in""of""on""to""up""and""as""but""or""and""nor"))
         (sl (string-split s)))
    (string-join
     (for/list
         ((i sl)
          (n (in-naturals)))
       (cond
         [(= n 0) (cap i)]
         [(member i ex) i]
         [(cap i)]
         )))))

Testing:

(f "the rule of thumb for title capitalization")

Output:

"The Rule of Thumb for Title Capitalization"

1

Java 7, 431 317 311 bytes

Thanks to @KevinCruijssen for 114 bytes.
Thanks to @RosLup for saving 6 bytes.

String c(String s){String v="",x,l[]=s.split(" "),b[]={"a","an","the","at","but,"by","for","in","of","on","to","‌​up","as","or","and","nor"};int i=0,f=0,z=0;for(String c:l){for(f=0;f<b.length;z=c.equals(b[f++])|z>0?1:0);x=(char)(c.charAt(0)-32)+c.substring(1);v+=(z>0?i<1|i>l.length-2?x:c:x)+" ";i++;}return v;}

ungolfed

first answer above 250 bytes

 static String c(String s) {
      String v = "", x, l[] = s.split(" "),
b[]={"a","an","the","at","by","for","in","of","on","to",
                                         "‌​up","and","as","or","nor","but"};
    int i , f , z = i = f = 0;
    for (String c : l) {

   for (f = 0; f < b.length; z = c.equals( b[f++] ) | z > 0 ? 1 : 0);
        x = (char)(c.charAt(0) - 32) + c.substring(1);

        v += (z > 0 ? i < 1 | i > l.length - 2 ? x : c : x) + " ";
        i++;
   }
    return v;
    }

1
It was too much to summarize in a comment, but you can golf it to this: String f(String s){String v="",x,l[]=s.split(" "),b[]={"a","an","the","at","by","for","in","of","on","to","up","and","as","but","or","and","nor"};int i=0,f=0,z=0;for(String c:l){for(f=0;f<b.length;z=c.equals(b[f++])|z>0?1:0);x=(char)(c.charAt(0)-32)+c.substring(1);v+=z>0?i<1|i++==l.length-1?x:c:x)+" ";}return v;} (314 bytes) I suggest taking a look at what I changed as tips for next time. :) PS: I've posted an answer with a different approach (259 bytes).
Kevin Cruijssen

1
Especially things like c.substring(0,1).toUpperCase()+c.substring(1,c.length())+" " which you did twice should make you think about re-using it somehow. And combined initializations like you did correctly with the int, but for some reason not with the String. Also, no need for the extra boolean when you can store at as an int 0 or 1 and then check it >0. And I would try to avoid brackets and break as much as possible; usually there is a trick to get rid of them, like the for(f=0;f<b.length;z=c.equals(b[f++])|z>0?1:0); I've showed. :)
Kevin Cruijssen

1
So much to learn and thanks for being helpful always (long live Nederland ;)
Numberknot

1
Oh, I've made a copy-paste error.. It should be this String c(String s){String v="",x,l[]=s.split(" "),b[]={"a","an","the","at","by","for","in","of","on","to","up","and","as","but","or","and","nor"};int i=0,f=0,z=0;for(String c:l){for(f=0;f<b.length;z=c.equals(b[f++])|z>0?1:0);x=(char)(c.charAt(0)-32)+c.substring(1);v+=(z>0?i<1|i++>l.length-2?x:c:x)+" ";}return v;} And no problem. :) I also learned a lot when I was new to code-golfing. I just make a list with every general codegolf tip I learn and look/update it sometimes. But my code still gets golfed by others a lot.
Kevin Cruijssen

1
In the string b[] there are 2 'and' is that ok?
RosLuP

1

PHP, 117 118 112 bytes

<?=strtr(ucwords(preg_replace("# (?=(a[snt]?|and|[fn]or|up|by|but|the|to|in|o[rnf]) )#","!",$argv[1])),'!',' ');

Uses the behaviour of ucwords() and escapes the relevant words that are surrounded by spaces then deletes the escape characters.

I copied the (a[snt]?|and|[fn]or|up|by|but|the|to|in|o[rnf]) from Jörg Hülsermann's answer but as the approach is completely different I'm posting it as a separate answer.

edit: bug noticed by Titus, fixing it cost 1 byte. also: 6 bytes saved thanks to his helpful comment about strtr


Save 6 bytes with strtr instead of str_replace. Or prepend the words with <> and drop the str_replace and use HTML output.
Titus

In some cases you can use preg_filter instead of preg_replace. I have not try it with your solution
Jörg Hülsermann

The regex will not work for two words from the list in a row; test nice try for a start. Replacing one of the spaces with an assertion solves that (+4 bytes).
Titus

Unfortunately preg_filter would fail on the title test case, returning nothing.
user59178

1

Pure bash - 253

(no external programs called) - needs bash v4

declare -A b;for x in A An The At By For In Of On To Up And As But Or Nor;do b[$x]=1;done
while read -a w;do
n=${#w[@]};o[0]=${w[0]^}
for((i=1;i<n-1;i++)){
g=${w[$i]^};((${b[$g]}))&&o+=(${g,,})||o+=($g);}
((n>1))&&o[$n]=${w[-1]^}
echo ${o[@]};o=()
done

normal view with comments

#create the "blacklist"
declare -A b
for w in A An The At By For In Of On To Up And As But Or Nor
do
    b[$x]=1
done

# logic:
# read each line (split by words) into array
# and each word is assigned capitalized to the new output array
# but the blacklisted ones

#read each line to array w (split on spaces)
while read -a w
do
    n=${#w[@]}         # get the number of words
    o[0]=${w[0]^}          # copy the capitalized word1
    for((i=1 ; i<n-1 ; i++)) { # loop over 2 up to last -1 words

        g=${w[$i]^}    # for the given word
        # check if it is in the blacklisted ones
        # if yes - convert to lowercase, if not leave as it is
        # and append to the output array
        (( ${b[$g]} )) && o+=(${g,,}) || o+=($g)
    }
    # capitalize the last word if here is more words
    (( n>1 )) && o[$n]=${w[-1]^}
    # make a line from the words
    echo ${o[@]}
    o=() #cleanup
done

output

Title
And and And
The Rule of Thumb for Title Capitalization
Programming Puzzles and Code Golf
The Many Uses of the Letter A
A an and as at but by for in nor of on or the to Up
On Computable Numbers With an Application to the Entscheidungsproblem

1

Japt, 71 bytes

£`a  e  by f     up d  ¿t  n`¸aX >0©Y¦0©YĦZl ?X:Xg u +XÅ}S

Try it online!

Explanation:

£`a  e  by f     up d  ¿t  n`¸aX >0©Y¦0©YĦZl ?X:Xg u +XÅ}S
£`...`qS aX >0&&Y!=0&&Y!=UqS l -1?X:Xg u +Xs1}S

£                                            }S   // Split at spaces and map each item X by this function:
 `...`                                            //  Backticks are used to decompress strings
      qS                                          //  Split the decompressed string at spaces.
         aX >J                                    //  If this contains X
              &&Y!=0                              //  and the index is non-zero (it's not the first word)
                    &&Y!=UqS l -1                 //  and the index is not the length of the input -1 (it's not the last word),
                                 ?X               //  return X.
                                   :Xg u +Xs1     //  Else, return X capitalized. (Literally X[0].toUpperCase() + X.slice(1))
                                             }S   // Rejoin with spaces

One of my favorite Japt features is its string compression, which uses the shoco library.

You can compress a string by wrapping it in Oc"{string}"Oc"a an the at by for in of on to up and as but or nor"

Then decompressing it with backticks or Od"{compressed string}"Od"a e by f up d ¿t n"


The -S flag was added after this challenge was posted, so your current solution is non-competing. However, I think you can do £...+XÅ}S, which would be competing for the same byte-count (Try it online!)
ETHproductions

How does shoco compare with Jelly's dictionary compression in your opinion?
Robert Fraser

@RobertFraser Compared to Jelly, it's not very good at compressing strings of English words, but it is very good at compressing strings of arbitrary lowercase letters, which comes in handy sometimes.
ETHproductions

1

Pure bash - 205 192 181 bytes

tc(){
while read -a x
do x=(${x[@]^})
for ((i=1;i<${#x[@]}-1;i++))
do
case "${x[i]}" in
A|A[nts]|The|By|[FN]or|In|O[fnr]|To|Up|And|But)x[i]=${x[i],};;
esac
done
echo ${x[@]}
done
}

Like jm66's answer tc accepts standard input.


0

Actually, 79 bytes

' ,ÿsd@p@`;0"A0An0The0At0By0For0In0Of0On0To0Up0And0As0But0Or0Nor"síu'ù*ƒ`Moq' j

Try it online!

Explanation:

' ,ÿsd@p@`;0"longstring"síu'ù*ƒ`Moq' j
' ,ÿs                                   title case input, split on spaces
     d@p@                               pop first and last words to stack
         `;0"longstring"síu'ù*ƒ`M       for every word except the first and last:
          ;0"longstring"s                 duplicate word, split the long string on 0s
                         íu               1-based index of word in list (0 if not found)
                           'ù*            "ù"*(index)
                              ƒ           execute the resulting string as a function (lowercases word if it's in the list)
                                 oq' j  put the first and last word back in the list, join with spaces

0

Batch, 323 bytes

@echo off
set s=
for %%w in (@%*@)do call:w %%w
echo%s%
exit/b
:w
for %%s in (a an the at by for in of on to up and as but or nor)do if %%s==%1 set s=%s% %1&exit/b
set w=%1
set w=%w:@=%
set f=%w:~0,1%
for %%c in (A B C D E F G H I J K L M N O P Q R S T U V W X Y Z)do call set f=%%f:%%c=%%c%%
set s=%s% %f%%w:~1%

With comments:

@echo off
rem Start with an empty output string
set s=
rem Wrap the parameters in @ signs to identify the first and last words 
for %%w in (@%*@) do call :w %%w
rem Ignore the leading space when printing the result
echo%s%
exit/b
:w
rem Check whether this is a word that we don't change
for %%s in (a an the at by for in of on to up and as but or nor) do if %%s==%1 set s=%s% %1&exit/b
set w=%1
rem Delete any @ signs from the first and last words
set w=%w:@=%
rem Get the first character
set f=%w:~0,1%
rem Case insensitively replace each upper case letter with itself
for %%c in (A B C D E F G H I J K L M N O P Q R S T U V W X Y Z) do call set f=%%f:%%c=%%c%%
rem Concatenate with the rest of the word
set s=%s% %f%%w:~1%
Dengan menggunakan situs kami, Anda mengakui telah membaca dan memahami Kebijakan Cookie dan Kebijakan Privasi kami.
Licensed under cc by-sa 3.0 with attribution required.