Jump to content

My Alethi Font


Turos

Recommended Posts

I was messing around with FontForge, a free font creator, and decided to make an Alethi font from the Stormlight Archive. I realize there is a way better one being made in another topic, but this one is a simple, unflourished version to tide us over until it is finished!

Enjoy: AlethiTS Font.zip

Alternate Link: Here

If your are using a non-Windows operating system and .tff fonts don't work on your computer, let me know what file type you need and I'll make one for ya.

------

I have made some updates, including fixing the size of the font, so you can use it at 12pt without problems even seeing it :P

I aligned the center lines on all characters so they match up now. I have also added the center line to the period character.

As a bonus, for people who want the entire sentence to fit on one center line, I have included a second font that adds the center line to the space character called AlethiTS_lined.ttf

Version 2: AlethiTS Fontv2.zip

Edited by Turos
Link to comment
Share on other sites

I didn't have much to do today, so I whipped up a quick Java program to convert from normal text assuming Roman letters into Turos's convention.

An example of this would be converting from "This is a test. Chh." into ".Tis is a test .Ch"

EDIT: And then just copy the modified text into a word document for font-ificatin.

I'll spoil the code here as well as attaching a text file (the forum doesn't like .java files). I haven't gotten my hands dirty actually coding in a while, so I apologize for any mistakes/laughable inefficiencies present in this first iteration. I invite anyone who knows what they're doing to suggest/submit improvements.

Instructions: Place the text you want to transliterate into a .txt file. It's easiest to put it in the same directory as the java file is being run from, with any sub-directories requiring that a path be given. Simply type in the file name (i.e. Example.txt) and a new file called Alethi_<YourFile>.txt will be created in the same directory as the original.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan

* @date 01/12/2012

* @version 1.0

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

public class AlethiTranslator_1_2_1

{

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter path to input file: ");

String temp = input.next();

//temp = "Test.txt";

String alethi = convertText(temp);

temp = "Alethi_"+temp;

writeFile(alethi,temp);

}

private static String convertText(String roman) throws IOException

{

char[] body = readFile(roman);

periodMover(body);

String alethi = replaceLetters(body);

return alethi;

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static char[] readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("Invalid file path");

}

whole=whole.substring(0,whole.length()-1); //to get rid of last \n

return whole.toCharArray();

}

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("File already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

/**

* In the Alethi alphabet, sentences start with a period '.' and don't end with anything.

*/

private static void periodMover(char[] array)

{

int temp = 0;

for(int i=0;i<array.length;i++)

{

if(array=='.')

if(!(((array.length - i) > 3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))) //ellipsis

{

twistRight(array,temp,i);

temp=i+1;

}

if(!inAlphabet(array))

temp=i;

}

}

private static boolean inAlphabet(char character)

{

//stupid workaround

String temp = ""+character;

temp.toLowerCase();

character = temp.charAt(0);

char[] library = new char[27];

library[26]=' ';

int place = 0;

for(int i = 97; i <=122; i++){

library[place] = (char)i;

place++;

}

for(int i = 0;i<library.length;i++)

if(library==character)

return true;

return false;

}

private static void twistRight(char[] array, int start, int end)

{

if (start==end)

return;

char a = array[start];

char b;

array[start] = '.';

while(start!=end)

{

start++;

b = array[start];

array[start] = a;

a = b;

}

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

private static String replaceLetters(char[] array)

{

String body = new String(array);

body = body.toLowerCase();

body = replace(body, "th","T");

body = replace(body, "sh","S");

body = replace(body, "ch","C"); //took some liberties here, capitalized the C to make room for the implicit c->k conversion

body = replace(body, "ck","k"); //and now ck->k

body = replace(body, "c","k");

body = replace(body,"C","c"); //and now uncapitalizing the C

body = replace(body, "ks","X");

body = replace(body, "q","ku");

body = replace(body, "wa","ua");

body = replace(body, "we","ue");

body = replace(body, "wi","ui");

body = replace(body, "wo","uo");

body = replace(body, "wu","uu");

return body;

}

private static String replace (String body, String target, String sub)

{

int target_size = target.length();

for(int i = 0; i<=body.length()-target_size;i++)

{

if((body.substring(i,i+target_size).equals(target)))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=((sub.length()+1)-target_size);

}

}

return body;

}

}

EDIT:

I don't know why, but they have both a j and a y letter :/

I couldn't find anything differentiating vowels in the script :mellow:

Sorry, didn't see you there for some reason. The reason I ask is because this (admittedly not the Coppermind) wiki provides a chart that shows two different shapes for "U."

EDIT 2: Tell a lie! that other "U" is a V.

Any other specifications you can think of besides c->k, ck->k would also be useful. Do you know what's going on with "Wh", "gh," etc?

Edited by Kurkistan
Link to comment
Share on other sites

Double post ftw!

Looking more in-depth at the problem of systematized transliteration, I'm considering taking this on as a mini-project. That 'e' modifier is annoying. From this point on, I'll only update the replaceLetters() function, where the replace() function is formatted (Entire text, Section to Replace, Replacement). Barring mistakes and inefficiencies in the original code, of course.

String body = new String(array);

body = body.toLowerCase();

//Necessary

body = replace(body, "th","T");

body = replace(body, "sh","S");

body = replace(body, "ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body, "ks","X");

body = replace(body, "q","ku");

body = replace(body, "wa","ua");

body = replace(body, "we","ue");

body = replace(body, "wi","ui");

body = replace(body, "wo","uo");

body = replace(body, "wu","uu");

//Ease of use

//c

body = replace(body, "ck","k"); //and now ck->k

body = replace(body, "sc","s");

body = replace(body, "cy","si");

body = replace(body, "cy","si");

body = replace(body, "cea","sea"); //?

body = replace(body, "cea","sea");

body = replace(body, "ce ","es");

body = replace(body, "ce\n","es");

body = replace(body, "ces","eses");

body = replace(body, "cing","sing");

body = replace(body, "c","k");

body = replace(body,"C","c"); //and now uncapitalizing the C

//wh

body = replace(body,"wha","ua");

body = replace(body,"whe","ue");

body = replace(body,"whi","ui");

body = replace(body,"whu","uu");

body = replace(body,"who ","hu");

//ph

body = replace(body,"ph","f");

//gh

body = replace(body,"gh ",""); //Can't think of a case where a "gh" at the end of a word actually changes anything

body = replace(body,"gh\n","");

body = replace(body,"gh","g");

//ere - their vs there, here

body = replace(body,"ere ","eir");

body = replace(body,"ere\n","eir");

P.S. You guys get to watch (and hopefully contribute to) my process, as opposed to me sitting alone in the dark for a few days and then throwing a list at you out of the blue (dark blue, presumably).

Edited by Kurkistan
Link to comment
Share on other sites

@Kurkistan: ... you sir, are a mad genius! :blink::lol::blink::lol::o:P:lol:

Plus plus plus!

Ok, so from what I have noticed so far, you still need to include what to do with w and q at the end of a word. I think q would be a k sound probably and w, maybe just a u?

Then again, words that end with 'ow' make a much different sound:

How would technically be Hau

But this is where it gets tricky. Tow should be To, with a long 'o'. I don't know, maybe it would be best if I made a unique symbol for w for these cases. That would be taking quite a bit of artistic license, but what can you do?

Either way, great work bro! Yours is truly a heart of linguistic fervor!

Ps: I also has some more thoughts on this section:

body = replace(body, "sc","s");
body = replace(body, "cy","si");
body = replace(body, "cy","si");
body = replace(body, "cea","sea"); //?

body = replace(body, "cea","sea");
body = replace(body, "ce ","es");
body = replace(body, "ce\n","es");

body = replace(body, "ces","eses");

'sc' as in 'scene' should definitely be 's', but there is the rare case like in 'skyscraper' where it should be 'sk' instead.

'cy' as in 'cycle' should for certain be 'si', but not in the case of 'fancy', where a long 'e' would be proper as opposed to a long 'i'.

for 'cea', just thought I would help you save a few bytes of data by pointing out you have it down twice. I can't think of a case where 'cea' is used, but that could be a proper translation, either 'sea' or maybe just 'se', depending.

Here's where I'm confused. 'ce' seems to me to make a 's' hiss sound, and not a pronunciation of the name of the letter as 'es'. An example would be 'since' = 'sins'. You've probably found a different case where 'ce' is used, but be aware of cases like 'since'.

Same would go for 'ces'. I don't see a need for the beginning 'e' before 'ses'. Also, if you wanna get really technically and muck things up like crazy, (almost) anything ending in 's', other than names, would sound like a 'z' instead. Again, I have no idea how much work would be involved in making such an exception which really doesn't matter anyway.

Other than those bits, your program is amazingly accurate from what I can tell!

PPS: Oh ya! Since you are putting so much effort into this sweet script, let me know if you want a personalized version of the font to fit better with what you want to make the program do.

PPPS: Oh right, about the 'gh' thing, I can only think of two examples at the moment:

'bough' = 'bau' (like the act of bowing)

'cough' = 'kof'

And of course the lame silent 'gh' like in 'bought', and the wierd 'gh' used in 'ghost'

Edited by Turos
Link to comment
Share on other sites

@Kurkistan: ... you sir, are a mad genius! :blink::lol::blink::lol::o:P:lol:

Plus plus plus!

Ok, so from what I have noticed so far, you still need to include what to do with w and q at the end of a word. I think q would be a k sound probably and w, maybe just a u?

Then again, words that end with 'ow' make a much different sound:

How would technically be Hau

But this is where it gets tricky. Tow should be To, with a long 'o'. I don't know, maybe it would be best if I made a unique symbol for w for these cases. That would be taking quite a bit of artistic license, but what can you do?

Either way, great work bro! Yours is truly a heart of linguistic fervor!

Thanks for the triple plus-sing, and done, although "tow" will just have to cry in a corner by itself for now.

'sc' as in 'scene' should definitely be 's', but there is the rare case like in 'skyscraper' where it should be 'sk' instead.

'cy' as in 'cycle' should for certain be 'si', but not in the case of 'fancy', where a long 'e' would be proper as opposed to a long 'i'.

for 'cea', just thought I would help you save a few bytes of data by pointing out you have it down twice. I can't think of a case where 'cea' is used, but that could be a proper translation, either 'sea' or maybe just 'se', depending.

Here's where I'm confused. 'ce' seems to me to make a 's' hiss sound, and not a pronunciation of the name of the letter as 'es'. An example would be 'since' = 'sins'. You've probably found a different case where 'ce' is used, but be aware of cases like 'since'.

Same would go for 'ces'. I don't see a need for the beginning 'e' before 'ses'. Also, if you wanna get really technically and muck things up like crazy, (almost) anything ending in 's', other than names, would sound like a 'z' instead. Again, I have no idea how much work would be involved in making such an exception which really doesn't matter anyway.

I think I may have implemented all of this correctly. It's a thorny issue though, so keep a sharp eye on it.

Towards the end of my changes, I thought to look at Dictionary.com. It's phonetic spellings are quite helpful. It wouldn't have occurred to me otherwise to use y so freely. I should probably take a few minutes to look over all of my changes again, but I need a break, and want to get some feedback before I get too caught up in a Kurkistan-only thought loop.

Changed "eses" to "seas." Don't know where my mind was there.

The Z problem is difficult. There are some soft S's in normal words, like Peas, that would be compromised. How about ses -> sez?

Other than those bits, your program is amazingly accurate from what I can tell!

PPS: Oh ya! Since you are putting so much effort into this sweet script, let me know if you want a personalized version of the font to fit better with what you want to make the program do.

PPPS: Oh right, about the 'gh' thing, I can only think of two examples at the moment:

'bough' = 'bau' (like the act of bowing)

'cough' = 'kof'

And of course the lame silent 'gh' like in 'bought', and the wierd 'gh' used in 'ghost'

My goal for the program is just to enable your implementation of the font to the largest extent possible, so I think that a personalized version might undermine that a bit :P. I'll let you know if there are any irreconcilable issues, but I haven't found any yet. Thanks for the offer.

I did underestimate those -gh's. That section definitely needs work.

1.3.6.1 code:

String body = new String(array);

body = body.toLowerCase();

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//s at end

body - replace(body,"s\n","z\n"); //Needs to go before c->s conversion, since C's are all soft S's

//c

body = replace(body,"ck","k");

body = replace(body,"sce","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy"); //making changes from 1.3.0

body = replace(body,"sc","sk");

body = replace(body,"cy","si");

body = replace(body,"cea","sea");

body = replace(body,"ace","eys");

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ice","ahys"); //Long S.

body = replace(body,"ces","seez");

body = replace(body,"ce\n","s\n");

body = replace(body,"cing","sing");

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"c","k");

body = replace(body,"C","c"); //and now uncapitalizing the C

//wh

body = replace(body,"wha","ua");

body = replace(body,"whe","ue");

body = replace(body,"whi","ui");

body = replace(body,"whu","uu");

body = replace(body,"who\n","hu\n");

//ph

body = replace(body,"ph","f");

//gh

body = replace(body,"gha","gah"); //This section need work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//ere - their vs there

body = replace(body,"ere\n","eir\n");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".too\n",".to\n");

body = replace(body," too\n"," to\n");

body = replace(body,".two\n",".to\n");

body = replace(body," two\n"," to\n");

//q at end

body = replace(body,"q\n","k\n");

//w at end

body = replace(body,"ow\n","au\n");

body = replace(body,"w\n","u\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

//body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"ks","X");

body = replace(body,"kz","X"); //In case of switching from S->Z

body = replace(body,"q","ku");

body = replace(body,"wa","ua");

body = replace(body,"we","ue");

body = replace(body,"wi","ui");

body = replace(body,"wo","uo");

body = replace(body,"wu","uu");

return body;

Edited by Kurkistan
Link to comment
Share on other sites

Dude! Sounds like you got it all figured out then. B)

I only found one problem:

body = replace(body,"ought","awt");

You might wanna make that convert to 'aut' or 'ot' since i never included a symbol for 'w'

But ya, no kidding! I know I'm gonna enjoy using this tool xD

Thanks for making my font easier to use! :lol:

Edited by Turos
Link to comment
Share on other sites

Dude! Sounds like you got it all figured out then. B)

I only found one problem:

body = replace(body,"ought","awt");

You might wanna make that convert to 'aut' or 'ot' since i never included a symbol for 'w'

But ya, no kidding! I know I'm gonna enjoy using this tool xD

Thanks for making my font easier to use! :lol:

Let's not get too far ahead of ourselves! B) We probably still need some detail work, especially on the -gh's, so we might want to bring in a fresh set of eyes to double check everything.

That's why I moved all of the "necessary" changes down to the bottom: so I wouldn't have to worry about stuff like not putting W's into phonetic spellings. On that note, I did forget to throw in a simple w->u conversion if all else failed.

You're welcome, once again. It's no problem. Have you been able to actually use the program?

1.3.6.2 (I might want to amend my naming protocols. . .), end of segment:

//w at end

body = replace(body,"ow\n","au\n");

--Got rid of w\n->u\n

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

//body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"ks","X");

body = replace(body,"kz","X"); //In case of switching from S->Z

body = replace(body,"q","ku");

body = replace(body,"wa","ua");

body = replace(body,"we","ue");

body = replace(body,"wi","ui");

body = replace(body,"wo","uo");

body = replace(body,"wu","uu");

//--New

body = replace(body,"w","u"); //exception catcher

Edited by Kurkistan
Link to comment
Share on other sites

I had to download the JDK to run it from Command Prompt, but ya it works!

I didn't realize you made the periods move to the beginning of sentences. Nice.

I do have a bug report for you though, at least from the version you attached. Here's my pre-/post-conversion test file:

Before conversion:

Hello.

I bought a new computer today. It runs really fast and can handle any program I want.

Well, enjoy.

After conversion:

.hello

.i bought a new komputer today it runs really fast and kan handle any program .i uant

uell., enjoy

Notice the second sentence's period does move, but not to the front. Same with the third sentence.

Anyway, I'm gonna add in your updates and let you know about anything else. Not sure how to compile the file as an executable, but that would help you get some more testers.

Seriously, everyone should try this tool out. It's great!

@Argent:

Now, someone just needs to translate The Way of Kings in Alethi...

Hey, when this gets polished up, you may get your wish! :P

Link to comment
Share on other sites

That's my bad. I've just revamped the period-moving process. Also, you're not supposed to be using any punctuation besides periods, oh Alethi-writer.

We have a stable (to my knowledge) version now, and I need to incorporate some mildly significant changes to the existing code, so here's an attachment/spoiler of the newest version.

EDIT: Small change to allow for no-period headers and generalize twistRight(), as well enable proper replacement at the end of the file.

EDIT 2: Smaller change to make sure ces->seez works even if S->Z conversion happens. Also changed name from "AlethiTranslator" to "AlethiTransliterator."

EDIT 3: Made a few changes to the "sc" conversions.

EDIT 4: Misc. fixes, tinkering with "sc" and s->z conversions

EDIT 5: Added more specifications for words beginning with 'c'.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan

* @date 01/14/2012

* @version 1.4.5

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_4_5

{

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter path to input file: ");

String temp = input.next();

//temp = "Test.txt";

String alethi = convertText(temp);

temp = "Alethi_"+temp;

writeFile(alethi,temp);

}

private static String convertText(String roman) throws IOException

{

char[] body = readFile(roman);

periodMover(body);

String alethi = replaceLetters(body);

return alethi;

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static char[] readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("Invalid file path");

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole.toCharArray();

}

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("File already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

/**

* In the Alethi alphabet, sentences start with a period '.' and don't end with anything.

*/

private static void periodMover(char[] array)

{

int temp = 0;

for(int i=0;i<array.length;i++)

{

if(array=='.'){

if(!(((array.length - i) >= 3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))) //ellipsis

{

twistRight(array,temp,i);

i++;

while(i<array.length)

if(!inAlphabet(array))

i++;

else

break; //Yes, the cardinal sin.

temp=i;

}

else if(((array.length-i)>=3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))

for(int j=0;j<3;j++)

twistRight(array,temp+j,i+j);

}

else if(array=='\n')

temp=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

}

private static boolean inAlphabet(char character)

{

char[] library = new char[26];

int place = 0;

for(int i = 97; i <=122; i++){

library[place] = (char)i;

place++;

}

if(Arrays.binarySearch(library,character)>=0) //I felt embarrassed by my earlier search algorithm.

return true;

return false;

}

private static void twistRight(char[] array, int start, int end)

{

if (start==end)

return;

char a = array[start];

char b;

array[start] = array[end]; //'.', although this is generalized

while(start!=end)

{

start++;

b = array[start];

array[start] = a;

a = b;

}

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

private static String replaceLetters(char[] array)

{

String body = new String(array);

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//s at end

body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//c

body = replace(body,"ck","k");

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"sce","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"cea","sea");

body = replace(body,"ace","eys");

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ice","ahys"); //Long S.

body = replace(body,"ces","seez");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"cing","sing");

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".ce",".se");

body = replace(body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = replace(body,"C","c"); //and now uncapitalizing the C

//wh

body = replace(body,"wha","ua");

body = replace(body,"whe","ue");

body = replace(body,"whi","ui");

body = replace(body,"whu","uu");

body = replace(body,"who\n","hu\n");

//ph

body = replace(body,"ph","f");

//gh

body = replace(body,"gha","gah"); //This section need work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//ere - their vs there

body = replace(body,"ere\n","eir\n");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".too\n",".to\n");

body = replace(body,".two\n",".to\n");

//q at end

body = replace(body,"q\n","k\n");

//w at end

body = replace(body,"ow\n","au\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

//body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"ks","X");

body = replace(body,"kz\n","X\n"); //In case of switching from S->Z

body = replace(body,"q","ku");

body = replace(body,"wa","ua");

body = replace(body,"we","ue");

body = replace(body,"wi","ui");

body = replace(body,"wo","uo");

body = replace(body,"wu","uu");

body = replace(body,"w","u"); //exception catcher

return body.substring(1,body.length()-1); //clipping first/last '\n'

}

private static String replace (String body, String target, String sub)

{

int target_size = target.length();

if(target.endsWith("\n"))

body = replace(body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub.length()-1)+" "));

if(target.startsWith(".")){

body = replace(body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub.length())));

body = replace(body,("\n"+target.substring(1,target_size)),("\n"+sub.substring(1,sub.length())));

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=((sub.length()+1)-target_size);

}

}

return body;

}

}

Edited by Kurkistan
Link to comment
Share on other sites

Yes, double post, but I think this one deserves it's own post.

I added some debugging tools at the end of the main() and replaceLetters() functions, and used them to derive a large portion of the rules for 'e' modifiers at the end of words. I also moved the c-replacements to the bottom of the replaceLetters() function, so we'll see what happens there. As always, comments from a fresh set of eyes are more than welcome.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan

* @date 01/14/2012

* @version 1.5.4

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_5_4

{

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter path to input file: ");

String temp = input.next();

//temp = "Test.txt";

String alethi = convertText(temp);

temp = "Alethi_"+temp;

writeFile(alethi,temp);

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

private static String convertText(String roman) throws IOException

{

char[] body = readFile(roman);

periodMover(body);

String alethi = replaceLetters(body);

return alethi;

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static char[] readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("Invalid file path");

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole.toCharArray();

}

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("File already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

char[] body_array = body.toCharArray();

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body_array=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)){ //c, q, w, and x

library[place] = (char)i;

place++;

}

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body_array=='\n')

line++;

else if(Arrays.binarySearch(library,body_array)<0) //not in library

violations = violations + (line+":"+body_array) + "; ";

return violations;

}

/**

* In the Alethi alphabet, sentences start with a period '.' and don't end with anything.

*/

private static void periodMover(char[] array)

{

int temp = 0;

for(int i=0;i<array.length;i++)

{

if(array=='.'){

if(!(((array.length - i) >= 3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))) //ellipsis

{

twistRight(array,temp,i);

i++;

while(i<array.length)

if(!inAlphabet(array))

i++;

else

break; //Yes, the cardinal sin.

temp=i;

}

else if(((array.length-i)>=3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))

for(int j=0;j<3;j++)

twistRight(array,temp+j,i+j);

}

else if(array=='\n')

temp=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

}

private static boolean inAlphabet(char character)

{

char[] library = new char[26];

int place = 0;

for(int i = 97; i <=122; i++){

library[place] = (char)i;

place++;

}

if(Arrays.binarySearch(library,character)>=0) //I felt embarrassed by my earlier search algorithm.

return true;

return false;

}

private static void twistRight(char[] array, int start, int end)

{

if (start==end)

return;

char a = array[start];

char b;

array[start] = array[end]; //'.', although this is generalized

while(start!=end)

{

start++;

b = array[start];

array[start] = a;

a = b;

}

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

private static String replaceLetters(char[] array)

{

String body = new String(array);

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//E at end - Some interference possible with C's

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //? with missing e

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

body = replace(body,"eve\n","eev\n");

body = replace(body,"ile\n","hyl\n");

body = replace(body,"gle\n","guhl\n");

body = replace(body,"base\n","eys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","eys\n"); //Don't need to allow for c->k if c's are bellow

body = replace(body,"Case\n","eys\n"); //C == ch

body = replace(body,"erase\n","eys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,"come\n","cuhm\n"); //Need to move this up

body = replace(body,"ome\n","ohm\n");

body = replace(body,"vate\n","vit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"tle\n","l\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,"ine\n","ahyn\n");

body = replace(body,".one\n",".uhn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"ope\n","ohp\n");

String[] temp = {"en","st","un","c","f","g","s","t",""};

for(int i = 0; i<temp.length;i++)

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"rue\n","roo\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"ade\n","eyd\n");

//ere - their vs there

body = replace(body,"ere\n","eir\n");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"aire\n","air\n");

body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","hrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"ble\n","buhl\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"gue\n","eeg\n");

//s at end

//body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//c

body = replace(body,"ck","k");

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"sce","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"cea","sea");

body = replace(body,"ace","eys");

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ice","ahys"); //Long S.

body = replace(body,"ces","seez");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"cing","sing");

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".ce",".se");

body = replace(body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = replace(body,"c\n","k\n");

//Generalized c

body = replace(body,".acc","aks"); //the dreaded double c's

body = replace(body,".ecc","eks");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usd\n");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsy");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweynt");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"icc","ik");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,".occ","uhk");

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

//END C'S

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

//wh

body = replace(body,"wha","ua");

body = replace(body,"whe","ue");

body = replace(body,"whi","ui");

body = replace(body,"whu","uu");

body = replace(body,"who\n","hu\n");

//gh

body = replace(body,"gha","gah"); //This section need work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".too\n",".to\n");

body = replace(body,".two\n",".to\n");

//q at end

body = replace(body,"q\n","k\n");

//w at end

body = replace(body,"ow\n","au\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

//body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"q","ku");

body = replace(body,"wa","ua");

body = replace(body,"we","ue");

body = replace(body,"wi","ui");

body = replace(body,"wo","uo");

body = replace(body,"wu","uu");

body = replace(body,"w","u"); //exception catcher

//ph

body = replace(body,"ph","f");

// body = replace(body,"e\n","Q\n"); //Just for debugging

// body = replace(body,"TQ","Te");

return body.substring(1,body.length()-1); //clipping first/last '\n'

}

private static String replace (String body, String target, String sub)

{

int target_size = target.length();

if(target.endsWith("\n")){ //checks for spaces and for plurals, also does s->z conversion

body = replace(body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub.length()-1)+" "));

if(target.charAt(target_size-2)!='s')

body = replace(body,(target.substring(0,target_size-1)+"s\n"),(sub.substring(0,sub.length()-1)+"z\n"));

}

if(target.startsWith(".")){

body = replace(body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub.length())));

body = replace(body,("\n"+target.substring(1,target_size)),("\n"+sub.substring(1,sub.length())));

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub.length()-target_size);

}

}

return body;

}

}

Edited by Kurkistan
Link to comment
Share on other sites

Okay, I'm a complete and total Java newbie and have no idea how to use this. I know I have Java installed but I don't know how to use the txt file for it at all. I can't even seem to be able to open a window for Java.

First of all, you should change the file extension to .java instead of .txt. The forum hates me, so it doesn't let me upload java files directly.

This site shows you how to run a java file off of Command Prompt. I'm using a free program called BlueJ to compile and test the code because it's what I'm used to, but Command Prompt works fine for Turos. Put the .txt that you want to convert in the same directory and then run the file. When prompted, enter the full file name (I've been testing with Test.txt, if you can believe it).

Any and all feedback would be greatly appreciated. I've been staring at this thing too long, so have a very tenuous grasp on reality when it comes to any flaws.

Edited by Kurkistan
Link to comment
Share on other sites

Okay, I got the .class file, and I have the .txt file in the same window (directory?) as the .java file and the .class file. When I run with ">java AlethiTransliterator_1_5_4" it prompts me with "Enter path to input file:" What do I put here? I've tried typing out the whole "C:\Blah\Blah\Blah\filename.txt" as well as just putting "filename.txt". Both give me the same answer:

Invalid file path

Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: -1

at java.lang.String.substring(Unknown Source)

at AlethiTransliterator_1_5_4.replaceLetters(AlethiTransliterator_1_5_4.java:501)

at AlethiTransliterator_1_5_4.convertText(AlethiTransliterator_1_5_4.java:47)

at AlethiTransliterator_1_5_4.main(AlethiTransliterator_1_5_4.java:31)

And then it just gives me the command prompt again. Argh! This is all gibberish to me! I've never felt computer illiterate before now! This is frustrating!

EDIT: Also tried %path%;C:\Blah\Blah\Blah\filename.txt Still no go. (Also, I replaced all of my actual folders with Blah Blah Blah, of course. They're not actually called that.

EDIT2: Okay, maybe it's working now. I tried just the filename.txt again and now it's just blinking. I hope that means it's working. I probably should have picked a smaller file than 2,162KB. My entire copy of WoK in a txt file is pretty big!

EDIT3: Got it to work finally! Haha. Gave up on WoK and made a file with just "Szeth-son-son-Vallano, Truthless of Shinovar, wore white on the day he was to kill a king." Worked much better! That font is really cool, but I wish it were bigger! I have to blow it up to like 90pt in order to see it! Also, is there a way to change the space character to a line? That way they'd be all connected.

Edited by FeatherWriter
Link to comment
Share on other sites

Here's a fancy tool to keep you from having to use Command Prompt every time:

You can make a batch file that will run all of the commands you put in it when you double-click it.

-Open Notepad.

-Paste this text in and make changes to make your folder setup and file names:

cd \Your\Folder\Path\Here
set path=%path%;C:\Program Files\Java\jdk1.7.0_02\bin
java AlethiTranslator_1_2_1

-Make sure to change 'C:' to whatever your hard-drive letter is.

-Also change the name of Kurkistan's file to the current version and the jdk1.7.0_02 to whatver version of the JDK you are using.

-Save the file as a .bat extension.

When you run it, just enter the name of the .txt file you want to convert, and make sure it's in the same folder as your .class file.

And yes, I will make a version with a lined space, too. :)

Unfortunately, I'm still figuring out the font creator program, so I don't know how to make it larger yet, but will try to find out.

Edited by Turos
Link to comment
Share on other sites

New version up with gerunds, proper permutations of ise\n, and easier to use debugging instructions (just change the two booleans at the top of the program).

EDIT: Added rules for nge\n, fixed issue with gerunds, error-proofed plurality checks

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan

* @date 01/15/2012

* @version 1.6.4

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_6_4

{

static boolean debug_char = true;

static boolean debug_end_e = false;

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter path to input file: ");

String temp = input.next();

//temp = "Test.txt";

String alethi = convertText(temp);

if(alethi.equals("&"))

return;

temp = "Alethi_"+temp;

writeFile(alethi,temp);

if(debug_char){

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

}

private static String convertText(String roman) throws IOException

{

char[] body = readFile(roman);

if((body.length==1)&&(body[0]=='&')) //invalid input, halt program

return "&";

periodMover(body);

String alethi = replaceLetters(body);

return alethi;

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static char[] readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("Invalid file path");

return "&".toCharArray();

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole.toCharArray();

}

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("File already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

char[] body_array = body.toCharArray();

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body_array=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)){ //c, q, w, and x

library[place] = (char)i;

place++;

}

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body_array=='\n')

line++;

else if(Arrays.binarySearch(library,body_array)<0) //not in library

violations = violations + (line+":"+body_array) + "; ";

return violations;

}

/**

* In the Alethi alphabet, sentences start with a period '.' and don't end with anything.

*/

private static void periodMover(char[] array)

{

int temp = 0;

for(int i=0;i<array.length;i++)

{

if(array=='.'){

if(!(((array.length - i) >= 3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))) //ellipsis

{

twistRight(array,temp,i);

i++;

while(i<array.length)

if(!inAlphabet(array))

i++;

else

break; //Yes, the cardinal sin.

temp=i;

}

else if(((array.length-i)>=3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))

for(int j=0;j<3;j++)

twistRight(array,temp+j,i+j);

}

else if(array=='\n')

temp=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

}

private static boolean inAlphabet(char character)

{

char[] library = new char[26];

int place = 0;

for(int i = 97; i <=122; i++){

library[place] = (char)i;

place++;

}

if(Arrays.binarySearch(library,character)>=0) //I felt embarrassed by my earlier search algorithm.

return true;

return false;

}

private static void twistRight(char[] array, int start, int end)

{

if (start==end)

return;

char a = array[start];

char b;

array[start] = array[end]; //'.', although this is generalized

while(start!=end)

{

start++;

b = array[start];

array[start] = a;

a = b;

}

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

private static String replaceLetters(char[] array)

{

String body = new String(array);

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//E at end - Some interference possible with C's

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //? with missing e

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

body = replace(body,"eve\n","eev\n");

body = replace(body,"ile\n","hyl\n");

body = replace(body,"gle\n","guhl\n");

body = replace(body,"base\n","eys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","eys\n"); //Don't need to allow for c->k if c's are bellow

body = replace(body,"chase\n","eys\n"); //ch == C

body = replace(body,"erase\n","eys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,"come\n","cuhm\n"); //Need to move this up

body = replace(body,"ome\n","ohm\n");

body = replace(body,"vate\n","vit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"tle\n","l\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,"ine\n","ahyn\n");

body = replace(body,".one\n",".uhn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"ope\n","ohp\n");

String[] temp = {"en","st","un","c","f","g","s","t",""};

for(int i = 0; i<temp.length;i++)

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"rue\n","roo\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"ade\n","eyd\n");

//ere - their vs there

body = replace(body,"ere\n","eir\n");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"aire\n","air\n");

body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","hrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"ble\n","buhl\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"promise\n","promis\n");

body = replace(body,"aise\n","eyz\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"gue\n","eeg\n");

//-nge

body = replace(body,"nge\n","nj\n"); //problem with sing vs singe not really being separable at the gerund-testing level

body = replace(body,"sinjing\n","singing\n"); //comprehensive fix for gerund mishaps

body = replace(body,"slinjing\n","slinging\n");

body = replace(body,"strinjing\n","stringing\n");

body = replace(body,"swinjing\n","swinging\n");

body = replace(body,"brinjing\n","bringing\n");

body = replace(body,"flinjing\n","flinging\n");

body = replace(body,"prinjing\n","pringing\n");

body = replace(body,".winjing\n",".winging\n");

body = replace(body,".zinjing\n",".zinging\n");

body = replace(body,".dinjing\n",".dinging\n");

body = replace(body,".pinjing\n",".pinging\n");

//END E's

//s at end

body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//c

body = replace(body,"ck","k");

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"sce","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"cea","sea");

body = replace(body,"acen","eysuhn"); //Don't get complacent

body = replace(body,"ace","eys");

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ice","ahys"); //Long S.

body = replace(body,"ces","seez");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"cing","sing");

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".cer",".sur");

body = replace(body,".ce",".se");

body = replace(body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = replace(body,"c\n","k\n");

body = replace(body,".acc","aks"); //the dreaded double c's

body = replace(body,".ecc","eks");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usd\n");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsy");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweynt");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"icc","ik");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,".occ","uhk");

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

//END C'S

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

//wh

body = replace(body,"wha","ua");

body = replace(body,"whe","ue");

body = replace(body,"whi","ui");

body = replace(body,"whu","uu");

body = replace(body,"who\n","hu\n");

//gh

body = replace(body,"gha","gah"); //This section need work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".to\n",".too\n");

body = replace(body,".two\n",".too\n");

//q at end

body = replace(body,"q\n","k\n");

//w at end

body = replace(body,"ow\n","au\n");

//.sy

body = replace(body,".syr",".suhr"); //Moved up to e-enders

body = replace(body,".syr",".sir");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

//body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"q","ku");

body = replace(body,"wa","ua");

body = replace(body,"we","ue");

body = replace(body,"wi","ui");

body = replace(body,"wo","uo");

body = replace(body,"wu","uu");

body = replace(body,"w","u"); //exception catcher

//ph

body = replace(body,"ph","f");

if(debug_end_e){

body = replace(body,"e\n","Q\n"); //Just for debugging

body = replace(body,".TQ","Te");

body = replace(body,".bQ","be");

body = replace(body,".seQ","seee");

body = replace(body,".mQ","me");

}

return body.substring(1,body.length()-1); //clipping first/last '\n'

}

private static String replace (String body, String target, String sub)

{

int target_size = target.length();

if(target.endsWith("\n")){ //checks for spaces and for plurals, also does s->z conversion where necessary

body = replace(body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub.length()-1)+" ")); //space substitution

if(target_size>=4){ //gerunds, include \n or space

if((!target.endsWith("ing\n"))&&(!target.endsWith("gs\n"))&&(!target.endsWith("gz"))) //leave no base uncovered

if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ie")))

body = replace(body,(target.substring(0,target_size-3)+"ying\n"),(sub.substring(0,sub.length()-1)+"ing\n")); //replacing 'ie' before gerund

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = replace(body,(target.substring(0,target_size-2)+"ing\n"),(sub.substring(0,sub.length()-1)+"ing\n")); //removing 'e'

}else if((!target.endsWith("gs\n"))&&(!target.endsWith("gz"))) //no "ing\n" or s\z at end

body = replace(body,(target.substring(0,target_size-1)+"ing\n"),(sub.substring(0,sub.length()-1)+"ing\n")); //no e, presumably ends in consonant

if((target_size>=2)&&(target.charAt(target_size-2)!='s')&&(target.charAt(target_size-2)!='z'))

if(target.charAt(target_size-2)=='e')

body = replace(body,(target.substring(0,target_size-2)+"es\n"),(sub.substring(0,sub.length()-1)+"ez\n")); //s->z

else

body = replace(body,(target.substring(0,target_size-1)+"s\n"),(sub.substring(0,sub.length()-1)+"z\n")); //s->z

}

if(target.startsWith(".")){

body = replace(body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub.length())));

body = replace(body,("\n"+target.substring(1,target_size)),("\n"+sub.substring(1,sub.length())));

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub.length()-target_size);

}

}

return body;

}

}

@FeatherWriter

Sorry about all the trouble you went through. I tend to take these things for granted. I'm sure you're not computer-illiterate :P.

File size is an issue. My program isn't exactly the most efficient one on the block right now because of the resource-intensive nature of the replace() function, and the sheer number of times that it's called. It should be able to chew through any file size if you give it time, although I would suggest breaking something like the WoK into more manageable chunks so that you don't have to wait so long that you're computer goes unattended should something start to burn.

I wouldn't suggest large-scale transliteration just yet, either, since I'm sure that we still have some letter combinations to work on. If you want to help, you could transliterate a few pages of the WoK to start with and do a close reading for accuracy. Be sure to get rid of any apostrophes and replace exclamation points and question marks with periods, though.

@Turos -Thank you for helping people out there.

P.S. And, as FeatherWriter said, the period character is supposed to have a line as well.

EDIT:

EDIT3: Got it to work finally! Haha. Gave up on WoK and made a file with just "Szeth-son-son-Vallano, Truthless of Shinovar, wore white on the day he was to kill a king." Worked much better! That font is really cool, but I wish it were bigger! I have to blow it up to like 90pt in order to see it! Also, is there a way to change the space character to a line? That way they'd be all connected.

I just ran it on the full text of the Odyssey, and it took just about 7 minutes.

Stats: 597165 characters, 594 KB. I was running at 50% resources the entire time, with no unmanageable heat.

Edited by Kurkistan
Link to comment
Share on other sites

Nice progress! It works really well for me.

Also, version two is up on the first post. Thank you both for the catch on that character missing the line. :lol:

Oh ya, don't worry, version two doesn't change anything that would make your trans-literator incompatible.

EDIT: Also, I noticed that line breaks are removed when the tool is run. Is there a way to keep the line breaks?

And I was testing some of your new conversions. Seems to have a glitch:

Erase. Erasing. Chase. Chasing. Blindly. Ice. Ices.

becomes:

.eys .eysing .eys .eysing .blindly .ahys .ahysz

Er- from Erase and Erasing is removed, as is Ch- from Chase and Chasing.

Edited by Turos
Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...