Jump to content

My Alethi Font


Turos

Recommended Posts

Well, so much for that speedy response reputation thing...

However! Here's my bugtest from version 1.8.9.6 ;)

dustbrinjerz-Dustbringers(j appears instead of g)

lhrj-large(a or o missing)

skee-sky(interesting exception in english)

linjered-lingered(j appears instead of g)

tuhped-topped(uh should be o or oh)

uhposahyt-opposite(same)

gluhnsing-glancing(same, but uh should be a or ah)

abanduhned-abandoned(same)

intriseyt-intricate(s appears instead of k)

rlee-rely(missing e after r)

enou-enough(no f sound)

kayoozed-caused(LOL, this one blew up)

someohn-someone(one is converted to uuhn by itself)

landskape-landscape(i dunno, maybe should be landskaip or something)

This was tested from the prologue of The Way of the Kings. I was amazed at how smoothly it handled the text. Especially nice job on the tab/period fix :lol: The period finally appears after the tab space like it should :)

Like you have mentioned before, it seems like the project is just about perfected. Ha ha! I almost typed 'perfekted' after reading in your transliterated english :P

.uell Tis is turos signing out

.iksellent uork kurkistan

.

Edited by Turos
Link to comment
Share on other sites

Well, so much for that speedy response reputation thing...

However! Here's my bugtest from version 1.8.9.6 ;)

This was tested from the prologue of The Way of the Kings. I was amazed at how smoothly it handled the text. Especially nice job on the tab/period fix :lol: The period finally appears after the tab space like it should :)

Like you have mentioned before, it seems like the project is just about perfected. Ha ha! I almost typed 'perfekted' after reading in your transliterated english :P

If you average together your response times, you still get a pretty good number, so you're still ahead.

Yeah, I've since rued that "endgame" comment. Not so much at that point. ;) I'm hesitant to say it again.

Just as a note (and/or excuse), my main computer just kicked the bucket, leaving me without the "Test.txt" file that contained just about every problem word ever, which I checked every time I updated to guard against accidental changes. So there might be some more mistakes of that nature in this version.

Fixed all of Turos' bugs, made the replace() function a bit more efficient, then made it less efficient by adding on "ized" and "iest" suffixes. Made it so that hyphens are turned into spaces upon conversion. Threw in a few random grammars like "align" and "ape\n."

EDIT: Cleaned up some r-ender words interactions with suffixes, generalized -iest to just -est, added in .def, .fly, cite\n and city\n grammars.

EDIT 2: Added rules for eir\n, ere\n.

EDIT 3: I have no idea how to post attachments otherwise and don't feel like starting a photobucket account just for my signiature.

post-1018-0-52813400-1327709299_thumb.pn

EDIT 4: Added in specific grammar for "Roman" and more general grammars for .rom so that my signiature isn't false advertising. Might need to focus more on "an\n" as a suffix (Trojan, Sicilian, etc.). Added in rules for possessives, moved the counter for the replace() function down so that it didn't double count calls, added in dle\n rules.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan, with significant developmental input from Turos

* @date 01/28/2012

* @version 1.9.2.2

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_9_2_2

{

static boolean debug_char = false;

static boolean debug_end_e = false;

static boolean remove_illegal = true;

static boolean add_CR = true;

/* static String Targets = "";

static int min = 200;

static int max = 400; */

static int Count = 0;

static boolean Counting = true;

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter input file (full name of file in same directory): ");

String temp = input.next();

//temp = "Test.txt";

final double startTime = System.currentTimeMillis();

final double endTime;

try {

String alethi = convertText(temp);

if(alethi.equals("&"))

return;

//putting carriage-returns back in to make it look pretty in Notepad. I can't tell what else they might do.

if(add_CR)

for(int i = 0; i<alethi.length();i++)

if(alethi.charAt(i)=='\n')

alethi = alethi.substring(0,i)+"\r"+alethi.substring(i++,alethi.length());

//writeFile(Targets,"TEMP.txt");

temp = "Alethi_"+temp;

writeFile(alethi,temp);

if(debug_char){

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

} finally {

endTime = System.currentTimeMillis();

}

final double duration = endTime - startTime;

System.out.println("Execution time: "+(duration/1000)+" seconds");

}

private static String convertText(String roman) throws IOException

{

roman = readFile(roman); //text file

if((roman.length()==1)&&(roman.charAt(0)=='&')) //invalid input, halt program

return "&";

if(remove_illegal)

roman = removeCharacters(roman);

roman = periodMover(roman);

roman = spaceEnds(roman);

String alethi = replaceLetters(roman);

return unSpaceEnds(alethi);

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static String readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("File not in directory or misspelled.");

return "&";

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole;

}

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("Output file already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercaseases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body.charAt(i)=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)) //c, q, w, and x

library[place++] = (char)i;

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body.charAt(i)=='\n')

line++;

else if(Arrays.binarySearch(library,body.charAt(i))<0) //not in library

violations = violations + (line+":"+body.charAt(i)) + "; ";

return violations;

}

private static String removeCharacters(String body)

{

char[] library = new char[56];

library[0] = '\t'; //tab

library[1] = '\n';

library[2] = ' ';

library[3] = '.';

int place = 4;

for(int i = 65; i <=90; i++)

library[place++] = (char)i;

for(int i = 97; i <=122; i++)

library[place++] = (char)i;

for(int i = 0; i < body.length(); i++)

if(Arrays.binarySearch(library,body.charAt(i))<0) //I felt embarrassed by my earlier search algorithm.

if((body.charAt(i)=='?')||(body.charAt(i)=='!'))

body = body.substring(0,i)+"."+body.substring(i+1,body.length());

else if(body.charAt(i)=='-')

body = body.substring(0,i)+" "+body.substring(i+1,body.length());

else if(body.charAt(i)==(char)39) //apostrophe character

if((i>0)&&(body.charAt(i-1)=='s')) //allowing for both Unitied States' and United States's, as an example

if((i<body.length()-1)&&(body.charAt(i+1)=='s')) //"-s's"

body = body.substring(0,i)+" A"+body.substring((i++)+2,body.length()); //" A"->"ez"

else

body = body.substring(0,i)+" A"+body.substring((i++)+1,body.length()); //"-s'"

else if((i<body.length()-1)&&(body.charAt(i+1)=='s')) //"-'s"

body = body.substring(0,i)+" B"+body.substring((i++)+2,body.length()); //" B"->"z"

else

body = body.substring(0,i)+body.substring(i--+1,body.length()); //same as normal

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

return body;

}

/**

* In the Alethi alphabet, sentences start with a period '.' and don't end with anything.

*/

private static String periodMover(String body)

{

int start = 0;

for(int i=0;i<body.length();i++)

{

if(body.charAt(i)=='.'){

while((i<body.length())&&(body.charAt(i)=='.')) //multiples

body = body.substring(0,start)+"."+body.substring(start,i)+body.substring((i++)+1,body.length());

while(i<body.length())

if(!inAlphabet(body.charAt(i)))

i++;

else

break; //Yes, the cardinal sin.

start = i;

}

else if(body.charAt(i)=='\n')

start=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

return body;

}

private static boolean inAlphabet(char character)

{

int value = (int)character;

if((value>=97)&&(value<=122)) //just checking lowercase letters

return true;

return false;

}

private static String spaceEnds(String body){

for(int i=0;i<body.length();i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+" "+body.substring((i++)+1,body.length());

else if(body.charAt(i)=='\n'){

body = body.substring(0,i)+" \n "+body.substring(i+1,body.length());

i+=2;

}

//System.out.println(body);

return body;

}

private static String unSpaceEnds(String body){

for(int i=1;i<body.length()-2;i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+body.substring(i+2,body.length());

else if(body.charAt(i)=='\n')

body = body.substring(0,i-1)+"\n"+body.substring((i--)+2,body.length());

if(body.charAt(body.length()-2)=='.')

body = body.substring(0,body.length()-1);

else if(body.charAt(body.length()-2)=='\n')

body = body.substring(0,body.length()-3)+"\n";

return body.substring(1,body.length()-1); //clipping first/last '\n';;

}

public static void test()

{

String body = "\nbutler\n";

String target = "ap\n";

String sub = "op\n";

System.out.println(replace(body,target,sub));

int target_size = target.length();

int sub_size = sub.length();

String sofar = "";

int j = 2;

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

System.out.println(body);

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

private static String replaceLetters(String body)

{

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//ph

body = replace(body,"ph","f");

//anti-

body = replace(body,".anti",".antahy");

//wh

body = replace(body,"who\n","hoo\n");

body = replace(body,"where","huair"); //changed w to u

body = replace(body,"whir","huur");

body = replace(body,"wh","hu"); //Might need more permutations

body = replace(body,".accr",".uhkr"); //many many many

body = replace(body,".acci",".aksi");

body = replace(body,".accord",".uhkawrd");

body = replace(body,".accomp",".uhkuhmp");

body = replace(body,".acco",".uhko");

body = replace(body,".accustom\n",".uhkuhstuhm\n");

body = replace(body,".accolade\n",".akuhleyd\n");

body = replace(body,".accus",".uhkyooz");

body = replace(body,".accurs",".uhkurs");

body = replace(body,".accur",".akyer");

body = replace(body,".accum",".uhkyoom");

body = replace(body,".accout",".uhkoot");

body = replace(body,".accoun",".uhkount");

body = replace(body,".acce",".akse"); //the dreaded double c's

body = replace(body,".ecc",".eks");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,".occ",".uhk");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

//E at end - Some interference possible with C's

body = replace(body,".cause",".kawz");

body = replace(body,"ause\n","awz\n");

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //special case

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"have\n","hav\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

//body = replace(body,"ever\n","ever\n");

body = replace(body,"eve\n","eev\n"); //HOWEVER

body = replace(body,"eever\n","ever\n");

body = replace(body,"ile\n","ahyl\n");

//System.out.println(replace(replace("while ","wh","hu"),"ile\n","ahyl\n"));

//huahyl

body = replace(body,"gle\n","guhl\n");

body = replace(body,".key\n",".kee\n"); //special

body = realReplace("QQQ",body,".keys\n",".kees\n");

body = replace(body,"base\n","beys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","keys\n");

body = replace(body,"chase\n","Ceys\n"); //ch == C

body = replace(body,"Case\n","Ceys\n"); //necessary?

body = replace(body,"erase\n","ihreys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,".one\n",".uuhn\n"); //sepcial

body = replace(body,".someone\n",".suhmuuhn\n");

body = replace(body,".anyone\n",".eneeuuhn\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,".some",".suhm");

body = replace(body,"comedy","komidee");

body = replace(body,"come\n","cuhm\n"); //Need to move this up

body = replace(body,".come",".cuhm");

body = replace(body,"ome\n","ohm\n");

body = replace(body,"ttle\n","tl\n");

body = replace(body,"tle\n","tl\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,".discipline\n",".disipline\n");

body = replace(body,"cine\n","sin\n");

body = replace(body,"ine\n","ahyn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"op\n","ohp\n");

body = replace(body,"ope\n","ohp\n");

body = replace(body,"rue\n","roo\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"bead\n","beed\n");

body = replace(body,".read\n",".reed\n");

body = replace(body,"nead\n","need\n");

body = replace(body,"lead\n","leed\n");

body = replace(body,"ead\n","ed\n"); //general

body = replace(body,"ade\n","eyd\n");

//1.9.2.1

body = replace(body,"heir","air"); //general rule

body = replace(body,"eir\n","er\n");

//this one's touchy, I'm just throwing in "air" exemptions to the "eer" rule where I see them

body = replace(body,"where\n","hwair\n");

body = replace(body,".ere\n",".air\n");

body = replace(body,"there\n","thair\n");

body = replace(body,"ere\n","eer\n");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"tire","tahyuhr"); //NOT \n or e

body = replace(body,"aire\n","air\n");

//body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","ahrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"promise\n","promis\n");

body = replace(body,"aise\n","eyz\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

body = replace(body,"que\n","k\n");

body = replace(body,"udge\n","uhj\n");

body = replace(body,"dge\n","j\n"); //NOT sure

body = replace(body,"age\n","aij\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"rgue\n","rgyoo\n");

body = replace(body,"gue\n","eeg\n");

//ible, might need to generalize downtown

body = replace(body,"ible\n","uhbuhl\n");

//-nge

//problem with sing, singer vs singe, singer not really being separable at the gerund-testing level

body = replace(body,"finger\n","fingger\n");

body = replace(body,"linger\n","lingger\n");

body = replace(body,"finger","fingger");

body = replace(body,"linger","lingger");

body = replace(body,".anger\n",".angger\n");

body = replace(body,".angry\n",".angree\n");//?

/* body = replace(body,"ringe\n","rinj\n"); //This is the best I can do for now.

body = replace(body,"hinge\n","hinj\n");

body = replace(body,".impinge\n",".impinj\n");

body = replace(body,"winge\n","winj\n");

body = replace(body,".binge\n",".binj\n");

body = replace(body,".singe\n",".sinj\n");

body = replace(body,".tinge\n",".winj\n");

body = replace(body,".dinge\n",".dinj\n"); */

body = realReplace("",body,"ringe\n","rinj\n"); //This is the best I can do for now.

body = realReplace("r",body,"hinge\n","hinj\n");

body = realReplace("r",body,".impinge\n",".impinj\n");

body = realReplace("r",body,"winge\n","winj\n");

body = realReplace("r",body,".binge\n",".binj\n");

body = realReplace("r",body,".singe\n",".sinj\n");

body = realReplace("",body,".tinge\n",".winj\n");

body = realReplace("",body,".dinge\n",".dinj\n");

body = replace(body,"ing\n","I\n"); //temporary

body = replace(body,"nge\n","nj\n");

body = replace(body,"I","ing");

/*

body = realReplace("QQQ",body,"nges\n","njez\n");

body = realReplace("QQQ",body,"ngely\n","njly\n");

body = realReplace("QQQ",body,"ngey\n","njee\n");

body = realReplace("QQQ",body,"ngeing\n","njing\n");

body = realReplace("QQQ",body,"nged\n","njed\n");

body = realReplace("QQQ",body,"ngeish\n","njish\n");

body = realReplace("QQQ",body,"ngeable\n","njuhbuhl\n");

body = replace(body,"ing\n","inQg\n");

body = realReplace("QQQ",body,"nger\n","njer\n");

body = realReplace("QQQ",body,"ngers\n","njerz\n");

body = realReplace("QQQ",body,"ngerly\n","njerlee\n");

body = realReplace("QQQ",body,"ngery\n","njeree\n");

body = realReplace("QQQ",body,"ngering\n","njering\n");

body = realReplace("QQQ",body,"ngered\n","njerd\n"); //that should do it. */

//END E's

//s at end - 1.7.4.5 -> unneeded, I think

//body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//START C 1.7 - moved so that higher number of characters in target get's preference, blocks kept cohesive

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

body = replace(body,"accent","aksent");

body = replace(body,"exercise\n","eksersahyz\n");

body = replace(body,".once",".wuhns");

body = replace(body,"preface\n","prefis\n"); //special

body = replace(body,"icise\n","uhsahyz\n");

body = replace(body,"rcise\n","ruhsahyz\n");

body = replace(body,".tacit\n",".tasit\n");

body = replace(body,"ciate\n","sheeeyt\n");

body = replace(body,"cate\n","kit\n");

body = replace(body,"vate\n","vit\n"); //pulled from E section, might be a sign of things to come

body = replace(body,"literate\n","literit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"cision\n","sizhuhn\n");

body = replace(body,"cise\n","sahys\n");

body = replace(body,"cist\n","sist");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usst\n"); //D's

body = replace(body,"came\n","keym\n");

body = replace(body,"came","kamuh");

body = replace(body,"ct","kt"); //factual

body = replace(body,"tual\n","Cual\n");

body = replace(body,".acid\n",".asid\n");

body = replace(body,".aci",".uhsi");

body = replace(body,"ierce\n","eers\n");

body = replace(body,"ince\n","ins\n");

//body = replace(body,".ance",".ahns");

body = replace(body,".trance",".trahns");

body = replace(body,"dance\n","dahns\n");

body = replace(body,"Cance\n","Cahns\n");

body = replace(body,"cance\n","cahns\n");

body = replace(body,"lance\n","lahns\n");

body = replace(body,"vance\n","vahns\n");

body = replace(body,"ance\n","uhns\n");

body = replace(body,"all\n","awl\n");

body = replace(body,".supp",".suhpp"); //just a general rule

body = replace(body,"appa","apuh");

body = replace(body,".appear",".uhpeer");

body = replace(body,"ppen","pen"); //double p's, might NOT be done

body = replace(body,"pplet\n","plit\n");

body = replace(body,"pple\n","puhl\n");

body = realReplace("QQQ",body,".supplement\n",".suhpluhment\n"); //special case

body = replace(body,"ppl","puhl");

body = replace(body,"upp\n","uhp");

body = replace(body,"oppor","oper");

body = replace(body,".opp",".ohp");

body = replace(body,".op",".ohp");

body = replace(body,"opp","uhp");

body = replace(body,"ypp","ip");

body = replace(body,"pp","p"); //Last ditch, should cover most before this

body = replace(body,"tice\n","tis\n");

body = replace(body,"arice\n","eris\n");

body = replace(body,"orice\n","uhis\n");

body = replace(body,"cipice\n","suhpis\n"); //patch for precipice

body = replace(body,"ipice\n","uhpis\n");

body = replace(body,".vice\n","vahys\n");

body = replace(body,"vice\n","vis\n");

body = replace(body,"ice\n","ahys\n"); //Long S. NOT sure about \n's

body = replace(body,"egy\n","ijee\n"); //possibilities/strategies fix, I have now idea how the ended up "kiez"

body = replace(body,"city\n","sitee\n");

body = replace(body,"cite\n","sahyt\n");

body = replace(body,"ity\n","itee\n");

body = replace(body,"ite\n","ahyt\n");

body = replace(body,"irst\n","urst\n");

body = replace(body,"ong\n","ong\n");

body = replace(body,"ull\n","ool\n");

body = replace(body,"cide\n","sahyd\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ence\n","ens\n");

body = replace(body,"rend\n","rend\n");

//1.8.9 Pie-

body = replace(body,"piety","pahyitee");

body = replace(body,".pier\n"," peer\n");

body = replace(body,".pie\n"," pahy\n");

body = replace(body,".pie",".pee");

body = replace(body,"ces\n","seez\n");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"ci\n","sahy\n");

body = replace(body,"gan\n","gahn\n");

body = replace(body,"dle\n","dl\n");

body = replace(body,"align\n","uhlahyn\n");

body = replace(body,"oy\n","oi\n");

body = replace(body,"ace\n","eys\n");

body = replace(body,".chull\n",".as\n");

body = replace(body,".chull",".uhs"); //Assoc-

body = replace(body,".rely\n",".relahy\n");

body = replace(body,"ely\n","lee\n"); //MUST BE LAST IN \N

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"cious","shuhs"); //For Ithaca!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"ply\n","plahy\n");

body = replace(body,".by\n",".bahy\n");

body = replace(body,".my\n",".mahy\n");

body = replace(body,".die\n",".dahy\n");

body = replace(body,".dye\n",".dahy\n");

body = replace(body,".bye\n",".bahy\n"); //conflict

body = replace(body,"hype","hahype");

body = replace(body,"hypo","hahypo");

body = replace(body,"hypn","hipn");

body = replace(body,"hyphen","hahyfuhn");

body = replace(body,"hyfen","hahyfuhn"); //ph->f

body = replace(body,"yp","ip");

body = replace(body,"duct","duhkt");

body = replace(body,"stion","sCuhn"); //1.8.9.4

body = replace(body,"tion","Suhn"); //1.8

body = replace(body,"ssion","Suhn"); //1.8.6

body = replace(body,"sion","zhuhn");

body = replace(body,"cean","Suhn");

body = replace(body,".abou",".uhbou");

body = replace(body,".aband",".uhbanduhn");

body = replace(body,"ture","Cur");

body = replace(body,"cies","seez"); //prophocies

body = replace(body,"ciez","seez"); //s->z already done

body = replace(body,"iew","yoo");

body = replace(body,".face",".feys");

body = replace(body,"face","feys");

//For-

body = replace(body,".fore",".fohr");

body = replace(body,".for",".fohr");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,"acen","eysuhn"); //Don't get complacent

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweyeynt");

body = replace(body,"cing","sing");

//1.6.5 - odyssey test

body = replace(body,"exce","ikse");

body = replace(body,"excit","iksahyt");

body = replace(body,"excis","eksahyz");

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ight","ahyt");

body = replace(body,"cep","sep");

body = replace(body,"cin","sin");

body = replace(body,".cit",".sit");

body = replace(body,"cip","sip");

body = replace(body,".def",".dihf");

body = replace(body,"cif","sif"); //NOT sure

body = replace(body,"icc","ik");

body = replace(body,"icn","ikn");

body = replace(body,"sce","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

//body = replace(body,"sco","sko");

body = replace(body,"cea","sea");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsee");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cent","sent"); //odyssey

body = replace(body,"it\n","it\n"); //Tacked on for suffix reasons

body = replace(body,"ap\n","ap\n");

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".cer",".sur");

body = replace(body,".ce",".se");

body = replace(body,"ck","k");

/* body = realReplace("QQQ",body,"C\n","k\n");

body = realReplace("QQQ",body,"ch\n","k\n"); */

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

body = replace(body,"ce","se"); //might want to move

body = replace(body,"ape\n","eyp\n");

body = realReplace("QQQ",body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = realReplace("QQQ",body,"c\n","k\n"); //to stop mischeif

//END C'S

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

body = replace(body,".be\n",".bee\n");

body = replace(body,".maybe\n",".meybee\n");

//rom

body = realReplace("QQQ",body,".roman\n",".rohmahn\n");

body = replace(body,"rom","rohm");

//gh

body = replace(body,"gha","gah"); //This section needs work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"igh","ahy");

body = replace(body,".enough\n",".ihnuhf\n"); //special case

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".to\n",".too\n");

body = replace(body,".two\n",".too\n");

//q at end

body = realReplace("QQQ",body,"q\n","k\n");

//w at end

body = replace(body,".low\n",".loh\n");//special cases

body = replace(body,".row\n",".roh\n");

body = replace(body,"ow\n","au\n");

//.sy

body = replace(body,".syr",".suhr"); //Moved up to e-enders

body = replace(body,".syr",".sir");

body = replace(body,".sly",".slahy");

body = replace(body,".lying\n",".lahying\n");

body = replace(body,".ly",".li");

//sz->siz - The coward's way out. I need to sit down and make this thing more cohesive

body = replace(body,"sz\n","siz\n");

body = replace(body,"pie\n","pahy\n"); // NOT normal, aka special

body = realReplace("qqq",body,".or",".awr");

body = replace(body,".sky",".skahy");

body = replace(body,".fly",".flahy");

body = replace(body,".ally\n",".alahy\n");

body = realReplace("qqq",body,"y\n","ee\n");

body = realReplace("qqq",body,"ehee\n","ehy\n");

body = realReplace("qqq",body,"ahee\n","ahy\n");

body = realReplace("qqq",body,"eee\n","ey\n"); //fixing issues raised by y->ee as compared to other phonetics

body = realReplace("qqq",body,"iest\n","eeest\n");

body = replace(body,"ize","ahz");

body = replace(body,"able","uhbuhl");

body = replace(body,"ably","uhblee"); //Last sweep

String[] temp = {"en","st","un","c","f","g","s","t",""};

body = replace(body,"ctable\n","kteybuhl\n"); //save the c's!

for(int i = 0; i<temp.length;i++)

if(temp.equals("c"))

body = replace(body,"kable\n","eybuhl\n");

else

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"ble\n","buhl\n");

//x's

body = replace(body,".xy",".zi");

body = replace(body,"xious","kSuhs");

//apostrophe possessive replacement, see removeCharacters()

body = replace(body," A","ez");

body = replace(body," B","z");

//General fixer for suffixes

//body = replace(body,"\n","\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

//body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"qu","ku");

//body = replace(body,"q","ku");

/* body = replace(body,"wa","ua"); //Unnecessary? I think not! I'm not sure why, but no.

body = replace(body,"we","ue");

body = replace(body,"wi","ui");

body = replace(body,"wo","uo");

body = replace(body,"wu","uu"); */

body = replace(body,"w","u"); //exception catcher

if(debug_end_e){

body = replace(body,"e\n","Q\n"); //Just for debugging

body = replace(body,".TQ",".Te");

body = replace(body,".bQ",".be");

body = replace(body,".seQ",".seee");

body = replace(body,".mQ",".me");

body = replace(body,"eQ\n","ee\n");

body = replace(body,"Qy\n","ey\n");

body = replace(body,".hQ",".he");

body = replace(body,".shQ",".she");

}

return body;

}

private static String replace(String body, String target, String sub){

return realReplace("",body,target,sub);

}

private static String realReplace(String sofar, String body, String target, String sub)

{

int target_size = target.length();

int sub_size = sub.length();

//As of 1.8.8.1, '.' and '\n' are only codes for ' '. Spaces will be added before and after every \n, as well as after every period, then removed at the end.

//'.'==' '

if(target.startsWith("."))

return realReplace(sofar, body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub_size)));

else if(target.endsWith("\n"))

return realReplace(sofar, body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub_size-1)+" ")); //space substitution

/* if((min<Count++)&&(max>Count))

Targets+= target+"_"; */

if(Counting)

{

Count++;

if(target.equals("w"))

System.out.println("Replaces Run: "+Count);

}

if(target.endsWith(" "))

if(sofar.length()<=2){ //that took longer than it should have. Anyone who can suggest improvements is welcome to try.

/* if(target.equals(" lingered "))

System.out.println(target); */

//I think contains() covers it. It saves time over endsWith() if it stops unnecessary calls to realReplace(), as long as it doesn't cut out possible permutations

if((!sofar.contains("z"))&&(!sofar.contains("l"))&&(!sofar.contains("t"))){

if(!sofar.contains("i"))// s->z

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z "));

else if((sub_size>=2)&&(sub.charAt(sub_size-2)=='y'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"ez ")); //s->z

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

if(((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))||((sub_size>=2)||(sub.substring(sub_size-2,sub_size).equals("hy"))))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"z "));

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"iez ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

/* //y

body = realReplace("qqq",body,"ay ","ey "); //stopgap, might want to revisit

body = replace(body,"ey ","ey ");

body = realReplace("qqq",body,"oy ","oi ");

body = realReplace("qqq",body,"uy ","ahy ");

body = realReplace("qqq",body,"y ","ee "); //might need generalized in replace()

body = replace(body,"ty","tahy"); */

//ly, focus on y as of 1.7.4.3 - It might need some work

if(target.equals("sly ")) //special case

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else{

//ly

if((target_size>=5)&&(target.substring(target_size-5,target_size-1).equals("able")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"y "),(sub.substring(0,sub_size-4)+"lee ")); //ably

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

if((sub_size>=3)&&(sub.substring(sub_size-3,sub_size-1).equals("ee")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-3)+"ily "),(sub.substring(0,sub_size-3)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"ily "),(sub.substring(0,sub_size-2)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"pily "),(sub.substring(0,sub_size-1)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"tily "),(sub.substring(0,sub_size-1)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

//y

if((target_size>=2)&&(target.charAt(target_size-2)=='a')) //might need work

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ey "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"y "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"i "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='u'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ahy "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"py "),(sub.substring(0,sub_size-1)+"ee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"ty "),(sub.substring(0,sub_size-1)+"ee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee ")); //might not be needed

}

if((!sofar.contains("g"))&&(!sofar.contains("i"))&&(!sofar.contains("r"))){ //covers multiple

if((!target.endsWith("g "))&&(!target.endsWith("gs "))&&(!target.endsWith("gz "))) //leave no base uncovered

if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ie")))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-3)+"ying "),(sub.substring(0,sub_size-1)+"ing ")); //replacing 'ie' before gerund

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){ //experiment

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ring "),(sub.substring(0,sub_size-1)+"ring ")); //rr

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //have to do both, sadly

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ping "),(sub.substring(0,sub_size-1)+"ing "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ting "),(sub.substring(0,sub_size-1)+"ing "));

else if((!target.endsWith("gs "))&&(!target.endsWith("gz "))) //no "ing\n" or s\z at end

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //no e, presumably ends in consonant

if((!sofar.contains("a"))&&(!sofar.contains("d"))) //ish

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"pish "),(sub.substring(0,sub_size-1)+"ish "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"tish "),(sub.substring(0,sub_size-1)+"ish "));

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ed")))||(target_size<3))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"ish "),(sub.substring(0,sub_size-1)+"ish "));

if(!sofar.contains("a")) //able

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"table "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){//experiment

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"rable "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(target.equals("fly")||target.equals("unfly"))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(((target_size>=4)&&(target.substring(target_size-4,target_size-1).equals("ing")))||(target_size<4))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"eybuhl "));

//1.9

//ize

if(!sofar.contains("x"))

if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

body = realReplace(sofar+"x",body,(target.substring(0,target_size-2)+"ize "),(sub.substring(0,sub_size-1)+"ahz ")); //removing 'e'

else

body = realReplace(sofar+"x",body,(target.substring(0,target_size-1)+"ize "),(sub.substring(0,sub_size-1)+"ahz "));

//est - was iest before 1.9.1.1

if((!sofar.contains("t")))

if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

body = realReplace(sofar+"t",body,(target.substring(0,target_size-2)+"iest "),(sub.substring(0,sub_size-1)+"eeest ")); //removing 'y'

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"t",body,(target.substring(0,target_size-2)+"est "),(sub.substring(0,sub_size-1)+"est "));

else

body = realReplace(sofar+"t",body,(target.substring(0,target_size-1)+"est "),(sub.substring(0,sub_size-1)+"est "));

}

if((!sofar.contains("g"))&&(!sofar.contains("d"))){ //covers multiple

if(target_size>=2) //d at end

if(target.charAt(target_size-2)=='e')

if((target_size>=3)&&(target.charAt(target_size-3)=='c'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"st "));

else

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed ")); //NOT st

else if(target.charAt(target_size-2)=='s')

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

else if(target.charAt(target_size-2)=='r'){//experiment

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"red "),(sub.substring(0,sub_size-1)+"d "));

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"d "));

}

else if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("se")))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ped "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ted "),(sub.substring(0,sub_size-1)+"ed "));

else if((target.charAt(target_size-2)!='s')||((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ss"))))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

//er

if(!sofar.contains("r"))

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r "),(sub.substring(0,sub_size-1)+"er ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"per "),(sub.substring(0,sub_size-1)+"er "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){ //experiement

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"rer "),(sub.substring(0,sub_size-1)+"rer "));

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er "),(sub.substring(0,sub_size-1)+"er "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"ter "),(sub.substring(0,sub_size-1)+"er "));

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er "),(sub.substring(0,sub_size-1)+"er "));

}

/* //ate, not bothering with fobiddances - Never mind

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n")); */

//Why do these need to be dealt with here?

//Because these permuations need to be available to figure out which \n grammars to apply

//ed, ish, ly, ing, able, edly, ishly, ably, lying, eding, abling

//Dirty method - add a recursion counter to replace()

//6 max - ed ish ly ing able z

//ablingly, lyingly - 3

//ablinger

//s-z, ly-l, ing-g, d-d, ish-i, able-a

//everything abides i, nothing abides s/l //nevermind, not much likes i either

//a allows l/s/d,

//a forbids a, i

//d forbids d, i

//g forbids d, g, i, a

//i forbids s, g, i, a

//er-r

//r forbids g, i, a, r

//r is forbidden by s, l, g, d

//y-y

//Not messing with forbidding now (1.8.8.2)

//x-ized, t-iest, t forbids all, don't care about anything else right now

//I think that forbiddance is total - no forbidden suffixes at any point before

}

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.charAt(i)==target.charAt(0))

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

return body;

}

}

Edited by Kurkistan
Link to comment
Share on other sites

Hmm some comments on the code:

-Having the entire file as one massive string seems a little dangerous to me, as well as unnecessary. From looking at what you are doing and my knowledge, it seems to me that reading and writing in chunks may be more efficient and put you at a much lower risk of getting an out of memory exception. Trying to use this method on a ~10mb text file has caused headaches for me in the past.

-The way you are doing the search and replace seems to have change that you can make that would give you some speedup: when you have a space in the middle of your comparison frame, you continue iterating one at a time, even though none of your patterns that you are trying to match have a space in them. I was going to suggest that you try using a string tokenizer, but it might not work since you are including spaces in your patterns. Also, dropping the entire string into the string tokenizer might not be such a good idea either.

-All that and no regex? (I'm not exactly one to comment here because my knowledge and usage of regex is rather low, but still...)

I've got a radically different method of doing your find and replace method that might turn out to be faster, but making it work would require some major changes (it also no longer seems that much faster after I realized you weren't doing whole word comparisons for everything). The way this procedure works is to load all of the pairings into a map with the string to be searched for as the key and the replacement as the value. The problem is that there seems to be a specific order to your replacements and unless string length is the only determinator in order this probably won't work.

Link to comment
Share on other sites

Hmm some comments on the code:

-Having the entire file as one massive string seems a little dangerous to me, as well as unnecessary. From looking at what you are doing and my knowledge, it seems to me that reading and writing in chunks may be more efficient and put you at a much lower risk of getting an out of memory exception. Trying to use this method on a ~10mb text file has caused headaches for me in the past.

-The way you are doing the search and replace seems to have change that you can make that would give you some speedup: when you have a space in the middle of your comparison frame, you continue iterating one at a time, even though none of your patterns that you are trying to match have a space in them. I was going to suggest that you try using a string tokenizer, but it might not work since you are including spaces in your patterns. Also, dropping the entire string into the string tokenizer might not be such a good idea either.

-All that and no regex? (I'm not exactly one to comment here because my knowledge and usage of regex is rather low, but still...)

I've got a radically different method of doing your find and replace method that might turn out to be faster, but making it work would require some major changes (it also no longer seems that much faster after I realized you weren't doing whole word comparisons for everything). The way this procedure works is to load all of the pairings into a map with the string to be searched for as the key and the replacement as the value. The problem is that there seems to be a specific order to your replacements and unless string length is the only determinator in order this probably won't work.

Praise be to the Stormfather, someone who knows how to program!

I am by no means an expert programmer, and have not had reason to do much searching within strings or use regex before, so do not have a very thorough knowledge of either. On top of that, I initially set this down as the bare bones of what would work, then focused primarily on the transliteration aspect. On top of that, I really didn't give much thought to the actual mechanics of searching/replacing words, and I haven't done anything similar to this before, so didn't have a code library to easily draw upon.

Any improvement you have are welcome: in fact, the way you're talking seems to indicate that you have entire functions that could just be substituted in instead of mine, which I would welcome moreso.

EDIT: For some substantive replies to your specific suggestions:

You're right that I could fairly easily implement a chunks-based version of this, but I don't think it's strictly necessary at this point in time. The largest file I've converted is the Odyssey at 120,162 words and 594 KB, while the largest file we would probably ever convert would probably be the WoK at ~400,000 words, which would still be only around the ball park of 2 MB. This is mostly laziness talking right not, though, not any genuine objection to the concept.

Once again, a better search algorithm would be welcome. Also, we probably wouldn't be able to order them entirely by length, given that text segments at the beginning and ends of words require special treatment compared to general swaps.

On the note of making things simpler, though, I now realize that I was introducing unnecessary complications into the existing code by not marking off phonetic text from un-converted text. My new version, besides throwing in some odd fixes and grammars, makes it so that all substituted strings are in CAPS, so that they are not overwritten twice. This might not give much more utility now, since I've made an effort to avoid such complications so far, but could vastly simplify any additions going forward, as well as possibly make the implementation of a mapping easier.

EDIT 2: Well, that was an un-fun experiment. Too many problems introduced through capitalization, making interactions between sections of transformed and transformable text impossible, rather than the simply problematic of the status quo. Changes rolled back, incorporating new grammars into non-CAPS version.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan, with significant developmental input from Turos

* @date 01/29/2012

* @version 1.9.3

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_9_3 //recovering from CAPITALIZATION in 1.9.2.3 onward

{

static boolean debug_char = false;

static boolean debug_end_e = false;

static boolean remove_illegal = true;

static boolean add_CR = true;

/* static String Targets = "";

static int min = 200;

static int max = 400; */

static int Count = 0;

static boolean Counting = true;

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter input file (full name of file in same directory): ");

String temp = input.next();

//temp = "Test.txt";

final double startTime = System.currentTimeMillis();

final double endTime;

try {

String alethi = convertText(temp);

if(alethi.equals("&"))

return;

//putting carriage-returns back in to make it look pretty in Notepad. I can't tell what else they might do.

if(add_CR)

for(int i = 0; i<alethi.length();i++)

if(alethi.charAt(i)=='\n')

alethi = alethi.substring(0,i)+"\r"+alethi.substring(i++,alethi.length());

//writeFile(Targets,"TEMP.txt");

temp = "Alethi_"+temp;

writeFile(alethi,temp);

if(debug_char){

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

} finally {

endTime = System.currentTimeMillis();

}

final double duration = endTime - startTime;

System.out.println("Execution time: "+(duration/1000)+" seconds");

}

private static String convertText(String roman) throws IOException

{

roman = readFile(roman); //text file

if((roman.length()==1)&&(roman.charAt(0)=='&')) //invalid input, halt program

return "&";

if(remove_illegal)

roman = removeCharacters(roman);

roman = periodMover(roman);

roman = spaceEnds(roman);

String alethi = replaceLetters(roman);

return unSpaceEnds(alethi);

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static String readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("File not in directory or misspelled.");

return "&";

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole;

}

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("Output file already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercaseases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body.charAt(i)=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)) //c, q, w, and x

library[place++] = (char)i;

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body.charAt(i)=='\n')

line++;

else if(Arrays.binarySearch(library,body.charAt(i))<0) //not in library

violations = violations + (line+":"+body.charAt(i)) + "; ";

return violations;

}

private static String removeCharacters(String body)

{

char[] library = new char[56];

library[0] = '\t'; //tab

library[1] = '\n';

library[2] = ' ';

library[3] = '.';

int place = 4;

for(int i = 65; i <=90; i++)

library[place++] = (char)i;

for(int i = 97; i <=122; i++)

library[place++] = (char)i;

for(int i = 0; i < body.length(); i++)

if(Arrays.binarySearch(library,body.charAt(i))<0) //I felt embarrassed by my earlier search algorithm.

if((body.charAt(i)=='?')||(body.charAt(i)=='!'))

body = body.substring(0,i)+"."+body.substring(i+1,body.length());

else if(body.charAt(i)=='-')

body = body.substring(0,i)+" "+body.substring(i+1,body.length());

else if(body.charAt(i)==(char)39) //apostrophe character

if((i>0)&&(body.charAt(i-1)=='s')) //allowing for both Unitied States' and United States's, as an example

if((i<body.length()-1)&&(body.charAt(i+1)=='s')) //"-s's"

body = body.substring(0,i)+" A"+body.substring((i++)+2,body.length()); //" A"->"ez"

else

body = body.substring(0,i)+" A"+body.substring((i++)+1,body.length()); //"-s'"

else if((i<body.length()-1)&&(body.charAt(i+1)=='s')) //"-'s"

body = body.substring(0,i)+" B"+body.substring((i++)+2,body.length()); //" B"->"z"

else

body = body.substring(0,i)+body.substring(i--+1,body.length()); //same as normal

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

return body;

}

/**

* In the Alethi alphabet, sentences start with a period '.' and don't end with anything.

*/

private static String periodMover(String body)

{

int start = 0;

for(int i=0;i<body.length();i++)

{

if(body.charAt(i)=='.'){

while((i<body.length())&&(body.charAt(i)=='.')) //multiples

body = body.substring(0,start)+"."+body.substring(start,i)+body.substring((i++)+1,body.length());

while(i<body.length())

if(!inAlphabet(body.charAt(i)))

i++;

else

break; //Yes, the cardinal sin.

start = i;

}

else if(body.charAt(i)=='\n')

start=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

return body;

}

private static boolean inAlphabet(char character)

{

int value = (int)character;

if((value>=97)&&(value<=122)) //just checking lowercase letters

return true;

return false;

}

private static String spaceEnds(String body){

for(int i=0;i<body.length();i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+" "+body.substring((i++)+1,body.length());

else if(body.charAt(i)=='\n'){

body = body.substring(0,i)+" \n "+body.substring(i+1,body.length());

i+=2;

}

//System.out.println(body);

return body;

}

private static String unSpaceEnds(String body){

for(int i=1;i<body.length()-2;i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+body.substring(i+2,body.length());

else if(body.charAt(i)=='\n')

body = body.substring(0,i-1)+"\n"+body.substring((i--)+2,body.length());

if(body.charAt(body.length()-2)=='.')

body = body.substring(0,body.length()-1);

else if(body.charAt(body.length()-2)=='\n')

body = body.substring(0,body.length()-3)+"\n";

return body.substring(1,body.length()-1); //clipping first/last '\n';;

}

public static void test()

{

String body = "\nbutler\n";

String target = "ap\n";

String sub = "op\n";

System.out.println(replace(body,target,sub));

int target_size = target.length();

int sub_size = sub.length();

String sofar = "";

int j = 2;

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

System.out.println(body);

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

private static String replaceLetters(String body)

{

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//ph

body = replace(body,"ph","f");

//anti-

body = replace(body,".anti",".antahy");

body = replace(body,".whole",".hohl");

//wh

body = replace(body,"who\n","hoo\n");

body = replace(body,"where","huair"); //changed w to u

body = replace(body,"whir","huur");

body = replace(body,"wh","hu"); //Might need more permutations

body = replace(body,".accr",".uhkr"); //many many many

body = replace(body,".acci",".aksi");

body = replace(body,".accord",".uhkawrd");

body = replace(body,".accomp",".uhkuhmp");

body = replace(body,".acco",".uhko");

body = replace(body,".accustom\n",".uhkuhstuhm\n");

body = replace(body,".accolade\n",".akuhleyd\n");

body = replace(body,".accus",".uhkyooz");

body = replace(body,".accurs",".uhkurs");

body = replace(body,".accur",".akyer");

body = replace(body,".accum",".uhkyoom");

body = replace(body,".accout",".uhkoot");

body = replace(body,".accoun",".uhkount");

body = replace(body,".acce",".akse"); //the dreaded double c's

body = replace(body,".ecc",".eks");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,".occ",".uhk");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

//E at end - Some interference possible with C's

body = replace(body,".cause",".kawz");

body = replace(body,"ause\n","awz\n");

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //special case

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"have\n","hav\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

//body = replace(body,"ever\n","ever\n");

body = replace(body,"eve\n","eev\n"); //HOWEVER

body = replace(body,"eever\n","ever\n");

body = replace(body,"ile\n","ahyl\n");

//System.out.println(replace(replace("while ","wh","hu"),"ile\n","ahyl\n"));

//huahyl

body = replace(body,"gle\n","guhl\n");

body = replace(body,".key\n",".kee\n"); //special

body = realReplace("QQQ",body,".keys\n",".kees\n");

body = replace(body,"base\n","beys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","keys\n");

body = replace(body,"chase\n","Ceys\n"); //ch == C

body = replace(body,"Case\n","Ceys\n"); //necessary?

body = replace(body,"erase\n","ihreys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,".one\n",".uuhn\n"); //sepcial

body = replace(body,".someone\n",".suhmuuhn\n");

body = replace(body,".anyone\n",".eneeuuhn\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,".some",".suhm");

body = replace(body,"comedy","komidee");

body = replace(body,"come\n","kuhm\n"); //Need to move this up

body = replace(body,".come",".kuhm");

body = replace(body,"ome\n","ohm\n");

body = replace(body,"title\n","tahytl\n");

body = replace(body,"ttle\n","tl\n");

body = replace(body,"tle\n","tl\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,".discipline\n",".disipline\n");

body = replace(body,"cine\n","sin\n");

body = replace(body,"ine\n","ahyn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"op\n","ohp\n");

body = replace(body,"ope\n","ohp\n");

body = replace(body,"rue\n","roo\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"bead\n","beed\n");

body = replace(body,".read\n",".reed\n");

body = replace(body,"nead\n","need\n");

body = replace(body,"lead\n","leed\n");

body = replace(body,"ead\n","ed\n"); //general

body = replace(body,"ade\n","eyd\n");

//1.9.2.1

body = replace(body,"heir","air"); //general rule

body = replace(body,"eir\n","er\n");

//this one's touchy, I'm just throwing in "air" exemptions to the "eer" rule where I see them

body = replace(body,"where\n","hwair\n");

body = replace(body,".ere\n",".air\n");

body = replace(body,"there\n","thair\n");

body = replace(body,"ere\n","eer\n");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"tire","tahyuhr"); //NOT \n or e

body = replace(body,"aire\n","air\n");

//body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","ahrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"promise\n","promis\n");

body = replace(body,"aise\n","eyz\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

body = replace(body,"que\n","k\n");

body = replace(body,"udge\n","uhj\n");

body = replace(body,"dge\n","j\n"); //NOT sure

body = replace(body,"age\n","aij\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"rgue\n","rgyoo\n");

body = replace(body,"gue\n","eeg\n");

//ible, might need to generalize downtown

body = replace(body,"ible\n","uhbuhl\n");

//-nge

//problem with sing, singer vs singe, singer not really being separable at the gerund-testing level

body = replace(body,"finger\n","fingger\n");

body = replace(body,"linger\n","lingger\n");

body = replace(body,"finger","fingger");

body = replace(body,"linger","lingger");

body = replace(body,".anger\n",".angger\n");

body = replace(body,".angry\n",".angree\n");//?

/* body = replace(body,"ringe\n","rinj\n"); //This is the best I can do for now.

body = replace(body,"hinge\n","hinj\n");

body = replace(body,".impinge\n",".impinj\n");

body = replace(body,"winge\n","winj\n");

body = replace(body,".binge\n",".binj\n");

body = replace(body,".singe\n",".sinj\n");

body = replace(body,".tinge\n",".winj\n");

body = replace(body,".dinge\n",".dinj\n"); */

body = realReplace("",body,"ringe\n","rinj\n"); //This is the best I can do for now.

body = realReplace("r",body,"hinge\n","hinj\n");

body = realReplace("r",body,".impinge\n",".impinj\n");

body = realReplace("r",body,"winge\n","winj\n");

body = realReplace("r",body,".binge\n",".binj\n");

body = realReplace("r",body,".singe\n",".sinj\n");

body = realReplace("",body,".tinge\n",".winj\n");

body = realReplace("",body,".dinge\n",".dinj\n");

body = replace(body,"ing\n","I\n"); //temporary

body = replace(body,"nge\n","nj\n");

body = replace(body,"I","ing");

/*

body = realReplace("QQQ",body,"nges\n","njez\n");

body = realReplace("QQQ",body,"ngely\n","njly\n");

body = realReplace("QQQ",body,"ngey\n","njee\n");

body = realReplace("QQQ",body,"ngeing\n","njing\n");

body = realReplace("QQQ",body,"nged\n","njed\n");

body = realReplace("QQQ",body,"ngeish\n","njish\n");

body = realReplace("QQQ",body,"ngeable\n","njuhbuhl\n");

body = replace(body,"ing\n","inQg\n");

body = realReplace("QQQ",body,"nger\n","njer\n");

body = realReplace("QQQ",body,"ngers\n","njerz\n");

body = realReplace("QQQ",body,"ngerly\n","njerlee\n");

body = realReplace("QQQ",body,"ngery\n","njeree\n");

body = realReplace("QQQ",body,"ngering\n","njering\n");

body = realReplace("QQQ",body,"ngered\n","njerd\n"); //that should do it. */

//END E's

//s at end - 1.7.4.5 -> unneeded, I think

//body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//START C 1.7 - moved so that higher number of characters in target get's preference, blocks kept cohesive

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

body = replace(body,"accent","aksent");

body = replace(body,"exercise\n","eksersahyz\n");

body = replace(body,".once",".wuhns");

body = replace(body,"preface\n","prefis\n"); //special

body = replace(body,"icise\n","uhsahyz\n");

body = replace(body,"rcise\n","ruhsahyz\n");

body = replace(body,".tacit\n",".tasit\n");

body = replace(body,"ciate\n","sheeeyt\n");

body = replace(body,"cate\n","keyt\n");

body = replace(body,"vate\n","vit\n"); //pulled from E section, might be a sign of things to come

body = replace(body,"literate\n","literit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"cision\n","sizhuhn\n");

body = replace(body,"cise\n","sahys\n");

body = replace(body,"cist\n","sist");

body = replace(body,"duce\n","doos\n");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usst\n"); //D's

body = replace(body,"came\n","keym\n");

body = replace(body,"came","kamuh");

body = replace(body,"ct","kt"); //factual

body = replace(body,"tual\n","Cual\n");

body = replace(body,".acid\n",".asid\n");

body = replace(body,".aci",".uhsi");

body = replace(body,"ierce\n","eers\n");

body = replace(body,"ince\n","ins\n");

//body = replace(body,".ance",".ahns");

body = replace(body,".trance",".trahns");

body = replace(body,"dance\n","dahns\n");

body = replace(body,"Cance\n","Cahns\n");

body = replace(body,"cance\n","kahns\n");

body = replace(body,"lance\n","lahns\n");

body = replace(body,"vance\n","vahns\n");

body = replace(body,"ance\n","uhns\n");

body = replace(body,"all\n","awl\n");

body = realReplace("QQQ",body,".supplement\n",".suhpluhment\n"); //special case

body = replace(body,".supp",".suhpp"); //just a general rule

body = replace(body,"appa","apuh");

body = replace(body,".appear",".uhpeer");

body = replace(body,"ppen","pen"); //double p's, might NOT be done

body = replace(body,"pplet\n","plit\n");

body = replace(body,"pple\n","puhl\n");

body = replace(body,"ppl","puhl");

body = replace(body,"upp\n","uhp");

body = replace(body,"oppor","oper");

body = replace(body,".opp",".ohp");

body = replace(body,".op",".ohp");

body = replace(body,"opp","uhp");

body = replace(body,"ypp","ip");

body = replace(body,"pp","p"); //Last ditch, should cover most before this

body = replace(body,"tice\n","tis\n");

body = replace(body,"arice\n","eris\n");

body = replace(body,"orice\n","uhis\n");

body = replace(body,"cipice\n","suhpis\n"); //patch for precipice

body = replace(body,"ipice\n","uhpis\n");

body = replace(body,".vice\n","vahys\n");

body = replace(body,"vice\n","vis\n");

body = replace(body,"ice\n","ahys\n"); //Long S. NOT sure about \n's

body = replace(body,"egy\n","ijee\n"); //possibilities/strategies fix, I have now idea how the ended up "kiez"

body = replace(body,"city\n","sitee\n");

body = replace(body,"cite\n","sahyt\n");

body = replace(body,"ity\n","itee\n");

body = replace(body,"ite\n","ahyt\n");

body = replace(body,"irst\n","urst\n");

body = replace(body,"ong\n","ong\n");

body = replace(body,"ull\n","ool\n");

body = replace(body,"cide\n","sahyd\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ence\n","ens\n");

body = replace(body,"rend\n","rend\n");

//1.8.9 Pie-

body = replace(body,"piety","pahyitee");

body = replace(body,".pier\n"," peer\n");

body = replace(body,".pie\n"," pahy\n");

body = replace(body,".pie",".pee");

body = replace(body,"ces\n","seez\n");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"ci\n","sahy\n");

body = replace(body,"gan\n","gahn\n");

body = replace(body,"dle\n","dl\n");

body = replace(body,"align\n","uhlahyn\n");

body = replace(body,"oy\n","oi\n");

body = replace(body,"ace\n","eys\n");

body = replace(body,".chull\n",".as\n");

body = replace(body,".chull",".uhs"); //Assoc-

body = replace(body,".rely\n",".relahy\n");

body = replace(body,"ely\n","lee\n"); //MUST BE LAST IN \N

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"cious","shuhs"); //For Ithaca!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"ply\n","plahy\n");

body = replace(body,".by\n",".bahy\n");

body = replace(body,".my\n",".mahy\n");

body = replace(body,".die\n",".dahy\n");

body = replace(body,".dye\n",".dahy\n");

body = replace(body,".bye\n",".bahy\n"); //conflict

body = replace(body,"hype","hahype");

body = replace(body,"hypo","hahypo");

body = replace(body,"hypn","hipn");

body = replace(body,"hyphen","hahyfuhn");

body = replace(body,"hyfen","hahyfuhn"); //ph->f

body = replace(body,"yp","ip");

body = replace(body,"duct","duhkt");

body = replace(body,"stion","sCuhn"); //1.8.9.4

body = replace(body,"tion","Suhn"); //1.8

body = replace(body,"ssion","Suhn"); //1.8.6

body = replace(body,"sion","zhuhn");

body = replace(body,"cean","Suhn");

body = replace(body,".abou",".uhbou");

body = replace(body,".aband",".uhbanduhn");

body = replace(body,"ture","Cur");

body = replace(body,"cies","seez"); //prophocies

body = replace(body,"ciez","seez"); //s->z already done

body = replace(body,"iew","yoo");

body = replace(body,".face",".feys");

body = replace(body,"face","feys");

//For-

body = replace(body,".fore",".fohr");

body = replace(body,".for",".fohr");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,"acen","eysuhn"); //Don't get complacent

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweyeynt");

body = replace(body,"cing","sing");

//1.6.5 - odyssey test

body = replace(body,"exce","ikse");

body = replace(body,"excit","iksahyt");

body = replace(body,"excis","eksahyz");

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ight","ahyt");

body = replace(body,"cep","sep");

body = replace(body,"cin","sin");

body = replace(body,".cit",".sit");

body = replace(body,"cip","sip");

body = replace(body,".def",".dihf");

body = replace(body,"cif","sif"); //NOT sure

body = replace(body,"icc","ik");

body = replace(body,"icn","ikn");

body = replace(body,"sce","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

//body = replace(body,"sco","sko");

body = replace(body,"cea","sea");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsee");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cent","sent"); //odyssey

body = replace(body,"it\n","it\n"); //Tacked on for suffix reasons

body = replace(body,"ap\n","ap\n");

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".cer",".sur");

body = replace(body,".ce",".se");

body = replace(body,"ck","k");

/* body = realReplace("QQQ",body,"C\n","k\n");

body = realReplace("QQQ",body,"ch\n","k\n"); */

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

body = replace(body,"ce","se"); //might want to move

body = replace(body,"ape\n","eyp\n");

body = realReplace("QQQ",body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = realReplace("QQQ",body,"c\n","k\n"); //to stop mischeif

//END C'S

body = replace(body,".odyssey\n",".oduhsee\n"); //special

body = replace(body,"sey\n","zee\n");

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

body = replace(body,".be\n",".bee\n");

body = replace(body,".maybe\n",".meybee\n");

//rom

body = realReplace("QQQ",body,".roman\n",".rohmahn\n");

body = replace(body,"rom","rohm");

//gh

body = replace(body,"gha","gah"); //This section needs work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"igh","ahy");

body = replace(body,".enough\n",".ihnuhf\n"); //special case

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".to\n",".too\n");

body = replace(body,".two\n",".too\n");

//q at end

body = realReplace("QQQ",body,"q\n","k\n");

//w at end

body = replace(body,".low\n",".loh\n");//special cases

body = replace(body,".row\n",".roh\n");

body = replace(body,"ow\n","au\n");

//.sy

body = replace(body,".syr",".suhr"); //Moved up to e-enders

body = replace(body,".syr",".sir");

body = replace(body,".sly",".slahy");

body = replace(body,".lying\n",".lahying\n");

body = replace(body,".ly",".li");

//sz->siz - The coward's way out. I need to sit down and make this thing more cohesive

body = replace(body,"sz\n","siz\n");

body = replace(body,"pie\n","pahy\n"); // NOT normal, aka special

body = realReplace("qqq",body,".or",".awr");

body = replace(body,".sky",".skahy");

body = replace(body,".fly",".flahy");

body = replace(body,".ally\n",".alahy\n");

body = realReplace("qqq",body,"y\n","ee\n");

body = realReplace("qqq",body,"ehee\n","ehy\n");

body = realReplace("qqq",body,"ahee\n","ahy\n");

body = realReplace("qqq",body,"eee\n","ey\n"); //fixing issues raised by y->ee as compared to other phonetics

body = realReplace("qqq",body,"iest\n","eeest\n");

body = replace(body,"ize","ahz");

body = replace(body,"able","uhbuhl");

body = replace(body,"ably","uhblee"); //Last sweep

String[] temp = {"en","st","un","c","f","g","s","t"};

body = replace(body,"ctable\n","kteybuhl\n"); //save the c's!

for(int i = 0; i<temp.length;i++)

if(temp.equals("c"))

body = replace(body,"kable\n","eybuhl\n");

else

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"ble\n","buhl\n");

//x's

body = replace(body,".xy",".zi");

body = replace(body,"xious","kSuhs");

//apostrophe possessive replacement, see removeCharacters()

body = replace(body," A","ez");

body = replace(body," B","z");

//General fixer for suffixes

//body = replace(body,"\n","\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"qu","ku");

body = replace(body,"w","u"); //exception catcher

if(debug_end_e){

body = replace(body,"e\n","Q\n"); //Just for debugging

body = replace(body,".TQ",".Te");

body = replace(body,".bQ",".be");

body = replace(body,".seQ",".seee");

body = replace(body,".mQ",".me");

body = replace(body,"eQ\n","ee\n");

body = replace(body,"Qy\n","ey\n");

body = replace(body,".hQ",".he");

body = replace(body,".shQ",".she");

}

return body;

}

private static String replace(String body, String target, String sub){

return realReplace("",body,target,sub);

}

private static String realReplace(String sofar, String body, String target, String sub)

{

int target_size = target.length();

int sub_size = sub.length();

//As of 1.8.8.1, '.' and '\n' are only codes for ' '. Spaces will be added before and after every \n, as well as after every period, then removed at the end.

//'.'==' '

if(target.startsWith("."))

return realReplace(sofar, body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub_size)));

else if(target.endsWith("\n"))

return realReplace(sofar, body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub_size-1)+" ")); //space substitution

/* if((min<Count++)&&(max>Count))

Targets+= target+"_"; */

if(Counting)

{

Count++;

if(target.equals("w"))

System.out.println("Replaces Run: "+Count);

}

if(target.endsWith(" "))

if(sofar.length()<=2){ //that took longer than it should have. Anyone who can suggest improvements is welcome to try.

/* if(target.equals(" lingered "))

System.out.println(target); */

//I think contains() covers it. It saves time over endsWith() if it stops unnecessary calls to realReplace(), as long as it doesn't cut out possible permutations

if((!sofar.contains("z"))&&(!sofar.contains("l"))&&(!sofar.contains("t"))){

if(!sofar.contains("i"))// s->z

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z "));

else if((sub_size>=2)&&(sub.charAt(sub_size-2)=='y'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"ez ")); //s->z

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

if(((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))||((sub_size>=2)||(sub.substring(sub_size-2,sub_size).equals("hy"))))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"z "));

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"iez ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

/* //y

body = realReplace("qqq",body,"ay ","ey "); //stopgap, might want to revisit

body = replace(body,"ey ","ey ");

body = realReplace("qqq",body,"oy ","oi ");

body = realReplace("qqq",body,"uy ","ahy ");

body = realReplace("qqq",body,"y ","ee "); //might need generalized in replace()

body = replace(body,"ty","tahy"); */

//ly, focus on y as of 1.7.4.3 - It might need some work

if(target.equals("sly ")) //special case

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else{

//ly

if((target_size>=5)&&(target.substring(target_size-5,target_size-1).equals("able")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"y "),(sub.substring(0,sub_size-4)+"lee ")); //ably

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

if((sub_size>=3)&&(sub.substring(sub_size-3,sub_size-1).equals("ee")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-3)+"ily "),(sub.substring(0,sub_size-3)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"ily "),(sub.substring(0,sub_size-2)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"pily "),(sub.substring(0,sub_size-1)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"tily "),(sub.substring(0,sub_size-1)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

//y

if((target_size>=2)&&(target.charAt(target_size-2)=='a')) //might need work

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ey "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"y "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"i "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='u'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ahy "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"py "),(sub.substring(0,sub_size-1)+"ee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"ty "),(sub.substring(0,sub_size-1)+"ee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee ")); //might not be needed

}

if((!sofar.contains("g"))&&(!sofar.contains("i"))&&(!sofar.contains("r"))){ //covers multiple

if((!target.endsWith("g "))&&(!target.endsWith("gs "))&&(!target.endsWith("gz "))) //leave no base uncovered

if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ie")))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-3)+"ying "),(sub.substring(0,sub_size-1)+"ing ")); //replacing 'ie' before gerund

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){ //experiment

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ring "),(sub.substring(0,sub_size-1)+"ring ")); //rr

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //have to do both, sadly

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ping "),(sub.substring(0,sub_size-1)+"ing "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ting "),(sub.substring(0,sub_size-1)+"ing "));

else if((!target.endsWith("gs "))&&(!target.endsWith("gz "))) //no "ing\n" or s\z at end

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //no e, presumably ends in consonant

if((!sofar.contains("a"))&&(!sofar.contains("d"))) //ish

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"pish "),(sub.substring(0,sub_size-1)+"ish "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"tish "),(sub.substring(0,sub_size-1)+"ish "));

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ed")))||(target_size<3))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"ish "),(sub.substring(0,sub_size-1)+"ish "));

if(!sofar.contains("a")) //able

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"table "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){//experiment

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"rable "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(target.equals("fly")||target.equals("unfly"))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(((target_size>=4)&&(target.substring(target_size-4,target_size-1).equals("ing")))||(target_size<4))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"eybuhl "));

//1.9

//ize

if(!sofar.contains("x"))

if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

body = realReplace(sofar+"x",body,(target.substring(0,target_size-2)+"ize "),(sub.substring(0,sub_size-1)+"ahyz ")); //removing 'e'

else

body = realReplace(sofar+"x",body,(target.substring(0,target_size-1)+"ize "),(sub.substring(0,sub_size-1)+"ahyz "));

//est - was iest before 1.9.1.1

if((!sofar.contains("t")))

if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

body = realReplace(sofar+"t",body,(target.substring(0,target_size-2)+"iest "),(sub.substring(0,sub_size-1)+"eeest ")); //removing 'y'

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"t",body,(target.substring(0,target_size-2)+"est "),(sub.substring(0,sub_size-1)+"est "));

else

body = realReplace(sofar+"t",body,(target.substring(0,target_size-1)+"est "),(sub.substring(0,sub_size-1)+"est "));

}

if((!sofar.contains("g"))&&(!sofar.contains("d"))){ //covers multiple

if(target_size>=2) //d at end

if(target.charAt(target_size-2)=='e')

if((target_size>=3)&&(target.charAt(target_size-3)=='c'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"st "));

else

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed ")); //NOT st

else if(target.charAt(target_size-2)=='s')

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

else if(target.charAt(target_size-2)=='r'){//experiment

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"red "),(sub.substring(0,sub_size-1)+"d "));

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"d "));

}

else if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("se")))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ped "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ted "),(sub.substring(0,sub_size-1)+"ed "));

else if((target.charAt(target_size-2)!='s')||((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ss"))))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

//er

if(!sofar.contains("r"))

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r "),(sub.substring(0,sub_size-1)+"er ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"per "),(sub.substring(0,sub_size-1)+"er "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){ //experiement

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"rer "),(sub.substring(0,sub_size-1)+"rer "));

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er "),(sub.substring(0,sub_size-1)+"er "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"ter "),(sub.substring(0,sub_size-1)+"er "));

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er "),(sub.substring(0,sub_size-1)+"er "));

}

/* //ate, not bothering with fobiddances - Never mind

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n")); */

//Why do these need to be dealt with here?

//Because these permuations need to be available to figure out which \n grammars to apply

//ed, ish, ly, ing, able, edly, ishly, ably, lying, eding, abling

//Dirty method - add a recursion counter to replace()

//6 max - ed ish ly ing able z

//ablingly, lyingly - 3

//ablinger

//s-z, ly-l, ing-g, d-d, ish-i, able-a

//everything abides i, nothing abides s/l //nevermind, not much likes i either

//a allows l/s/d,

//a forbids a, i

//d forbids d, i

//g forbids d, g, i, a

//i forbids s, g, i, a

//er-r

//r forbids g, i, a, r

//r is forbidden by s, l, g, d

//y-y

//Not messing with forbidding now (1.8.8.2)

//x-ized, t-iest, t forbids all, don't care about anything else right now

//I think that forbiddance is total - no forbidden suffixes at any point before

}

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.charAt(i)==target.charAt(0))

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

return body;

}

}

Edited by Kurkistan
Link to comment
Share on other sites

  • 3 weeks later...

This has been a fairly long break, hasn't it? New version up, I finally (yes, I am lazy) implemented a simple efficiency boost to the search algorithm, as well as throwing in a few odd grammars and fixes to existing grammars.

Big news of the day is that I've implemented a <safe>[...]</safe> tag that protects the code within the tag from being touched by the transliteration aspect of the program. This way, particularly tricky words or proper names can be cordoned off and search-replaced manually. It's currently set to leave the tags in the final text, where they can be easily found and removed after manual transliteration.

Ex.

"<safe>Wow, Xanthophyll is not necessarily the most transmorgraphical name to pronounce, is it?</safe>" becomes:

"<safe>.wow xanthophyll is not necessarily the most transmorgraphical name to pronounce is it</safe>"

EDIT:

Added in grammar for "indict" fixed some inefficiencies in how the <safe> tag was handled, and threw in some documentation and a rudimentary program flow for the benefit of Joe ST.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan, with significant developmental input from Turos

* @date 02/21/2012

* @version 1.9.4.1

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_9_4_1{

static boolean debug_char = false;

static boolean debug_end_e = false;

static boolean remove_illegal = true;

static boolean add_CR = true;

static boolean skip_protected = true;

static boolean retain_tags = true;

static boolean unbounded = false;

static int[] skip_array; //stores number of indexes to skip for <safe> tags

//^global booleans to turn certain parts of the program on/off

/* static String Targets = "";

static int min = 200;

static int max = 400; */

static int Count = 0;

static boolean Counting = true; //used to count number of replace operations run

/**

Program flow, 1.9.4.1:

main()

convertText()

readFile()

removeCharacters()*

periodMover()

inAlphabet()

spaceEnds()

buildSkipArray()*

safeSkip()

<Recursive>

replaceLetters()

replace()

realReplace()

<Recursive>

findReplace()

removeSkip()*

realReplace()

unSpaceEnds()

writeFile()

allowedCharacters*

* - Indicates possible call based on global boolean setting.

*/

/*

Function: main

Runs program: Asks for filename of input and writes to outfile, as well as printing out execution time run

Parameters:

None

Returns:

void

*/

/**

* Any sequence of characters bracketed by <safe>[...]</safe> will not be touched by the program

*/

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter input file (full name of file in same directory): ");

String temp = input.next();

//temp = "Test.txt";

final double startTime = System.currentTimeMillis();

final double endTime;

try {

String alethi = convertText(temp);

if(alethi.equals("&"))

return;

//putting carriage-returns back in to make it look pretty in Notepad. I can't tell what else they might do.

if(add_CR)

for(int i = 0; i<alethi.length();i++)

if(alethi.charAt(i)=='\n')

alethi = alethi.substring(0,i)+"\r"+alethi.substring(i++,alethi.length());

//writeFile(Targets,"TEMP.txt");

temp = "Alethi_"+temp;

writeFile(alethi,temp);

if(debug_char){

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

} finally {

endTime = System.currentTimeMillis();

}

final double duration = endTime - startTime;

System.out.println("Execution time: "+(duration/1000)+" seconds");

}

/*

Function: convertText

Turns English string into Roman-alphabet phonetic spelling

Parameters:

roman - Raw string of input file, still in roman.

Returns:

Roman-alphabet phonetic spelling of input string

*/

private static String convertText(String roman) throws IOException

{

roman = readFile(roman); //text file

if((roman.length()==1)&&(roman.charAt(0)=='&')) //invalid input, halt program

return "&";

if(remove_illegal)

roman = removeCharacters(roman);

roman = periodMover(roman);

roman = spaceEnds(roman);

if(skip_protected)

buildSkipArray(roman);

String alethi = replaceLetters(roman);

if(skip_protected){

alethi = removeSkip(alethi);

if(unbounded)

System.out.println("There is at least one unbounded '<safe>'");

}

return unSpaceEnds(alethi);

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static String readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("File not in directory or misspelled.");

return "&";

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole;

}

/*

Function: removeCharacters

Takes out non-allowed characters, replacing appropriate characters with their proper equivalent

Parameters:

body - The text to be corrected

Returns:

Character-pruned original text

*/

private static String removeCharacters(String body)

{

char[] library = new char[56];

library[0] = '\t'; //tab

library[1] = '\n';

library[2] = ' ';

library[3] = '.';

int place = 4;

for(int i = 65; i <=90; i++)

library[place++] = (char)i;

for(int i = 97; i <=122; i++)

library[place++] = (char)i;

for(int i = 0; i < body.length(); i++)

if(Arrays.binarySearch(library,body.charAt(i))<0) //I felt embarrassed by my earlier search algorithm.

if((body.charAt(i)=='?')||(body.charAt(i)=='!'))

body = body.substring(0,i)+"."+body.substring(i+1,body.length());

else if(body.charAt(i)=='-')

body = body.substring(0,i)+" "+body.substring(i+1,body.length());

else if(body.charAt(i)==(char)39) //apostrophe character

if((i>0)&&(body.charAt(i-1)=='s')) //allowing for both Unitied States' and United States's, as an example

if((i<body.length()-1)&&(body.charAt(i+1)=='s')) //"-s's"

body = body.substring(0,i)+" A"+body.substring((i++)+2,body.length()); //" A"->"ez"

else

body = body.substring(0,i)+" A"+body.substring((i++)+1,body.length()); //"-s'"

else if((i<body.length()-1)&&(body.charAt(i+1)=='s')) //"-'s"

body = body.substring(0,i)+" B"+body.substring((i++)+2,body.length()); //" B"->"z"

else

body = body.substring(0,i)+body.substring(i--+1,body.length()); //same as normal

else if (body.charAt(i)=='<')

if(i<body.length()-("/safe>".length())) //no reason to have <safe> in very end, especially since there's always a \n

if(body.substring(i+1,i+7).equals("/safe>"))

i+=6;

else if(body.substring(i+1,i+6).equals("safe>"))

i+=5;

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

return body;

}

/*

Function: periodMover

In the Alethi alphabet, sentences start with a period '.' and don't end with anything. This models that.

Parameters:

body - Text to be manipulated

Returns:

Text with periods moved to beginning of sentences

*/

private static String periodMover(String body)

{

int start = 0;

for(int i=0;i<body.length();i++)

{

if(body.charAt(i)=='.'){

while((i<body.length())&&(body.charAt(i)=='.')) //multiples

body = body.substring(0,start)+"."+body.substring(start,i)+body.substring((i++)+1,body.length());

while(i<body.length())

if(!inAlphabet(body.charAt(i)))

i++;

else if(body.charAt(i-1)=='<') //skipping

i+=5;

else if(body.charAt(i-1)=='/') //skipping

i+=6;

else

break; //Yes, the cardinal sin.

start = i;

}

else if(body.charAt(i)=='\n')

start=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

return body;

}

/*

Function: inAlphabet

Returns whether or not a character is within the lower-case roman alphabet

Parameters:

character - char to be checked

Returns:

Boolean indicating whether or not the given char is in the lower-case roman alphabet

*/

private static boolean inAlphabet(char character){

int value = (int)character;

if((value>=97)&&(value<=122)) //just checking lowercase letters

return true;

return false;

}

/*

Function: spaceEnds

Adds 'space' buffers around periods, <safe> and </safe> tags, and endline characters to enable easier replacement of string segments at the ends of words.

Parameters:

body - Text to be manipulated

Returns:

Text with spaces added around periods, <safe> tags, and endline charactes

*/

private static String spaceEnds(String body){

for(int i=0;i<body.length();i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+" "+body.substring((i++)+1,body.length());

else if(body.charAt(i)=='\n'){

body = body.substring(0,i)+" \n "+body.substring(i+1,body.length());

i+=2;

}

else if(body.charAt(i)=='>') //For skipping

body = body.substring(0,i+1)+" "+body.substring((i++)+1,body.length());

else if(body.charAt(i)=='<')

body = body.substring(0,i)+" "+body.substring((i++),body.length());

//System.out.println(body);

return body;

}

/*

Function: buildSkipArray

Sets the value of the global int[] array skip_array to store the number of indices which each successive <safe> tag indicates should be skipped.

Parameters:

body - Text to be read from, <safe> found in.

Returns:

Void. skip_array value set

*/

private static void buildSkipArray(String body){

String gradual = "";

int count = 0;

int temp;

for(int i = 0; i<=body.length()-1;i++)

if(body.charAt(i)=='<'){ //skipping

temp = safeSkip(body.substring(i+1,body.length()));

i+=temp;

gradual+=temp + ":";

count++;

}

//System.out.println(gradual);

skip_array = new int[count];

int place = 0;

for(int i = 0;i<count;i++){

temp = gradual.indexOf(':',place);

skip_array = Integer.parseInt(gradual.substring(place,temp));

place = temp+1;

}

}

/*

Function: safeSkip

Returns the number of indices to be skipped until the end of a <safe>...</safe> sequence.

Parameters:

clip - Hopefully the segment of a larger body of text following directly after a '<' character.

Returns:

The number of indices until the ending '>', if it exists, the number until the end of the string otherwise.

*/

private static int safeSkip(String clip){

//assume that point just before clip was '<'

int skip = 0;

if(clip.length()>=("safe></safe>".length()))

if(clip.substring(0,5).equals("safe>"))

for(int i=5; i < (clip.length()-("</safe>".length()));i++)

if(clip.charAt(i)=='<'){

if(clip.substring(i,i+6).equals("<safe>"))

i += safeSkip(clip.substring(i+1,clip.length()));

else if(clip.substring(i,i+7).equals("</safe>")){

skip=(i+6);

break;

}

}

else if(i+1>=clip.length()-("</safe>".length())){

skip = clip.length()-1;

unbounded = true;

}

return skip;

}

/*

Function: removeSkip

Removes all <safe> and </safe> tags from the text

Parameters:

body - The text to be manipulated.

Returns:

The body without any <safe> or </safe> tags

*/

private static String removeSkip(String body){

skip_protected=false;

if(retain_tags){

body = realReplace("QQQ", body," <safe> ", "<safe>");

body = realReplace("QQQ", body," </safe> ", "</safe>"); //java didn't agree when I wanted to nest them

}

else{

body = realReplace("QQQ", body," <safe> ", "");

body = realReplace("QQQ", body," </safe> ", ""); //java didn't agree when I wanted to nest them

}

return body;

}

/*

Function: unSpaceEnds

Removes the 'space' buffers around periods, <safe> and </safe> tags, and endline characters to return text to proper formating.

Parameters:

body - Text to be manipulated

Returns:

Text with spaces removed from around periods, <safe> tags, and endline charactes

*/

private static String unSpaceEnds(String body){

for(int i=1;i<body.length()-2;i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+body.substring(i+2,body.length());

else if(body.charAt(i)=='\n')

body = body.substring(0,i-1)+"\n"+body.substring((i--)+2,body.length());

if(body.charAt(body.length()-2)=='.')

body = body.substring(0,body.length()-1);

else if(body.charAt(body.length()-2)=='\n')

body = body.substring(0,body.length()-3)+"\n";

return body.substring(1,body.length()-1); //clipping first/last '\n';;

}

/*

Function: writeFile

Writes the given string to an outfile

Parameters:

text - Text to be written.

destination - Name of outfile

Returns:

Void, outfile written to.

*/

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("Output file already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

/*

Function: allowedCharacters

Returns string of lines and types of characters which ought not be in the text upon output because Turos's Alethi font convention does not allow for them.

Parameters:

body - Text to be read

Returns:

String containing line numbers and types of violations of font conventions

*/

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercaseases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body.charAt(i)=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)) //c, q, w, and x

library[place++] = (char)i;

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body.charAt(i)=='\n')

line++;

else if(Arrays.binarySearch(library,body.charAt(i))<0) //not in library

violations = violations + (line+":"+body.charAt(i)) + "; ";

return violations;

}

/*

Function: test

Generic function used to test odds and ends of code.

Parameters:

None

Returns:

Void

*/

public static void test()

{

String body = "\nbutler\n";

String target = "ap\n";

String sub = "op\n";

System.out.println(replace(body,target,sub));

int target_size = target.length();

int sub_size = sub.length();

String sofar = "";

int j = 2;

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

System.out.println(body);

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

/*

Function: replaceLetters

Body of program, replaces English spelling of text segments with phonetic spelling in Roman-alphabet

Parameters:

body - Text to be manipulated

Returns:

Text with Roman-alphabet phonetic spelling of English words.

*/

private static String replaceLetters(String body)

{

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//ph

body = replace(body,"ph","f");

//anti-

body = replace(body,".anti",".antahy");

body = replace(body,".whole",".hohl");

//wh

body = replace(body,"whose","hooz");

body = replace(body,"whom","hoom");

body = replace(body,"who\n","hoo\n");

body = replace(body,"where","huair"); //changed w to u

body = replace(body,"whir","huur");

body = replace(body,"wh","hu"); //Might need more permutations

body = replace(body,".accr",".uhkr"); //many many many

body = replace(body,".acci",".aksi");

body = replace(body,".accord",".uhkawrd");

body = replace(body,".accomp",".uhkuhmp");

body = replace(body,".acco",".uhko");

body = replace(body,".accustom\n",".uhkuhstuhm\n");

body = replace(body,".accolade\n",".akuhleyd\n");

body = replace(body,".accus",".uhkyooz");

body = replace(body,".accurs",".uhkurs");

body = replace(body,".accur",".akyer");

body = replace(body,".accum",".uhkyoom");

body = replace(body,".accout",".uhkoot");

body = replace(body,".accoun",".uhkoun");

body = replace(body,".acce",".akse"); //the dreaded double c's

body = replace(body,".ecc",".eks");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,".occ",".uhk");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

//E at end - Some interference possible with C's

body = replace(body,".cause",".kawz");

body = replace(body,"ause\n","awz\n");

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //special case

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"have\n","hav\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

//body = replace(body,"ever\n","ever\n");

body = replace(body,"eve\n","eev\n"); //HOWEVER

body = replace(body,"eever\n","ever\n");

body = replace(body,"ile\n","ahyl\n");

//System.out.println(replace(replace("while ","wh","hu"),"ile\n","ahyl\n"));

//huahyl

body = replace(body,"gle\n","guhl\n");

body = replace(body,".key\n",".kee\n"); //special

body = realReplace("QQQ",body,".keys\n",".kees\n");

body = replace(body,"base\n","beys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","keys\n");

body = replace(body,"chase\n","Ceys\n"); //ch == C

body = replace(body,"Case\n","Ceys\n"); //necessary?

body = replace(body,"erase\n","ihreys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,".one\n",".uuhn\n"); //sepcial

body = replace(body,".someone\n",".suhmuuhn\n");

body = replace(body,".anyone\n",".eneeuuhn\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,".some",".suhm");

body = replace(body,"comedy","komidee");

body = replace(body,"come\n","kuhm\n"); //Need to move this up

body = replace(body,".come",".kuhm");

body = replace(body,"ome\n","ohm\n");

body = replace(body,"title\n","tahytl\n");

body = replace(body,"ttle\n","tl\n");

body = replace(body,"tle\n","tl\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,".discipline\n",".disipline\n");

body = replace(body,"cine\n","sin\n");

body = replace(body,"ine\n","ahyn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"op\n","ohp\n");

body = replace(body,"ope\n","ohp\n");

body = replace(body,"rue\n","roo\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"bead\n","beed\n");

body = replace(body,".read\n",".reed\n");

body = replace(body,"nead\n","need\n");

body = replace(body,"lead\n","leed\n");

body = replace(body,"ead\n","ed\n"); //general

body = replace(body,"ade\n","eyd\n");

//1.9.2.1

body = replace(body,"heir","air"); //general rule

body = replace(body,"eir\n","er\n");

//this one's touchy, I'm just throwing in "air" exemptions to the "eer" rule where I see them

body = replace(body,"where\n","hwair\n");

body = replace(body,".ere\n",".air\n");

body = replace(body,"there\n","thair\n");

body = replace(body,"sphere\n","sfeer\n");

body = realReplace("QQQ",body,".here\n",".heer\n");

body = realReplace("QQQ",body,".were\n",".wur\n");

body = replace(body,"sier\n","seer\n");

body = replace(body,"shier\n","Seer\n");

body = replace(body,"Sier\n","Seer\n");

body = replace(body,"cier\n","seer\n");

body = replace(body,".premiere\n",".primeer\n");

body = replace(body,"iere\n","yair\n");

body = replace(body,"soldier","sohljer");

body = replace(body,"iere\n","yair\n");

body = replace(body,".persevere\n",".pursuhveer\n");

body = replace(body,".revere\n",".riveer\n");

body = replace(body,"cere\n","seer\n");

body = replace(body,".interfere\n",".interfeer\n");

body = replace(body,"mmere","M");

body = replace(body,"mere\n","meer\n");

body = replace(body,"M","mmere");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"tire","tahyuhr"); //NOT \n or e

body = replace(body,"aire\n","air\n");

//body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","ahrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"promise\n","promis\n");

body = replace(body,"aise\n","eyz\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

body = replace(body,"que\n","k\n");

body = replace(body,"udge\n","uhj\n");

body = replace(body,"dge\n","j\n"); //NOT sure

body = replace(body,"age\n","aij\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"rgue\n","rgyoo\n");

body = replace(body,"gue\n","eeg\n");

//ible, might need to generalize downtown

body = replace(body,"ible\n","uhbuhl\n");

//-nge

//problem with sing, singer vs singe, singer not really being separable at the gerund-testing level

body = replace(body,"finger\n","fingger\n");

body = replace(body,"linger\n","lingger\n");

body = replace(body,"finger","fingger");

body = replace(body,"linger","lingger");

body = replace(body,".anger\n",".angger\n");

body = replace(body,".angry\n",".angree\n");//?

//body = realReplace("",body,"ringe\n","rinj\n"); //This is the best I can do for now.

body = replace(body,".cringe\n",".krinj\n");

body = replace(body,".fringe\n",".frinj\n");

body = replace(body,".cringe\n",".kuhnstrinj\n");

body = replace(body,".astringe\n",".uhstrinj\n");

body = replace(body,".infringe\n",".infrinj\n");

body = realReplace("R",body,"hinge\n","hinj\n");

body = realReplace("R",body,".impinge\n",".impinj\n");

body = realReplace("R",body,"winge\n","winj\n");

body = realReplace("R",body,".binge\n",".binj\n");

body = realReplace("",body,".tinge\n",".winj\n");

body = realReplace("",body,".dinge\n",".dinj\n");

body = realReplace("QQQ",body,".singe\n",".sinj\n");

body = realReplace("QQQ",body,".singed\n",".sinjed\n");

body = realReplace("QQQ",body,".singeing\n",".sinjing\n");

body = realReplace("g",body,"inging\n","D\n"); //temporary

body = replace(body,"ing\n","I\n"); //temporary

body = replace(body,"nge\n","nj\n");

body = replace(body,"I","ing");

body = replace(body,"D","inging");

//END E's

//s at end - 1.7.4.5 -> unneeded, I think

//body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//START C 1.7 - moved so that higher number of characters in target get's preference, blocks kept cohesive

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

body = replace(body,"accent","aksent");

body = replace(body,"exercise\n","eksersahyz\n");

body = replace(body,".once",".wuhns");

body = replace(body,"preface\n","prefis\n"); //special

body = replace(body,"icise\n","uhsahyz\n");

body = replace(body,"rcise\n","ruhsahyz\n");

body = replace(body,".tacit\n",".tasit\n");

body = replace(body,"ciate\n","sheeeyt\n");

body = replace(body,"cate\n","kit\n");

body = replace(body,"vate\n","vit\n"); //pulled from E section, might be a sign of things to come

body = replace(body,"literate\n","literit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"cision\n","sizhuhn\n");

body = replace(body,"cise\n","sahys\n");

body = replace(body,"cist\n","sist");

body = replace(body,"duce\n","doos\n");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usst\n"); //D's

body = replace(body,"came\n","keym\n");

body = replace(body,"came","kamuh");

body = replace(body,"ct","kt"); //factual

body = replace(body,"tual\n","Cual\n");

body = replace(body,".acid\n",".asid\n");

body = replace(body,".aci",".uhsi");

body = replace(body,"ierce\n","eers\n");

body = replace(body,"ince\n","ins\n");

//body = replace(body,".ance",".ahns");

body = replace(body,".trance",".trahns");

body = replace(body,"dance\n","dahns\n");

body = replace(body,"Cance\n","Cahns\n");

body = replace(body,"cance\n","kahns\n");

body = replace(body,"lance\n","lahns\n");

body = replace(body,"vance\n","vahns\n");

body = replace(body,"ance\n","uhns\n");

body = replace(body,"all\n","awl\n");

body = realReplace("QQQ",body,".supplement\n",".suhpluhment\n"); //special case

body = replace(body,".supp",".suhpp"); //just a general rule

body = replace(body,"ape\n","eYp\n");

body = replace(body,"appa","apuh");

body = replace(body,".appear",".uhpeer");

body = replace(body,"ppen","pen"); //double p's, might NOT be done

body = replace(body,"pplet\n","plit\n");

body = replace(body,"pple\n","puhl\n");

body = replace(body,"ppl","puhl");

body = replace(body,"upp\n","uhp");

body = replace(body,"oppor","oper");

body = replace(body,".opp",".ohp");

body = replace(body,".op",".ohp");

body = replace(body,"opp","uhp");

body = replace(body,"ypp","ip");

body = replace(body,"pp","p"); //Last ditch, should cover most before this

body = replace(body,"tice\n","tis\n");

body = replace(body,"arice\n","eris\n");

body = replace(body,"orice\n","uhis\n");

body = replace(body,"cipice\n","suhpis\n"); //patch for precipice

body = replace(body,"ipice\n","uhpis\n");

body = replace(body,".vice\n","vahys\n");

body = replace(body,"vice\n","vis\n");

body = replace(body,"ice\n","ahys\n"); //Long S. NOT sure about \n's

body = replace(body,"egy\n","ijee\n"); //possibilities/strategies fix, I have now idea how the ended up "kiez"

body = replace(body,"city\n","sitee\n");

body = replace(body,"cite\n","sahyt\n");

body = replace(body,"ity\n","itee\n");

body = replace(body,"ite\n","ahyt\n");

body = replace(body,"irst\n","urst\n");

body = replace(body,"ong\n","ong\n");

body = replace(body,"ull\n","ool\n");

body = replace(body,"cide\n","sahyd\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ence\n","ens\n");

body = replace(body,"rend\n","rend\n");

//1.8.9 Pie-

body = replace(body,"piety","pahyitee");

body = replace(body,".pier\n"," peer\n");

body = replace(body,".pie\n"," pahy\n");

body = replace(body,".pie",".pee");

body = replace(body,"ces\n","seez\n");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"ci\n","sahy\n");

body = replace(body,"gan\n","gahn\n");

body = replace(body,"dle\n","dl\n");

body = replace(body,"align\n","uhlahyn\n");

body = replace(body,"oy\n","oi\n");

body = replace(body,"ace\n","eys\n");

body = replace(body,".chull\n",".as\n");

body = replace(body,".chull",".uhs"); //Assoc-

body = replace(body,".rely\n",".relahy\n");

body = replace(body,"ely\n","lee\n"); //MUST BE LAST IN \N

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"cious","shuhs"); //For Ithaca!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"ply\n","plahy\n");

body = replace(body,".by\n",".bahy\n");

body = replace(body,".my\n",".mahy\n");

body = replace(body,".die\n",".dahy\n");

body = replace(body,".dye\n",".dahy\n");

body = replace(body,".bye\n",".bahy\n"); //conflict

body = replace(body,"hype","hahype");

body = replace(body,"hypo","hahypo");

body = replace(body,"hypn","hipn");

body = replace(body,"hyphen","hahyfuhn");

body = replace(body,"hyfen","hahyfuhn"); //ph->f

body = replace(body,"yp","ip");

body = replace(body,"eYp","eyp"); //see ape->eyp

body = replace(body,"duct","duhkt");

body = replace(body,"stion","sCuhn"); //1.8.9.4

body = replace(body,"tion","Suhn"); //1.8

body = replace(body,"ssion","Suhn"); //1.8.6

body = replace(body,"sion","zhuhn");

body = replace(body,"cean","Suhn");

body = replace(body,".abou",".uhbou");

body = replace(body,".aband",".uhbanduhn");

body = replace(body,"ture","Cur");

body = replace(body,"cies","seez"); //prophocies

body = replace(body,"ciez","seez"); //s->z already done

body = replace(body,"iew","yoo");

body = replace(body,".face",".feys");

body = replace(body,"face","feys");

body = replace(body,"indict","indahyt");

//For-

body = replace(body,".fore",".fohr");

body = replace(body,".for",".fohr");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,"acen","eysuhn"); //Don't get complacent

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweyeynt");

body = replace(body,"cing","sing");

//1.6.5 - odyssey test

body = replace(body,"exce","ikse");

body = replace(body,"excit","iksahyt");

body = replace(body,"excis","eksahyz");

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ight","ahyt");

body = replace(body,"cep","sep");

body = replace(body,"cin","sin");

body = replace(body,".cit",".sit");

body = replace(body,"cip","sip");

body = replace(body,".def",".dihf");

body = replace(body,"cif","sif"); //NOT sure

body = replace(body,"icc","ik");

body = replace(body,"icn","ikn");

body = replace(body,"sce","SE");

body = replace(body,"SEyp","skeyp");

body = replace(body,"SE","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

//body = replace(body,"sco","sko");

body = replace(body,"cea","sea");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsee");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cent","sent"); //odyssey

body = replace(body,"it\n","it\n"); //Tacked on for suffix reasons

body = replace(body,"ap\n","ap\n");

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".cer",".sur");

body = replace(body,".ce",".se");

body = replace(body,"ck","k");

/* body = realReplace("QQQ",body,"C\n","k\n");

body = realReplace("QQQ",body,"ch\n","k\n"); */

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

body = replace(body,"ce","se"); //might want to move

body = realReplace("QQQ",body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = realReplace("QQQ",body,"c\n","k\n"); //to stop mischeif

//END C'S

body = replace(body,".odyssey\n",".oduhsee\n"); //special

body = replace(body,"sey\n","zee\n");

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

body = replace(body,".be\n",".bee\n");

body = replace(body,".maybe\n",".meybee\n");

//rom

body = realReplace("QQQ",body,".roman\n",".rohmahn\n"); //might want to generalize "-an" suffix

body = replace(body,"rom","rohm");

//gh

body = replace(body,"gha","gah"); //This section needs work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"igh","ahy");

body = replace(body,".enough\n",".ihnuhf\n"); //special case

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".to\n",".too\n");

body = replace(body,".two\n",".too\n");

//q at end

body = realReplace("QQQ",body,"q\n","k\n");

//w at end

body = replace(body,".low\n",".loh\n");//special cases

body = replace(body,".row\n",".roh\n");

body = replace(body,".tow\n",".toh\n");

body = replace(body,"ow\n","au\n");

//.sy

body = replace(body,".syr",".suhr"); //Moved up to e-enders

body = replace(body,".syr",".sir");

body = replace(body,".sly",".slahy");

body = replace(body,".lying\n",".lahying\n");

body = replace(body,".ly",".li");

//sz->siz - The coward's way out. I need to sit down and make this thing more cohesive

body = replace(body,"sz\n","siz\n");

body = replace(body,"pie\n","pahy\n"); // NOT normal, aka special

body = realReplace("qqq",body,".or",".awr");

body = replace(body,".sky",".skahy");

body = replace(body,".fly",".flahy");

body = replace(body,".ally\n",".alahy\n");

body = realReplace("qqq",body,"y\n","ee\n");

body = realReplace("qqq",body,"ehee\n","ehy\n");

body = realReplace("qqq",body,"ahee\n","ahy\n");

body = realReplace("qqq",body,"eee\n","ey\n"); //fixing issues raised by y->ee as compared to other phonetics

body = realReplace("qqq",body,"iest\n","eeest\n");

body = replace(body,"izen","uhzen");

body = replace(body,"ize","ahz");

body = replace(body,"able","uhbuhl");

body = replace(body,"ably","uhblee"); //Last sweep

String[] temp = {"en","st","un","c","f","g","s","t"};

body = replace(body,"ctable\n","kteybuhl\n"); //save the c's!

for(int i = 0; i<temp.length;i++)

if(temp.equals("c"))

body = replace(body,"kable\n","eybuhl\n");

else

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"ble\n","buhl\n");

//x's

body = replace(body,".xy",".zi");

body = replace(body,"xious","kSuhs");

//apostrophe possessive replacement, see removeCharacters()

body = replace(body," A","ez");

body = replace(body," B","z");

//General fixer for suffixes

//body = replace(body,"\n","\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling sometimes.

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"qu","ku");

body = replace(body,"w","u"); //exception catcher

if(debug_end_e){

body = replace(body,"e\n","Q\n"); //Just for debugging

body = replace(body,".TQ",".Te");

body = replace(body,".bQ",".be");

body = replace(body,".seQ",".seee");

body = replace(body,".mQ",".me");

body = replace(body,"eQ\n","ee\n");

body = replace(body,"Qy\n","ey\n");

body = replace(body,".hQ",".he");

body = replace(body,".shQ",".she");

}

return body;

}

/*

Function: replace

Buffer function for realReplace, adds on an empty string for generic case

Parameters:

body - Text to be searched/replaced

target - Text to be replaced

sub - Text to replace target

Returns:

Original text with target replaced by sub by realReplace

See Also:

<realReplace>

*/

private static String replace(String body, String target, String sub){

return realReplace("",body,target,sub);

}

/*

Function: realReplace

Permutates (hopefully) all expected suffixes to replace a given string with a substitute string

Parameters:

sofar - Shorthand listing of the suffixes which have been added to the original target/sub comination up to this point. "QQQ" and "qqq" used to denote a desire not to perumutate target/string suffixes at all.

body - Text to be searched/replaced

target - Text to be replaced

sub - Text to replace target

Returns:

Text with spaces added around periods, <safe> tags, and endline charactes

*/

private static String realReplace(String sofar, String body, String target, String sub)

{

int target_size = target.length();

int sub_size = sub.length();

//As of 1.8.8.1, '.' and '\n' are only codes for ' '. Spaces will be added before and after every \n, as well as after every period, then removed at the end.

//'.'==' '

if(target.startsWith("."))

return realReplace(sofar, body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub_size)));

else if(target.endsWith("\n"))

return realReplace(sofar, body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub_size-1)+" ")); //space substitution

/* if((min<Count++)&&(max>Count))

Targets+= target+"_"; */

if(Counting)

{

Count++;

if(target.equals("w"))

System.out.println("Replaces Run: "+Count);

}

if(target.endsWith(" "))

if(sofar.length()<=2){ //that took longer than it should have. Anyone who can suggest improvements is welcome to try.

/* if(target.equals(" lingered "))

System.out.println(target); */

//I think contains() covers it. It saves time over endsWith() if it stops unnecessary calls to realReplace(), as long as it doesn't cut out possible permutations

if((!sofar.contains("z"))&&(!sofar.contains("l"))&&(!sofar.contains("t"))){

if(!sofar.contains("i"))// s->z

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z "));

else if((sub_size>=2)&&(sub.charAt(sub_size-2)=='y'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"ez ")); //s->z

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

if(((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))||((sub_size>=2)||(sub.substring(sub_size-2,sub_size).equals("hy"))))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"z "));

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"iez ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

/* //y

body = realReplace("qqq",body,"ay ","ey "); //stopgap, might want to revisit

body = replace(body,"ey ","ey ");

body = realReplace("qqq",body,"oy ","oi ");

body = realReplace("qqq",body,"uy ","ahy ");

body = realReplace("qqq",body,"y ","ee "); //might need generalized in replace()

body = replace(body,"ty","tahy"); */

//ly, focus on y as of 1.7.4.3 - It might need some work

if(target.equals("sly ")) //special case

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else{

//ly

if((target_size>=5)&&(target.substring(target_size-5,target_size-1).equals("able")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"y "),(sub.substring(0,sub_size-4)+"lee ")); //ably

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

if((sub_size>=3)&&(sub.substring(sub_size-3,sub_size-1).equals("ee")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-3)+"ily "),(sub.substring(0,sub_size-3)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"ily "),(sub.substring(0,sub_size-2)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"pily "),(sub.substring(0,sub_size-1)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"tily "),(sub.substring(0,sub_size-1)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

//y

if((target_size>=2)&&(target.charAt(target_size-2)=='a')) //might need work

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ey "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"y "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"i "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='u'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ahy "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"py "),(sub.substring(0,sub_size-1)+"ee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"ty "),(sub.substring(0,sub_size-1)+"ee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee ")); //might not be needed

}

if((!sofar.contains("g"))&&(!sofar.contains("i"))&&(!sofar.contains("r"))){ //covers multiple

//ing, gerunds

if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ie")))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-3)+"ying "),(sub.substring(0,sub_size-1)+"ing ")); //replacing 'ie' before gerund

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){ //experiment

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ring "),(sub.substring(0,sub_size-1)+"ring ")); //rr

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //have to do both, sadly

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ping "),(sub.substring(0,sub_size-1)+"ing "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ting "),(sub.substring(0,sub_size-1)+"ing "));

else

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //no e, presumably ends in consonant

if((!sofar.contains("a"))&&(!sofar.contains("d"))) //ish

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"pish "),(sub.substring(0,sub_size-1)+"ish "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"tish "),(sub.substring(0,sub_size-1)+"ish "));

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ed")))||(target_size<3))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"ish "),(sub.substring(0,sub_size-1)+"ish "));

if(!sofar.contains("a")) //able

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"table "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){//experiment

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"rable "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(target.equals("fly")||target.equals("unfly"))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(((target_size>=4)&&(target.substring(target_size-4,target_size-1).equals("ing")))||(target_size<4))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"eybuhl "));

//1.9

//ize

if(!sofar.contains("x"))

if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

body = realReplace(sofar+"x",body,(target.substring(0,target_size-2)+"ize "),(sub.substring(0,sub_size-1)+"ahyz ")); //removing 'e'

else

body = realReplace(sofar+"x",body,(target.substring(0,target_size-1)+"ize "),(sub.substring(0,sub_size-1)+"ahyz "));

//est - was iest before 1.9.1.1

if((!sofar.contains("t")))

if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

body = realReplace(sofar+"t",body,(target.substring(0,target_size-2)+"iest "),(sub.substring(0,sub_size-1)+"eeest ")); //removing 'y'

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"t",body,(target.substring(0,target_size-2)+"est "),(sub.substring(0,sub_size-1)+"est "));

else

body = realReplace(sofar+"t",body,(target.substring(0,target_size-1)+"est "),(sub.substring(0,sub_size-1)+"est "));

}

if((!sofar.contains("g"))&&(!sofar.contains("d"))){ //covers multiple

if(target_size>=2) //d at end

if(target.charAt(target_size-2)=='e')

if((target_size>=3)&&(target.charAt(target_size-3)=='c'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"st "));

else

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed ")); //NOT st

else if(target.charAt(target_size-2)=='s')

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

else if(target.charAt(target_size-2)=='r'){//experiment

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"red "),(sub.substring(0,sub_size-1)+"d "));

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"d "));

}

else if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("se")))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ped "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ted "),(sub.substring(0,sub_size-1)+"ed "));

else if((target.charAt(target_size-2)!='s')||((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ss"))))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

//er

if((!sofar.contains("r"))&&(!sofar.contains("R"))) //inge special

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r "),(sub.substring(0,sub_size-1)+"er ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"per "),(sub.substring(0,sub_size-1)+"er "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){ //experiement

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"rer "),(sub.substring(0,sub_size-1)+"rer "));

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er "),(sub.substring(0,sub_size-1)+"er "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"ter "),(sub.substring(0,sub_size-1)+"er "));

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er "),(sub.substring(0,sub_size-1)+"er "));

}

/* //ate, not bothering with fobiddances - Never mind

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n")); */

//Why do these need to be dealt with here?

//Because these permuations need to be available to figure out which \n grammars to apply

//ed, ish, ly, ing, able, edly, ishly, ably, lying, eding, abling

//Dirty method - add a recursion counter to replace()

//6 max - ed ish ly ing able z

//ablingly, lyingly - 3

//ablinger

//s-z, ly-l, ing-g, d-d, ish-i, able-a

//everything abides i, nothing abides s/l //nevermind, not much likes i either

//a allows l/s/d,

//a forbids a, i

//d forbids d, i

//g forbids d, g, i, a

//i forbids s, g, i, a

//er-r

//r forbids g, i, a, r

//r is forbidden by s, l, g, d

//y-y

//Not messing with forbidding now (1.8.8.2)

//x-ized, t-iest, t forbids all, don't care about anything else right now

//I think that forbiddance is total - no forbidden suffixes at any point before

}

}

return findReplace(body,target,target_size,sub,sub_size);

}

/*

Function: findReplace

Bog standard search/replace function for a given string and a given pair of target/substitute. Skips over <safe> tags if appropriate.

Parameters:

body - Text to be searched/replaced

target - Text to be replaced

target_size - Precalulated length of target string

sub - Text to replace target

sub_size - Precalulated length of sub string

Returns:

Text with spaces added around periods, <safe> tags, and endline charactes

*/

private static String findReplace(String body, String target, int target_size, String sub, int sub_size){

int safe_count = 0;

for(int i = 0; i<=body.length()-target_size;i++){

for(int j = 0; j <target_size; j++)

if(body.charAt(i+j)!=target.charAt(j))

break; //Once more unto the break

else if(j+1>=target_size){

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

if(skip_protected)

if(body.charAt(i)=='<') //skipping

i+=skip_array[safe_count++];

}

return body;

}

}

Edited by Kurkistan
Link to comment
Share on other sites

  • 1 month later...

Very cool!

Though... I'm wondering if maybe you're taking the transliterating a little too far? I haven't actually run your code, but from looking over the spoiler blocks, it seems like you might be doing a lot of unnecessary replacements, particularly when it comes to vowels.

There was a quote from Peter Ahlstrom on the old forum: "The person who translated these books into English treated certain art pages different ways in order to tailor it to the English-speaking audience. The Navani pages are meant to give a flavor for what the writing looks like, yet still be something readers can figure out and understand."

Unless there is other information outside of the books that I've missed (which is entirely possible -- I haven't kept up with the forums much), I don't think it's really possible to transliterate English into *actual Alethi*, only the pseudo-Alethi that's used in the WOK artwork. So, if your goal is to produce Alethi script consistent with what we have in WOK, I don't really think you need to convert English into full phonetic notation before applying the Alethi script font. Apart from the trickyness with c > s/k, I think all you really need to do are some basic character-level replacements.

The examples we have of pseudo-Alethi (the excerpts from Navani's Notebook on p. 762 & 856)follow English orthography pretty straightforwardly, as I recall -- apart from a few specific exceptions at the character level:

Th > /θ/

CH > /tʃ/

C > /k/ or /s/

W > /ʌ/

X > /ks/

I forget if there's anything canon regarding Qu, but I think it should realize as either /k/ or /kʌ/

For example, the device on p. 762 is labelled just as it would be spelled in English ("Pain Knife"), not as it would be pronounced (/pān nīf/). Furthermore, the English word "Joy" (/dʒoi/) is written using distinct Alethi characters for "J" and "Y" even though that combination of letters would pronounced in modern Alethi the same as "Yoy" (/joi/).

But regardless, I'm impressed with what you folks have done!

Link to comment
Share on other sites

@harakeke

A fair enough criticism. We do know the phonetic sounds of many of the letters, though, so I think it's acceptable to try to phonetically spell words instead of simply going for vague approximations.

I wasn't actually aware that I was putting more work into a true transliteration than the artist (Inkthinker, I think), but I see no reason to stop now that my evil plan has progressed so far!

As for over-replacing vowels--having a different line for essentially the same block of text except for needing to check which vowels precede or follow it--that's because I fail regex forever, which JoeST will hopefully fix as soon as we can get a JavaScript implementation off the ground.

Overall, thank you for your enthusiasm. It's been a bit sad and lonely around here for a while now. Also, welcome to the forums! You might want to head over to the Introduction section to get your cookie/waffle. Watch out for spikes, though!

EDIT: Harmony's forearms! I hadn't realized until now (a month later) that you're the guy who diciphered the text originally. Great work on that!

Also, I didn't mean anything negative when I said I was working harder on a "true transliteration" than Inkthinker. Reading over it now, it sounds harsh and a bit dismissive. I guess I've gotten a smidge caught up in the details of the transliteration, to the point where the thought of reverting to a "nuts and bolts" transliteration is near-sacrilege. ;)

Edited by Kurkistan
Link to comment
Share on other sites

  • 4 weeks later...

This has been a fairly long break, hasn't it? New version up, I finally (yes, I am lazy) implemented a simple efficiency boost to the search algorithm, as well as throwing in a few odd grammars and fixes to existing grammars.

Big news of the day is that I've implemented a <safe>[...]</safe> tag that protects the code within the tag from being touched by the transliteration aspect of the program. This way, particularly tricky words or proper names can be cordoned off and search-replaced manually. It's currently set to leave the tags in the final text, where they can be easily found and removed after manual transliteration.

Ex.

"<safe>Wow, Xanthophyll is not necessarily the most transmorgraphical name to pronounce, is it?</safe>" becomes:

"<safe>.wow xanthophyll is not necessarily the most transmorgraphical name to pronounce is it</safe>"

EDIT:

Added in grammar for "indict" fixed some inefficiencies in how the <safe> tag was handled, and threw in some documentation and a rudimentary program flow for the benefit of Joe ST.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan, with significant developmental input from Turos

* @date 02/21/2012

* @version 1.9.4.1

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_9_4_1{

static boolean debug_char = false;

static boolean debug_end_e = false;

static boolean remove_illegal = true;

static boolean add_CR = true;

static boolean skip_protected = true;

static boolean retain_tags = true;

static boolean unbounded = false;

static int[] skip_array; //stores number of indexes to skip for <safe> tags

//^global booleans to turn certain parts of the program on/off

/* static String Targets = "";

static int min = 200;

static int max = 400; */

static int Count = 0;

static boolean Counting = true; //used to count number of replace operations run

/**

Program flow, 1.9.4.1:

main()

convertText()

readFile()

removeCharacters()*

periodMover()

inAlphabet()

spaceEnds()

buildSkipArray()*

safeSkip()

<Recursive>

replaceLetters()

replace()

realReplace()

<Recursive>

findReplace()

removeSkip()*

realReplace()

unSpaceEnds()

writeFile()

allowedCharacters*

* - Indicates possible call based on global boolean setting.

*/

/*

Function: main

Runs program: Asks for filename of input and writes to outfile, as well as printing out execution time run

Parameters:

None

Returns:

void

*/

/**

* Any sequence of characters bracketed by <safe>[...]</safe> will not be touched by the program

*/

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter input file (full name of file in same directory): ");

String temp = input.next();

//temp = "Test.txt";

final double startTime = System.currentTimeMillis();

final double endTime;

try {

String alethi = convertText(temp);

if(alethi.equals("&"))

return;

//putting carriage-returns back in to make it look pretty in Notepad. I can't tell what else they might do.

if(add_CR)

for(int i = 0; i<alethi.length();i++)

if(alethi.charAt(i)=='\n')

alethi = alethi.substring(0,i)+"\r"+alethi.substring(i++,alethi.length());

//writeFile(Targets,"TEMP.txt");

temp = "Alethi_"+temp;

writeFile(alethi,temp);

if(debug_char){

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

} finally {

endTime = System.currentTimeMillis();

}

final double duration = endTime - startTime;

System.out.println("Execution time: "+(duration/1000)+" seconds");

}

/*

Function: convertText

Turns English string into Roman-alphabet phonetic spelling

Parameters:

roman - Raw string of input file, still in roman.

Returns:

Roman-alphabet phonetic spelling of input string

*/

private static String convertText(String roman) throws IOException

{

roman = readFile(roman); //text file

if((roman.length()==1)&&(roman.charAt(0)=='&')) //invalid input, halt program

return "&";

if(remove_illegal)

roman = removeCharacters(roman);

roman = periodMover(roman);

roman = spaceEnds(roman);

if(skip_protected)

buildSkipArray(roman);

String alethi = replaceLetters(roman);

if(skip_protected){

alethi = removeSkip(alethi);

if(unbounded)

System.out.println("There is at least one unbounded '<safe>'");

}

return unSpaceEnds(alethi);

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static String readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("File not in directory or misspelled.");

return "&";

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole;

}

/*

Function: removeCharacters

Takes out non-allowed characters, replacing appropriate characters with their proper equivalent

Parameters:

body - The text to be corrected

Returns:

Character-pruned original text

*/

private static String removeCharacters(String body)

{

char[] library = new char[56];

library[0] = '\t'; //tab

library[1] = '\n';

library[2] = ' ';

library[3] = '.';

int place = 4;

for(int i = 65; i <=90; i++)

library[place++] = (char)i;

for(int i = 97; i <=122; i++)

library[place++] = (char)i;

for(int i = 0; i < body.length(); i++)

if(Arrays.binarySearch(library,body.charAt(i))<0) //I felt embarrassed by my earlier search algorithm.

if((body.charAt(i)=='?')||(body.charAt(i)=='!'))

body = body.substring(0,i)+"."+body.substring(i+1,body.length());

else if(body.charAt(i)=='-')

body = body.substring(0,i)+" "+body.substring(i+1,body.length());

else if(body.charAt(i)==(char)39) //apostrophe character

if((i>0)&&(body.charAt(i-1)=='s')) //allowing for both Unitied States' and United States's, as an example

if((i<body.length()-1)&&(body.charAt(i+1)=='s')) //"-s's"

body = body.substring(0,i)+" A"+body.substring((i++)+2,body.length()); //" A"->"ez"

else

body = body.substring(0,i)+" A"+body.substring((i++)+1,body.length()); //"-s'"

else if((i<body.length()-1)&&(body.charAt(i+1)=='s')) //"-'s"

body = body.substring(0,i)+" B"+body.substring((i++)+2,body.length()); //" B"->"z"

else

body = body.substring(0,i)+body.substring(i--+1,body.length()); //same as normal

else if (body.charAt(i)=='<')

if(i<body.length()-("/safe>".length())) //no reason to have <safe> in very end, especially since there's always a \n

if(body.substring(i+1,i+7).equals("/safe>"))

i+=6;

else if(body.substring(i+1,i+6).equals("safe>"))

i+=5;

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

return body;

}

/*

Function: periodMover

In the Alethi alphabet, sentences start with a period '.' and don't end with anything. This models that.

Parameters:

body - Text to be manipulated

Returns:

Text with periods moved to beginning of sentences

*/

private static String periodMover(String body)

{

int start = 0;

for(int i=0;i<body.length();i++)

{

if(body.charAt(i)=='.'){

while((i<body.length())&&(body.charAt(i)=='.')) //multiples

body = body.substring(0,start)+"."+body.substring(start,i)+body.substring((i++)+1,body.length());

while(i<body.length())

if(!inAlphabet(body.charAt(i)))

i++;

else if(body.charAt(i-1)=='<') //skipping

i+=5;

else if(body.charAt(i-1)=='/') //skipping

i+=6;

else

break; //Yes, the cardinal sin.

start = i;

}

else if(body.charAt(i)=='\n')

start=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

return body;

}

/*

Function: inAlphabet

Returns whether or not a character is within the lower-case roman alphabet

Parameters:

character - char to be checked

Returns:

Boolean indicating whether or not the given char is in the lower-case roman alphabet

*/

private static boolean inAlphabet(char character){

int value = (int)character;

if((value>=97)&&(value<=122)) //just checking lowercase letters

return true;

return false;

}

/*

Function: spaceEnds

Adds 'space' buffers around periods, <safe> and </safe> tags, and endline characters to enable easier replacement of string segments at the ends of words.

Parameters:

body - Text to be manipulated

Returns:

Text with spaces added around periods, <safe> tags, and endline charactes

*/

private static String spaceEnds(String body){

for(int i=0;i<body.length();i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+" "+body.substring((i++)+1,body.length());

else if(body.charAt(i)=='\n'){

body = body.substring(0,i)+" \n "+body.substring(i+1,body.length());

i+=2;

}

else if(body.charAt(i)=='>') //For skipping

body = body.substring(0,i+1)+" "+body.substring((i++)+1,body.length());

else if(body.charAt(i)=='<')

body = body.substring(0,i)+" "+body.substring((i++),body.length());

//System.out.println(body);

return body;

}

/*

Function: buildSkipArray

Sets the value of the global int[] array skip_array to store the number of indices which each successive <safe> tag indicates should be skipped.

Parameters:

body - Text to be read from, <safe> found in.

Returns:

Void. skip_array value set

*/

private static void buildSkipArray(String body){

String gradual = "";

int count = 0;

int temp;

for(int i = 0; i<=body.length()-1;i++)

if(body.charAt(i)=='<'){ //skipping

temp = safeSkip(body.substring(i+1,body.length()));

i+=temp;

gradual+=temp + ":";

count++;

}

//System.out.println(gradual);

skip_array = new int[count];

int place = 0;

for(int i = 0;i<count;i++){

temp = gradual.indexOf(':',place);

skip_array = Integer.parseInt(gradual.substring(place,temp));

place = temp+1;

}

}

/*

Function: safeSkip

Returns the number of indices to be skipped until the end of a <safe>...</safe> sequence.

Parameters:

clip - Hopefully the segment of a larger body of text following directly after a '<' character.

Returns:

The number of indices until the ending '>', if it exists, the number until the end of the string otherwise.

*/

private static int safeSkip(String clip){

//assume that point just before clip was '<'

int skip = 0;

if(clip.length()>=("safe></safe>".length()))

if(clip.substring(0,5).equals("safe>"))

for(int i=5; i < (clip.length()-("</safe>".length()));i++)

if(clip.charAt(i)=='<'){

if(clip.substring(i,i+6).equals("<safe>"))

i += safeSkip(clip.substring(i+1,clip.length()));

else if(clip.substring(i,i+7).equals("</safe>")){

skip=(i+6);

break;

}

}

else if(i+1>=clip.length()-("</safe>".length())){

skip = clip.length()-1;

unbounded = true;

}

return skip;

}

/*

Function: removeSkip

Removes all <safe> and </safe> tags from the text

Parameters:

body - The text to be manipulated.

Returns:

The body without any <safe> or </safe> tags

*/

private static String removeSkip(String body){

skip_protected=false;

if(retain_tags){

body = realReplace("QQQ", body," <safe> ", "<safe>");

body = realReplace("QQQ", body," </safe> ", "</safe>"); //java didn't agree when I wanted to nest them

}

else{

body = realReplace("QQQ", body," <safe> ", "");

body = realReplace("QQQ", body," </safe> ", ""); //java didn't agree when I wanted to nest them

}

return body;

}

/*

Function: unSpaceEnds

Removes the 'space' buffers around periods, <safe> and </safe> tags, and endline characters to return text to proper formating.

Parameters:

body - Text to be manipulated

Returns:

Text with spaces removed from around periods, <safe> tags, and endline charactes

*/

private static String unSpaceEnds(String body){

for(int i=1;i<body.length()-2;i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+body.substring(i+2,body.length());

else if(body.charAt(i)=='\n')

body = body.substring(0,i-1)+"\n"+body.substring((i--)+2,body.length());

if(body.charAt(body.length()-2)=='.')

body = body.substring(0,body.length()-1);

else if(body.charAt(body.length()-2)=='\n')

body = body.substring(0,body.length()-3)+"\n";

return body.substring(1,body.length()-1); //clipping first/last '\n';;

}

/*

Function: writeFile

Writes the given string to an outfile

Parameters:

text - Text to be written.

destination - Name of outfile

Returns:

Void, outfile written to.

*/

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("Output file already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

/*

Function: allowedCharacters

Returns string of lines and types of characters which ought not be in the text upon output because Turos's Alethi font convention does not allow for them.

Parameters:

body - Text to be read

Returns:

String containing line numbers and types of violations of font conventions

*/

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercaseases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body.charAt(i)=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)) //c, q, w, and x

library[place++] = (char)i;

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body.charAt(i)=='\n')

line++;

else if(Arrays.binarySearch(library,body.charAt(i))<0) //not in library

violations = violations + (line+":"+body.charAt(i)) + "; ";

return violations;

}

/*

Function: test

Generic function used to test odds and ends of code.

Parameters:

None

Returns:

Void

*/

public static void test()

{

String body = "\nbutler\n";

String target = "ap\n";

String sub = "op\n";

System.out.println(replace(body,target,sub));

int target_size = target.length();

int sub_size = sub.length();

String sofar = "";

int j = 2;

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

System.out.println(body);

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

/*

Function: replaceLetters

Body of program, replaces English spelling of text segments with phonetic spelling in Roman-alphabet

Parameters:

body - Text to be manipulated

Returns:

Text with Roman-alphabet phonetic spelling of English words.

*/

private static String replaceLetters(String body)

{

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//ph

body = replace(body,"ph","f");

//anti-

body = replace(body,".anti",".antahy");

body = replace(body,".whole",".hohl");

//wh

body = replace(body,"whose","hooz");

body = replace(body,"whom","hoom");

body = replace(body,"who\n","hoo\n");

body = replace(body,"where","huair"); //changed w to u

body = replace(body,"whir","huur");

body = replace(body,"wh","hu"); //Might need more permutations

body = replace(body,".accr",".uhkr"); //many many many

body = replace(body,".acci",".aksi");

body = replace(body,".accord",".uhkawrd");

body = replace(body,".accomp",".uhkuhmp");

body = replace(body,".acco",".uhko");

body = replace(body,".accustom\n",".uhkuhstuhm\n");

body = replace(body,".accolade\n",".akuhleyd\n");

body = replace(body,".accus",".uhkyooz");

body = replace(body,".accurs",".uhkurs");

body = replace(body,".accur",".akyer");

body = replace(body,".accum",".uhkyoom");

body = replace(body,".accout",".uhkoot");

body = replace(body,".accoun",".uhkoun");

body = replace(body,".acce",".akse"); //the dreaded double c's

body = replace(body,".ecc",".eks");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,".occ",".uhk");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

//E at end - Some interference possible with C's

body = replace(body,".cause",".kawz");

body = replace(body,"ause\n","awz\n");

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //special case

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"have\n","hav\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

//body = replace(body,"ever\n","ever\n");

body = replace(body,"eve\n","eev\n"); //HOWEVER

body = replace(body,"eever\n","ever\n");

body = replace(body,"ile\n","ahyl\n");

//System.out.println(replace(replace("while ","wh","hu"),"ile\n","ahyl\n"));

//huahyl

body = replace(body,"gle\n","guhl\n");

body = replace(body,".key\n",".kee\n"); //special

body = realReplace("QQQ",body,".keys\n",".kees\n");

body = replace(body,"base\n","beys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","keys\n");

body = replace(body,"chase\n","Ceys\n"); //ch == C

body = replace(body,"Case\n","Ceys\n"); //necessary?

body = replace(body,"erase\n","ihreys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,".one\n",".uuhn\n"); //sepcial

body = replace(body,".someone\n",".suhmuuhn\n");

body = replace(body,".anyone\n",".eneeuuhn\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,".some",".suhm");

body = replace(body,"comedy","komidee");

body = replace(body,"come\n","kuhm\n"); //Need to move this up

body = replace(body,".come",".kuhm");

body = replace(body,"ome\n","ohm\n");

body = replace(body,"title\n","tahytl\n");

body = replace(body,"ttle\n","tl\n");

body = replace(body,"tle\n","tl\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,".discipline\n",".disipline\n");

body = replace(body,"cine\n","sin\n");

body = replace(body,"ine\n","ahyn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"op\n","ohp\n");

body = replace(body,"ope\n","ohp\n");

body = replace(body,"rue\n","roo\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"bead\n","beed\n");

body = replace(body,".read\n",".reed\n");

body = replace(body,"nead\n","need\n");

body = replace(body,"lead\n","leed\n");

body = replace(body,"ead\n","ed\n"); //general

body = replace(body,"ade\n","eyd\n");

//1.9.2.1

body = replace(body,"heir","air"); //general rule

body = replace(body,"eir\n","er\n");

//this one's touchy, I'm just throwing in "air" exemptions to the "eer" rule where I see them

body = replace(body,"where\n","hwair\n");

body = replace(body,".ere\n",".air\n");

body = replace(body,"there\n","thair\n");

body = replace(body,"sphere\n","sfeer\n");

body = realReplace("QQQ",body,".here\n",".heer\n");

body = realReplace("QQQ",body,".were\n",".wur\n");

body = replace(body,"sier\n","seer\n");

body = replace(body,"shier\n","Seer\n");

body = replace(body,"Sier\n","Seer\n");

body = replace(body,"cier\n","seer\n");

body = replace(body,".premiere\n",".primeer\n");

body = replace(body,"iere\n","yair\n");

body = replace(body,"soldier","sohljer");

body = replace(body,"iere\n","yair\n");

body = replace(body,".persevere\n",".pursuhveer\n");

body = replace(body,".revere\n",".riveer\n");

body = replace(body,"cere\n","seer\n");

body = replace(body,".interfere\n",".interfeer\n");

body = replace(body,"mmere","M");

body = replace(body,"mere\n","meer\n");

body = replace(body,"M","mmere");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"tire","tahyuhr"); //NOT \n or e

body = replace(body,"aire\n","air\n");

//body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","ahrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"promise\n","promis\n");

body = replace(body,"aise\n","eyz\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

body = replace(body,"que\n","k\n");

body = replace(body,"udge\n","uhj\n");

body = replace(body,"dge\n","j\n"); //NOT sure

body = replace(body,"age\n","aij\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"rgue\n","rgyoo\n");

body = replace(body,"gue\n","eeg\n");

//ible, might need to generalize downtown

body = replace(body,"ible\n","uhbuhl\n");

//-nge

//problem with sing, singer vs singe, singer not really being separable at the gerund-testing level

body = replace(body,"finger\n","fingger\n");

body = replace(body,"linger\n","lingger\n");

body = replace(body,"finger","fingger");

body = replace(body,"linger","lingger");

body = replace(body,".anger\n",".angger\n");

body = replace(body,".angry\n",".angree\n");//?

//body = realReplace("",body,"ringe\n","rinj\n"); //This is the best I can do for now.

body = replace(body,".cringe\n",".krinj\n");

body = replace(body,".fringe\n",".frinj\n");

body = replace(body,".cringe\n",".kuhnstrinj\n");

body = replace(body,".astringe\n",".uhstrinj\n");

body = replace(body,".infringe\n",".infrinj\n");

body = realReplace("R",body,"hinge\n","hinj\n");

body = realReplace("R",body,".impinge\n",".impinj\n");

body = realReplace("R",body,"winge\n","winj\n");

body = realReplace("R",body,".binge\n",".binj\n");

body = realReplace("",body,".tinge\n",".winj\n");

body = realReplace("",body,".dinge\n",".dinj\n");

body = realReplace("QQQ",body,".singe\n",".sinj\n");

body = realReplace("QQQ",body,".singed\n",".sinjed\n");

body = realReplace("QQQ",body,".singeing\n",".sinjing\n");

body = realReplace("g",body,"inging\n","D\n"); //temporary

body = replace(body,"ing\n","I\n"); //temporary

body = replace(body,"nge\n","nj\n");

body = replace(body,"I","ing");

body = replace(body,"D","inging");

//END E's

//s at end - 1.7.4.5 -> unneeded, I think

//body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//START C 1.7 - moved so that higher number of characters in target get's preference, blocks kept cohesive

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

body = replace(body,"accent","aksent");

body = replace(body,"exercise\n","eksersahyz\n");

body = replace(body,".once",".wuhns");

body = replace(body,"preface\n","prefis\n"); //special

body = replace(body,"icise\n","uhsahyz\n");

body = replace(body,"rcise\n","ruhsahyz\n");

body = replace(body,".tacit\n",".tasit\n");

body = replace(body,"ciate\n","sheeeyt\n");

body = replace(body,"cate\n","kit\n");

body = replace(body,"vate\n","vit\n"); //pulled from E section, might be a sign of things to come

body = replace(body,"literate\n","literit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"cision\n","sizhuhn\n");

body = replace(body,"cise\n","sahys\n");

body = replace(body,"cist\n","sist");

body = replace(body,"duce\n","doos\n");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usst\n"); //D's

body = replace(body,"came\n","keym\n");

body = replace(body,"came","kamuh");

body = replace(body,"ct","kt"); //factual

body = replace(body,"tual\n","Cual\n");

body = replace(body,".acid\n",".asid\n");

body = replace(body,".aci",".uhsi");

body = replace(body,"ierce\n","eers\n");

body = replace(body,"ince\n","ins\n");

//body = replace(body,".ance",".ahns");

body = replace(body,".trance",".trahns");

body = replace(body,"dance\n","dahns\n");

body = replace(body,"Cance\n","Cahns\n");

body = replace(body,"cance\n","kahns\n");

body = replace(body,"lance\n","lahns\n");

body = replace(body,"vance\n","vahns\n");

body = replace(body,"ance\n","uhns\n");

body = replace(body,"all\n","awl\n");

body = realReplace("QQQ",body,".supplement\n",".suhpluhment\n"); //special case

body = replace(body,".supp",".suhpp"); //just a general rule

body = replace(body,"ape\n","eYp\n");

body = replace(body,"appa","apuh");

body = replace(body,".appear",".uhpeer");

body = replace(body,"ppen","pen"); //double p's, might NOT be done

body = replace(body,"pplet\n","plit\n");

body = replace(body,"pple\n","puhl\n");

body = replace(body,"ppl","puhl");

body = replace(body,"upp\n","uhp");

body = replace(body,"oppor","oper");

body = replace(body,".opp",".ohp");

body = replace(body,".op",".ohp");

body = replace(body,"opp","uhp");

body = replace(body,"ypp","ip");

body = replace(body,"pp","p"); //Last ditch, should cover most before this

body = replace(body,"tice\n","tis\n");

body = replace(body,"arice\n","eris\n");

body = replace(body,"orice\n","uhis\n");

body = replace(body,"cipice\n","suhpis\n"); //patch for precipice

body = replace(body,"ipice\n","uhpis\n");

body = replace(body,".vice\n","vahys\n");

body = replace(body,"vice\n","vis\n");

body = replace(body,"ice\n","ahys\n"); //Long S. NOT sure about \n's

body = replace(body,"egy\n","ijee\n"); //possibilities/strategies fix, I have now idea how the ended up "kiez"

body = replace(body,"city\n","sitee\n");

body = replace(body,"cite\n","sahyt\n");

body = replace(body,"ity\n","itee\n");

body = replace(body,"ite\n","ahyt\n");

body = replace(body,"irst\n","urst\n");

body = replace(body,"ong\n","ong\n");

body = replace(body,"ull\n","ool\n");

body = replace(body,"cide\n","sahyd\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ence\n","ens\n");

body = replace(body,"rend\n","rend\n");

//1.8.9 Pie-

body = replace(body,"piety","pahyitee");

body = replace(body,".pier\n"," peer\n");

body = replace(body,".pie\n"," pahy\n");

body = replace(body,".pie",".pee");

body = replace(body,"ces\n","seez\n");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"ci\n","sahy\n");

body = replace(body,"gan\n","gahn\n");

body = replace(body,"dle\n","dl\n");

body = replace(body,"align\n","uhlahyn\n");

body = replace(body,"oy\n","oi\n");

body = replace(body,"ace\n","eys\n");

body = replace(body,".chull\n",".as\n");

body = replace(body,".chull",".uhs"); //Assoc-

body = replace(body,".rely\n",".relahy\n");

body = replace(body,"ely\n","lee\n"); //MUST BE LAST IN \N

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"cious","shuhs"); //For Ithaca!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"ply\n","plahy\n");

body = replace(body,".by\n",".bahy\n");

body = replace(body,".my\n",".mahy\n");

body = replace(body,".die\n",".dahy\n");

body = replace(body,".dye\n",".dahy\n");

body = replace(body,".bye\n",".bahy\n"); //conflict

body = replace(body,"hype","hahype");

body = replace(body,"hypo","hahypo");

body = replace(body,"hypn","hipn");

body = replace(body,"hyphen","hahyfuhn");

body = replace(body,"hyfen","hahyfuhn"); //ph->f

body = replace(body,"yp","ip");

body = replace(body,"eYp","eyp"); //see ape->eyp

body = replace(body,"duct","duhkt");

body = replace(body,"stion","sCuhn"); //1.8.9.4

body = replace(body,"tion","Suhn"); //1.8

body = replace(body,"ssion","Suhn"); //1.8.6

body = replace(body,"sion","zhuhn");

body = replace(body,"cean","Suhn");

body = replace(body,".abou",".uhbou");

body = replace(body,".aband",".uhbanduhn");

body = replace(body,"ture","Cur");

body = replace(body,"cies","seez"); //prophocies

body = replace(body,"ciez","seez"); //s->z already done

body = replace(body,"iew","yoo");

body = replace(body,".face",".feys");

body = replace(body,"face","feys");

body = replace(body,"indict","indahyt");

//For-

body = replace(body,".fore",".fohr");

body = replace(body,".for",".fohr");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,"acen","eysuhn"); //Don't get complacent

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweyeynt");

body = replace(body,"cing","sing");

//1.6.5 - odyssey test

body = replace(body,"exce","ikse");

body = replace(body,"excit","iksahyt");

body = replace(body,"excis","eksahyz");

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ight","ahyt");

body = replace(body,"cep","sep");

body = replace(body,"cin","sin");

body = replace(body,".cit",".sit");

body = replace(body,"cip","sip");

body = replace(body,".def",".dihf");

body = replace(body,"cif","sif"); //NOT sure

body = replace(body,"icc","ik");

body = replace(body,"icn","ikn");

body = replace(body,"sce","SE");

body = replace(body,"SEyp","skeyp");

body = replace(body,"SE","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

//body = replace(body,"sco","sko");

body = replace(body,"cea","sea");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsee");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cent","sent"); //odyssey

body = replace(body,"it\n","it\n"); //Tacked on for suffix reasons

body = replace(body,"ap\n","ap\n");

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".cer",".sur");

body = replace(body,".ce",".se");

body = replace(body,"ck","k");

/* body = realReplace("QQQ",body,"C\n","k\n");

body = realReplace("QQQ",body,"ch\n","k\n"); */

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

body = replace(body,"ce","se"); //might want to move

body = realReplace("QQQ",body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = realReplace("QQQ",body,"c\n","k\n"); //to stop mischeif

//END C'S

body = replace(body,".odyssey\n",".oduhsee\n"); //special

body = replace(body,"sey\n","zee\n");

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

body = replace(body,".be\n",".bee\n");

body = replace(body,".maybe\n",".meybee\n");

//rom

body = realReplace("QQQ",body,".roman\n",".rohmahn\n"); //might want to generalize "-an" suffix

body = replace(body,"rom","rohm");

//gh

body = replace(body,"gha","gah"); //This section needs work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"igh","ahy");

body = replace(body,".enough\n",".ihnuhf\n"); //special case

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".to\n",".too\n");

body = replace(body,".two\n",".too\n");

//q at end

body = realReplace("QQQ",body,"q\n","k\n");

//w at end

body = replace(body,".low\n",".loh\n");//special cases

body = replace(body,".row\n",".roh\n");

body = replace(body,".tow\n",".toh\n");

body = replace(body,"ow\n","au\n");

//.sy

body = replace(body,".syr",".suhr"); //Moved up to e-enders

body = replace(body,".syr",".sir");

body = replace(body,".sly",".slahy");

body = replace(body,".lying\n",".lahying\n");

body = replace(body,".ly",".li");

//sz->siz - The coward's way out. I need to sit down and make this thing more cohesive

body = replace(body,"sz\n","siz\n");

body = replace(body,"pie\n","pahy\n"); // NOT normal, aka special

body = realReplace("qqq",body,".or",".awr");

body = replace(body,".sky",".skahy");

body = replace(body,".fly",".flahy");

body = replace(body,".ally\n",".alahy\n");

body = realReplace("qqq",body,"y\n","ee\n");

body = realReplace("qqq",body,"ehee\n","ehy\n");

body = realReplace("qqq",body,"ahee\n","ahy\n");

body = realReplace("qqq",body,"eee\n","ey\n"); //fixing issues raised by y->ee as compared to other phonetics

body = realReplace("qqq",body,"iest\n","eeest\n");

body = replace(body,"izen","uhzen");

body = replace(body,"ize","ahz");

body = replace(body,"able","uhbuhl");

body = replace(body,"ably","uhblee"); //Last sweep

String[] temp = {"en","st","un","c","f","g","s","t"};

body = replace(body,"ctable\n","kteybuhl\n"); //save the c's!

for(int i = 0; i<temp.length;i++)

if(temp.equals("c"))

body = replace(body,"kable\n","eybuhl\n");

else

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"ble\n","buhl\n");

//x's

body = replace(body,".xy",".zi");

body = replace(body,"xious","kSuhs");

//apostrophe possessive replacement, see removeCharacters()

body = replace(body," A","ez");

body = replace(body," B","z");

//General fixer for suffixes

//body = replace(body,"\n","\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling sometimes.

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"qu","ku");

body = replace(body,"w","u"); //exception catcher

if(debug_end_e){

body = replace(body,"e\n","Q\n"); //Just for debugging

body = replace(body,".TQ",".Te");

body = replace(body,".bQ",".be");

body = replace(body,".seQ",".seee");

body = replace(body,".mQ",".me");

body = replace(body,"eQ\n","ee\n");

body = replace(body,"Qy\n","ey\n");

body = replace(body,".hQ",".he");

body = replace(body,".shQ",".she");

}

return body;

}

/*

Function: replace

Buffer function for realReplace, adds on an empty string for generic case

Parameters:

body - Text to be searched/replaced

target - Text to be replaced

sub - Text to replace target

Returns:

Original text with target replaced by sub by realReplace

See Also:

<realReplace>

*/

private static String replace(String body, String target, String sub){

return realReplace("",body,target,sub);

}

/*

Function: realReplace

Permutates (hopefully) all expected suffixes to replace a given string with a substitute string

Parameters:

sofar - Shorthand listing of the suffixes which have been added to the original target/sub comination up to this point. "QQQ" and "qqq" used to denote a desire not to perumutate target/string suffixes at all.

body - Text to be searched/replaced

target - Text to be replaced

sub - Text to replace target

Returns:

Text with spaces added around periods, <safe> tags, and endline charactes

*/

private static String realReplace(String sofar, String body, String target, String sub)

{

int target_size = target.length();

int sub_size = sub.length();

//As of 1.8.8.1, '.' and '\n' are only codes for ' '. Spaces will be added before and after every \n, as well as after every period, then removed at the end.

//'.'==' '

if(target.startsWith("."))

return realReplace(sofar, body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub_size)));

else if(target.endsWith("\n"))

return realReplace(sofar, body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub_size-1)+" ")); //space substitution

/* if((min<Count++)&&(max>Count))

Targets+= target+"_"; */

if(Counting)

{

Count++;

if(target.equals("w"))

System.out.println("Replaces Run: "+Count);

}

if(target.endsWith(" "))

if(sofar.length()<=2){ //that took longer than it should have. Anyone who can suggest improvements is welcome to try.

/* if(target.equals(" lingered "))

System.out.println(target); */

//I think contains() covers it. It saves time over endsWith() if it stops unnecessary calls to realReplace(), as long as it doesn't cut out possible permutations

if((!sofar.contains("z"))&&(!sofar.contains("l"))&&(!sofar.contains("t"))){

if(!sofar.contains("i"))// s->z

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z "));

else if((sub_size>=2)&&(sub.charAt(sub_size-2)=='y'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"ez ")); //s->z

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

if(((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))||((sub_size>=2)||(sub.substring(sub_size-2,sub_size).equals("hy"))))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"z "));

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"iez ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

/* //y

body = realReplace("qqq",body,"ay ","ey "); //stopgap, might want to revisit

body = replace(body,"ey ","ey ");

body = realReplace("qqq",body,"oy ","oi ");

body = realReplace("qqq",body,"uy ","ahy ");

body = realReplace("qqq",body,"y ","ee "); //might need generalized in replace()

body = replace(body,"ty","tahy"); */

//ly, focus on y as of 1.7.4.3 - It might need some work

if(target.equals("sly ")) //special case

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else{

//ly

if((target_size>=5)&&(target.substring(target_size-5,target_size-1).equals("able")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"y "),(sub.substring(0,sub_size-4)+"lee ")); //ably

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

if((sub_size>=3)&&(sub.substring(sub_size-3,sub_size-1).equals("ee")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-3)+"ily "),(sub.substring(0,sub_size-3)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"ily "),(sub.substring(0,sub_size-2)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"pily "),(sub.substring(0,sub_size-1)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"tily "),(sub.substring(0,sub_size-1)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

//y

if((target_size>=2)&&(target.charAt(target_size-2)=='a')) //might need work

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ey "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"y "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"i "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='u'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ahy "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"py "),(sub.substring(0,sub_size-1)+"ee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"ty "),(sub.substring(0,sub_size-1)+"ee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee ")); //might not be needed

}

if((!sofar.contains("g"))&&(!sofar.contains("i"))&&(!sofar.contains("r"))){ //covers multiple

//ing, gerunds

if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ie")))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-3)+"ying "),(sub.substring(0,sub_size-1)+"ing ")); //replacing 'ie' before gerund

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){ //experiment

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ring "),(sub.substring(0,sub_size-1)+"ring ")); //rr

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //have to do both, sadly

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ping "),(sub.substring(0,sub_size-1)+"ing "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ting "),(sub.substring(0,sub_size-1)+"ing "));

else

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //no e, presumably ends in consonant

if((!sofar.contains("a"))&&(!sofar.contains("d"))) //ish

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"pish "),(sub.substring(0,sub_size-1)+"ish "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"tish "),(sub.substring(0,sub_size-1)+"ish "));

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ed")))||(target_size<3))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"ish "),(sub.substring(0,sub_size-1)+"ish "));

if(!sofar.contains("a")) //able

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"table "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){//experiment

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"rable "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(target.equals("fly")||target.equals("unfly"))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(((target_size>=4)&&(target.substring(target_size-4,target_size-1).equals("ing")))||(target_size<4))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"eybuhl "));

//1.9

//ize

if(!sofar.contains("x"))

if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

body = realReplace(sofar+"x",body,(target.substring(0,target_size-2)+"ize "),(sub.substring(0,sub_size-1)+"ahyz ")); //removing 'e'

else

body = realReplace(sofar+"x",body,(target.substring(0,target_size-1)+"ize "),(sub.substring(0,sub_size-1)+"ahyz "));

//est - was iest before 1.9.1.1

if((!sofar.contains("t")))

if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

body = realReplace(sofar+"t",body,(target.substring(0,target_size-2)+"iest "),(sub.substring(0,sub_size-1)+"eeest ")); //removing 'y'

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"t",body,(target.substring(0,target_size-2)+"est "),(sub.substring(0,sub_size-1)+"est "));

else

body = realReplace(sofar+"t",body,(target.substring(0,target_size-1)+"est "),(sub.substring(0,sub_size-1)+"est "));

}

if((!sofar.contains("g"))&&(!sofar.contains("d"))){ //covers multiple

if(target_size>=2) //d at end

if(target.charAt(target_size-2)=='e')

if((target_size>=3)&&(target.charAt(target_size-3)=='c'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"st "));

else

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed ")); //NOT st

else if(target.charAt(target_size-2)=='s')

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

else if(target.charAt(target_size-2)=='r'){//experiment

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"red "),(sub.substring(0,sub_size-1)+"d "));

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"d "));

}

else if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("se")))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ped "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ted "),(sub.substring(0,sub_size-1)+"ed "));

else if((target.charAt(target_size-2)!='s')||((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ss"))))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

//er

if((!sofar.contains("r"))&&(!sofar.contains("R"))) //inge special

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r "),(sub.substring(0,sub_size-1)+"er ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"per "),(sub.substring(0,sub_size-1)+"er "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){ //experiement

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"rer "),(sub.substring(0,sub_size-1)+"rer "));

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er "),(sub.substring(0,sub_size-1)+"er "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"ter "),(sub.substring(0,sub_size-1)+"er "));

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er "),(sub.substring(0,sub_size-1)+"er "));

}

/* //ate, not bothering with fobiddances - Never mind

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n")); */

//Why do these need to be dealt with here?

//Because these permuations need to be available to figure out which \n grammars to apply

//ed, ish, ly, ing, able, edly, ishly, ably, lying, eding, abling

//Dirty method - add a recursion counter to replace()

//6 max - ed ish ly ing able z

//ablingly, lyingly - 3

//ablinger

//s-z, ly-l, ing-g, d-d, ish-i, able-a

//everything abides i, nothing abides s/l //nevermind, not much likes i either

//a allows l/s/d,

//a forbids a, i

//d forbids d, i

//g forbids d, g, i, a

//i forbids s, g, i, a

//er-r

//r forbids g, i, a, r

//r is forbidden by s, l, g, d

//y-y

//Not messing with forbidding now (1.8.8.2)

//x-ized, t-iest, t forbids all, don't care about anything else right now

//I think that forbiddance is total - no forbidden suffixes at any point before

}

}

return findReplace(body,target,target_size,sub,sub_size);

}

/*

Function: findReplace

Bog standard search/replace function for a given string and a given pair of target/substitute. Skips over <safe> tags if appropriate.

Parameters:

body - Text to be searched/replaced

target - Text to be replaced

target_size - Precalulated length of target string

sub - Text to replace target

sub_size - Precalulated length of sub string

Returns:

Text with spaces added around periods, <safe> tags, and endline charactes

*/

private static String findReplace(String body, String target, int target_size, String sub, int sub_size){

int safe_count = 0;

for(int i = 0; i<=body.length()-target_size;i++){

for(int j = 0; j <target_size; j++)

if(body.charAt(i+j)!=target.charAt(j))

break; //Once more unto the break

else if(j+1>=target_size){

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

if(skip_protected)

if(body.charAt(i)=='<') //skipping

i+=skip_array[safe_count++];

}

return body;

}

}

How do you use this thing?

Link to comment
Share on other sites

How do you use this thing?

Turos explains it pretty well here.

Also, I just updated the code a bit, so you might want to redownload.

I forget what changes I made from 1.9.4.1, actually. Oops.

EDIT: And the font itself is linked to in the OP.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan, with significant developmental input from Turos

* @date 02/28/2012

* @version 1.9.4.4

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_9_4_4{

static boolean debug_char = false;

static boolean debug_end_e = false;

static boolean remove_illegal = true;

static boolean add_CR = true;

static boolean skip_protected = true;

static boolean retain_tags = true;

static boolean unbounded = false;

static int[] skip_array; //stores number of indexes to skip for <safe> tags

//^global booleans to turn certain parts of the program on/off

/* static String Targets = "";

static int min = 200;

static int max = 400; */

static int Count = 0;

static boolean Counting = true; //used to count number of replace operations run

/**

Program flow, 1.9.4.1:

main()

convertText()

readFile()

removeCharacters()*

periodMover()

inAlphabet()

spaceEnds()

buildSkipArray()*

safeSkip()

<Recursive>

replaceLetters()

replace()

realReplace()

<Recursive>

findReplace()

removeSkip()*

realReplace()

unSpaceEnds()

writeFile()

allowedCharacters*

* - Indicates possible call based on global boolean setting.

*/

/*

Function: main

Runs program: Asks for filename of input and writes to outfile, as well as printing out execution time run

Parameters:

None

Returns:

void

*/

/**

* Any sequence of characters bracketed by <safe>[...]</safe> will not be touched by the program

*/

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter input file (full name of file in same directory): ");

String temp = input.next();

//temp = "Test.txt";

final double startTime = System.currentTimeMillis();

final double endTime;

try {

String alethi = convertText(temp);

if(alethi.equals("&"))

return;

//putting carriage-returns back in to make it look pretty in Notepad. I can't tell what else they might do.

if(add_CR)

for(int i = 0; i<alethi.length();i++)

if(alethi.charAt(i)=='\n')

alethi = alethi.substring(0,i)+"\r"+alethi.substring(i++,alethi.length());

//writeFile(Targets,"TEMP.txt");

temp = "Alethi_"+temp;

writeFile(alethi,temp);

if(debug_char){

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

} finally {

endTime = System.currentTimeMillis();

}

final double duration = endTime - startTime;

System.out.println("Execution time: "+(duration/1000)+" seconds");

}

/*

Function: convertText

Turns English string into Roman-alphabet phonetic spelling

Parameters:

roman - Raw string of input file, still in roman.

Returns:

Roman-alphabet phonetic spelling of input string

*/

private static String convertText(String roman) throws IOException

{

roman = readFile(roman); //text file

if((roman.length()==1)&&(roman.charAt(0)=='&')) //invalid input, halt program

return "&";

if(remove_illegal)

roman = removeCharacters(roman);

roman = periodMover(roman);

roman = spaceEnds(roman);

if(skip_protected)

buildSkipArray(roman);

String alethi = replaceLetters(roman);

if(skip_protected){

alethi = removeSkip(alethi);

if(unbounded)

System.out.println("There is at least one unbounded '<safe>'");

}

return unSpaceEnds(alethi);

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static String readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("File not in directory or misspelled.");

return "&";

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole;

}

/*

Function: removeCharacters

Takes out non-allowed characters, replacing appropriate characters with their proper equivalent

Parameters:

body - The text to be corrected

Returns:

Character-pruned original text

*/

private static String removeCharacters(String body)

{

char[] library = new char[56];

library[0] = '\t'; //tab

library[1] = '\n';

library[2] = ' ';

library[3] = '.';

int place = 4;

for(int i = 65; i <=90; i++)

library[place++] = (char)i;

for(int i = 97; i <=122; i++)

library[place++] = (char)i;

for(int i = 0; i < body.length(); i++)

if(Arrays.binarySearch(library,body.charAt(i))<0) //I felt embarrassed by my earlier search algorithm.

if((body.charAt(i)=='?')||(body.charAt(i)=='!'))

body = body.substring(0,i)+"."+body.substring(i+1,body.length());

else if(body.charAt(i)=='-')

body = body.substring(0,i)+" "+body.substring(i+1,body.length());

else if(body.charAt(i)==(char)39) //apostrophe character

if((i>0)&&(body.charAt(i-1)=='s')) //allowing for both Unitied States' and United States's, as an example

if((i<body.length()-1)&&(body.charAt(i+1)=='s')) //"-s's"

body = body.substring(0,i)+" A"+body.substring((i++)+2,body.length()); //" A"->"ez"

else

body = body.substring(0,i)+" A"+body.substring((i++)+1,body.length()); //"-s'"

else if((i<body.length()-1)&&(body.charAt(i+1)=='s')) //"-'s"

body = body.substring(0,i)+" B"+body.substring((i++)+2,body.length()); //" B"->"z"

else if((i<body.length()-1)&&(body.charAt(i+1)=='d')) //Contractions

body = body.substring(0,i)+" D"+body.substring((i++)+2,body.length()); //" D"->d

else if((i<body.length()-2)&&(body.charAt(i+1)=='v')&&(body.charAt(i+2)=='e'))

body = body.substring(0,i)+" E"+body.substring((i++)+3,body.length()); //" E"->v

else if((i<body.length()-2)&&(body.charAt(i+1)=='l')&&(body.charAt(i+2)=='l'))

body = body.substring(0,i)+" F"+body.substring((i++)+3,body.length()); //" F"->l

else if((i<body.length()-1)&&(body.charAt(i+1)=='t'))

if((i>1))

if(body.charAt(i-1)=='n')

if((body.charAt(i-2)=='e')||(body.charAt(i-2)=='o'))

body = body.substring(0,i-1)+" G"+body.substring((i++)+2,body.length()); //" G"->nt

else if(body.charAt(i-2)=='a') //can't covered by this

body = body.substring(0,i)+body.substring(i--+1,body.length()); //same as normal

else

body = body.substring(0,i-1)+" H"+body.substring((i++)+2,body.length()); //" H"->int

else

body = body.substring(0,i)+body.substring(i--+1,body.length()); //same as normal

else

body = body.substring(0,i)+body.substring(i--+1,body.length()); //same as normal

else if ((skip_protected)&&(body.charAt(i)=='<'))

if(i<body.length()-("/safe>".length())) //no reason to have <safe> in very end, especially since there's always a \n

if(body.substring(i+1,i+7).equals("/safe>"))

i+=6;

else if(body.substring(i+1,i+6).equals("safe>"))

i+=5;

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

return body;

}

/*

Function: periodMover

In the Alethi alphabet, sentences start with a period '.' and don't end with anything. This models that.

Parameters:

body - Text to be manipulated

Returns:

Text with periods moved to beginning of sentences

*/

private static String periodMover(String body)

{

int start = 0;

for(int i=0;i<body.length();i++)

{

if(body.charAt(i)=='.'){

while((i<body.length())&&(body.charAt(i)=='.')) //multiples

body = body.substring(0,start)+"."+body.substring(start,i)+body.substring((i++)+1,body.length());

while(i<body.length())

if(!inAlphabet(body.charAt(i)))

i++;

else if(body.charAt(i-1)=='<') //skipping

i+=5;

else if(body.charAt(i-1)=='/') //skipping

i+=6;

else

break; //Yes, the cardinal sin.

start = i;

}

else if(body.charAt(i)=='\n')

start=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

return body;

}

/*

Function: inAlphabet

Returns whether or not a character is within the lower-case roman alphabet

Parameters:

character - char to be checked

Returns:

Boolean indicating whether or not the given char is in the lower-case roman alphabet

*/

private static boolean inAlphabet(char character){

int value = (int)character;

if((value>=97)&&(value<=122)) //just checking lowercase letters

return true;

return false;

}

/*

Function: spaceEnds

Adds 'space' buffers around periods, <safe> and </safe> tags, and endline characters to enable easier replacement of string segments at the ends of words.

Parameters:

body - Text to be manipulated

Returns:

Text with spaces added around periods, <safe> tags, and endline charactes

*/

private static String spaceEnds(String body){

for(int i=0;i<body.length();i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+" "+body.substring((i++)+1,body.length());

else if(body.charAt(i)=='\n'){

body = body.substring(0,i)+" \n "+body.substring(i+1,body.length());

i+=2;

}

else if(body.charAt(i)=='>') //For skipping

body = body.substring(0,i+1)+" "+body.substring((i++)+1,body.length());

else if(body.charAt(i)=='<')

body = body.substring(0,i)+" "+body.substring((i++),body.length());

//System.out.println(body);

return body;

}

/*

Function: buildSkipArray

Sets the value of the global int[] array skip_array to store the number of indices which each successive <safe> tag indicates should be skipped.

Parameters:

body - Text to be read from, <safe> found in.

Returns:

Void. skip_array value set

*/

private static void buildSkipArray(String body){

String gradual = "";

int count = 0;

int temp;

for(int i = 0; i<=body.length()-1;i++)

if(body.charAt(i)=='<'){ //skipping

temp = safeSkip(body.substring(i+1,body.length()));

i+=temp;

gradual+=temp + ":";

count++;

}

//System.out.println(gradual);

skip_array = new int[count];

int place = 0;

for(int i = 0;i<count;i++){

temp = gradual.indexOf(':',place);

skip_array = Integer.parseInt(gradual.substring(place,temp));

place = temp+1;

}

}

/*

Function: safeSkip

Returns the number of indices to be skipped until the end of a <safe>...</safe> sequence.

Parameters:

clip - Hopefully the segment of a larger body of text following directly after a '<' character.

Returns:

The number of indices until the ending '>', if it exists, the number until the end of the string otherwise.

*/

private static int safeSkip(String clip){

//assume that point just before clip was '<'

int skip = 0;

if(clip.length()>=("safe></safe>".length()))

if(clip.substring(0,5).equals("safe>"))

for(int i=5; i < (clip.length()-("</safe>".length()));i++)

if(clip.charAt(i)=='<'){

if(clip.substring(i,i+6).equals("<safe>"))

i += safeSkip(clip.substring(i+1,clip.length()));

else if(clip.substring(i,i+7).equals("</safe>")){

skip=(i+6);

break;

}

}

else if(i+1>=clip.length()-("</safe>".length())){

skip = clip.length()-1;

unbounded = true;

}

return skip;

}

/*

Function: removeSkip

Removes all <safe> and </safe> tags from the text

Parameters:

body - The text to be manipulated.

Returns:

The body without any <safe> or </safe> tags

*/

private static String removeSkip(String body){

skip_protected=false;

if(retain_tags){

body = realReplace("QQQ", body," <safe> ", "<safe>");

body = realReplace("QQQ", body," </safe> ", "</safe>"); //java didn't agree when I wanted to nest them

}

else{

body = realReplace("QQQ", body," <safe> ", "");

body = realReplace("QQQ", body," </safe> ", ""); //java didn't agree when I wanted to nest them

}

return body;

}

/*

Function: unSpaceEnds

Removes the 'space' buffers around periods, <safe> and </safe> tags, and endline characters to return text to proper formating.

Parameters:

body - Text to be manipulated

Returns:

Text with spaces removed from around periods, <safe> tags, and endline charactes

*/

private static String unSpaceEnds(String body){

for(int i=1;i<body.length()-2;i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+body.substring(i+2,body.length());

else if(body.charAt(i)=='\n')

body = body.substring(0,i-1)+"\n"+body.substring((i--)+2,body.length());

if(body.charAt(body.length()-2)=='.')

body = body.substring(0,body.length()-1);

else if(body.charAt(body.length()-2)=='\n')

body = body.substring(0,body.length()-3)+"\n";

return body.substring(1,body.length()-1); //clipping first/last '\n';;

}

/*

Function: writeFile

Writes the given string to an outfile

Parameters:

text - Text to be written.

destination - Name of outfile

Returns:

Void, outfile written to.

*/

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("Output file already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

/*

Function: allowedCharacters

Returns string of lines and types of characters which ought not be in the text upon output because Turos's Alethi font convention does not allow for them.

Parameters:

body - Text to be read

Returns:

String containing line numbers and types of violations of font conventions

*/

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercaseases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body.charAt(i)=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)) //c, q, w, and x

library[place++] = (char)i;

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body.charAt(i)=='\n')

line++;

else if(Arrays.binarySearch(library,body.charAt(i))<0) //not in library

violations = violations + (line+":"+body.charAt(i)) + "; ";

return violations;

}

/*

Function: test

Generic function used to test odds and ends of code.

Parameters:

None

Returns:

Void

*/

public static void test()

{

String body = "\nbutler\n";

String target = "ap\n";

String sub = "op\n";

System.out.println(replace(body,target,sub));

int target_size = target.length();

int sub_size = sub.length();

String sofar = "";

int j = 2;

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

System.out.println(body);

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

/*

Function: replaceLetters

Body of program, replaces English spelling of text segments with phonetic spelling in Roman-alphabet

Parameters:

body - Text to be manipulated

Returns:

Text with Roman-alphabet phonetic spelling of English words.

*/

private static String replaceLetters(String body)

{

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//ph

body = replace(body,"ph","f");

//anti-

body = replace(body,".anti",".antahy");

body = replace(body,".whole",".hohl");

//wh

body = replace(body,"whose","hooz");

body = replace(body,"whom","hoom");

body = replace(body,"who\n","hoo\n");

body = replace(body,"where","huair"); //changed w to u

body = replace(body,"whir","huur");

body = replace(body,"wh","hu"); //Might need more permutations

body = replace(body,".accr",".uhkr"); //many many many

body = replace(body,".acci",".aksi");

body = replace(body,".accord",".uhkawrd");

body = replace(body,".accomp",".uhkuhmp");

body = replace(body,".acco",".uhko");

body = replace(body,".accustom\n",".uhkuhstuhm\n");

body = replace(body,".accolade\n",".akuhleyd\n");

body = replace(body,".accus",".uhkyooz");

body = replace(body,".accurs",".uhkurs");

body = replace(body,".accur",".akyer");

body = replace(body,".accum",".uhkyoom");

body = replace(body,".accout",".uhkoot");

body = replace(body,".accoun",".uhkoun");

body = replace(body,".acce",".akse"); //the dreaded double c's

body = replace(body,".ecc",".eks");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,".occ",".uhk");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

//E at end - Some interference possible with C's

body = replace(body,".cause",".kawz");

body = replace(body,"ause\n","awz\n");

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //special case

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"have\n","hav\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

//body = replace(body,"ever\n","ever\n");

body = replace(body,"eve\n","eev\n"); //HOWEVER

body = replace(body,"eever\n","ever\n");

body = replace(body,"ile\n","ahyl\n");

//System.out.println(replace(replace("while ","wh","hu"),"ile\n","ahyl\n"));

//huahyl

body = replace(body,"gle\n","guhl\n");

body = replace(body,".key\n",".kee\n"); //special

body = realReplace("QQQ",body,".keys\n",".kees\n");

body = replace(body,"base\n","beys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","keys\n");

body = replace(body,"chase\n","Ceys\n"); //ch == C

body = replace(body,"Case\n","Ceys\n"); //necessary?

body = replace(body,"erase\n","ihreys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,".one\n",".uuhn\n"); //sepcial

body = replace(body,".someone\n",".suhmuuhn\n");

body = replace(body,".anyone\n",".eneeuuhn\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,".some",".suhm");

body = replace(body,"comedy","komidee");

body = replace(body,"come\n","kuhm\n"); //Need to move this up

body = replace(body,".come",".kuhm");

body = replace(body,"ome\n","ohm\n");

body = replace(body,"title\n","tahytl\n");

body = replace(body,"ttle\n","tl\n");

body = replace(body,"tle\n","tl\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,".discipline\n",".disipline\n");

body = replace(body,"cine\n","sin\n");

body = replace(body,"ine\n","ahyn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"op\n","ohp\n");

body = replace(body,"ope\n","ohp\n");

body = replace(body,"rue\n","roo\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"bead\n","beed\n");

body = replace(body,".read\n",".reed\n");

body = replace(body,"nead\n","need\n");

body = replace(body,"lead\n","leed\n");

body = replace(body,"ead\n","ed\n"); //general

body = replace(body,"ade\n","eyd\n");

//1.9.2.1

body = replace(body,"heir","air"); //general rule

body = replace(body,"eir\n","er\n");

//this one's touchy, I'm just throwing in "air" exemptions to the "eer" rule where I see them

body = replace(body,"where\n","hwair\n");

body = replace(body,".ere\n",".air\n");

body = replace(body,"there\n","thair\n");

body = replace(body,"sphere\n","sfeer\n");

body = realReplace("QQQ",body,".here\n",".heer\n");

body = realReplace("QQQ",body,".were\n",".wur\n");

body = replace(body,"sier\n","seer\n");

body = replace(body,"shier\n","Seer\n");

body = replace(body,"Sier\n","Seer\n");

body = replace(body,"cier\n","seer\n");

body = replace(body,".premiere\n",".primeer\n");

body = replace(body,"iere\n","yair\n");

body = replace(body,"soldier","sohljer");

body = replace(body,"iere\n","yair\n");

body = replace(body,".persevere\n",".pursuhveer\n");

body = replace(body,".revere\n",".riveer\n");

body = replace(body,"cere\n","seer\n");

body = replace(body,".interfere\n",".interfeer\n");

body = replace(body,"mmere","M");

body = replace(body,"mere\n","meer\n");

body = replace(body,"M","mmere");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"tire","tahyuhr"); //NOT \n or e

body = replace(body,"aire\n","air\n");

//body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","ahrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"promise\n","promis\n");

body = replace(body,"aise\n","eyz\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

body = replace(body,"que\n","k\n");

body = replace(body,"udge\n","uhj\n");

body = replace(body,"dge\n","j\n"); //NOT sure

body = replace(body,"age\n","aij\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"rgue\n","rgyoo\n");

body = replace(body,"gue\n","eeg\n");

//ible, might need to generalize downtown

body = replace(body,"ible\n","uhbuhl\n");

//-nge

//problem with sing, singer vs singe, singer not really being separable at the gerund-testing level

body = replace(body,"finger\n","fingger\n");

body = replace(body,"linger\n","lingger\n");

body = replace(body,"finger","fingger");

body = replace(body,"linger","lingger");

body = replace(body,".anger\n",".angger\n");

body = replace(body,".angry\n",".angree\n");//?

//body = realReplace("",body,"ringe\n","rinj\n"); //This is the best I can do for now.

body = replace(body,".cringe\n",".krinj\n");

body = replace(body,".fringe\n",".frinj\n");

body = replace(body,".cringe\n",".kuhnstrinj\n");

body = replace(body,".astringe\n",".uhstrinj\n");

body = replace(body,".infringe\n",".infrinj\n");

body = realReplace("R",body,"hinge\n","hinj\n");

body = realReplace("R",body,".impinge\n",".impinj\n");

body = realReplace("R",body,"winge\n","winj\n");

body = realReplace("R",body,".binge\n",".binj\n");

body = realReplace("",body,".tinge\n",".winj\n");

body = realReplace("",body,".dinge\n",".dinj\n");

body = realReplace("QQQ",body,".singe\n",".sinj\n");

body = realReplace("QQQ",body,".singed\n",".sinjed\n");

body = realReplace("QQQ",body,".singeing\n",".sinjing\n");

body = realReplace("g",body,"inging\n","J\n"); //temporary

body = replace(body,"ing\n","I\n"); //temporary

body = replace(body,"nge\n","nj\n");

body = replace(body,"I","ing");

body = replace(body,"J","inging");

//END E's

//s at end - 1.7.4.5 -> unneeded, I think

//body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//START C 1.7 - moved so that higher number of characters in target get's preference, blocks kept cohesive

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

body = replace(body,"accent","aksent");

body = replace(body,"exercise\n","eksersahyz\n");

body = replace(body,".once",".wuhns");

body = replace(body,"preface\n","prefis\n"); //special

body = replace(body,"icise\n","uhsahyz\n");

body = replace(body,"rcise\n","ruhsahyz\n");

body = replace(body,".tacit\n",".tasit\n");

body = replace(body,"ciate\n","sheeeyt\n");

body = replace(body,"cate\n","kit\n");

body = replace(body,"vate\n","vit\n"); //pulled from E section, might be a sign of things to come

body = replace(body,"literate\n","literit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"cision\n","sizhuhn\n");

body = replace(body,"cise\n","sahys\n");

body = replace(body,"cist\n","sist");

body = replace(body,"duce\n","doos\n");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usst\n"); //D's

body = replace(body,"came\n","keym\n");

body = replace(body,"came","kamuh");

body = replace(body,"indict","indahyt");

body = replace(body,"ct","kt"); //factual

body = replace(body,"tual\n","Cual\n");

body = replace(body,".acid\n",".asid\n");

body = replace(body,".aci",".uhsi");

body = replace(body,"ierce\n","eers\n");

body = replace(body,"ince\n","ins\n");

//body = replace(body,".ance",".ahns");

body = replace(body,".trance",".trahns");

body = replace(body,"dance\n","dahns\n");

body = replace(body,"Cance\n","Cahns\n");

body = replace(body,"cance\n","kahns\n");

body = replace(body,"lance\n","lahns\n");

body = replace(body,"vance\n","vahns\n");

body = replace(body,"ance\n","uhns\n");

body = replace(body,"all\n","awl\n");

body = realReplace("QQQ",body,".supplement\n",".suhpluhment\n"); //special case

body = replace(body,".supp",".suhpp"); //just a general rule

body = replace(body,"ape\n","eYp\n");

body = replace(body,"appa","apuh");

body = replace(body,".appear",".uhpeer");

body = replace(body,"ppen","pen"); //double p's, might NOT be done

body = replace(body,"pplet\n","plit\n");

body = replace(body,"pple\n","puhl\n");

body = replace(body,"ppl","puhl");

body = replace(body,"upp\n","uhp");

body = replace(body,"oppor","oper");

body = replace(body,".opp",".ohp");

body = replace(body,".op",".ohp");

body = replace(body,"opp","uhp");

body = replace(body,"ypp","ip");

body = replace(body,"pp","p"); //Last ditch, should cover most before this

body = replace(body,"tice\n","tis\n");

body = replace(body,"arice\n","eris\n");

body = replace(body,"orice\n","uhis\n");

body = replace(body,"cipice\n","suhpis\n"); //patch for precipice

body = replace(body,"ipice\n","uhpis\n");

body = replace(body,".vice\n","vahys\n");

body = replace(body,"vice\n","vis\n");

body = replace(body,"ice\n","ahys\n"); //Long S. NOT sure about \n's

body = replace(body,"egy\n","ijee\n"); //possibilities/strategies fix, I have now idea how the ended up "kiez"

body = replace(body,"city\n","sitee\n");

body = replace(body,"cite\n","sahyt\n");

body = replace(body,"ity\n","itee\n");

body = replace(body,"ite\n","ahyt\n");

body = replace(body,"irst\n","urst\n");

body = replace(body,"ong\n","ong\n");

body = replace(body,"ull\n","ool\n");

body = replace(body,"cide\n","sahyd\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ence\n","ens\n");

body = replace(body,"rend\n","rend\n");

//1.8.9 Pie-

body = replace(body,"piety","pahyitee");

body = replace(body,".pier\n"," peer\n");

body = replace(body,".pie\n"," pahy\n");

body = replace(body,".pie",".pee");

body = replace(body,"ces\n","seez\n");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"ci\n","sahy\n");

body = replace(body,"gan\n","gahn\n");

body = replace(body,"dle\n","dl\n");

body = replace(body,"align\n","uhlahyn\n");

body = replace(body,"oy\n","oi\n");

body = replace(body,"ace\n","eys\n");

body = replace(body,".chull\n",".as\n");

body = replace(body,".chull",".uhs"); //Assoc-

body = replace(body,".rely\n",".relahy\n");

body = replace(body,"ely\n","lee\n"); //MUST BE LAST IN \N

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"cious","shuhs"); //For Ithaca!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"ply\n","plahy\n");

body = replace(body,".by\n",".bahy\n");

body = replace(body,".my\n",".mahy\n");

body = replace(body,".die\n",".dahy\n");

body = replace(body,".dye\n",".dahy\n");

body = replace(body,".bye\n",".bahy\n"); //conflict

body = replace(body,"hype","hahype");

body = replace(body,"hypo","hahypo");

body = replace(body,"hypn","hipn");

body = replace(body,"hyphen","hahyfuhn");

body = replace(body,"hyfen","hahyfuhn"); //ph->f

body = replace(body,"yp","ip");

body = replace(body,"eYp","eyp"); //see ape->eyp

body = replace(body,"duct","duhkt");

body = replace(body,"stion","sCuhn"); //1.8.9.4

body = replace(body,"tion","Suhn"); //1.8

body = replace(body,"ssion","Suhn"); //1.8.6

body = replace(body,"sion","zhuhn");

body = replace(body,"cean","Suhn");

body = replace(body,".abou",".uhbou");

body = replace(body,".aband",".uhbanduhn");

body = replace(body,"ture","Cur");

body = replace(body,"cies","seez"); //prophocies

body = replace(body,"ciez","seez"); //s->z already done

body = replace(body,"iew","yoo");

body = replace(body,".face",".feys");

body = replace(body,"face","feys");

//For-

body = replace(body,".fore",".fohr");

body = replace(body,".for",".fohr");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,"acen","eysuhn"); //Don't get complacent

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweyeynt");

body = replace(body,"cing","sing");

//1.6.5 - odyssey test

body = replace(body,"exce","ikse");

body = replace(body,"excit","iksahyt");

body = replace(body,"excis","eksahyz");

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ight","ahyt");

body = replace(body,"cep","sep");

body = replace(body,"cin","sin");

body = replace(body,".cit",".sit");

body = replace(body,"cip","sip");

body = replace(body,".def",".dihf");

body = replace(body,"cif","sif"); //NOT sure

body = replace(body,"icc","ik");

body = replace(body,"icn","ikn");

body = replace(body,"sce","SE");

body = replace(body,"SEyp","skeyp");

body = replace(body,"SE","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

//body = replace(body,"sco","sko");

body = replace(body,"cea","sea");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsee");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cent","sent"); //odyssey

body = replace(body,"it\n","it\n"); //Tacked on for suffix reasons

body = replace(body,"ap\n","ap\n");

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".cer",".sur");

body = replace(body,".ce",".se");

body = replace(body,"ck","k");

/* body = realReplace("QQQ",body,"C\n","k\n");

body = realReplace("QQQ",body,"ch\n","k\n"); */

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

body = replace(body,"ce","se"); //might want to move

body = realReplace("QQQ",body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = realReplace("QQQ",body,"c\n","k\n"); //to stop mischeif

//END C'S

body = replace(body,".odyssey\n",".oduhsee\n"); //special

body = replace(body,"sey\n","zee\n");

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

body = replace(body,".be\n",".bee\n");

body = replace(body,".maybe\n",".meybee\n");

//rom

body = realReplace("QQQ",body,".roman\n",".rohmahn\n"); //might want to generalize "-an" suffix

body = replace(body,"rom","rohm");

//gh

body = replace(body,"gha","gah"); //This section needs work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"igh","ahy");

body = replace(body,".enough\n",".ihnuhf\n"); //special case

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".to\n",".too\n");

body = replace(body,".two\n",".too\n");

//q at end

body = realReplace("QQQ",body,"q\n","k\n");

//w at end

body = replace(body,".low\n",".loh\n");//special cases

body = replace(body,".row\n",".roh\n");

body = replace(body,".tow\n",".toh\n");

body = replace(body,"ow\n","au\n");

//.sy

body = replace(body,".syr",".suhr"); //Moved up to e-enders

body = replace(body,".syr",".sir");

body = replace(body,".sly",".slahy");

body = replace(body,".lying\n",".lahying\n");

body = replace(body,".ly",".li");

//sz->siz - The coward's way out. I need to sit down and make this thing more cohesive

body = replace(body,"sz\n","siz\n");

body = replace(body,"pie\n","pahy\n"); // NOT normal, aka special

body = realReplace("qqq",body,".or",".awr");

body = replace(body,".sky",".skahy");

body = replace(body,".fly",".flahy");

body = replace(body,".ally\n",".alahy\n");

body = realReplace("qqq",body,"y\n","ee\n");

body = realReplace("qqq",body,"ehee\n","ehy\n");

body = realReplace("qqq",body,"ahee\n","ahy\n");

body = realReplace("qqq",body,"eee\n","ey\n"); //fixing issues raised by y->ee as compared to other phonetics

body = realReplace("qqq",body,"iest\n","eeest\n");

body = replace(body,"izen","uhzen");

body = replace(body,"ize","ahz");

body = replace(body,"able","uhbuhl");

body = replace(body,"ably","uhblee"); //Last sweep

String[] temp = {"en","st","un","c","f","g","s","t"};

body = replace(body,"ctable\n","kteybuhl\n"); //save the c's!

for(int i = 0; i<temp.length;i++)

if(temp.equals("c"))

body = replace(body,"kable\n","eybuhl\n");

else

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"ble\n","buhl\n");

body = realReplace("QQQ",body,".i\n",".ahy\n");

//x's

body = replace(body,".xy",".zi");

body = replace(body,"xious","kSuhs");

//apostrophe replacement, see removeCharacters()

boolean save =skip_protected;

skip_protected=false;

body = replace(body,".A","ez");

body = replace(body,".B","z");

body = replace(body,".D","d");

body = replace(body,".E","v");

body = replace(body,".F","l");

body = replace(body,".G","nt");

body = replace(body,".H","int");

skip_protected = save;

//General fixer for suffixes

//body = replace(body,"\n","\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling sometimes.

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"qu","ku");

body = replace(body,"w","u"); //exception catcher

if(debug_end_e){

body = replace(body,"e\n","Q\n"); //Just for debugging

body = replace(body,".TQ",".Te");

body = replace(body,".bQ",".be");

body = replace(body,".seQ",".seee");

body = replace(body,".mQ",".me");

body = replace(body,"eQ\n","ee\n");

body = replace(body,"Qy\n","ey\n");

body = replace(body,".hQ",".he");

body = replace(body,".shQ",".she");

}

return body;

}

/*

Function: replace

Buffer function for realReplace, adds on an empty string for generic case

Parameters:

body - Text to be searched/replaced

target - Text to be replaced

sub - Text to replace target

Returns:

Original text with target replaced by sub by realReplace

See Also:

<realReplace>

*/

private static String replace(String body, String target, String sub){

return realReplace("",body,target,sub);

}

/*

Function: realReplace

Permutates (hopefully) all expected suffixes to replace a given string with a substitute string

Parameters:

sofar - Shorthand listing of the suffixes which have been added to the original target/sub comination up to this point. "QQQ" and "qqq" used to denote a desire not to perumutate target/string suffixes at all.

body - Text to be searched/replaced

target - Text to be replaced

sub - Text to replace target

Returns:

Text with spaces added around periods, <safe> tags, and endline charactes

*/

private static String realReplace(String sofar, String body, String target, String sub)

{

int target_size = target.length();

int sub_size = sub.length();

boolean rerun = false;

if(target.startsWith(".")){

rerun = true;

target=" "+target.substring(1,target_size);

}

if(target.endsWith("\n")){

rerun = true;

target = target.substring(0,target_size-1)+" ";

}

if(sub.startsWith(".")){

rerun = true;

sub = " "+sub.substring(1,sub_size);

}

if(sub.endsWith("\n")){

rerun = true;

sub = sub.substring(0,sub_size-1)+" ";

}

if(rerun)

return realReplace(sofar,body,target,sub);

//As of 1.8.8.1, '.' and '\n' are only codes for ' '. Spaces will be added before and after every \n, as well as after every period, then removed at the end.

//'.'==' '

/* if((min<Count++)&&(max>Count))

Targets+= target+"_"; */

if(Counting)

{

Count++;

if(target.equals("w"))

System.out.println("Replaces Run: "+Count);

}

if(target.endsWith(" "))

if(sofar.length()<=2){ //that took longer than it should have. Anyone who can suggest improvements is welcome to try.

/* if(target.equals(" lingered "))

System.out.println(target); */

//I think contains() covers it. It saves time over endsWith() if it stops unnecessary calls to realReplace(), as long as it doesn't cut out possible permutations

if((!sofar.contains("z"))&&(!sofar.contains("l"))&&(!sofar.contains("t"))){

if(!sofar.contains("i"))// s->z

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z "));

else if((sub_size>=2)&&(sub.charAt(sub_size-2)=='y'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"ez ")); //s->z

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

if(((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))||((sub_size>=2)||(sub.substring(sub_size-2,sub_size).equals("hy"))))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"z "));

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"iez ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

/* //y

body = realReplace("qqq",body,"ay ","ey "); //stopgap, might want to revisit

body = replace(body,"ey ","ey ");

body = realReplace("qqq",body,"oy ","oi ");

body = realReplace("qqq",body,"uy ","ahy ");

body = realReplace("qqq",body,"y ","ee "); //might need generalized in replace()

body = replace(body,"ty","tahy"); */

//ly, focus on y as of 1.7.4.3 - It might need some work

if(target.equals("sly ")) //special case

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else{

//ly

if((target_size>=5)&&(target.substring(target_size-5,target_size-1).equals("able")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"y "),(sub.substring(0,sub_size-4)+"lee ")); //ably

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

if((sub_size>=3)&&(sub.substring(sub_size-3,sub_size-1).equals("ee")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-3)+"ily "),(sub.substring(0,sub_size-3)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"ily "),(sub.substring(0,sub_size-2)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"pily "),(sub.substring(0,sub_size-1)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"tily "),(sub.substring(0,sub_size-1)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

//y

if((target_size>=2)&&(target.charAt(target_size-2)=='a')) //might need work

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ey "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"y "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"i "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='u'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ahy "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"py "),(sub.substring(0,sub_size-1)+"ee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"ty "),(sub.substring(0,sub_size-1)+"ee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee ")); //might not be needed

}

if((!sofar.contains("g"))&&(!sofar.contains("i"))&&(!sofar.contains("r"))){ //covers multiple

//ing, gerunds

if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ie")))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-3)+"ying "),(sub.substring(0,sub_size-1)+"ing ")); //replacing 'ie' before gerund

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){ //experiment

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ring "),(sub.substring(0,sub_size-1)+"ring ")); //rr

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //have to do both, sadly

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ping "),(sub.substring(0,sub_size-1)+"ing "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ting "),(sub.substring(0,sub_size-1)+"ing "));

else

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //no e, presumably ends in consonant

if((!sofar.contains("a"))&&(!sofar.contains("d"))) //ish

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"pish "),(sub.substring(0,sub_size-1)+"ish "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"tish "),(sub.substring(0,sub_size-1)+"ish "));

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ed")))||(target_size<3))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"ish "),(sub.substring(0,sub_size-1)+"ish "));

if(!sofar.contains("a")) //able

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"table "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){//experiment

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"rable "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(target.equals("fly")||target.equals("unfly"))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(((target_size>=4)&&(target.substring(target_size-4,target_size-1).equals("ing")))||(target_size<4))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"eybuhl "));

//1.9

//ize

if(!sofar.contains("x"))

if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

body = realReplace(sofar+"x",body,(target.substring(0,target_size-2)+"ize "),(sub.substring(0,sub_size-1)+"ahyz ")); //removing 'e'

else

body = realReplace(sofar+"x",body,(target.substring(0,target_size-1)+"ize "),(sub.substring(0,sub_size-1)+"ahyz "));

//est - was iest before 1.9.1.1

if((!sofar.contains("t")))

if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

body = realReplace(sofar+"t",body,(target.substring(0,target_size-2)+"iest "),(sub.substring(0,sub_size-1)+"eeest ")); //removing 'y'

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"t",body,(target.substring(0,target_size-2)+"est "),(sub.substring(0,sub_size-1)+"est "));

else

body = realReplace(sofar+"t",body,(target.substring(0,target_size-1)+"est "),(sub.substring(0,sub_size-1)+"est "));

}

if((!sofar.contains("g"))&&(!sofar.contains("d"))){ //covers multiple

if(target_size>=2) //d at end

if(target.charAt(target_size-2)=='e')

if((target_size>=3)&&(target.charAt(target_size-3)=='c'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"st "));

else

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed ")); //NOT st

else if(target.charAt(target_size-2)=='s')

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

else if(target.charAt(target_size-2)=='r'){//experiment

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"red "),(sub.substring(0,sub_size-1)+"d "));

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"d "));

}

else if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("se")))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ped "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ted "),(sub.substring(0,sub_size-1)+"ed "));

else if((target.charAt(target_size-2)!='s')||((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ss"))))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

//er

if((!sofar.contains("r"))&&(!sofar.contains("R"))) //inge special

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r "),(sub.substring(0,sub_size-1)+"er ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"per "),(sub.substring(0,sub_size-1)+"er "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='r')){ //experiement

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"rer "),(sub.substring(0,sub_size-1)+"rer "));

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er "),(sub.substring(0,sub_size-1)+"er "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"ter "),(sub.substring(0,sub_size-1)+"er "));

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er "),(sub.substring(0,sub_size-1)+"er "));

}

/* //ate, not bothering with fobiddances - Never mind

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n")); */

//Why do these need to be dealt with here?

//Because these permuations need to be available to figure out which \n grammars to apply

//ed, ish, ly, ing, able, edly, ishly, ably, lying, eding, abling

//Dirty method - add a recursion counter to replace()

//6 max - ed ish ly ing able z

//ablingly, lyingly - 3

//ablinger

//s-z, ly-l, ing-g, d-d, ish-i, able-a

//everything abides i, nothing abides s/l //nevermind, not much likes i either

//a allows l/s/d,

//a forbids a, i

//d forbids d, i

//g forbids d, g, i, a

//i forbids s, g, i, a

//er-r

//r forbids g, i, a, r

//r is forbidden by s, l, g, d

//y-y

//Not messing with forbidding now (1.8.8.2)

//x-ized, t-iest, t forbids all, don't care about anything else right now

//I think that forbiddance is total - no forbidden suffixes at any point before

}

}

return findReplace(body,target,target_size,sub,sub_size);

}

/*

Function: findReplace

Bog standard search/replace function for a given string and a given pair of target/substitute. Skips over <safe> tags if appropriate.

Parameters:

body - Text to be searched/replaced

target - Text to be replaced

target_size - Precalulated length of target string

sub - Text to replace target

sub_size - Precalulated length of sub string

Returns:

Text with spaces added around periods, <safe> tags, and endline charactes

*/

private static String findReplace(String body, String target, int target_size, String sub, int sub_size){

int safe_count = 0;

for(int i = 0; i<=body.length()-target_size;i++){

for(int j = 0; j <target_size; j++)

if(body.charAt(i+j)!=target.charAt(j))

break; //Once more unto the break

else if(j+1>=target_size){

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

if(skip_protected)

if(body.charAt(i)=='<') //skipping

i+=skip_array[safe_count++];

}

return body;

}

}

Edited by Kurkistan
Link to comment
Share on other sites

  • 3 months later...
  • 1 year later...
  • 1 year later...

First of all, congratulations!

 

There is one small problem I'm having, however. The tops of the words seem to be cutting off. I am using Microsoft Office Word 2013. Changing spacing and/or font size seems to have no effect. Any ideas?

 

Thanks.

Link to comment
Share on other sites

  • Chaos locked this topic
Guest
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...