Jump to content

My Alethi Font


Turos

Recommended Posts

Nice progress! It works really well for me.

Also, version two is up on the first post. Thank you both for the catch on that character missing the line. :lol:

Oh ya, don't worry, version two doesn't change anything that would make your trans-literator incompatible.

EDIT: Also, I noticed that line breaks are removed when the tool is run. Is there a way to keep the line breaks?

And I was testing some of your new conversions. Seems to have a glitch:

Erase. Erasing. Chase. Chasing. Blindly. Ice. Ices.

becomes:

.eys .eysing .eys .eysing .blindly .ahys .ahysz

As far as I've seen, it doesn't take away line breaks. Could you link in a .txt and it's output for me to test?

I'm in the middle of making it compatible with -ed, -ly, -ish, and -able endings, as well as hunting down some fugitive c's. I'll try to get to your bug by the end of the night. Thanks for bringing it to my attention.

EDIT: Thanks for posting a new version, although I would suggest renaming the upload-version to just AlethiTS, since we want it to be compatible for the documents and posts of people who upgrade.

Edited by Kurkistan
Link to comment
Share on other sites

As far as I've seen, it doesn't take away line breaks. Could you link in a .txt and it's output for me to test?

Oh, weird, it doesn't take out the line breaks when I open it in Notepad++, but does in Notepad. That's crazy. Maybe it's just my computer. :huh:

EDIT: Thanks for posting a new version, although I would suggest renaming the upload-version to just AlethiTS, since we want it to be compatible for the documents and posts of people who upgrade.

I only changed the name of the .zip file. The actual font is still AlethiTS.ttf. The additional font is AlethiTS_lined.ttf. It just adds another font for those who wants the line in space characters. And both fonts have the line through the period character.

Take a look at both and see what I mean. If you want the lined spaces by default, I can change that.

Link to comment
Share on other sites

Oh, weird, it doesn't take out the line breaks when I open it in Notepad++, but does in Notepad. That's crazy. Maybe it's just my computer. :huh:

I only changed the name of the .zip file. The actual font is still AlethiTS.ttf. The additional font is AlethiTS_lined.ttf. It just adds another font for those who wants the line in space characters. And both fonts have the line through the period character.

Take a look at both and see what I mean. If you want the lined spaces by default, I can change that.

*Opens .doc with Alethi script at 72*

*Screen explodes*

Ah, I see what you did there. I assumed that "AlethiTS" was simply the old version. My mistake. I prefer no lines between words, and that's how it is in the notebook pages, so I think it's better to leave it as optional.

*Goes back to work*

Link to comment
Share on other sites

I've reached the conclusion that we deserve medals for this. No rush, but it would be nice. B)

Substantial reordering of C category to simplify debugging, substantial number of grammars added to C category, various bugs squashed, -ed, -ly, -ish, -able, and -er suffixes added, timer added to show how long the transliteration took, various touch-ups.

As always, but particularly with so many changes, comments are welcome.

EDIT 2: added tests for y\n, not sure if I might want to put them in replace().

EDIT: Just ran the Odyssey again, 8 minutes, 7 seconds with no misses on 'c' except for names and weird compounds (washingcistern, panicstricken).

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan

* @date 01/15/2012

* @version 1.7.4.1

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_7_4_1

{

static boolean debug_char = true;

static boolean debug_end_e = false;

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter input file (full name of file in same directory): ");

String temp = input.next();

//temp = "Test.txt";

final double startTime = System.currentTimeMillis();

final double endTime;

try {

String alethi = convertText(temp);

if(alethi.equals("&"))

return;

temp = "Alethi_"+temp;

writeFile(alethi,temp);

if(debug_char){

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

} finally {

endTime = System.currentTimeMillis();

}

final double duration = endTime - startTime;

System.out.println("Execution time: "+(duration/1000)+" seconds");

}

private static String convertText(String roman) throws IOException

{

char[] body = readFile(roman);

if((body.length==1)&&(body[0]=='&')) //invalid input, halt program

return "&";

periodMover(body);

String alethi = replaceLetters(body);

return alethi;

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static char[] readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("Invalid file path");

return "&".toCharArray();

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole.toCharArray();

}

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("Output file already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

char[] body_array = body.toCharArray();

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body_array=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)){ //c, q, w, and x

library[place] = (char)i;

place++;

}

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body_array=='\n')

line++;

else if(Arrays.binarySearch(library,body_array)<0) //not in library

violations = violations + (line+":"+body_array) + "; ";

return violations;

}

/**

* In the Alethi alphabet, sentences start with a period '.' and don't end with anything.

*/

private static void periodMover(char[] array)

{

int temp = 0;

for(int i=0;i<array.length;i++)

{

if(array=='.'){

if(!(((array.length - i) >= 3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))) //ellipsis

{

twistRight(array,temp,i);

i++;

while(i<array.length)

if(!inAlphabet(array))

i++;

else

break; //Yes, the cardinal sin.

temp=i;

}

else if(((array.length-i)>=3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))

for(int j=0;j<3;j++)

twistRight(array,temp+j,i+j);

}

else if(array=='\n')

temp=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

}

private static boolean inAlphabet(char character)

{

char[] library = new char[26];

int place = 0;

for(int i = 97; i <=122; i++){

library[place] = (char)i;

place++;

}

if(Arrays.binarySearch(library,character)>=0) //I felt embarrassed by my earlier search algorithm.

return true;

return false;

}

private static void twistRight(char[] array, int start, int end)

{

if (start==end)

return;

char a = array[start];

char b;

array[start] = array[end]; //'.', although this is generalized

while(start!=end)

{

start++;

b = array[start];

array[start] = a;

a = b;

}

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

private static String replaceLetters(char[] array)

{

String body = new String(array);

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//ph

body = replace(body,"ph","f");

//E at end - Some interference possible with C's

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //special case

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

body = replace(body,"eve\n","eev\n");

body = replace(body,"ile\n","hyl\n");

body = replace(body,"gle\n","guhl\n");

body = replace(body,"base\n","beys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","ceys\n"); //Don't need to allow for c->k if c's are bellow

body = replace(body,"chase\n","Ceys\n"); //ch == C

body = replace(body,"erase\n","ihreys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,"come\n","cuhm\n"); //Need to move this up

body = replace(body,"ome\n","ohm\n");

body = replace(body,"vate\n","vit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"tle\n","l\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,"ine\n","ahyn\n");

body = replace(body,".one\n",".uhn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"ope\n","ohp\n");

String[] temp = {"en","st","un","c","f","g","s","t",""};

body = replace(body,"ctable\n","kteybuhl\n"); //save the c's!

for(int i = 0; i<temp.length;i++)

if(temp.equals("c"))

body = replace(body,"kable\n","eybuhl\n");

else

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"rue\n","roo\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"ade\n","eyd\n");

//ere - their vs there

body = replace(body,"ere\n","eir\n");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"aire\n","air\n");

body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","hrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"ble\n","buhl\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"promise\n","promis\n");

body = replace(body,"aise\n","eyz\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"gue\n","eeg\n");

//-nge

body = replace(body,"nge\n","nj\n"); //problem with sing vs singe not really being separable at the gerund-testing level

body = replace(body,"sinjing\n","singing\n"); //comprehensive fix for gerund mishaps

body = replace(body,"slinjing\n","slinging\n");

body = replace(body,"strinjing\n","stringing\n");

body = replace(body,"swinjing\n","swinging\n");

body = replace(body,"brinjing\n","bringing\n");

body = replace(body,"flinjing\n","flinging\n");

body = replace(body,"prinjing\n","pringing\n");

body = replace(body,".winjing\n",".winging\n");

body = replace(body,".zinjing\n",".zinging\n");

body = replace(body,".dinjing\n",".dinging\n");

body = replace(body,".pinjing\n",".pinging\n");

//END E's

//s at end

body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//START C 1.7 - moved so that higher number of characters in target get's preference, blocks kept cohesive

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

body = replace(body,"accent","aksent");

body = replace(body,"exercise\n","eksersahyz\n");

body = replace(body,".once",".wuhns");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,".occ",".uhk");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

body = replace(body,".acc",".aks"); //the dreaded double c's

body = replace(body,".ecc",".eks");

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"cious","shuhs"); //For Ithaca!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"cies","seez"); //prophocies

body = replace(body,"ciez","seez"); //s->z already done

body = replace(body,"acen","eysuhn"); //Don't get complacent

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"icise\n","uhsahyz\n");

body = replace(body,"rcise\n","ruhsahyz\n");

body = replace(body,"ciate\n","sheeeyt\n");

body = replace(body,"cision\n","sizhuhn\n");

body = replace(body,"cise\n","sahys\n");

body = replace(body,"cist\n","sist");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usst\n"); //D's

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweynt");

body = replace(body,"cing","sing");

body = replace(body,"came\n","keym\n");

body = replace(body,"came","kamuh");

//1.6.5 - odyssey test

body = replace(body,"exce","ikse");

body = replace(body,"excit","iksahyt");

body = replace(body,"excis","eksahyz");

body = replace(body,".acid\n",".asid\n");

body = replace(body,".aci",".uhsi");

body = replace(body,"ence","ens");

body = replace(body,"ierce\n","eers\n");

//body = replace(body,".ance",".ahns");

body = replace(body,".trance",".trahns");

body = replace(body,"dance\n","dahns\n");

body = replace(body,"Cance","Cahns");

body = replace(body,"cance","cahns");

body = replace(body,"lance","lahns");

body = replace(body,"ance\n","uhns\n");

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ice","ahys"); //Long S.

body = replace(body,"cep","sep");

body = replace(body,"cin","sin");

body = replace(body,".cit",".sit");

body = replace(body,"cip","sip");

body = replace(body,"cif","suhf");

body = replace(body,"ces","seez");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"icc","ik");

body = replace(body,"icn","ikn");

body = replace(body,"sce","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

//body = replace(body,"sco","sko");

body = replace(body,"cea","sea");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsee");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cent","sent"); //odyssey

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,"ace","eys");

body = replace(body,".cer",".sur");

body = replace(body,".ce",".se");

body = replace(body,"ck","k");

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"ci\n","sahy\n");

body = replace(body,"ce","se");

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

body = replace(body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = replace(body,"c\n","k\n");

//END C'S

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

//wh

body = replace(body,"wha","ua");

body = replace(body,"whe","ue");

body = replace(body,"whi","ui");

body = replace(body,"whu","uu");

body = replace(body,"who\n","hu\n");

//gh

body = replace(body,"gha","gah"); //This section need work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".to\n",".too\n");

body = replace(body,".two\n",".too\n");

//q at end

body = replace(body,"q\n","k\n");

//w at end

body = replace(body,"ow\n","au\n");

//.sy

body = replace(body,".syr",".suhr"); //Moved up to e-enders

body = replace(body,".syr",".sir");

body = replace(body,".sly",".slahy");

body = replace(body,".lying\n",".lahying\n");

body = replace(body,".ly",".li");

body = realReplace("qqq",body,"ay\n","ey\n"); //stopgap, might want to revisit

//body = replace(body,"ey\n","ey\n");

body = realReplace("qqq",body,"oy\n","oi\n");

body = realReplace("qqq",body,"uy\n","ahy\n");

body = realReplace("qqq",body,"y\n","ee\n"); //might need generalized in replace()

//sz->siz - The coward's way out. I need to sit down and make this thing more cohesive

body = replace(body,"sz\n","siz\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

//body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"q","ku");

body = replace(body,"wa","ua");

body = replace(body,"we","ue");

body = replace(body,"wi","ui");

body = replace(body,"wo","uo");

body = replace(body,"wu","uu");

body = replace(body,"w","u"); //exception catcher

if(debug_end_e){

body = replace(body,"e\n","Q\n"); //Just for debugging

body = replace(body,".TQ",".Te");

body = replace(body,".bQ",".be");

body = replace(body,".seQ",".seee");

body = replace(body,".mQ",".me");

}

return body.substring(1,body.length()-1); //clipping first/last '\n'

}

private static String replace(String body, String target, String sub){

return realReplace("",body,target,sub);

}

private static String realReplace(String sofar, String body, String target, String sub)

{

int target_size = target.length();

int sub_size = sub.length();

//'.'==' '

// if(target.startsWith("."))

// System.out.println(target);

if(target.startsWith(".")){

body = replace(body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub_size)));

body = replace(body,("\n"+target.substring(1,target_size)),("\n"+sub.substring(1,sub_size)));

}

if(target.endsWith("\n")){ //checks for spaces and for plurals, also does s->z conversion where necessary

body = replace(body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub_size-1)+" ")); //space substitution

if(sofar.length()<=2){ //that took longer than it should have. Anyone who can suggest improvements is welcome to try.

if((!sofar.contains("z"))&&(!sofar.contains("l"))){ //I think contains() covers it. It saves time over endsWith() if it stops unnecessary calls to realReplace(), as long as it doesn't cut out possible permutations

if(!sofar.contains("i"))// s->z

if((target_size>=2)&&(target.charAt(target_size-2)!='s')&&(target.charAt(target_size-2)!='z')) //Double-checking s/z

if(target.charAt(target_size-2)=='e')

body = realReplace(sofar+="z",body,(target.substring(0,target_size-2)+"es\n"),(sub.substring(0,sub_size-1)+"ez\n")); //s->z

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3)) //bug stopper

body = realReplace(sofar+="z",body,(target.substring(0,target_size-1)+"s\n"),(sub.substring(0,sub_size-1)+"z\n")); //s->z

//ly - It might need some work

if(target.equals("sly\n")) //special case

body = realReplace(sofar+="l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

if(((target_size>=5)&&(!target.substring(target_size-5,target_size-1).equals("able")))||(target_size<5))

body = realReplace(sofar+="l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n")); //ably

else

body = realReplace(sofar+="l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

if((!sofar.contains("g"))&&(!sofar.contains("i"))&&(!sofar.contains("r"))){ //covers multiple

if(target_size>=4){ //gerunds, include \n or space

if((!target.endsWith("g\n"))&&(!target.endsWith("gs\n"))&&(!target.endsWith("gz"))) //leave no base uncovered

if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ie")))

body = realReplace(sofar+="g",body,(target.substring(0,target_size-3)+"ying\n"),(sub.substring(0,sub_size-1)+"ing\n")); //replacing 'ie' before gerund

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+="g",body,(target.substring(0,target_size-2)+"ing\n"),(sub.substring(0,sub_size-1)+"ing\n")); //removing 'e'

}else if((!target.endsWith("gs\n"))&&(!target.endsWith("gz"))) //no "ing\n" or s\z at end

body = realReplace(sofar+="g",body,(target.substring(0,target_size-1)+"ing\n"),(sub.substring(0,sub_size-1)+"ing\n")); //no e, presumably ends in consonant

if((!sofar.contains("a"))&&(!sofar.contains("d"))) //ish

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ed")))||(target_size<3))

body = realReplace(sofar+="i",body,(target.substring(0,target_size-1)+"ish\n"),(sub.substring(0,sub_size-1)+"ish\n"));

if(!sofar.contains("a")) //able

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

body = realReplace(sofar+="a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

else if(target.equals("fly")||target.equals("unfly"))

body = realReplace(sofar+="a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

else if(((target_size>=4)&&(target.substring(target_size-4,target_size-1).equals("ing")))||(target_size<4))

body = realReplace(sofar+="a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"eybuhl\n"));

}

if((!sofar.contains("g"))&&(!sofar.contains("d"))){ //covers multiple

if(target_size>=2) //d at end

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if(target.charAt(target_size-2)=='e')

body = realReplace(sofar+="d",body,(target.substring(0,target_size-1)+"d\n"),(sub.substring(0,sub_size-1)+"st\n"));

else if((target.charAt(target_size-2)!='s')||((target.substring(target_size-3,target_size-1).equals("ss"))))

body = realReplace(sofar+="d",body,(target.substring(0,target_size-1)+"ed\n"),(sub.substring(0,sub_size-1)+"st\n"));

else if(target.charAt(target_size-2)=='s')

body = realReplace(sofar+="d",body,(target.substring(0,target_size-1)+"ed\n"),(sub.substring(0,sub_size-1)+"ed\n"));

else if(target.substring(target_size-3,target_size-1).equals("se"))

body = realReplace(sofar+="d",body,(target.substring(0,target_size-1)+"d\n"),(sub.substring(0,sub_size-1)+"ed\n"));

//er

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+="r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else

body = realReplace(sofar+="r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n"));

}

//Why do these need to be dealt with here?

//Because these permuations need to be available to figure out which \n grammars to apply

//ed, ish, ly, ing, able, edly, ishly, ably, lying, eding, abling

//Dirty method - add a recursion counter to replace()

//6 max - ed ish ly ing able z

//ablingly, lyingly - 3

//ablinger

//s-z, ly-l, ing-g, d-d, ish-i, able-a

//everything abides i, nothing abides s/l //nevermind, not much likes i either

//a allows l/s/d,

//a forbids a, i

//d forbids d, i

//g forbids d, g, i, a

//i forbids s, g, i, a

//er-r

//r forbids g, i, a

//r is forbidden by s, l, g, d

//-ity?

//I think that forbiddance is total - no forbidden suffixes at any point before

//all of the checks for these are rather crude, but they are all-encompassing

}

}

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

return body;

}

}

Edited by Kurkistan
Link to comment
Share on other sites

Mwahahahaha! My evil plan of destruction has destroyed your computer! ;)

:wacko: Now I need a new project :(

*Pulls vacuum-tube monitor out of basement*

Ah ha! Your evil scheme has failed!

I think we may actually be reaching the endgame here. Someone who isn't me needs to go over a relatively long and diverse text with a fine-toothed comb to find phonetic errors, and we'll probably need to rearrange the order of the grammars at some point before the day is over, but barring massive oversights on my part, I think we may have most everything we need written down in the program.

Updated "-ly" suffixes to cover all "-y" suffixes, reordered grammars, added various new grammars.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan

* @date 01/16/2012

* @version 1.7.4.5

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_7_4_5

{

static boolean debug_char = true;

static boolean debug_end_e = false;

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter input file (full name of file in same directory): ");

String temp = input.next();

//temp = "Test.txt";

final double startTime = System.currentTimeMillis();

final double endTime;

try {

String alethi = convertText(temp);

if(alethi.equals("&"))

return;

temp = "Alethi_"+temp;

writeFile(alethi,temp);

if(debug_char){

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

} finally {

endTime = System.currentTimeMillis();

}

final double duration = endTime - startTime;

System.out.println("Execution time: "+(duration/1000)+" seconds");

}

private static String convertText(String roman) throws IOException

{

char[] body = readFile(roman);

if((body.length==1)&&(body[0]=='&')) //invalid input, halt program

return "&";

periodMover(body);

String alethi = replaceLetters(body);

return alethi;

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static char[] readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("Invalid file path");

return "&".toCharArray();

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole.toCharArray();

}

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("Output file already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

char[] body_array = body.toCharArray();

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body_array=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)){ //c, q, w, and x

library[place] = (char)i;

place++;

}

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body_array=='\n')

line++;

else if(Arrays.binarySearch(library,body_array)<0) //not in library

violations = violations + (line+":"+body_array) + "; ";

return violations;

}

/**

* In the Alethi alphabet, sentences start with a period '.' and don't end with anything.

*/

private static void periodMover(char[] array)

{

int temp = 0;

for(int i=0;i<array.length;i++)

{

if(array=='.'){

if(!(((array.length - i) >= 3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))) //ellipsis

{

twistRight(array,temp,i);

i++;

while(i<array.length)

if(!inAlphabet(array))

i++;

else

break; //Yes, the cardinal sin.

temp=i;

}

else if(((array.length-i)>=3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))

for(int j=0;j<3;j++)

twistRight(array,temp+j,i+j);

}

else if(array=='\n')

temp=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

}

private static boolean inAlphabet(char character)

{

char[] library = new char[26];

int place = 0;

for(int i = 97; i <=122; i++){

library[place] = (char)i;

place++;

}

if(Arrays.binarySearch(library,character)>=0) //I felt embarrassed by my earlier search algorithm.

return true;

return false;

}

private static void twistRight(char[] array, int start, int end)

{

if (start==end)

return;

char a = array[start];

char b;

array[start] = array[end]; //'.', although this is generalized

while(start!=end)

{

start++;

b = array[start];

array[start] = a;

a = b;

}

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

private static String replaceLetters(char[] array)

{

String body = new String(array);

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//ph

body = replace(body,"ph","f");

//anti-

body = replace(body,".anti",".antahy");

//E at end - Some interference possible with C's

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //special case

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

body = replace(body,"eve\n","eev\n");

body = replace(body,"ile\n","hyl\n");

body = replace(body,"gle\n","guhl\n");

body = replace(body,"base\n","beys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","ceys\n"); //Don't need to allow for c->k if c's are bellow

body = replace(body,"chase\n","Ceys\n"); //ch == C

body = replace(body,"erase\n","ihreys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,"come\n","cuhm\n"); //Need to move this up

body = replace(body,"ome\n","ohm\n");

body = replace(body,"vate\n","vit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"tle\n","l\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,"ine\n","ahyn\n");

body = replace(body,".one\n",".uhn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"ope\n","ohp\n");

String[] temp = {"en","st","un","c","f","g","s","t",""};

body = replace(body,"ctable\n","kteybuhl\n"); //save the c's!

for(int i = 0; i<temp.length;i++)

if(temp.equals("c"))

body = replace(body,"kable\n","eybuhl\n");

else

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"rue\n","roo\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"ade\n","eyd\n");

//ere - their vs there

body = replace(body,"ere\n","eir\n");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"tire","tahyuhr"); //NOT \n or e

body = replace(body,"aire\n","air\n");

body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","hrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"ble\n","buhl\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"promise\n","promis\n");

body = replace(body,"aise\n","eyz\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"rgue\n","rgyoo\n");

body = replace(body,"gue\n","eeg\n");

//-nge

body = replace(body,"nge\n","nj\n"); //problem with sing vs singe not really being separable at the gerund-testing level

body = replace(body,"sinjing\n","singing\n"); //comprehensive fix for gerund mishaps

body = replace(body,"slinjing\n","slinging\n");

body = replace(body,"strinjing\n","stringing\n");

body = replace(body,"swinjing\n","swinging\n");

body = replace(body,"brinjing\n","bringing\n");

body = replace(body,"flinjing\n","flinging\n");

body = replace(body,"prinjing\n","pringing\n");

body = replace(body,".winjing\n",".winging\n");

body = replace(body,".zinjing\n",".zinging\n");

body = replace(body,".dinjing\n",".dinging\n");

body = replace(body,".pinjing\n",".pinging\n");

//END E's

//s at end - 1.7.4.5 -> unneeded, I think

//body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//START C 1.7 - moved so that higher number of characters in target get's preference, blocks kept cohesive

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

body = replace(body,"accent","aksent");

body = replace(body,"exercise\n","eksersahyz\n");

body = replace(body,".once",".wuhns");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,".occ",".uhk");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

body = replace(body,"preface\n","prefis\n"); //special

body = replace(body,"ace\n","eys\n");

body = replace(body,"icise\n","uhsahyz\n");

body = replace(body,"rcise\n","ruhsahyz\n");

body = replace(body,"ciate\n","sheeeyt\n");

body = replace(body,"cision\n","sizhuhn\n");

body = replace(body,"cise\n","sahys\n");

body = replace(body,"cist\n","sist");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usst\n"); //D's

body = replace(body,"came\n","keym\n");

body = replace(body,"came","kamuh");

body = replace(body,".acid\n",".asid\n");

body = replace(body,".aci",".uhsi");

body = replace(body,"ierce\n","eers\n");

//body = replace(body,".ance",".ahns");

body = replace(body,".trance",".trahns");

body = replace(body,"dance\n","dahns\n");

body = replace(body,"Cance","Cahns");

body = replace(body,"cance","cahns");

body = replace(body,"lance","lahns");

body = replace(body,"ance\n","uhns\n");

body = replace(body,"all\n","awl\n");

body = replace(body,"ice\n","ahys\n"); //Long S. NOT sure about \n's

body = replace(body,"ces\n","seez\n");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"ci\n","sahy\n");

body = replace(body,".acc",".aks"); //the dreaded double c's

body = replace(body,".ecc",".eks");

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"cious","shuhs"); //For Ithaca!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"ply\n","plahy\n");

body = replace(body,".by\n",".bahy\n");

body = replace(body,".my\n",".mahy\n");

body = replace(body,".die\n",".dahy\n");

body = replace(body,".dye\n",".dahy\n");

body = replace(body,".bye\n",".bahy\n"); //conflict

body = replace(body,"hype\n","hahyp\n");

body = replace(body,"hype","hahype");

body = replace(body,"hypo","hahypo");

body = replace(body,"hypn","hipn");

body = replace(body,"hyphen","hahyfuhn");

body = replace(body,"hyfen","hahyfuhn"); //ph->f

body = replace(body,"yp","ip");

body = replace(body,"cies","seez"); //prophocies

body = replace(body,"ciez","seez"); //s->z already done

body = replace(body,"iew","yoo");

body = replace(body,".face",".feys");

body = replace(body,"face","feys");

body = replace(body,"acen","eysuhn"); //Don't get complacent

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweynt");

body = replace(body,"cing","sing");

//1.6.5 - odyssey test

body = replace(body,"exce","ikse");

body = replace(body,"excit","iksahyt");

body = replace(body,"excis","eksahyz");

body = replace(body,"ence","ens");

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"cep","sep");

body = replace(body,"cin","sin");

body = replace(body,".cit",".sit");

body = replace(body,"cip","sip");

body = replace(body,"cif","suhf");

body = replace(body,"icc","ik");

body = replace(body,"icn","ikn");

body = replace(body,"sce","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

//body = replace(body,"sco","sko");

body = replace(body,"cea","sea");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsee");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cent","sent"); //odyssey

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".cer",".sur");

body = replace(body,".ce",".se");

body = replace(body,"ck","k");

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"ce","se");

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

body = replace(body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = replace(body,"c\n","k\n");

//END C'S

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

//wh

body = replace(body,"who\n","hu\n");

body = replace(body,"where","hwair");

body = replace(body,"whir","hwur");

body = replace(body,"wh,","hw"); //Might need more permutations

//gh

body = replace(body,"gha","gah"); //This section need work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".to\n",".too\n");

body = replace(body,".two\n",".too\n");

//q at end

body = replace(body,"q\n","k\n");

//w at end

body = replace(body,"ow\n","au\n");

//.sy

body = replace(body,".syr",".suhr"); //Moved up to e-enders

body = replace(body,".syr",".sir");

body = replace(body,".sly",".slahy");

body = replace(body,".lying\n",".lahying\n");

body = replace(body,".ly",".li");

//sz->siz - The coward's way out. I need to sit down and make this thing more cohesive

body = replace(body,"sz\n","siz\n");

body = realReplace("qqq",body,"y\n","ee\n");

body = realReplace("qqq",body,"ehee\n","ehy\n");

body = realReplace("qqq",body,"ahee\n","ahy\n");

body = realReplace("qqq",body,"eee\n","ey\n"); //fixing issues raised by y->ee as compared to other phonetics

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

//body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"q","ku");

/* body = replace(body,"wa","ua"); //Unnecessary? I think not! I'm not sure why, but no.

body = replace(body,"we","ue");

body = replace(body,"wi","ui");

body = replace(body,"wo","uo");

body = replace(body,"wu","uu"); */

body = replace(body,"w","u"); //exception catcher

if(debug_end_e){

body = replace(body,"e\n","Q\n"); //Just for debugging

body = replace(body,".TQ",".Te");

body = replace(body,".bQ",".be");

body = replace(body,".seQ",".seee");

body = replace(body,".mQ",".me");

body = replace(body,"eQ\n","ee\n");

body = replace(body,"Qy\n","ey\n");

body = replace(body,".hQ",".he");

body = replace(body,".shQ",".she");

}

return body.substring(1,body.length()-1); //clipping first/last '\n'

}

private static String replace(String body, String target, String sub){

return realReplace("",body,target,sub);

}

private static String realReplace(String sofar, String body, String target, String sub)

{

int target_size = target.length();

int sub_size = sub.length();

//'.'==' '

// if(target.startsWith("."))

// System.out.println(target);

if(target.startsWith(".")){

body = replace(body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub_size)));

body = replace(body,("\n"+target.substring(1,target_size)),("\n"+sub.substring(1,sub_size)));

}

if(target.endsWith("\n")){ //checks for spaces and for plurals, also does s->z conversion where necessary

body = replace(body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub_size-1)+" ")); //space substitution

if(sofar.length()<=2){ //that took longer than it should have. Anyone who can suggest improvements is welcome to try.

if((!sofar.contains("z"))&&(!sofar.contains("l"))){ //I think contains() covers it. It saves time over endsWith() if it stops unnecessary calls to realReplace(), as long as it doesn't cut out possible permutations

if(!sofar.contains("i"))// s->z

if((target_size>=2)&&(target.charAt(target_size-2)!='s')&&(target.charAt(target_size-2)!='z')) //Double-checking s/z

if(target.charAt(target_size-2)=='e')

body = realReplace(sofar+="z",body,(target.substring(0,target_size-2)+"es\n"),(sub.substring(0,sub_size-1)+"ez\n")); //s->z

else if(((target_size>=2)&&(target.substring(target_size-2,target_size-1).equals("y")))||(target_size<3)) //bug stopper

body = realReplace(sofar+="z",body,(target.substring(0,target_size-2)+"ies\n"),(sub.substring(0,sub_size-1)+"iez\n")); //s->z

else

body = realReplace(sofar+="z",body,(target.substring(0,target_size-1)+"s\n"),(sub.substring(0,sub_size-1)+"z\n")); //s->z

/* //y

body = realReplace("qqq",body,"ay\n","ey\n"); //stopgap, might want to revisit

body = replace(body,"ey\n","ey\n");

body = realReplace("qqq",body,"oy\n","oi\n");

body = realReplace("qqq",body,"uy\n","ahy\n");

body = realReplace("qqq",body,"y\n","ee\n"); //might need generalized in replace()

body = replace(body,"ty","tahy"); */

//ly, focus on y as of 1.7.4.3 - It might need some work

if(target.equals("sly\n")) //special case

body = realReplace(sofar+="l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

if((target_size>=5)&&(target.substring(target_size-5,target_size-1).equals("able")))

body = realReplace(sofar+="l",body,(target.substring(0,target_size-2)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n")); //ably

else if((target_size>=2)&&(target.charAt(target_size-2)=='a'))

body = realReplace(sofar+="l",body,(target.substring(0,target_size-1)+"y\n"),(sub.substring(0,sub_size-2)+"ey\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+="l",body,(target.substring(0,target_size-1)+"y\n"),(sub.substring(0,sub_size-1)+"y\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+="l",body,(target.substring(0,target_size-1)+"y\n"),(sub.substring(0,sub_size-1)+"i\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='u'))

body = realReplace(sofar+="l",body,(target.substring(0,target_size-1)+"y\n"),(sub.substring(0,sub_size-2)+"ahy\n"));

else

body = realReplace(sofar+="l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

if((!sofar.contains("g"))&&(!sofar.contains("i"))&&(!sofar.contains("r"))){ //covers multiple

if(target_size>=4){ //gerunds, include \n or space

if((!target.endsWith("g\n"))&&(!target.endsWith("gs\n"))&&(!target.endsWith("gz"))) //leave no base uncovered

if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ie")))

body = realReplace(sofar+="g",body,(target.substring(0,target_size-3)+"ying\n"),(sub.substring(0,sub_size-1)+"ing\n")); //replacing 'ie' before gerund

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+="g",body,(target.substring(0,target_size-2)+"ing\n"),(sub.substring(0,sub_size-1)+"ing\n")); //removing 'e'

}else if((!target.endsWith("gs\n"))&&(!target.endsWith("gz"))) //no "ing\n" or s\z at end

body = realReplace(sofar+="g",body,(target.substring(0,target_size-1)+"ing\n"),(sub.substring(0,sub_size-1)+"ing\n")); //no e, presumably ends in consonant

if((!sofar.contains("a"))&&(!sofar.contains("d"))) //ish

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ed")))||(target_size<3))

body = realReplace(sofar+="i",body,(target.substring(0,target_size-1)+"ish\n"),(sub.substring(0,sub_size-1)+"ish\n"));

if(!sofar.contains("a")) //able

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

body = realReplace(sofar+="a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

else if(target.equals("fly")||target.equals("unfly"))

body = realReplace(sofar+="a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

else if(((target_size>=4)&&(target.substring(target_size-4,target_size-1).equals("ing")))||(target_size<4))

body = realReplace(sofar+="a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"eybuhl\n"));

}

if((!sofar.contains("g"))&&(!sofar.contains("d"))){ //covers multiple

if(target_size>=2) //d at end

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if(target.charAt(target_size-2)=='e')

body = realReplace(sofar+="d",body,(target.substring(0,target_size-1)+"d\n"),(sub.substring(0,sub_size-1)+"st\n"));

else if((target.charAt(target_size-2)!='s')||((target.substring(target_size-3,target_size-1).equals("ss"))))

body = realReplace(sofar+="d",body,(target.substring(0,target_size-1)+"ed\n"),(sub.substring(0,sub_size-1)+"st\n"));

else if(target.charAt(target_size-2)=='s')

body = realReplace(sofar+="d",body,(target.substring(0,target_size-1)+"ed\n"),(sub.substring(0,sub_size-1)+"ed\n"));

else if(target.substring(target_size-3,target_size-1).equals("se"))

body = realReplace(sofar+="d",body,(target.substring(0,target_size-1)+"d\n"),(sub.substring(0,sub_size-1)+"ed\n"));

//er

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+="r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else

body = realReplace(sofar+="r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n"));

}

//Why do these need to be dealt with here?

//Because these permuations need to be available to figure out which \n grammars to apply

//ed, ish, ly, ing, able, edly, ishly, ably, lying, eding, abling

//Dirty method - add a recursion counter to replace()

//6 max - ed ish ly ing able z

//ablingly, lyingly - 3

//ablinger

//s-z, ly-l, ing-g, d-d, ish-i, able-a

//everything abides i, nothing abides s/l //nevermind, not much likes i either

//a allows l/s/d,

//a forbids a, i

//d forbids d, i

//g forbids d, g, i, a

//i forbids s, g, i, a

//er-r

//r forbids g, i, a

//r is forbidden by s, l, g, d

//I think that forbiddance is total - no forbidden suffixes at any point before

}

}

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

return body;

}

}

Edited by Kurkistan
Link to comment
Share on other sites

It is a little difficult to translate into a phonetics system another person has devised when my own system of phonetics is different. But I'll look at your code to get an idea. B)

Also, here's a list of english suffixes, which might help for finding rarer ones: http://www.michigan-proficiency-exams.com/suffix-list.html

tion > Sun, should also affect 'tionate', 'tionately' and 'tioning'.

sion > Sun, should also affect 'sionate' and 'sionately'.

tech > tek

ocean > oSun

ture > Cur

advance > advans, not advuhns

row > ro, not rau

low > lo, not lau

age\n > aij\n

'ged\n' > 'jed'

'ridge' could be 'rij' instead of 'ridge' (or instead of 'ridje' if the above suggestion were changed)

ques\n > ks\n, not kuues\n, but this should not interefere with 'question'.

'que\n' > 'k\n', but this should not interfere with 'question'.

'reading' becomes 'reeyding', but 'reader' doesn't change.

strategies should not become 'strategkiez'.

possibilities should not become 'posibilitkiez'.

ture > Cur

ely\n > lee\n

'typically' should have the -lee suffix, but this should not interfere with the word 'ally' when a change is made.

'specific' should not be 'spesuhfik', but 'spesifik'.

tual\n > Cual\n

'one' should be 'uuhn', not 'uhn' I guess... kind of a wierd one without long and short vowel characters.

'while' could be 'uhyl' instead of 'uhhyl'.

Here's a tough one: 'disciplines' should have 'plinez' and not 'plahynez', but 'plahynez' is perfect for 'splines'.

'insight' should be 'insahyt', not 'insigt'.

'rightly' should be 'rahytlee', not 'rigtlee'.

'wrongly' should be 'ronglee', not 'rongly'.

'deciding' should be 'desahyding', not 'dekahyding'.

'services' should be 'servisez', not 'servahysez'.

'associated' should be 'asoSiated', not 'asociated'. Not sure how that 'c' got through.

'accomplishing' should be 'akompliSing', not 'aksompliSing'.

'highly' should be 'hahylee', not 'hilee'.

'have' should be 'hav', not 'heyv'.

'references' should be 'referensez', not 'referenseez'.

'quickly' should be 'kuiklee', not 'kuuiklee'. This one is not a big deal, just has an extra letter.

'tacit' > 'tasit'. It's a word that didn't get the 'c' converted.

'practice' should be 'praktis', not 'practahys'.

'quite' should be 'kuahyt', not 'kuuite'. Notice the extra 'u' again.

There was a reference to T.S. Eliot. As you can imagine, this messed with the punctuation. Don't know how to even suggest tackling something like this...

Also, abreviations tend to have extra periods. Maybe a list of common abbreviations would be good to include for the purpose of removing those extra periods. Sounds hard to me though. Here's a few that are common in books, used in referencing other works: p. > page; pg. > page; pgs. > pages; lit. > literally; no. > number (man this one sucks...); who knows what others... maybe these are a "That's too bad" scenario, haha.

I found an instance where a sentence was indented with the tab key. When the period was moved to the front, it was placed before the indentation with a gap before the first word of the sentence.

I found one case where the word 'poetry' was translated to 'poetree', but a few words later, another instance of the same word followed by a comma was not converted. Maybe commas mess with the '-try' suffix? Perhaps removing commas first thing might be beneficial.

Also, is it possible to convert all number characters into their spelled forms? Actually, maybe I will add number symbols. That will make it easier.

Is it possible to remove all extra characters such as ' , ; : @ # $ % ^ & * ( ) - _ = + / \ | ` ~ < > [ ] { }

Hope this helps! And regardless of the existing glitches, this program is amazing. I see lots of complex words transliterated perfectly throughout the article. Great work man!

Edited by Turos
Link to comment
Share on other sites

It is a little difficult to translate into a phonetics system another person has devised when my own system of phonetics is different. But I'll look at your code to get an idea. B)

Also, here's a list of english suffixes, which might help for finding rarer ones: http://www.michigan-proficiency-exams.com/suffix-list.html

tion > Sun, should also affect 'tionate', 'tionately' and 'tioning'.

sion > Sun, should also affect 'sionate' and 'sionately'.

tech > tek

ocean > oSun

ture > Cur

advance > advans, not advuhns

row > ro, not rau

low > lo, not lau

age\n > aij\n

'ged\n' > 'jed'

'ridge' could be 'rij' instead of 'ridge' (or instead of 'ridje' if the above suggestion were changed)

ques\n > ks\n, not kuues\n, but this should not interefere with 'question'.

'que\n' > 'k\n', but this should not interfere with 'question'.

'reading' becomes 'reeyding', but 'reader' doesn't change.

strategies should not become 'strategkiez'.

possibilities should not become 'posibilitkiez'.

ture > Cur

ely\n > lee\n

'typically' should have the -lee suffix, but this should not interfere with the word 'ally' when a change is made.

'specific' should not be 'spesuhfik', but 'spesifik'.

tual\n > Cual\n

'one' should be 'uuhn', not 'uhn' I guess... kind of a wierd one without long and short vowel characters.

'while' could be 'uhyl' instead of 'uhhyl'.

Here's a tough one: 'disciplines' should have 'plinez' and not 'plahynez', but 'plahynez' is perfect for 'splines'.

'insight' should be 'insahyt', not 'insigt'.

'rightly' should be 'rahytlee', not 'rigtlee'.

'wrongly' should be 'ronglee', not 'rongly'.

'deciding' should be 'desahyding', not 'dekahyding'.

'services' should be 'servisez', not 'servahysez'.

'associated' should be 'asoSiated', not 'asociated'. Not sure how that 'c' got through.

'accomplishing' should be 'akompliSing', not 'aksompliSing'.

'highly' should be 'hahylee', not 'hilee'.

'have' should be 'hav', not 'heyv'.

'references' should be 'referensez', not 'referenseez'.

'quickly' should be 'kuiklee', not 'kuuiklee'. This one is not a big deal, just has an extra letter.

'tacit' > 'tasit'. It's a word that didn't get the 'c' converted.

'practice' should be 'praktis', not 'practahys'.

'quite' should be 'kuahyt', not 'kuuite'. Notice the extra 'u' again.

There was a reference to T.S. Eliot. As you can imagine, this messed with the punctuation. Don't know how to even suggest tackling something like this...

Also, abreviations tend to have extra periods. Maybe a list of common abbreviations would be good to include for the purpose of removing those extra periods. Sounds hard to me though. Here's a few that are common in books, used in referencing other works: p. > page; pg. > page; pgs. > pages; lit. > literally; no. > number (man this one sucks...); who knows what others... maybe these are a "That's too bad" scenario, haha.

I found an instance where a sentence was indented with the tab key. When the period was moved to the front, it was placed before the indentation with a gap before the first word of the sentence.

I found one case where the word 'poetry' was translated to 'poetree', but a few words later, another instance of the same word followed by a comma was not converted. Maybe commas mess with the '-try' suffix? Perhaps removing commas first thing might be beneficial.

Also, is it possible to convert all number characters into their spelled forms? Actually, maybe I will add number symbols. That will make it easier.

Is it possible to remove all extra characters such as ' , ; : @ # $ % ^ & * ( ) - _ = + / \ | ` ~ < > [ ] { }

Hope this helps! And regardless of the existing glitches, this program is amazing. I see lots of complex words transliterated perfectly throughout the article. Great work man!

Ow. Very thorough. Thank you very much for doing this: I doubt that I could have stood going through another transliteration of the Odyssey looking for errors. I see much work ahead, but at least the end is in sight. *knocks on wood* The system of phonetics that I'm using is the one used on Dictionary.com, simply because of ease of use and consistency.

I' going to say goodbye to efficiency for now and just tack on all of those suffixes without looking for necessary conflicts between them. A three-suffix limit should accomplish the job, although it will be less efficient than looking for individual conflicts.

If you do end up adding number characters, be sure to tell me so that I can add them to the "allowed" list when removing forbidden characters.

I'll whip something up to remove forbidden characters, but I think that abbreviations and acronyms will just have to go the way of the dinosaurs for now, given the complexity involved in fixing them and the relative ease with which they can be avoided.

"Poetry," was an example of the comma not being recognized, so getting rid of them would solve that.

I do warn you: I'm going to take a bit of a break for now, and I'll probably start working on this in a few hours at the earliest. I'm a bit burned out just now.

EDIT: Looking at the suffix list, I think I'll just leave well enough alone for now. Most of them are just the ends of existing words, not "tacked on."

Edited by Kurkistan
Link to comment
Share on other sites

I do warn you: I'm going to take a bit of a break for now, and I'll probably start working on this in a few hours at the earliest. I'm a bit burned out just now.

Shoot, if I were you, I'd take a week off after all that logic work :lol:

EDIT:

Ignore this one. A comma came after the word:

'wrongly' should be 'ronglee', not 'rongly'.

Edited by Turos
Link to comment
Share on other sites

Shoot, if I were you, I'd take a week off after all that logic work :lol:

EDIT:

Ignore this one. A comma came after the word:

'wrongly' should be 'ronglee', not 'rongly'.

That was fun. Thanks again for putting in all of that work. Now you get to re-check everything to make sure my fixes didn't mess anything else up! Yeah! I probably need to sit down and reorganize the grammars to eliminate interference, which was the cause of a fair amount of your issues, but I'm too close to it right now.

I didn't get your tab problem, so if you still have it for this version, then please send me the before and after text files that contain that specific error. I also disagree with your categorical "ged\n"->"jeg\n." There's some nuance there.

Fixed all of Turos's most recent bugs, added in "pp" rules, as well as rules for sufixes of words ending in 'p.'

EDTI: Deleted some of the old versions to make room in my attachments.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan, with significant developmental input from Turos

* @date 01/18/2012

* @version 1.8.5

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_8_5

{

static boolean debug_char = false;

static boolean debug_end_e = false;

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter input file (full name of file in same directory): ");

String temp = input.next();

//temp = "Test.txt";

final double startTime = System.currentTimeMillis();

final double endTime;

try {

String alethi = convertText(temp);

if(alethi.equals("&"))

return;

temp = "Alethi_"+temp;

writeFile(alethi,temp);

if(debug_char){

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

} finally {

endTime = System.currentTimeMillis();

}

final double duration = endTime - startTime;

System.out.println("Execution time: "+(duration/1000)+" seconds");

}

private static String convertText(String roman) throws IOException

{

char[] body = readFile(roman);

if((body.length==1)&&(body[0]=='&')) //invalid input, halt program

return "&";

periodMover(body);

roman = new String(body);

if(!debug_char)

roman = removeCharacters(roman);

String alethi = replaceLetters(roman);

return alethi;

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static char[] readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("File not in directory or misspelled.");

return "&".toCharArray();

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole.toCharArray();

}

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("Output file already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

char[] body_array = body.toCharArray();

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body_array=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)){ //c, q, w, and x

library[place] = (char)i;

place++;

}

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body_array=='\n')

line++;

else if(Arrays.binarySearch(library,body_array)<0) //not in library

violations = violations + (line+":"+body_array) + "; ";

return violations;

}

private static String removeCharacters(String body)

{

char[] library = new char[56];

library[0] = '\t'; //tab

library[1] = '\n';

library[2] = ' ';

library[3] = '.';

int place = 4;

for(int i = 65; i <=90; i++){

library[place] = (char)i;

place++;

}

for(int i = 97; i <=122; i++){

library[place] = (char)i;

place++;

}

for(int i = 0; i < body.length(); i++)

if(Arrays.binarySearch(library,body.charAt(i))<0){ //I felt embarrassed by my earlier search algorithm.

body = body.substring(0,i)+body.substring(i+1,body.length());

i--;

}

return body;

}

/**

* In the Alethi alphabet, sentences start with a period '.' and don't end with anything.

*/

private static void periodMover(char[] array)

{

int temp = 0;

for(int i=0;i<array.length;i++)

{

if(array=='.'){

if(!(((array.length - i) >= 3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))) //ellipsis

{

twistRight(array,temp,i);

i++;

while(i<array.length)

if(!inAlphabet(array))

i++;

else

break; //Yes, the cardinal sin.

temp=i;

}

else if(((array.length-i)>=3)&&(array==array[i+1])&&(array[i+1]==array[i+2])) {

for(int j=0;j<3;j++)

twistRight(array,temp+j,i+j);

i+=3;

while(i<array.length)

if(!inAlphabet(array))

i++;

else

break; //Yes, the cardinal sin.

temp=i;

}

}

else if(array=='\n')

temp=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

}

private static boolean inAlphabet(char character)

{

char[] library = new char[26];

int place = 0;

for(int i = 97; i <=122; i++){

library[place] = (char)i;

place++;

}

if(Arrays.binarySearch(library,character)>=0) //I felt embarrassed by my earlier search algorithm.

return true;

return false;

}

private static void twistRight(char[] array, int start, int end)

{

if (start==end)

return;

char a = array[start];

char b;

array[start] = array[end]; //'.', although this is generalized

while(start!=end)

{

start++;

b = array[start];

array[start] = a;

a = b;

}

}

public static void test()

{

String body = "\nsnapping snapper snappily snappy snaps snap snapped snappable snappably\n";

//snapping snapper snappily snappy snaps snap snapped snappable snappably.

String target = "ap\n";

String sub = "op\n";

System.out.println(replace(body,target,sub));

int target_size = target.length();

int sub_size = sub.length();

String sofar = "";

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

System.out.println(body);

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

private static String replaceLetters(String body)

{

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//ph

body = replace(body,"ph","f");

//anti-

body = replace(body,".anti",".antahy");

//wh

body = replace(body,"who\n","hoo\n");

body = replace(body,"where","huair"); //changed w to u

body = replace(body,"whir","huur");

body = replace(body,"wh","hu"); //Might need more permutations

body = replace(body,".accr",".uhkr"); //many many many

body = replace(body,".acci",".aksi");

body = replace(body,".accord",".uhkawrd");

body = replace(body,".accomp",".uhkuhmp");

body = replace(body,".acco",".uhko");

body = replace(body,".accustom\n",".uhkuhstuhm\n");

body = replace(body,".accolade\n",".akuhleyd\n");

body = replace(body,".accus",".uhkyooz");

body = replace(body,".accurs",".uhkurs");

body = replace(body,".accur",".akyer");

body = replace(body,".accum",".uhkyoom");

body = replace(body,".accout",".uhkoot");

body = replace(body,".accoun",".uhkount");

body = replace(body,".acce",".akse"); //the dreaded double c's

body = replace(body,".ecc",".eks");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,".occ",".uhk");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

//E at end - Some interference possible with C's

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //special case

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"have\n","hav\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

body = replace(body,"eve\n","eev\n");

body = replace(body,"ile\n","ahyl\n");

//System.out.println(replace(replace("while ","wh","hu"),"ile\n","ahyl\n"));

//huahyl

body = replace(body,"gle\n","guhl\n");

body = replace(body,"base\n","beys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","ceys\n"); //Don't need to allow for c->k if c's are bellow

body = replace(body,"chase\n","Ceys\n"); //ch == C

body = replace(body,"erase\n","ihreys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,"come\n","cuhm\n"); //Need to move this up

body = replace(body,"ome\n","ohm\n");

body = replace(body,"tle\n","l\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,".discipline\n",".disipline\n");

body = replace(body,"ine\n","ahyn\n");

body = replace(body,".one\n",".uuhn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"ope\n","ohp\n");

body = replace(body,"rue\n","roo\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"bead\n","beed\n");

body = replace(body,".read\n",".reed\n");

body = replace(body,"nead\n","need\n");

body = replace(body,"lead\n","leed\n");

body = replace(body,"ead\n","ed\n"); //general

body = replace(body,"ade\n","eyd\n");

//ere - their vs there

body = replace(body,"ere\n","eir\n");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"tire","tahyuhr"); //NOT \n or e

body = replace(body,"aire\n","air\n");

body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","hrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"promise\n","promis\n");

body = replace(body,"aise\n","eyz\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

body = replace(body,"que\n","k\n");

body = replace(body,"udge\n","uhj\n");

body = replace(body,"dge\n","j\n"); //NOT sure

body = replace(body,"age\n","aij\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"rgue\n","rgyoo\n");

body = replace(body,"gue\n","eeg\n");

//-nge

body = replace(body,"nge\n","nj\n"); //problem with sing vs singe not really being separable at the gerund-testing level

body = replace(body,"sinjing\n","singing\n"); //comprehensive fix for gerund mishaps

body = replace(body,"slinjing\n","slinging\n");

body = replace(body,"strinjing\n","stringing\n");

body = replace(body,"swinjing\n","swinging\n");

body = replace(body,"brinjing\n","bringing\n");

body = replace(body,"flinjing\n","flinging\n");

body = replace(body,"prinjing\n","pringing\n");

body = replace(body,".winjing\n",".winging\n");

body = replace(body,".zinjing\n",".zinging\n");

body = replace(body,".dinjing\n",".dinging\n");

body = replace(body,".pinjing\n",".pinging\n");

//END E's

//s at end - 1.7.4.5 -> unneeded, I think

//body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//START C 1.7 - moved so that higher number of characters in target get's preference, blocks kept cohesive

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

body = replace(body,"accent","aksent");

body = replace(body,"exercise\n","eksersahyz\n");

body = replace(body,".once",".wuhns");

body = replace(body,"preface\n","prefis\n"); //special

body = replace(body,"icise\n","uhsahyz\n");

body = replace(body,"rcise\n","ruhsahyz\n");

body = replace(body,".tacit\n",".tasit\n");

body = replace(body,"ciate\n","sheeeyt\n");

body = replace(body,"vate\n","vit\n"); //pulled from E section, might be a sign of things to come

body = replace(body,"literate\n","literit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"cision\n","sizhuhn\n");

body = replace(body,"cise\n","sahys\n");

body = replace(body,"cist\n","sist");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usst\n"); //D's

body = replace(body,"came\n","keym\n");

body = replace(body,"came","kamuh");

body = replace(body,"tual\n","Cual");

body = replace(body,".acid\n",".asid\n");

body = replace(body,".aci",".uhsi");

body = replace(body,"ierce\n","eers\n");

body = replace(body,"ince\n","ins\n");

//body = replace(body,".ance",".ahns");

body = replace(body,".trance",".trahns");

body = replace(body,"dance\n","dahns\n");

body = replace(body,"Cance","Cahns");

body = replace(body,"cance","cahns");

body = replace(body,"lance","lahns");

body = replace(body,"vance","vahns");

body = replace(body,"ance\n","uhns\n");

body = replace(body,"all\n","awl\n");

body = replace(body,"tice\n","tis\n");

body = replace(body,"arice\n","eris\n");

body = replace(body,"orice\n","uhis\n");

body = replace(body,"cipice\n","suhpis\n"); //patch for precipice

body = replace(body,"ipice\n","uhpis\n");

body = replace(body,".vice\n","vahys\n");

body = replace(body,"vice\n","vis\n");

body = replace(body,"ice\n","ahys\n"); //Long S. NOT sure about \n's

body = replace(body,"egy\n","ijee\n"); //possibilities/strategies fix, I have now idea how the ended up "kiez"

body = replace(body,"ity\n","itee\n");

body = replace(body,"ite\n","ahyt\n");

body = replace(body,"ong\n","ong\n");

body = replace(body,"ull\n","ool\n");

body = replace(body,"cide\n","sahyd\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ence\n","ens\n");

body = replace(body,"ces\n","seez\n");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"ci\n","sahy\n");

body = replace(body,"oy\n","oi\n");

body = replace(body,"ace\n","eys\n");

body = replace(body,"ely\n","lee\n"); //MUST BE LAST IN \N

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"cious","shuhs"); //For Ithaca!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"ply\n","plahy\n");

body = replace(body,".by\n",".bahy\n");

body = replace(body,".my\n",".mahy\n");

body = replace(body,".die\n",".dahy\n");

body = replace(body,".dye\n",".dahy\n");

body = replace(body,".bye\n",".bahy\n"); //conflict

body = replace(body,"hype","hahype");

body = replace(body,"hypo","hahypo");

body = replace(body,"hypn","hipn");

body = replace(body,"hyphen","hahyfuhn");

body = replace(body,"hyfen","hahyfuhn"); //ph->f

body = replace(body,"yp","ip");

body = replace(body,"tion","Suhn"); //1.8

body = replace(body,"sion","zhuhn");

body = replace(body,"cean","Suhn");

body = replace(body,"ture","Cur");

body = replace(body,"cies","seez"); //prophocies

body = replace(body,"ciez","seez"); //s->z already done

body = replace(body,"iew","yoo");

body = replace(body,".face",".feys");

body = replace(body,"face","feys");

body = replace(body,"acen","eysuhn"); //Don't get complacent

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweynt");

body = replace(body,"cing","sing");

//1.6.5 - odyssey test

body = replace(body,"exce","ikse");

body = replace(body,"excit","iksahyt");

body = replace(body,"excis","eksahyz");

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ight","ahyt");

body = replace(body,"cep","sep");

body = replace(body,"cin","sin");

body = replace(body,".cit",".sit");

body = replace(body,"cip","sip");

body = replace(body,"cif","sif"); //NOT sure

body = replace(body,"icc","ik");

body = replace(body,"icn","ikn");

body = replace(body,"sce","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

//body = replace(body,"sco","sko");

body = replace(body,"cea","sea");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsee");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cent","sent"); //odyssey

body = replace(body,"ap\n","ap\n");

body = replace(body,"ppen","pen"); //double p's, might NOT be done

body = replace(body,"ppl","puhl");

body = replace(body,"upp\n","uhp");

body = replace(body,"oppor","oper");

body = replace(body,"opp","uhp");

body = replace(body,"ypp","ip");

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".cer",".sur");

body = replace(body,".ce",".se");

body = replace(body,"ck","k");

body = realReplace("QQQ",body,"C\n","k\n");

body = realReplace("QQQ",body,"ch\n","k\n");

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"ce","se");

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

body = realReplace("QQQ",body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = realReplace("QQQ",body,"c\n","k\n"); //to stop mischeif

//END C'S

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

//gh

body = replace(body,"gha","gah"); //This section needs work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"igh","ahy");

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".to\n",".too\n");

body = replace(body,".two\n",".too\n");

//q at end

body = realReplace("QQQ",body,"q\n","k\n");

//w at end

body = replace(body,".low\n",".loh\n");//special cases

body = replace(body,".row\n",".roh\n");

body = replace(body,"ow\n","au\n");

//.sy

body = replace(body,".syr",".suhr"); //Moved up to e-enders

body = replace(body,".syr",".sir");

body = replace(body,".sly",".slahy");

body = replace(body,".lying\n",".lahying\n");

body = replace(body,".ly",".li");

//sz->siz - The coward's way out. I need to sit down and make this thing more cohesive

body = replace(body,"sz\n","siz\n");

body = realReplace("qqq",body,"y\n","ee\n");

body = realReplace("qqq",body,"ehee\n","ehy\n");

body = realReplace("qqq",body,"ahee\n","ahy\n");

body = realReplace("qqq",body,"eee\n","ey\n"); //fixing issues raised by y->ee as compared to other phonetics

String[] temp = {"en","st","un","c","f","g","s","t",""};

body = replace(body,"ctable\n","kteybuhl\n"); //save the c's!

for(int i = 0; i<temp.length;i++)

if(temp.equals("c"))

body = replace(body,"kable\n","eybuhl\n");

else

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"ble\n","buhl\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

//body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"qu","ku");

//body = replace(body,"q","ku");

/* body = replace(body,"wa","ua"); //Unnecessary? I think not! I'm not sure why, but no.

body = replace(body,"we","ue");

body = replace(body,"wi","ui");

body = replace(body,"wo","uo");

body = replace(body,"wu","uu"); */

body = replace(body,"w","u"); //exception catcher

if(debug_end_e){

body = replace(body,"e\n","Q\n"); //Just for debugging

body = replace(body,".TQ",".Te");

body = replace(body,".bQ",".be");

body = replace(body,".seQ",".seee");

body = replace(body,".mQ",".me");

body = replace(body,"eQ\n","ee\n");

body = replace(body,"Qy\n","ey\n");

body = replace(body,".hQ",".he");

body = replace(body,".shQ",".she");

}

return body.substring(1,body.length()-1); //clipping first/last '\n'

}

private static String replace(String body, String target, String sub){

return realReplace("",body,target,sub);

}

private static String realReplace(String sofar, String body, String target, String sub)

{

int target_size = target.length();

int sub_size = sub.length();

//'.'==' '

if(target.startsWith(".")){

body = replace(body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub_size)));

body = replace(body,("\n"+target.substring(1,target_size)),("\n"+sub.substring(1,sub_size)));

/*

//re-

if(((target_size>=5)&&(!target.substring(1,5).equals("rere")))||(target_size<3)) //clumsy

body = replace(body,".re"+target.substring(1,target_size),".ree"+sub.substring(1,target_size)); */

}

if(target.endsWith("\n")){ //checks for spaces and for plurals, also does s->z conversion where necessary

body = replace(body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub_size-1)+" ")); //space substitution

if(sofar.length()<=2){ //that took longer than it should have. Anyone who can suggest improvements is welcome to try.

if((!sofar.contains("z"))&&(!sofar.contains("l"))){ //I think contains() covers it. It saves time over endsWith() if it stops unnecessary calls to realReplace(), as long as it doesn't cut out possible permutations

if(!sofar.contains("i"))// s->z

if((target_size>=2)&&(target.charAt(target_size-2)!='s')&&(target.charAt(target_size-2)!='z')) //Double-checking s/z

if(target.charAt(target_size-2)=='e')

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s\n"),(sub.substring(0,sub_size-1)+"z\n"));

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s\n"),(sub.substring(0,sub_size-1)+"ez\n")); //s->z

else if(((target_size>=2)&&(target.charAt(target_size-2)=='y'))||(target_size<3)) //bug stopper

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies\n"),(sub.substring(0,sub_size-1)+"z\n"));

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies\n"),(sub.substring(0,sub_size-1)+"iez\n")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s\n"),(sub.substring(0,sub_size-1)+"z\n")); //s->z

/* //y

body = realReplace("qqq",body,"ay\n","ey\n"); //stopgap, might want to revisit

body = replace(body,"ey\n","ey\n");

body = realReplace("qqq",body,"oy\n","oi\n");

body = realReplace("qqq",body,"uy\n","ahy\n");

body = realReplace("qqq",body,"y\n","ee\n"); //might need generalized in replace()

body = replace(body,"ty","tahy"); */

//ly, focus on y as of 1.7.4.3 - It might need some work

if(target.equals("sly\n")) //special case

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

//ly

if((target_size>=5)&&(target.substring(target_size-5,target_size-1).equals("able")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"y\n"),(sub.substring(0,sub_size-4)+"lee\n")); //ably

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"ily\n"),(sub.substring(0,sub_size-2)+"uhlee\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"pily\n"),(sub.substring(0,sub_size-1)+"uhlee\n"));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

//y

if((target_size>=2)&&(target.charAt(target_size-2)=='a'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"y\n"),(sub.substring(0,sub_size-2)+"ey\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"y\n"),(sub.substring(0,sub_size-1)+"y\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"y\n"),(sub.substring(0,sub_size-1)+"i\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='u'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"y\n"),(sub.substring(0,sub_size-2)+"ahy\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"py\n"),(sub.substring(0,sub_size-1)+"ee\n"));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

if((!sofar.contains("g"))&&(!sofar.contains("i"))&&(!sofar.contains("r"))){ //covers multiple

if((!target.endsWith("g\n"))&&(!target.endsWith("gs\n"))&&(!target.endsWith("gz"))) //leave no base uncovered

if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ie")))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-3)+"ying\n"),(sub.substring(0,sub_size-1)+"ing\n")); //replacing 'ie' before gerund

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ing\n"),(sub.substring(0,sub_size-1)+"ing\n")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ping\n"),(sub.substring(0,sub_size-1)+"ing\n"));

else if((!target.endsWith("gs\n"))&&(!target.endsWith("gz"))) //no "ing\n" or s\z at end

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing\n"),(sub.substring(0,sub_size-1)+"ing\n")); //no e, presumably ends in consonant

if((!sofar.contains("a"))&&(!sofar.contains("d"))) //ish

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ed")))||(target_size<3))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"ish\n"),(sub.substring(0,sub_size-1)+"ish\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"pish\n"),(sub.substring(0,sub_size-1)+"ish\n"));

if(!sofar.contains("a")) //able

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

}

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

else if(target.equals("fly")||target.equals("unfly"))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

else if(((target_size>=4)&&(target.substring(target_size-4,target_size-1).equals("ing")))||(target_size<4))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"eybuhl\n"));

}

if((!sofar.contains("g"))&&(!sofar.contains("d"))){ //covers multiple

if(target_size>=2) //d at end

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if(target.charAt(target_size-2)=='e')

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d\n"),(sub.substring(0,sub_size-1)+"ed\n")); //NOT st

else if(target.charAt(target_size-2)=='s')

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed\n"),(sub.substring(0,sub_size-1)+"ed\n"));

else if(target.substring(target_size-3,target_size-1).equals("se"))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d\n"),(sub.substring(0,sub_size-1)+"ed\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ped\n"),(sub.substring(0,sub_size-1)+"ed\n"));

else if((target.charAt(target_size-2)!='s')||((target.substring(target_size-3,target_size-1).equals("ss"))))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed\n"),(sub.substring(0,sub_size-1)+"st\n"));

//er

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"per\n"),(sub.substring(0,sub_size-1)+"er\n"));

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n"));

}

/* //ate, not bothering with fobiddances - Never mind

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n")); */

//Why do these need to be dealt with here?

//Because these permuations need to be available to figure out which \n grammars to apply

//ed, ish, ly, ing, able, edly, ishly, ably, lying, eding, abling

//Dirty method - add a recursion counter to replace()

//6 max - ed ish ly ing able z

//ablingly, lyingly - 3

//ablinger

//s-z, ly-l, ing-g, d-d, ish-i, able-a

//everything abides i, nothing abides s/l //nevermind, not much likes i either

//a allows l/s/d,

//a forbids a, i

//d forbids d, i

//g forbids d, g, i, a

//i forbids s, g, i, a

//er-r

//r forbids g, i, a

//r is forbidden by s, l, g, d

//I think that forbiddance is total - no forbidden suffixes at any point before

}

}

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

return body;

}

}

Edited by Kurkistan
Link to comment
Share on other sites

Is there a way for the Transliterated Alethi to be Transliterated back?

I'm gonna pull a 'no' on this one, as there are multiple letter combinations that convert to the same sound in phonetics. Ex.: 'ks' converts to 'X' in this program. 'x' also converts to 'X'. Would be impossible to differentiate between the two. Their are other cases, but it's still too early in the morning for me to come up with them, haha.

It would be possible if Kurkistan made a secondary file that recorded every single conversion and listed the location on the page, but I don't dare ask him to put himself through something like that... and it wouldn't work on text someone typed out in Alethi in the first place.

Sorry.

----------------------

@Kurkistan:

'ritual' > 'riCual', you got the 'tual' to be 'Cua', but the 'l' is missing.

'factual' is 'facCua' now. Both missing 'l' and 'c' didn't convert to 'k' like it did before.

'introduction' used to convert to 'introduktion' and now converts to 'introducSuhn'. It seems the 'tion' conversion caused the 'c' conversion from before to cancel. Same thing with 'section' losing the 'c' convert for the 'tion' convert.

'each' used to be 'eaC', now it's 'eak'.

'research' was 'researC', now 'researk'.

'which' used to convert to 'uhiC' and now converts to 'huik'.

I like the 'sion' to 'zhun' conversion you added, very nice. It is off for cases like 'passion' but perfect for 'diversion'. I use the 'zh' as well in phonetics, and it's cool to see it used elsewhere! I think maybe if its 'sion', 'zhuhn' is correct, but maybe 'ssion' will always be 'Suhn'. Not a big deal though.

'case,' used to convert to 'kase,', but now to 'seys'. I wouldn't imagine the comma removal causing this glitch, but thought its best to incdicate it was there before.

'however,' used to convert to 'houever,', but now to 'houeever'. Extra 'e'. Not important, but perhaps a clue to the comma removal actually causing issues.

'associated' used to convert to 'asociated', but now to 'asoSeeeyted'. No comma involved this time, weird. Oh, I think I understand. its a combo of See with a long e and eyted with a long a sound. Nevermind ^^

'accomplishing' was 'aksompliSing', now 'uhkuhmpliSing'. Not a big deal, but it gives a heads up to how changes were made to other conversions. Wierd how 'accessible' converts to 'aksesibuhl' with no problem in both versions.

I notice you mentioned attacking 'pp'. Not sure how you meant, but here's one still: 'apparatus'.

Ah shoot... I hate english exceptions. I never thought that the 'tion' conversion would screw up words like 'bastion' where 'Suhn' would be improper and 'Cuhn' would be more fitting. I don't think this is really worth tackling, though. Darn you english language!!! Spanish would be so much easier.

Anywho, ya the tab problem still happened. I'll post the before and after attachments. Tab happens on the fourth paragraph. If it helps, here's a table of character values. The site lists the "horizontal tab" character as number 9 in the list. Something about ASCII values. http://www.asciitable.com/

Before: test4.txt

After: Alethi_test4.txt

(I'll get rid of these attachments after you respond to my glitch update.)

Edited by Turos
Link to comment
Share on other sites

*Applause*

*Bows*

I'm gonna pull a 'no' on this one, as there are multiple letter combinations that convert to the same sound in phonetics. Ex.: 'ks' converts to 'X' in this program. 'x' also converts to 'X'. Would be impossible to differentiate between the two. Their are other cases, but it's still too early in the morning for me to come up with them, haha.

It would be possible if Kurkistan made a secondary file that recorded every single conversion and listed the location on the page, but I don't dare ask him to put himself through something like that... and it wouldn't work on text someone typed out in Alethi in the first place.

Sorry.

Second that. If you want an Latin/Roman transliteration of an Alethi file, then your best bet is just to hope that the file was originally typed in English, and acquire that source file.

It might be possible to transliterate raw Alethi, despite the fact that our Alethi alphabet has fewer characters: someone might be able to reverse-engineer my program or create one of their own from scratch to transliterate it into English. The problem is that the spelling of many English words is phonetically arbitrary, so you won't be able to get proper spelling unless you put in a ludicrous amount of work, and maybe not even then.

----------------------

@Kurkistan:

'ritual' > 'riCual', you got the 'tual' to be 'Cua', but the 'l' is missing.

'factual' is 'facCua' now. Both missing 'l' and 'c' didn't convert to 'k' like it did before.

'introduction' used to convert to 'introduktion' and now converts to 'introducSuhn'. It seems the 'tion' conversion caused the 'c' conversion from before to cancel. Same thing with 'section' losing the 'c' convert for the 'tion' convert.

'each' used to be 'eaC', now it's 'eak'.

'research' was 'researC', now 'researk'.

'which' used to convert to 'uhiC' and now converts to 'huik'.

I like the 'sion' to 'zhun' conversion you added, very nice. It is off for cases like 'passion' but perfect for 'diversion'. I use the 'zh' as well in phonetics, and it's cool to see it used elsewhere! I think maybe if its 'sion', 'zhuhn' is correct, but maybe 'ssion' will always be 'Suhn'. Not a big deal though.

'case,' used to convert to 'kase,', but now to 'seys'. I wouldn't imagine the comma removal causing this glitch, but thought its best to incdicate it was there before.

'however,' used to convert to 'houever,', but now to 'houeever'. Extra 'e'. Not important, but perhaps a clue to the comma removal actually causing issues.

'associated' used to convert to 'asociated', but now to 'asoSeeeyted'. No comma involved this time, weird. Oh, I think I understand. its a combo of See with a long e and eyted with a long a sound. Nevermind ^^

'accomplishing' was 'aksompliSing', now 'uhkuhmpliSing'. Not a big deal, but it gives a heads up to how changes were made to other conversions. Wierd how 'accessible' converts to 'aksesibuhl' with no problem in both versions.

I notice you mentioned attacking 'pp'. Not sure how you meant, but here's one still: 'apparatus'.

Ah shoot... I hate english exceptions. I never thought that the 'tion' conversion would screw up words like 'bastion' where 'Suhn' would be improper and 'Cuhn' would be more fitting. I don't think this is really worth tackling, though. Darn you english language!!! Spanish would be so much easier.

Anywho, ya the tab problem still happened. I'll post the before and after attachments. Tab happens on the fourth paragraph. If it helps, here's a table of character values. The site lists the "horizontal tab" character as number 9 in the list. Something about ASCII values. http://www.asciitable.com/

Before: test4.txt

After: Alethi_test4.txt

(I'll get rid of these attachments after you respond to my glitch update.)

I'm away from my computer and working files for a few more hours, but these all look like relatively simple issues, not the death-log of your last test. Most of them are just stupid mistakes I made, like forgetting the second '\n' when going from "tual\n"->"Cual\n."

I'll take a look at that tab problem as well, although that's almost certainly just an itsy bitsy coding issue, not a implication-ridden grammatical error.

EDIT: Found it in the spoiled code already. I need to move removeCharacter() higher up in the function: There was a colon just before that tab, and that's what the period was moving to.

I agree wholeheartedly that Spanish would be easier. English spelling is what happens when you take one alphabet, use it to generate choose-your-own-adventure spelling for two completely different families of languages, and then mash those languages back together again, stealing from a few others along the way.

Edited by Kurkistan
Link to comment
Share on other sites

Dealt with Turos' bugs, added rules for suffixes which add a 't' onto the end of words, added a few more "pp" rules, although there might be a few more floating around.

Specifically, a few of the bugs that Turos pointed out were actually intentional on my part based upon Dictionary.com phonetics:

I meant which->huiC.

I messed up with case->seys, but it should have been case->keys all along.

Many of the cases of .a->.uh are actually intentional, although it varies by word, and so is still worth double checking.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan, with significant developmental input from Turos

* @date 01/18/2012

* @version 1.8.6

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_8_6

{

static boolean debug_char = false;

static boolean debug_end_e = false;

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter input file (full name of file in same directory): ");

String temp = input.next();

//temp = "Test.txt";

final double startTime = System.currentTimeMillis();

final double endTime;

try {

String alethi = convertText(temp);

if(alethi.equals("&"))

return;

temp = "Alethi_"+temp;

writeFile(alethi,temp);

if(debug_char){

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

} finally {

endTime = System.currentTimeMillis();

}

final double duration = endTime - startTime;

System.out.println("Execution time: "+(duration/1000)+" seconds");

}

private static String convertText(String roman) throws IOException

{

roman = readFile(roman); //text file

if(!debug_char)

roman = removeCharacters(roman);

if((roman.length()==1)&&(roman.charAt(0)=='&')) //invalid input, halt program

return "&";

char[] body = roman.toCharArray();

periodMover(body);

roman = new String(body);

String alethi = replaceLetters(roman);

return alethi;

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static String readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("File not in directory or misspelled.");

return "&";

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole;

}

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("Output file already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

char[] body_array = body.toCharArray();

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body_array=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)){ //c, q, w, and x

library[place] = (char)i;

place++;

}

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body_array=='\n')

line++;

else if(Arrays.binarySearch(library,body_array)<0) //not in library

violations = violations + (line+":"+body_array) + "; ";

return violations;

}

private static String removeCharacters(String body)

{

char[] library = new char[56];

library[0] = '\t'; //tab

library[1] = '\n';

library[2] = ' ';

library[3] = '.';

int place = 4;

for(int i = 65; i <=90; i++){

library[place] = (char)i;

place++;

}

for(int i = 97; i <=122; i++){

library[place] = (char)i;

place++;

}

for(int i = 0; i < body.length(); i++)

if(Arrays.binarySearch(library,body.charAt(i))<0){ //I felt embarrassed by my earlier search algorithm.

body = body.substring(0,i)+body.substring(i+1,body.length());

i--;

}

return body;

}

/**

* In the Alethi alphabet, sentences start with a period '.' and don't end with anything.

*/

private static void periodMover(char[] array)

{

int temp = 0;

for(int i=0;i<array.length;i++)

{

if(array=='.'){

if(!(((array.length - i) >= 3)&&(array==array[i+1])&&(array[i+1]==array[i+2]))) //ellipsis

{

twistRight(array,temp,i);

i++;

while(i<array.length)

if(!inAlphabet(array))

i++;

else

break; //Yes, the cardinal sin.

temp=i;

}

else if(((array.length-i)>=3)&&(array==array[i+1])&&(array[i+1]==array[i+2])) {

for(int j=0;j<3;j++)

twistRight(array,temp+j,i+j);

i+=3;

while(i<array.length)

if(!inAlphabet(array))

i++;

else

break; //Yes, the cardinal sin.

temp=i;

}

}

else if(array=='\n')

temp=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

}

private static boolean inAlphabet(char character)

{

char[] library = new char[26];

int place = 0;

for(int i = 97; i <=122; i++){

library[place] = (char)i;

place++;

}

if(Arrays.binarySearch(library,character)>=0) //I felt embarrassed by my earlier search algorithm.

return true;

return false;

}

private static void twistRight(char[] array, int start, int end)

{

if (start==end)

return;

char a = array[start];

char b;

array[start] = array[end]; //'.', although this is generalized

while(start!=end)

{

start++;

b = array[start];

array[start] = a;

a = b;

}

}

public static void test()

{

String body = "\nsnapping snapper snappily snappy snaps snap snapped snappable snappably\n";

//snapping snapper snappily snappy snaps snap snapped snappable snappably.

String target = "ap\n";

String sub = "op\n";

System.out.println(replace(body,target,sub));

int target_size = target.length();

int sub_size = sub.length();

String sofar = "";

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

System.out.println(body);

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

private static String replaceLetters(String body)

{

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//ph

body = replace(body,"ph","f");

//anti-

body = replace(body,".anti",".antahy");

//wh

body = replace(body,"who\n","hoo\n");

body = replace(body,"where","huair"); //changed w to u

body = replace(body,"whir","huur");

body = replace(body,"wh","hu"); //Might need more permutations

body = replace(body,".accr",".uhkr"); //many many many

body = replace(body,".acci",".aksi");

body = replace(body,".accord",".uhkawrd");

body = replace(body,".accomp",".uhkuhmp");

body = replace(body,".acco",".uhko");

body = replace(body,".accustom\n",".uhkuhstuhm\n");

body = replace(body,".accolade\n",".akuhleyd\n");

body = replace(body,".accus",".uhkyooz");

body = replace(body,".accurs",".uhkurs");

body = replace(body,".accur",".akyer");

body = replace(body,".accum",".uhkyoom");

body = replace(body,".accout",".uhkoot");

body = replace(body,".accoun",".uhkount");

body = replace(body,".acce",".akse"); //the dreaded double c's

body = replace(body,".ecc",".eks");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,".occ",".uhk");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

//E at end - Some interference possible with C's

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //special case

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"have\n","hav\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

//body = replace(body,"ever\n","ever\n");

body = replace(body,"eve\n","eev\n"); //HOWEVER

body = replace(body,"eever\n","ever\n");

body = replace(body,"ile\n","ahyl\n");

//System.out.println(replace(replace("while ","wh","hu"),"ile\n","ahyl\n"));

//huahyl

body = replace(body,"gle\n","guhl\n");

body = replace(body,"base\n","beys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","keys\n");

body = replace(body,"chase\n","Ceys\n"); //ch == C

body = replace(body,"Case\n","Ceys\n"); //necessary?

body = replace(body,"erase\n","ihreys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,"come\n","cuhm\n"); //Need to move this up

body = replace(body,"ome\n","ohm\n");

body = replace(body,"tle\n","l\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,".discipline\n",".disipline\n");

body = replace(body,"ine\n","ahyn\n");

body = replace(body,".one\n",".uuhn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"ope\n","ohp\n");

body = replace(body,"rue\n","roo\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"bead\n","beed\n");

body = replace(body,".read\n",".reed\n");

body = replace(body,"nead\n","need\n");

body = replace(body,"lead\n","leed\n");

body = replace(body,"ead\n","ed\n"); //general

body = replace(body,"ade\n","eyd\n");

//ere - their vs there

body = replace(body,"ere\n","eir\n");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"tire","tahyuhr"); //NOT \n or e

body = replace(body,"aire\n","air\n");

body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","hrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"promise\n","promis\n");

body = replace(body,"aise\n","eyz\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

body = replace(body,"que\n","k\n");

body = replace(body,"udge\n","uhj\n");

body = replace(body,"dge\n","j\n"); //NOT sure

body = replace(body,"age\n","aij\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"rgue\n","rgyoo\n");

body = replace(body,"gue\n","eeg\n");

//-nge

body = replace(body,"nge\n","nj\n"); //problem with sing vs singe not really being separable at the gerund-testing level

body = replace(body,"sinjing\n","singing\n"); //comprehensive fix for gerund mishaps

body = replace(body,"slinjing\n","slinging\n");

body = replace(body,"strinjing\n","stringing\n");

body = replace(body,"swinjing\n","swinging\n");

body = replace(body,"brinjing\n","bringing\n");

body = replace(body,"flinjing\n","flinging\n");

body = replace(body,"prinjing\n","pringing\n");

body = replace(body,".winjing\n",".winging\n");

body = replace(body,".zinjing\n",".zinging\n");

body = replace(body,".dinjing\n",".dinging\n");

body = replace(body,".pinjing\n",".pinging\n");

//END E's

//s at end - 1.7.4.5 -> unneeded, I think

//body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//START C 1.7 - moved so that higher number of characters in target get's preference, blocks kept cohesive

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

body = replace(body,"accent","aksent");

body = replace(body,"exercise\n","eksersahyz\n");

body = replace(body,".once",".wuhns");

body = replace(body,"preface\n","prefis\n"); //special

body = replace(body,"icise\n","uhsahyz\n");

body = replace(body,"rcise\n","ruhsahyz\n");

body = replace(body,".tacit\n",".tasit\n");

body = replace(body,"ciate\n","sheeeyt\n");

body = replace(body,"vate\n","vit\n"); //pulled from E section, might be a sign of things to come

body = replace(body,"literate\n","literit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"cision\n","sizhuhn\n");

body = replace(body,"cise\n","sahys\n");

body = replace(body,"cist\n","sist");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usst\n"); //D's

body = replace(body,"came\n","keym\n");

body = replace(body,"came","kamuh");

body = replace(body,"ct","kt"); //factual

body = replace(body,"tual\n","Cual\n");

body = replace(body,".acid\n",".asid\n");

body = replace(body,".aci",".uhsi");

body = replace(body,".key\n",".kee\n"); //special

body = realReplace("QQQ",body,".keys\n",".kees\n");

body = replace(body,"ierce\n","eers\n");

body = replace(body,"ince\n","ins\n");

//body = replace(body,".ance",".ahns");

body = replace(body,".trance",".trahns");

body = replace(body,"dance\n","dahns\n");

body = replace(body,"Cance","Cahns");

body = replace(body,"cance","cahns");

body = replace(body,"lance","lahns");

body = replace(body,"vance","vahns");

body = replace(body,"ance\n","uhns\n");

body = replace(body,"all\n","awl\n");

body = replace(body,"appa","apuh");

body = replace(body,"ppen","pen"); //double p's, might NOT be done

body = replace(body,"pple\n","puhl\n");

body = replace(body,"ppl","puhl");

body = replace(body,"upp\n","uhp");

body = replace(body,"oppor","oper");

body = replace(body,"opp","uhp");

body = replace(body,"ypp","ip");

body = replace(body,"pp","p"); //Last ditch, should cover most before this

body = replace(body,"tice\n","tis\n");

body = replace(body,"arice\n","eris\n");

body = replace(body,"orice\n","uhis\n");

body = replace(body,"cipice\n","suhpis\n"); //patch for precipice

body = replace(body,"ipice\n","uhpis\n");

body = replace(body,".vice\n","vahys\n");

body = replace(body,"vice\n","vis\n");

body = replace(body,"ice\n","ahys\n"); //Long S. NOT sure about \n's

body = replace(body,"egy\n","ijee\n"); //possibilities/strategies fix, I have now idea how the ended up "kiez"

body = replace(body,"ity\n","itee\n");

body = replace(body,"ite\n","ahyt\n");

body = replace(body,"ong\n","ong\n");

body = replace(body,"ull\n","ool\n");

body = replace(body,"cide\n","sahyd\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ence\n","ens\n");

body = replace(body,"ces\n","seez\n");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"ci\n","sahy\n");

body = replace(body,"oy\n","oi\n");

body = replace(body,"ace\n","eys\n");

body = replace(body,".chull\n",".as\n");

body = replace(body,".chull",".uhs"); //Assoc-

body = replace(body,"ely\n","lee\n"); //MUST BE LAST IN \N

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"cious","shuhs"); //For Ithaca!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"ply\n","plahy\n");

body = replace(body,".by\n",".bahy\n");

body = replace(body,".my\n",".mahy\n");

body = replace(body,".die\n",".dahy\n");

body = replace(body,".dye\n",".dahy\n");

body = replace(body,".bye\n",".bahy\n"); //conflict

body = replace(body,"hype","hahype");

body = replace(body,"hypo","hahypo");

body = replace(body,"hypn","hipn");

body = replace(body,"hyphen","hahyfuhn");

body = replace(body,"hyfen","hahyfuhn"); //ph->f

body = replace(body,"yp","ip");

body = replace(body,"duct","duhkt");

body = replace(body,"tion","Suhn"); //1.8

body = replace(body,"ssion","Suhn"); //1.8.6

body = replace(body,"sion","zhuhn");

body = replace(body,"cean","Suhn");

body = replace(body,"ture","Cur");

body = replace(body,"cies","seez"); //prophocies

body = replace(body,"ciez","seez"); //s->z already done

body = replace(body,"iew","yoo");

body = replace(body,".face",".feys");

body = replace(body,"face","feys");

body = replace(body,"acen","eysuhn"); //Don't get complacent

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweynt");

body = replace(body,"cing","sing");

//1.6.5 - odyssey test

body = replace(body,"exce","ikse");

body = replace(body,"excit","iksahyt");

body = replace(body,"excis","eksahyz");

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ight","ahyt");

body = replace(body,"cep","sep");

body = replace(body,"cin","sin");

body = replace(body,".cit",".sit");

body = replace(body,"cip","sip");

body = replace(body,"cif","sif"); //NOT sure

body = replace(body,"icc","ik");

body = replace(body,"icn","ikn");

body = replace(body,"sce","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

//body = replace(body,"sco","sko");

body = replace(body,"cea","sea");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsee");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cent","sent"); //odyssey

body = replace(body,"it\n","it\n"); //Tacked on for suffix reasons

body = replace(body,"ap\n","ap\n");

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".cer",".sur");

body = replace(body,".ce",".se");

body = replace(body,"ck","k");

/* body = realReplace("QQQ",body,"C\n","k\n");

body = realReplace("QQQ",body,"ch\n","k\n"); */

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"ce","se");

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

body = realReplace("QQQ",body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = realReplace("QQQ",body,"c\n","k\n"); //to stop mischeif

//END C'S

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

body = replace(body,".be\n",".bee\n");

body = replace(body,".maybe\n",".meybee\n");

//gh

body = replace(body,"gha","gah"); //This section needs work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"igh","ahy");

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".to\n",".too\n");

body = replace(body,".two\n",".too\n");

//q at end

body = realReplace("QQQ",body,"q\n","k\n");

//w at end

body = replace(body,".low\n",".loh\n");//special cases

body = replace(body,".row\n",".roh\n");

body = replace(body,"ow\n","au\n");

//.sy

body = replace(body,".syr",".suhr"); //Moved up to e-enders

body = replace(body,".syr",".sir");

body = replace(body,".sly",".slahy");

body = replace(body,".lying\n",".lahying\n");

body = replace(body,".ly",".li");

//sz->siz - The coward's way out. I need to sit down and make this thing more cohesive

body = replace(body,"sz\n","siz\n");

body = realReplace("qqq",body,"y\n","ee\n");

body = realReplace("qqq",body,"ehee\n","ehy\n");

body = realReplace("qqq",body,"ahee\n","ahy\n");

body = realReplace("qqq",body,"eee\n","ey\n"); //fixing issues raised by y->ee as compared to other phonetics

String[] temp = {"en","st","un","c","f","g","s","t",""};

body = replace(body,"ctable\n","kteybuhl\n"); //save the c's!

for(int i = 0; i<temp.length;i++)

if(temp.equals("c"))

body = replace(body,"kable\n","eybuhl\n");

else

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"ble\n","buhl\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

//body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"qu","ku");

//body = replace(body,"q","ku");

/* body = replace(body,"wa","ua"); //Unnecessary? I think not! I'm not sure why, but no.

body = replace(body,"we","ue");

body = replace(body,"wi","ui");

body = replace(body,"wo","uo");

body = replace(body,"wu","uu"); */

body = replace(body,"w","u"); //exception catcher

if(debug_end_e){

body = replace(body,"e\n","Q\n"); //Just for debugging

body = replace(body,".TQ",".Te");

body = replace(body,".bQ",".be");

body = replace(body,".seQ",".seee");

body = replace(body,".mQ",".me");

body = replace(body,"eQ\n","ee\n");

body = replace(body,"Qy\n","ey\n");

body = replace(body,".hQ",".he");

body = replace(body,".shQ",".she");

}

return body.substring(1,body.length()-1); //clipping first/last '\n'

}

private static String replace(String body, String target, String sub){

return realReplace("",body,target,sub);

}

private static String realReplace(String sofar, String body, String target, String sub)

{

int target_size = target.length();

int sub_size = sub.length();

//'.'==' '

if(target.startsWith(".")){

body = replace(body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub_size)));

body = replace(body,("\n"+target.substring(1,target_size)),("\n"+sub.substring(1,sub_size)));

/*

//re-

if(((target_size>=5)&&(!target.substring(1,5).equals("rere")))||(target_size<3)) //clumsy

body = replace(body,".re"+target.substring(1,target_size),".ree"+sub.substring(1,target_size)); */

}

if(target.endsWith("\n")){ //checks for spaces and for plurals, also does s->z conversion where necessary

body = replace(body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub_size-1)+" ")); //space substitution

if(sofar.length()<=2){ //that took longer than it should have. Anyone who can suggest improvements is welcome to try.

if((!sofar.contains("z"))&&(!sofar.contains("l"))){ //I think contains() covers it. It saves time over endsWith() if it stops unnecessary calls to realReplace(), as long as it doesn't cut out possible permutations

if(!sofar.contains("i"))// s->z

if((target_size>=2)&&(target.charAt(target_size-2)!='s')&&(target.charAt(target_size-2)!='z')) //Double-checking s/z

if(target.charAt(target_size-2)=='e')

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s\n"),(sub.substring(0,sub_size-1)+"z\n"));

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s\n"),(sub.substring(0,sub_size-1)+"ez\n")); //s->z

else if(((target_size>=2)&&(target.charAt(target_size-2)=='y'))||(target_size<3)) //bug stopper

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies\n"),(sub.substring(0,sub_size-1)+"z\n"));

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies\n"),(sub.substring(0,sub_size-1)+"iez\n")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s\n"),(sub.substring(0,sub_size-1)+"z\n")); //s->z

/* //y

body = realReplace("qqq",body,"ay\n","ey\n"); //stopgap, might want to revisit

body = replace(body,"ey\n","ey\n");

body = realReplace("qqq",body,"oy\n","oi\n");

body = realReplace("qqq",body,"uy\n","ahy\n");

body = realReplace("qqq",body,"y\n","ee\n"); //might need generalized in replace()

body = replace(body,"ty","tahy"); */

//ly, focus on y as of 1.7.4.3 - It might need some work

if(target.equals("sly\n")) //special case

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

//ly

if((target_size>=5)&&(target.substring(target_size-5,target_size-1).equals("able")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"y\n"),(sub.substring(0,sub_size-4)+"lee\n")); //ably

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"ily\n"),(sub.substring(0,sub_size-2)+"uhlee\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"pily\n"),(sub.substring(0,sub_size-1)+"uhlee\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"tily\n"),(sub.substring(0,sub_size-1)+"uhlee\n"));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

//y

if((target_size>=2)&&(target.charAt(target_size-2)=='a'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"y\n"),(sub.substring(0,sub_size-2)+"ey\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"y\n"),(sub.substring(0,sub_size-1)+"y\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"y\n"),(sub.substring(0,sub_size-1)+"i\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='u'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"y\n"),(sub.substring(0,sub_size-2)+"ahy\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"py\n"),(sub.substring(0,sub_size-1)+"ee\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ty\n"),(sub.substring(0,sub_size-1)+"ee\n"));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly\n"),(sub.substring(0,sub_size-1)+"lee\n"));

if((!sofar.contains("g"))&&(!sofar.contains("i"))&&(!sofar.contains("r"))){ //covers multiple

if((!target.endsWith("g\n"))&&(!target.endsWith("gs\n"))&&(!target.endsWith("gz"))) //leave no base uncovered

if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ie")))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-3)+"ying\n"),(sub.substring(0,sub_size-1)+"ing\n")); //replacing 'ie' before gerund

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ing\n"),(sub.substring(0,sub_size-1)+"ing\n")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ping\n"),(sub.substring(0,sub_size-1)+"ing\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ting\n"),(sub.substring(0,sub_size-1)+"ing\n"));

else if((!target.endsWith("gs\n"))&&(!target.endsWith("gz"))) //no "ing\n" or s\z at end

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing\n"),(sub.substring(0,sub_size-1)+"ing\n")); //no e, presumably ends in consonant

if((!sofar.contains("a"))&&(!sofar.contains("d"))) //ish

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"pish\n"),(sub.substring(0,sub_size-1)+"ish\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"tish\n"),(sub.substring(0,sub_size-1)+"ish\n"));

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ed")))||(target_size<3))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"ish\n"),(sub.substring(0,sub_size-1)+"ish\n"));

if(!sofar.contains("a")) //able

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"table\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

}

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

else if(target.equals("fly")||target.equals("unfly"))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

else if(((target_size>=4)&&(target.substring(target_size-4,target_size-1).equals("ing")))||(target_size<4))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"eybuhl\n"));

}

if((!sofar.contains("g"))&&(!sofar.contains("d"))){ //covers multiple

if(target_size>=2) //d at end

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if(target.charAt(target_size-2)=='e')

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d\n"),(sub.substring(0,sub_size-1)+"ed\n")); //NOT st

else if(target.charAt(target_size-2)=='s')

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed\n"),(sub.substring(0,sub_size-1)+"ed\n"));

else if(target.substring(target_size-3,target_size-1).equals("se"))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d\n"),(sub.substring(0,sub_size-1)+"ed\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ped\n"),(sub.substring(0,sub_size-1)+"ed\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ted\n"),(sub.substring(0,sub_size-1)+"ed\n"));

else if((target.charAt(target_size-2)!='s')||((target.substring(target_size-3,target_size-1).equals("ss"))))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed\n"),(sub.substring(0,sub_size-1)+"st\n"));

//er

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"per\n"),(sub.substring(0,sub_size-1)+"er\n"));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"ter\n"),(sub.substring(0,sub_size-1)+"er\n"));

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n"));

}

/* //ate, not bothering with fobiddances - Never mind

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n")); */

//Why do these need to be dealt with here?

//Because these permuations need to be available to figure out which \n grammars to apply

//ed, ish, ly, ing, able, edly, ishly, ably, lying, eding, abling

//Dirty method - add a recursion counter to replace()

//6 max - ed ish ly ing able z

//ablingly, lyingly - 3

//ablinger

//s-z, ly-l, ing-g, d-d, ish-i, able-a

//everything abides i, nothing abides s/l //nevermind, not much likes i either

//a allows l/s/d,

//a forbids a, i

//d forbids d, i

//g forbids d, g, i, a

//i forbids s, g, i, a

//er-r

//r forbids g, i, a

//r is forbidden by s, l, g, d

//I think that forbiddance is total - no forbidden suffixes at any point before

}

}

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

return body;

}

}

Link to comment
Share on other sites

Many of the cases of .a->.uh are actually intentional, although it varies by word, and so is still worth double checking.

Ah, makes sense. Awesome! I'll check another article over tomorrow and watch in awe at how shiny it looks after conversion B)

Link to comment
Share on other sites

Ah, makes sense. Awesome! I'll check another article over tomorrow and watch in awe at how shiny it looks after conversion B)

I warn you, it's gotten slightly longer to convert things. The Odyssey was bumped up from 8 minutes to 2 hours, 16 minutes :o.

EDIT: That's a %1600 increase, for all you folks at home.

Edited by Kurkistan
Link to comment
Share on other sites

EDIT:

Turos and Kurkistan -- Firstly, you guys are awesome. Secondly, I'm envious of the amount of free time you have.

Firstly: Thank you.

Secondly: B)

Ah, makes sense. Awesome! I'll check another article over tomorrow and watch in awe at how shiny it looks after conversion B)

You may want to hold off for a moment. I'm doing some rather large revisions to boost efficiency, which are having odd side-effects.

EDIT: Okay, done with that. Also added in some .pie rules. Essentially, I made a very foolish programming error that resulted in running about 20 times as many replace() functions as I needed to: This spiked run-times by a small amount.

As evidence, I ran the odyssey for only 18 minutes, 16 seconds, despite having more grammars than the last time I ran it.

EDIT 2: Made periodMover() a bit more efficient as well as allowing it to work on an arbitrary number of periods, added in a few rules for xious\n, irst\n, stion\n, the pp's.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan, with significant developmental input from Turos

* @date 01/20/2012

* @version 1.8.9.4

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_8_9_4

{

static boolean debug_char = false;

static boolean debug_end_e = false;

static boolean remove_illegal = true;

static boolean add_CR = true;

/* static String Targets = "";

static int min = 200;

static int max = 400; */

static int Count = 0;

static boolean Counting = true;

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter input file (full name of file in same directory): ");

String temp = input.next();

//temp = "Test.txt";

final double startTime = System.currentTimeMillis();

final double endTime;

try {

String alethi = convertText(temp);

if(alethi.equals("&"))

return;

//putting carriage-returns back in to make it look pretty in Notepad. I can't tell what else they might do.

if(add_CR)

for(int i = 0; i<alethi.length();i++)

if(alethi.charAt(i)=='\n')

alethi = alethi.substring(0,i)+"\r"+alethi.substring(i++,alethi.length());

//writeFile(Targets,"TEMP.txt");

temp = "Alethi_"+temp;

writeFile(alethi,temp);

if(debug_char){

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

} finally {

endTime = System.currentTimeMillis();

}

final double duration = endTime - startTime;

System.out.println("Execution time: "+(duration/1000)+" seconds");

}

private static String convertText(String roman) throws IOException

{

roman = readFile(roman); //text file

if((roman.length()==1)&&(roman.charAt(0)=='&')) //invalid input, halt program

return "&";

if(remove_illegal)

roman = removeCharacters(roman);

roman = periodMover(roman);

roman = spaceEnds(roman);

String alethi = replaceLetters(roman);

return unSpaceEnds(alethi);

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static String readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("File not in directory or misspelled.");

return "&";

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole;

}

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("Output file already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercaseases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body.charAt(i)=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)) //c, q, w, and x

library[place++] = (char)i;

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body.charAt(i)=='\n')

line++;

else if(Arrays.binarySearch(library,body.charAt(i))<0) //not in library

violations = violations + (line+":"+body.charAt(i)) + "; ";

return violations;

}

private static String removeCharacters(String body)

{

char[] library = new char[56];

library[0] = '\t'; //tab

library[1] = '\n';

library[2] = ' ';

library[3] = '.';

int place = 4;

for(int i = 65; i <=90; i++)

library[place++] = (char)i;

for(int i = 97; i <=122; i++)

library[place++] = (char)i;

for(int i = 0; i < body.length(); i++)

if(Arrays.binarySearch(library,body.charAt(i))<0) //I felt embarrassed by my earlier search algorithm.

if((body.charAt(i)=='?')||(body.charAt(i)=='!'))

body = body.substring(0,i)+"."+body.substring(i+1,body.length());

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

return body;

}

/**

* In the Alethi alphabet, sentences start with a period '.' and don't end with anything.

*/

private static String periodMover(String body)

{

int start = 0;

for(int i=0;i<body.length();i++)

{

if(body.charAt(i)=='.'){

while((i<body.length())&&(body.charAt(i)=='.')) //multiples

body = body.substring(0,start)+"."+body.substring(start,i)+body.substring((i++)+1,body.length());

while(i<body.length())

if(!inAlphabet(body.charAt(i)))

i++;

else

break; //Yes, the cardinal sin.

start = i;

}

else if(body.charAt(i)=='\n')

start=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

return body;

}

private static boolean inAlphabet(char character)

{

char[] library = new char[26];

int place = 0;

for(int i = 97; i <=122; i++)

library[place++] = (char)i;

if(Arrays.binarySearch(library,character)>=0) //I felt embarrassed by my earlier search algorithm.

return true;

return false;

}

private static String spaceEnds(String body){

for(int i=0;i<body.length();i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+" "+body.substring((i++)+1,body.length());

else if(body.charAt(i)=='\n'){

body = body.substring(0,i)+" \n "+body.substring(i+1,body.length());

i+=2;

}

//System.out.println(body);

return body;

}

private static String unSpaceEnds(String body){

for(int i=1;i<body.length()-2;i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+body.substring(i+2,body.length());

else if(body.charAt(i)=='\n')

body = body.substring(0,i-1)+"\n"+body.substring((i--)+2,body.length());

if(body.charAt(body.length()-2)=='.')

body = body.substring(0,body.length()-1);

else if(body.charAt(body.length()-2)=='\n')

body = body.substring(0,body.length()-3)+"\n";

return body.substring(1,body.length()-1); //clipping first/last '\n';;

}

public static void test()

{

String body = "\nbutler\n";

String target = "ap\n";

String sub = "op\n";

System.out.println(replace(body,target,sub));

int target_size = target.length();

int sub_size = sub.length();

String sofar = "";

int j = 2;

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

System.out.println(body);

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

private static String replaceLetters(String body)

{

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//ph

body = replace(body,"ph","f");

//anti-

body = replace(body,".anti",".antahy");

//wh

body = replace(body,"who\n","hoo\n");

body = replace(body,"where","huair"); //changed w to u

body = replace(body,"whir","huur");

body = replace(body,"wh","hu"); //Might need more permutations

body = replace(body,".accr",".uhkr"); //many many many

body = replace(body,".acci",".aksi");

body = replace(body,".accord",".uhkawrd");

body = replace(body,".accomp",".uhkuhmp");

body = replace(body,".acco",".uhko");

body = replace(body,".accustom\n",".uhkuhstuhm\n");

body = replace(body,".accolade\n",".akuhleyd\n");

body = replace(body,".accus",".uhkyooz");

body = replace(body,".accurs",".uhkurs");

body = replace(body,".accur",".akyer");

body = replace(body,".accum",".uhkyoom");

body = replace(body,".accout",".uhkoot");

body = replace(body,".accoun",".uhkount");

body = replace(body,".acce",".akse"); //the dreaded double c's

body = replace(body,".ecc",".eks");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,".occ",".uhk");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

//E at end - Some interference possible with C's

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //special case

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"have\n","hav\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

//body = replace(body,"ever\n","ever\n");

body = replace(body,"eve\n","eev\n"); //HOWEVER

body = replace(body,"eever\n","ever\n");

body = replace(body,"ile\n","ahyl\n");

//System.out.println(replace(replace("while ","wh","hu"),"ile\n","ahyl\n"));

//huahyl

body = replace(body,"gle\n","guhl\n");

body = replace(body,".key\n",".kee\n"); //special

body = realReplace("QQQ",body,".keys\n",".kees\n");

body = replace(body,"base\n","beys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","keys\n");

body = replace(body,"chase\n","Ceys\n"); //ch == C

body = replace(body,"Case\n","Ceys\n"); //necessary?

body = replace(body,"erase\n","ihreys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,"come\n","cuhm\n"); //Need to move this up

body = replace(body,"ome\n","ohm\n");

body = replace(body,"ttle\n","tl\n");

body = replace(body,"tle\n","tl\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,".discipline\n",".disipline\n");

body = replace(body,"ine\n","ahyn\n");

body = replace(body,".one\n",".uuhn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"ope\n","ohp\n");

body = replace(body,"rue\n","roo\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"bead\n","beed\n");

body = replace(body,".read\n",".reed\n");

body = replace(body,"nead\n","need\n");

body = replace(body,"lead\n","leed\n");

body = replace(body,"ead\n","ed\n"); //general

body = replace(body,"ade\n","eyd\n");

//ere - their vs there

body = realReplace("QQQ",body,"ere\n","eir\n");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"tire","tahyuhr"); //NOT \n or e

body = replace(body,"aire\n","air\n");

//body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","hrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"promise\n","promis\n");

body = replace(body,"aise\n","eyz\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

body = replace(body,"que\n","k\n");

body = replace(body,"udge\n","uhj\n");

body = replace(body,"dge\n","j\n"); //NOT sure

body = replace(body,"age\n","aij\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"rgue\n","rgyoo\n");

body = replace(body,"gue\n","eeg\n");

//-nge

body = replace(body,"nge\n","nj\n"); //problem with sing vs singe not really being separable at the gerund-testing level

body = replace(body,"sinjing\n","singing\n"); //comprehensive fix for gerund mishaps

body = replace(body,"slinjing\n","slinging\n");

body = replace(body,"strinjing\n","stringing\n");

body = replace(body,"swinjing\n","swinging\n");

body = replace(body,"brinjing\n","bringing\n");

body = replace(body,"flinjing\n","flinging\n");

body = replace(body,"prinjing\n","pringing\n");

body = replace(body,".winjing\n",".winging\n");

body = replace(body,".zinjing\n",".zinging\n");

body = replace(body,".dinjing\n",".dinging\n");

body = replace(body,".pinjing\n",".pinging\n");

//END E's

//s at end - 1.7.4.5 -> unneeded, I think

//body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//START C 1.7 - moved so that higher number of characters in target get's preference, blocks kept cohesive

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

body = replace(body,"accent","aksent");

body = replace(body,"exercise\n","eksersahyz\n");

body = replace(body,".once",".wuhns");

body = replace(body,"preface\n","prefis\n"); //special

body = replace(body,"icise\n","uhsahyz\n");

body = replace(body,"rcise\n","ruhsahyz\n");

body = replace(body,".tacit\n",".tasit\n");

body = replace(body,"ciate\n","sheeeyt\n");

body = replace(body,"vate\n","vit\n"); //pulled from E section, might be a sign of things to come

body = replace(body,"literate\n","literit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"cision\n","sizhuhn\n");

body = replace(body,"cise\n","sahys\n");

body = replace(body,"cist\n","sist");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usst\n"); //D's

body = replace(body,"came\n","keym\n");

body = replace(body,"came","kamuh");

body = replace(body,"ct","kt"); //factual

body = replace(body,"tual\n","Cual\n");

body = replace(body,".acid\n",".asid\n");

body = replace(body,".aci",".uhsi");

body = replace(body,"ierce\n","eers\n");

body = replace(body,"ince\n","ins\n");

//body = replace(body,".ance",".ahns");

body = replace(body,".trance",".trahns");

body = replace(body,"dance\n","dahns\n");

body = replace(body,"Cance","Cahns");

body = replace(body,"cance","cahns");

body = replace(body,"lance","lahns");

body = replace(body,"vance","vahns");

body = replace(body,"ance\n","uhns\n");

body = replace(body,"all\n","awl\n");

body = replace(body,".supp",".suhpp"); //just a general rule

body = replace(body,"appa","apuh");

body = replace(body,"ppen","pen"); //double p's, might NOT be done

body = replace(body,"pplet\n","plit\n");

body = replace(body,"pple\n","puhl\n");

body = realReplace("QQQ",body,".supplement\n",".suhpluhment\n"); //special case

body = replace(body,"ppl","puhl");

body = replace(body,"upp\n","uhp");

body = replace(body,"oppor","oper");

body = replace(body,"opp","uhp");

body = replace(body,"ypp","ip");

body = replace(body,"pp","p"); //Last ditch, should cover most before this

body = replace(body,"tice\n","tis\n");

body = replace(body,"arice\n","eris\n");

body = replace(body,"orice\n","uhis\n");

body = replace(body,"cipice\n","suhpis\n"); //patch for precipice

body = replace(body,"ipice\n","uhpis\n");

body = replace(body,".vice\n","vahys\n");

body = replace(body,"vice\n","vis\n");

body = replace(body,"ice\n","ahys\n"); //Long S. NOT sure about \n's

body = replace(body,"egy\n","ijee\n"); //possibilities/strategies fix, I have now idea how the ended up "kiez"

body = replace(body,"ity\n","itee\n");

body = replace(body,"ite\n","ahyt\n");

body = replace(body,"irst\n","urst\n");

body = replace(body,"ong\n","ong\n");

body = replace(body,"ull\n","ool\n");

body = replace(body,"cide\n","sahyd\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ence\n","ens\n");

body = replace(body,"rend\n","rend\n");

//1.8.9 Pie-

body = replace(body,"piety","pahyitee");

body = replace(body,".pier\n"," peer\n");

body = replace(body,".pie\n"," pahy\n");

body = replace(body,".pie",".pee");

body = replace(body,"ces\n","seez\n");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"ci\n","sahy\n");

body = replace(body,"oy\n","oi\n");

body = replace(body,"ace\n","eys\n");

body = replace(body,".chull\n",".as\n");

body = replace(body,".chull",".uhs"); //Assoc-

body = replace(body,"ely\n","lee\n"); //MUST BE LAST IN \N

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"cious","shuhs"); //For Ithaca!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"ply\n","plahy\n");

body = replace(body,".by\n",".bahy\n");

body = replace(body,".my\n",".mahy\n");

body = replace(body,".die\n",".dahy\n");

body = replace(body,".dye\n",".dahy\n");

body = replace(body,".bye\n",".bahy\n"); //conflict

body = replace(body,"hype","hahype");

body = replace(body,"hypo","hahypo");

body = replace(body,"hypn","hipn");

body = replace(body,"hyphen","hahyfuhn");

body = replace(body,"hyfen","hahyfuhn"); //ph->f

body = replace(body,"yp","ip");

body = replace(body,"duct","duhkt");

body = replace(body,"stion","sCuhn"); //1.8.9.4

body = replace(body,"tion","Suhn"); //1.8

body = replace(body,"ssion","Suhn"); //1.8.6

body = replace(body,"sion","zhuhn");

body = replace(body,"cean","Suhn");

body = replace(body,"ture","Cur");

body = replace(body,"cies","seez"); //prophocies

body = replace(body,"ciez","seez"); //s->z already done

body = replace(body,"iew","yoo");

body = replace(body,".face",".feys");

body = replace(body,"face","feys");

//For-

body = replace(body,".fore",".fohr");

body = replace(body,".for",".fohr");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,"acen","eysuhn"); //Don't get complacent

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweyeynt");

body = replace(body,"cing","sing");

//1.6.5 - odyssey test

body = replace(body,"exce","ikse");

body = replace(body,"excit","iksahyt");

body = replace(body,"excis","eksahyz");

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ight","ahyt");

body = replace(body,"cep","sep");

body = replace(body,"cin","sin");

body = replace(body,".cit",".sit");

body = replace(body,"cip","sip");

body = replace(body,"cif","sif"); //NOT sure

body = replace(body,"icc","ik");

body = replace(body,"icn","ikn");

body = replace(body,"sce","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

//body = replace(body,"sco","sko");

body = replace(body,"cea","sea");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsee");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cent","sent"); //odyssey

body = replace(body,"it\n","it\n"); //Tacked on for suffix reasons

body = replace(body,"ap\n","ap\n");

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".cer",".sur");

body = replace(body,".ce",".se");

body = replace(body,"ck","k");

/* body = realReplace("QQQ",body,"C\n","k\n");

body = realReplace("QQQ",body,"ch\n","k\n"); */

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"ce","se");

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

body = realReplace("QQQ",body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = realReplace("QQQ",body,"c\n","k\n"); //to stop mischeif

//END C'S

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

body = replace(body,".be\n",".bee\n");

body = replace(body,".maybe\n",".meybee\n");

//gh

body = replace(body,"gha","gah"); //This section needs work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"igh","ahy");

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".to\n",".too\n");

body = replace(body,".two\n",".too\n");

//q at end

body = realReplace("QQQ",body,"q\n","k\n");

//w at end

body = replace(body,".low\n",".loh\n");//special cases

body = replace(body,".row\n",".roh\n");

body = replace(body,"ow\n","au\n");

//.sy

body = replace(body,".syr",".suhr"); //Moved up to e-enders

body = replace(body,".syr",".sir");

body = replace(body,".sly",".slahy");

body = replace(body,".lying\n",".lahying\n");

body = replace(body,".ly",".li");

//sz->siz - The coward's way out. I need to sit down and make this thing more cohesive

body = replace(body,"sz\n","siz\n");

body = replace(body,"pie\n","pahy\n"); // NOT normal, aka special

body = realReplace("qqq",body,".or\n",".awr\n");

body = realReplace("qqq",body,"y\n","ee\n");

body = realReplace("qqq",body,"ehee\n","ehy\n");

body = realReplace("qqq",body,"ahee\n","ahy\n");

body = realReplace("qqq",body,"eee\n","ey\n"); //fixing issues raised by y->ee as compared to other phonetics

String[] temp = {"en","st","un","c","f","g","s","t",""};

body = replace(body,"ctable\n","kteybuhl\n"); //save the c's!

for(int i = 0; i<temp.length;i++)

if(temp.equals("c"))

body = replace(body,"kable\n","eybuhl\n");

else

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"ble\n","buhl\n");

//x's

body = replace(body,".xy",".zi");

body = replace(body,"xious","kSuhs");

//General fixer for suffixes

//body = replace(body,"\n","\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

//body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"qu","ku");

//body = replace(body,"q","ku");

/* body = replace(body,"wa","ua"); //Unnecessary? I think not! I'm not sure why, but no.

body = replace(body,"we","ue");

body = replace(body,"wi","ui");

body = replace(body,"wo","uo");

body = replace(body,"wu","uu"); */

body = replace(body,"w","u"); //exception catcher

if(debug_end_e){

body = replace(body,"e\n","Q\n"); //Just for debugging

body = replace(body,".TQ",".Te");

body = replace(body,".bQ",".be");

body = replace(body,".seQ",".seee");

body = replace(body,".mQ",".me");

body = replace(body,"eQ\n","ee\n");

body = replace(body,"Qy\n","ey\n");

body = replace(body,".hQ",".he");

body = replace(body,".shQ",".she");

}

return body;

}

private static String replace(String body, String target, String sub){

return realReplace("",body,target,sub);

}

private static String realReplace(String sofar, String body, String target, String sub)

{

int target_size = target.length();

int sub_size = sub.length();

/* if((min<Count++)&&(max>Count))

Targets+= target+"_"; */

if(Counting)

{

Count++;

if(target.equals("w"))

System.out.println("Replaces Run: "+Count);

}

//As of 1.8.8.1, '.' and '\n' are only codes for ' '. Spaces will be added before and after every \n, as well as after every period, then removed at the end.

//'.'==' '

if(target.startsWith("."))

return realReplace(sofar, body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub_size)));

else if(target.endsWith("\n"))

return realReplace(sofar, body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub_size-1)+" ")); //space substitution

if(target.endsWith(" "))

if(sofar.length()<=2){ //that took longer than it should have. Anyone who can suggest improvements is welcome to try.

if(target.equals("y "))

System.out.println(target);

if((!sofar.contains("z"))&&(!sofar.contains("l"))){ //I think contains() covers it. It saves time over endsWith() if it stops unnecessary calls to realReplace(), as long as it doesn't cut out possible permutations

if(!sofar.contains("i"))// s->z

if((target_size>=2)&&(target.charAt(target_size-2)!='s')&&(target.charAt(target_size-2)!='z')) //Double-checking s/z

if(target.charAt(target_size-2)=='e')

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z "));

else if((sub_size>=2)&&(sub.charAt(sub_size-2)=='y'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"ez ")); //s->z

else if(((target_size>=2)&&(target.charAt(target_size-2)=='y'))||(target_size<3)) //bug stopper

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"z "));

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"iez ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

/* //y

body = realReplace("qqq",body,"ay ","ey "); //stopgap, might want to revisit

body = replace(body,"ey ","ey ");

body = realReplace("qqq",body,"oy ","oi ");

body = realReplace("qqq",body,"uy ","ahy ");

body = realReplace("qqq",body,"y ","ee "); //might need generalized in replace()

body = replace(body,"ty","tahy"); */

//ly, focus on y as of 1.7.4.3 - It might need some work

if(target.equals("sly ")) //special case

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else{

//ly

if((target_size>=5)&&(target.substring(target_size-5,target_size-1).equals("able")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"y "),(sub.substring(0,sub_size-4)+"lee ")); //ably

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

if((sub_size>=3)&&(sub.substring(sub_size-3,sub_size-1).equals("ee")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-3)+"ily "),(sub.substring(0,sub_size-3)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"ily "),(sub.substring(0,sub_size-2)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"pily "),(sub.substring(0,sub_size-1)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"tily "),(sub.substring(0,sub_size-1)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

//y

if((target_size>=2)&&(target.charAt(target_size-2)=='a')) //might need work

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ey "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"y "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"i "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='u'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ahy "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"py "),(sub.substring(0,sub_size-1)+"ee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"ty "),(sub.substring(0,sub_size-1)+"ee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee ")); //might not be needed

}

if((!sofar.contains("g"))&&(!sofar.contains("i"))&&(!sofar.contains("r"))){ //covers multiple

if((!target.endsWith("g "))&&(!target.endsWith("gs "))&&(!target.endsWith("gz "))) //leave no base uncovered

if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ie")))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-3)+"ying "),(sub.substring(0,sub_size-1)+"ing ")); //replacing 'ie' before gerund

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ping "),(sub.substring(0,sub_size-1)+"ing "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ting "),(sub.substring(0,sub_size-1)+"ing "));

else if((!target.endsWith("gs "))&&(!target.endsWith("gz "))) //no "ing\n" or s\z at end

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //no e, presumably ends in consonant

if((!sofar.contains("a"))&&(!sofar.contains("d"))) //ish

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"pish "),(sub.substring(0,sub_size-1)+"ish "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"tish "),(sub.substring(0,sub_size-1)+"ish "));

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ed")))||(target_size<3))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"ish "),(sub.substring(0,sub_size-1)+"ish "));

if(!sofar.contains("a")) //able

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"table "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(target.equals("fly")||target.equals("unfly"))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(((target_size>=4)&&(target.substring(target_size-4,target_size-1).equals("ing")))||(target_size<4))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"eybuhl "));

}

if((!sofar.contains("g"))&&(!sofar.contains("d"))){ //covers multiple

if(target_size>=2) //d at end

if(target.charAt(target_size-2)=='e')

if((target_size>=3)&&(target.charAt(target_size-3)=='c'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"st "));

else

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed ")); //NOT st

else if(target.charAt(target_size-2)=='s')

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("se")))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ped "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ted "),(sub.substring(0,sub_size-1)+"ed "));

else if((target.charAt(target_size-2)!='s')||((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ss"))))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

//er

if(!sofar.contains("r"))

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r "),(sub.substring(0,sub_size-1)+"er ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"per "),(sub.substring(0,sub_size-1)+"er "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"ter "),(sub.substring(0,sub_size-1)+"er "));

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er "),(sub.substring(0,sub_size-1)+"er "));

}

/* //ate, not bothering with fobiddances - Never mind

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n")); */

//Why do these need to be dealt with here?

//Because these permuations need to be available to figure out which \n grammars to apply

//ed, ish, ly, ing, able, edly, ishly, ably, lying, eding, abling

//Dirty method - add a recursion counter to replace()

//6 max - ed ish ly ing able z

//ablingly, lyingly - 3

//ablinger

//s-z, ly-l, ing-g, d-d, ish-i, able-a

//everything abides i, nothing abides s/l //nevermind, not much likes i either

//a allows l/s/d,

//a forbids a, i

//d forbids d, i

//g forbids d, g, i, a

//i forbids s, g, i, a

//er-r

//r forbids g, i, a, r

//r is forbidden by s, l, g, d

//y-y

//Not messing with forbidding now (1.8.8.2)

//I think that forbiddance is total - no forbidden suffixes at any point before

}

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

return body;

}

}

Edited by Kurkistan
Link to comment
Share on other sites

Turos and Kurkistan -- Firstly, you guys are awesome. Secondly, I'm envious of the amount of free time you have.

1st: ;)

2nd: ;)

I apologize for not having done another run-through yet. I have been distracted with my new Kindle Touch :lol:

I will schedule a test tomorrow morning, as I have to go to sleep now and I work graveyard :(

Note the graveyard shift = no social life = extra free time! :D ... ( :( )

Edited by Turos
Link to comment
Share on other sites

1st: ;)

2nd: ;)

I apologize for not having done another run-through yet. I have been distracted with my new Kindle Touch :lol:

I will schedule a test tomorrow morning, as I have to go to sleep now and I work graveyard :(

Note the graveyard shift = no social life = extra free time! :D ... ( :( )

No problem, you've had a remarkable turnaround speed so far. This also gives me extra time to polish the next version before you test it. Speaking of that. . .

Generalized .or\n to just .or

EDIT: Sped up inAlphabet() and fixed file name within the code so that it can actually run.

/**

* Goal: Provide an easy means of transliterating Roman letters into Alethi script using Turos's font conventions.

*

*

* @author Kurkistan, with significant developmental input from Turos

* @date 01/24/2012

* @version 1.8.9.6

*/

import java.io.FileReader;

import java.io.FileWriter;

import java.io.BufferedWriter;

import java.io.InputStreamReader;

import java.io.File;

import java.io.PrintWriter;

import java.io.IOException;

import java.util.Scanner;

import java.io.BufferedReader;

import java.util.Arrays;

public class AlethiTransliterator_1_8_9_6

{

static boolean debug_char = false;

static boolean debug_end_e = false;

static boolean remove_illegal = true;

static boolean add_CR = true;

/* static String Targets = "";

static int min = 200;

static int max = 400; */

static int Count = 0;

static boolean Counting = true;

public static void main (String[] arg) throws IOException{

Scanner input=new Scanner(System.in);

System.out.print("Enter input file (full name of file in same directory): ");

String temp = input.next();

//temp = "Test.txt";

final double startTime = System.currentTimeMillis();

final double endTime;

try {

String alethi = convertText(temp);

if(alethi.equals("&"))

return;

//putting carriage-returns back in to make it look pretty in Notepad. I can't tell what else they might do.

if(add_CR)

for(int i = 0; i<alethi.length();i++)

if(alethi.charAt(i)=='\n')

alethi = alethi.substring(0,i)+"\r"+alethi.substring(i++,alethi.length());

//writeFile(Targets,"TEMP.txt");

temp = "Alethi_"+temp;

writeFile(alethi,temp);

if(debug_char){

String violations = allowedCharacters(alethi); //debugging blatant errors

if(!violations.equals(""))

System.out.println("Unauthorized sections in text (Line:Violation):"+"\n"+violations);

}

} finally {

endTime = System.currentTimeMillis();

}

final double duration = endTime - startTime;

System.out.println("Execution time: "+(duration/1000)+" seconds");

}

private static String convertText(String roman) throws IOException

{

roman = readFile(roman); //text file

if((roman.length()==1)&&(roman.charAt(0)=='&')) //invalid input, halt program

return "&";

if(remove_illegal)

roman = removeCharacters(roman);

roman = periodMover(roman);

roman = spaceEnds(roman);

String alethi = replaceLetters(roman);

return unSpaceEnds(alethi);

}

/**

* Load a text file contents as a <code>String<code>.

*

* @param file The input file

* @return The file contents as a <code>String</code>

* @exception IOException IO Error

*/

private static String readFile(String file) throws IOException

{

String whole = "";

try {

BufferedReader in = new BufferedReader(new FileReader(file));

String str;

while ((str = in.readLine()) != null) {

whole = whole + str + '\n';

//process(str);

}

in.close();

} catch (IOException e) {

System.out.println("File not in directory or misspelled.");

return "&";

}

whole="\n"+whole.toLowerCase(); //convert to lower - keeping an extra \n at the end and beginning for replacement ease of use, will get rid of it

return whole;

}

private static void writeFile(String text, String destination) throws IOException

{

File file = new File(destination);

boolean exist = file.createNewFile();

if (!exist)

{

System.out.println("Output file already exists.");

System.exit(0);

}

else

{

FileWriter fstream = new FileWriter(destination);

BufferedWriter out = new BufferedWriter(fstream);

out.write(text);

out.close();

System.out.println("File created successfully.");

}

}

private static String allowedCharacters(String body)

{

//c, q, w, x, th, sh, ch - Forbidden; I assume no lowercaseases of the special characters (C, X)

//\n, ' ', '.', C, S/s, T/t, X, - Allowed

char[] library = new char[29];

String[] pairs = {"th","sh","ch"}; //These shouldn't trigger unless I made a serious mistake in the "necessary" section.

String violations = "";

int line = 1; //for all of those +1ers out there

int target_size = 2;

int search = body.length() - target_size;

for(int j = 0;j<pairs.length;j++)

for(int i = 0; i<=search;i++)

if(body.charAt(i)=='\n')

line++;

else if(body.substring(i,i+target_size).equals(pairs[j]))

violations = violations + (line+":"+pairs[j]) + "; ";

library[0] = '\n';

library[1] = ' ';

library[2] = '.';

library[3] = 'C';

library[4] = 'S';

library[5] = 'T';

library[6] = 'X';

int place = 7;

for(int i = 97; i <=122; i++){

if((i!=99)&&(i!=113)&&(i!=119)&&(i!=120)) //c, q, w, and x

library[place++] = (char)i;

}

line = 1; //resetting

for(int i = 0;i<body.length();i++)

if(body.charAt(i)=='\n')

line++;

else if(Arrays.binarySearch(library,body.charAt(i))<0) //not in library

violations = violations + (line+":"+body.charAt(i)) + "; ";

return violations;

}

private static String removeCharacters(String body)

{

char[] library = new char[56];

library[0] = '\t'; //tab

library[1] = '\n';

library[2] = ' ';

library[3] = '.';

int place = 4;

for(int i = 65; i <=90; i++)

library[place++] = (char)i;

for(int i = 97; i <=122; i++)

library[place++] = (char)i;

for(int i = 0; i < body.length(); i++)

if(Arrays.binarySearch(library,body.charAt(i))<0) //I felt embarrassed by my earlier search algorithm.

if((body.charAt(i)=='?')||(body.charAt(i)=='!'))

body = body.substring(0,i)+"."+body.substring(i+1,body.length());

else

body = body.substring(0,i)+body.substring(i--+1,body.length());

return body;

}

/**

* In the Alethi alphabet, sentences start with a period '.' and don't end with anything.

*/

private static String periodMover(String body)

{

int start = 0;

for(int i=0;i<body.length();i++)

{

if(body.charAt(i)=='.'){

while((i<body.length())&&(body.charAt(i)=='.')) //multiples

body = body.substring(0,start)+"."+body.substring(start,i)+body.substring((i++)+1,body.length());

while(i<body.length())

if(!inAlphabet(body.charAt(i)))

i++;

else

break; //Yes, the cardinal sin.

start = i;

}

else if(body.charAt(i)=='\n')

start=i+1; //Doesn't allow sentences to continue after true line breaks. Enables no-period headers and whatnot.

}

return body;

}

private static boolean inAlphabet(char character)

{

int value = (int)character;

if((value>=97)&&(value<=122)) //just checking lowercase letters

return true;

return false;

}

private static String spaceEnds(String body){

for(int i=0;i<body.length();i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+" "+body.substring((i++)+1,body.length());

else if(body.charAt(i)=='\n'){

body = body.substring(0,i)+" \n "+body.substring(i+1,body.length());

i+=2;

}

//System.out.println(body);

return body;

}

private static String unSpaceEnds(String body){

for(int i=1;i<body.length()-2;i++)

if(body.charAt(i)=='.')

body = body.substring(0,i+1)+body.substring(i+2,body.length());

else if(body.charAt(i)=='\n')

body = body.substring(0,i-1)+"\n"+body.substring((i--)+2,body.length());

if(body.charAt(body.length()-2)=='.')

body = body.substring(0,body.length()-1);

else if(body.charAt(body.length()-2)=='\n')

body = body.substring(0,body.length()-3)+"\n";

return body.substring(1,body.length()-1); //clipping first/last '\n';;

}

public static void test()

{

String body = "\nbutler\n";

String target = "ap\n";

String sub = "op\n";

System.out.println(replace(body,target,sub));

int target_size = target.length();

int sub_size = sub.length();

String sofar = "";

int j = 2;

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able\n"),(sub.substring(0,sub_size-1)+"uhbuhl\n"));

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

System.out.println(body);

}

/**

* Special charaters:

For t, use lower case t.

For th, use capital T.

For s, use lower case s.

For sh, use capital S.

For ch, use c.

X will print a combination of k and s.

For q and w, use your imagination. Technically speaking, q is a

combination of k and u. W is basically a combination of a long u

("oo") and any other vowel: a e i o and short u ("uh")

*/

private static String replaceLetters(String body)

{

//Ease of use

//1.3.5-Threw in an If statement in the replace function to deal with space and \n at the same time

//ph

body = replace(body,"ph","f");

//anti-

body = replace(body,".anti",".antahy");

//wh

body = replace(body,"who\n","hoo\n");

body = replace(body,"where","huair"); //changed w to u

body = replace(body,"whir","huur");

body = replace(body,"wh","hu"); //Might need more permutations

body = replace(body,".accr",".uhkr"); //many many many

body = replace(body,".acci",".aksi");

body = replace(body,".accord",".uhkawrd");

body = replace(body,".accomp",".uhkuhmp");

body = replace(body,".acco",".uhko");

body = replace(body,".accustom\n",".uhkuhstuhm\n");

body = replace(body,".accolade\n",".akuhleyd\n");

body = replace(body,".accus",".uhkyooz");

body = replace(body,".accurs",".uhkurs");

body = replace(body,".accur",".akyer");

body = replace(body,".accum",".uhkyoom");

body = replace(body,".accout",".uhkoot");

body = replace(body,".accoun",".uhkount");

body = replace(body,".acce",".akse"); //the dreaded double c's

body = replace(body,".ecc",".eks");

body = replace(body,"ucca","uhka");

body = replace(body,"ucco","uhko");

body = replace(body,"uccu","uhku");

body = replace(body,".occ",".uhk");

body = replace(body,"ucce","uhkse");

body = replace(body,"ucci","uhksi");

body = replace(body,"occup","okyuh"); //very special case

body = replace(body,"occa","uhkah");

body = replace(body,"occi","oksi");

body = replace(body,"occe","ochee"); //?

body = replace(body,"occo","okuh");

body = replace(body,"occu","okuh"); //Just went down the list on http://www.morewords.com/contains/cc - Useful, if laborious

//E at end - Some interference possible with C's

body = replace(body,"use\n","yooz\n");

body = replace(body,"used\n","yoozd\n"); //special case

//Note: Need to make sure that plurals of e-enders are covered, i.e. wives.

body = replace(body,"like\n","lahyk\n");

body = replace(body,"ole\n","ohl\n"); //hyperbole will suffer

body = replace(body,"ose\n","ohz\n");

body = replace(body,"ame\n","eym\n");

body = replace(body,"ese\n","eez\n");

body = replace(body,"have\n","hav\n");

body = replace(body,"ave\n","eyv\n");

body = replace(body,"eive\n","eev\n");

body = replace(body,"vive\n","vahyv\n");

body = replace(body,"ive\n","iv\n");

//body = replace(body,"ever\n","ever\n");

body = replace(body,"eve\n","eev\n"); //HOWEVER

body = replace(body,"eever\n","ever\n");

body = replace(body,"ile\n","ahyl\n");

//System.out.println(replace(replace("while ","wh","hu"),"ile\n","ahyl\n"));

//huahyl

body = replace(body,"gle\n","guhl\n");

body = replace(body,".key\n",".kee\n"); //special

body = realReplace("QQQ",body,".keys\n",".kees\n");

body = replace(body,"base\n","beys\n"); //And now the ends-with function on scrabblefinder.com was useful

body = replace(body,"case\n","keys\n");

body = replace(body,"chase\n","Ceys\n"); //ch == C

body = replace(body,"Case\n","Ceys\n"); //necessary?

body = replace(body,"erase\n","ihreys\n");

body = replace(body,"ase\n","eez\n");

body = replace(body,"olve\n","olv\n");

body = replace(body,"alve\n","ahv\n");

body = replace(body,"elve\n","elv\n");

body = replace(body,"some\n","suhm\n");

body = replace(body,"come\n","cuhm\n"); //Need to move this up

body = replace(body,"ome\n","ohm\n");

body = replace(body,"ttle\n","tl\n");

body = replace(body,"tle\n","tl\n"); //This is what dictionary.com said to do, and I live to serve

body = replace(body,".discipline\n",".disipline\n");

body = replace(body,"ine\n","ahyn\n");

body = replace(body,".one\n",".uuhn\n");

body = replace(body,"done\n","duhn\n");

body = replace(body,"none\n","nuhn\n");

body = replace(body,"one\n","ohn\n");

body = replace(body,"ake\n","eyk\n");

body = replace(body,"ope\n","ohp\n");

body = replace(body,"rue\n","roo\n");

body = replace(body,"ife\n","ahyf\n");

body = replace(body,"bead\n","beed\n");

body = replace(body,".read\n",".reed\n");

body = replace(body,"nead\n","need\n");

body = replace(body,"lead\n","leed\n");

body = replace(body,"ead\n","ed\n"); //general

body = replace(body,"ade\n","eyd\n");

//ere - their vs there

body = realReplace("QQQ",body,"ere\n","eir\n");

body = replace(body,".are\n",".ahr\n");

body = replace(body,"are\n","air\n");

body = replace(body,"oke\n","ohk\n");

body = replace(body,"tire","tahyuhr"); //NOT \n or e

body = replace(body,"aire\n","air\n");

//body = replace(body,"ire\n","yuhr\n"); //?

body = replace(body,"ype\n","ahyp\n");

body = replace(body,"urge\n","urj\n");

body = replace(body,"erge\n","urj\n"); //Not a mistake

body = replace(body,"arge\n","hrj\n");

body = replace(body,"orge\n","wrj\n");

body = replace(body,"ime\n","ahym\n");

body = replace(body,"sle\n","ahyl\n");

body = replace(body,"promise\n","promis\n");

body = replace(body,"aise\n","eyz\n");

body = replace(body,"ise\n","ahyz\n");

body = replace(body,"lse\n","ls\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"igue\n","teeg\n");

body = replace(body,"sce\n","es\n");

body = replace(body,"que\n","k\n");

body = replace(body,"udge\n","uhj\n");

body = replace(body,"dge\n","j\n"); //NOT sure

body = replace(body,"age\n","aij\n");

//gue - This one was irritating, might not be right

body = replace(body,"logue\n","awg\n");

body = replace(body,"gogue\n","awg\n");

body = replace(body,".morgue\n",".mawrg\n");

body = replace(body,".fugue\n",".fyoog\n");

body = replace(body,".segue\n",".segwey\n");

body = replace(body,"rgue\n","rgyoo\n");

body = replace(body,"gue\n","eeg\n");

//-nge

body = replace(body,"nge\n","nj\n"); //problem with sing vs singe not really being separable at the gerund-testing level

body = replace(body,"sinjing\n","singing\n"); //comprehensive fix for gerund mishaps

body = replace(body,"slinjing\n","slinging\n");

body = replace(body,"strinjing\n","stringing\n");

body = replace(body,"swinjing\n","swinging\n");

body = replace(body,"brinjing\n","bringing\n");

body = replace(body,"flinjing\n","flinging\n");

body = replace(body,"prinjing\n","pringing\n");

body = replace(body,".winjing\n",".winging\n");

body = replace(body,".zinjing\n",".zinging\n");

body = replace(body,".dinjing\n",".dinging\n");

body = replace(body,".pinjing\n",".pinging\n");

//END E's

//s at end - 1.7.4.5 -> unneeded, I think

//body = replace(body,"es\n","ez\n"); //Needs to go before c->s conversion, since C's are all soft S's

//This is a big thing. I moved the c down mainly to allow for the s->z convertor to do it's job, and the judgement on whether or not this messes things up is pending.

//START C 1.7 - moved so that higher number of characters in target get's preference, blocks kept cohesive

//Stolen from the "necessary" bin.

body = replace(body,"ch","C"); //Although both versions of C work, I'm assuming capitalized, so no lowercas c's are allowed in the text

body = replace(body,"accent","aksent");

body = replace(body,"exercise\n","eksersahyz\n");

body = replace(body,".once",".wuhns");

body = replace(body,"preface\n","prefis\n"); //special

body = replace(body,"icise\n","uhsahyz\n");

body = replace(body,"rcise\n","ruhsahyz\n");

body = replace(body,".tacit\n",".tasit\n");

body = replace(body,"ciate\n","sheeeyt\n");

body = replace(body,"vate\n","vit\n"); //pulled from E section, might be a sign of things to come

body = replace(body,"literate\n","literit\n");

body = replace(body,"ate\n","eyt\n");

body = replace(body,"cision\n","sizhuhn\n");

body = replace(body,"cise\n","sahys\n");

body = replace(body,"cist\n","sist");

body = replace(body,"uce\n","us\n");

body = replace(body,"uces\n","usez\n"); //z incorporated

body = replace(body,"uced\n","usst\n"); //D's

body = replace(body,"came\n","keym\n");

body = replace(body,"came","kamuh");

body = replace(body,"ct","kt"); //factual

body = replace(body,"tual\n","Cual\n");

body = replace(body,".acid\n",".asid\n");

body = replace(body,".aci",".uhsi");

body = replace(body,"ierce\n","eers\n");

body = replace(body,"ince\n","ins\n");

//body = replace(body,".ance",".ahns");

body = replace(body,".trance",".trahns");

body = replace(body,"dance\n","dahns\n");

body = replace(body,"Cance","Cahns");

body = replace(body,"cance","cahns");

body = replace(body,"lance","lahns");

body = replace(body,"vance","vahns");

body = replace(body,"ance\n","uhns\n");

body = replace(body,"all\n","awl\n");

body = replace(body,".supp",".suhpp"); //just a general rule

body = replace(body,"appa","apuh");

body = replace(body,"ppen","pen"); //double p's, might NOT be done

body = replace(body,"pplet\n","plit\n");

body = replace(body,"pple\n","puhl\n");

body = realReplace("QQQ",body,".supplement\n",".suhpluhment\n"); //special case

body = replace(body,"ppl","puhl");

body = replace(body,"upp\n","uhp");

body = replace(body,"oppor","oper");

body = replace(body,"opp","uhp");

body = replace(body,"ypp","ip");

body = replace(body,"pp","p"); //Last ditch, should cover most before this

body = replace(body,"tice\n","tis\n");

body = replace(body,"arice\n","eris\n");

body = replace(body,"orice\n","uhis\n");

body = replace(body,"cipice\n","suhpis\n"); //patch for precipice

body = replace(body,"ipice\n","uhpis\n");

body = replace(body,".vice\n","vahys\n");

body = replace(body,"vice\n","vis\n");

body = replace(body,"ice\n","ahys\n"); //Long S. NOT sure about \n's

body = replace(body,"egy\n","ijee\n"); //possibilities/strategies fix, I have now idea how the ended up "kiez"

body = replace(body,"ity\n","itee\n");

body = replace(body,"ite\n","ahyt\n");

body = replace(body,"irst\n","urst\n");

body = replace(body,"ong\n","ong\n");

body = replace(body,"ull\n","ool\n");

body = replace(body,"cide\n","sahyd\n");

body = replace(body,"ide\n","ahyd\n");

body = replace(body,"ence\n","ens\n");

body = replace(body,"rend\n","rend\n");

//1.8.9 Pie-

body = replace(body,"piety","pahyitee");

body = replace(body,".pier\n"," peer\n");

body = replace(body,".pie\n"," pahy\n");

body = replace(body,".pie",".pee");

body = replace(body,"ces\n","seez\n");

body = replace(body,"cez\n","seez\n"); //Incase of S->Z

body = replace(body,"ce\n","s\n");

body = replace(body,"ci\n","sahy\n");

body = replace(body,"oy\n","oi\n");

body = replace(body,"ace\n","eys\n");

body = replace(body,".chull\n",".as\n");

body = replace(body,".chull",".uhs"); //Assoc-

body = replace(body,"ely\n","lee\n"); //MUST BE LAST IN \N

body = replace(body,".scie",".sahye"); //For Science!

body = replace(body,"sciou","shuh"); //For Conscience!

body = replace(body,"cious","shuhs"); //For Ithaca!

body = replace(body,"scio","shuh");

body = replace(body,"scie","shuh");

body = replace(body,"ply\n","plahy\n");

body = replace(body,".by\n",".bahy\n");

body = replace(body,".my\n",".mahy\n");

body = replace(body,".die\n",".dahy\n");

body = replace(body,".dye\n",".dahy\n");

body = replace(body,".bye\n",".bahy\n"); //conflict

body = replace(body,"hype","hahype");

body = replace(body,"hypo","hahypo");

body = replace(body,"hypn","hipn");

body = replace(body,"hyphen","hahyfuhn");

body = replace(body,"hyfen","hahyfuhn"); //ph->f

body = replace(body,"yp","ip");

body = replace(body,"duct","duhkt");

body = replace(body,"stion","sCuhn"); //1.8.9.4

body = replace(body,"tion","Suhn"); //1.8

body = replace(body,"ssion","Suhn"); //1.8.6

body = replace(body,"sion","zhuhn");

body = replace(body,"cean","Suhn");

body = replace(body,"ture","Cur");

body = replace(body,"cies","seez"); //prophocies

body = replace(body,"ciez","seez"); //s->z already done

body = replace(body,"iew","yoo");

body = replace(body,".face",".feys");

body = replace(body,"face","feys");

//For-

body = replace(body,".fore",".fohr");

body = replace(body,".for",".fohr");

//ore, as in fore, bore

body = replace(body,"ore","ohr");

body = replace(body,"acen","eysuhn"); //Don't get complacent

body = replace(body,"ician","ishuhn"); //musician

body = replace(body,"cism","sizuhm"); //anglicanism

body = replace(body,"cial","shul");

body = replace(body,".acq",".akw"); //might need refinement

body = replace(body,"cque","ke");

body = replace(body,"acquaint","uhkweyeynt");

body = replace(body,"cing","sing");

//1.6.5 - odyssey test

body = replace(body,"exce","ikse");

body = replace(body,"excit","iksahyt");

body = replace(body,"excis","eksahyz");

body = replace(body,"ici","isi"); //Sicily

body = replace(body,"iec","ees"); //Piece/Peace -> Pees

body = replace(body,"eac","ees");

body = replace(body,"ight","ahyt");

body = replace(body,"cep","sep");

body = replace(body,"cin","sin");

body = replace(body,".cit",".sit");

body = replace(body,"cip","sip");

body = replace(body,"cif","sif"); //NOT sure

body = replace(body,"icc","ik");

body = replace(body,"icn","ikn");

body = replace(body,"sce","se");

body = replace(body,"sci","si");

body = replace(body,"scy","sahy");

//body = replace(body,"sco","sko");

body = replace(body,"cea","sea");

body = replace(body,"nci","nsi"); //might need refinement

body = replace(body,"ncy","nsee");

body = replace(body,"cei","see");

body = replace(body,"cee","see");

body = replace(body,"cent","sent"); //odyssey

body = replace(body,"it\n","it\n"); //Tacked on for suffix reasons

body = replace(body,"ap\n","ap\n");

//starting with c

body = replace(body,".cy",".sahy");

body = replace(body,".cir",".sur");

body = replace(body,".cid",".sahyd");

body = replace(body,".ci",".si");

body = replace(body,".cer",".sur");

body = replace(body,".ce",".se");

body = replace(body,"ck","k");

/* body = realReplace("QQQ",body,"C\n","k\n");

body = realReplace("QQQ",body,"ch\n","k\n"); */

body = replace(body,"sc","sk");

body = replace(body,"cy","see"); //1.4.3 - si->see

body = replace(body,"ce","se");

body = replace(body,"ca","ka");

body = replace(body,"co","ko");

body = replace(body,"cu","ku");

body = replace(body,"ct","kt");

body = replace(body,"cl","kl");

body = replace(body,"cr","kr");

body = realReplace("QQQ",body,".c",".k"); //This can possibly leave lowercase c's in the text, although I think that all properly spelled words should be covered here.

body = realReplace("QQQ",body,"c\n","k\n"); //to stop mischeif

//END C'S

//Not sure where to put this section

//ss

body = replace(body,"ss","s");

body = replace(body,".be\n",".bee\n");

body = replace(body,".maybe\n",".meybee\n");

//gh

body = replace(body,"gha","gah"); //This section needs work

body = replace(body,"gho","goh");

body = replace(body,"ought","awt");

body = replace(body,"though","thoh");

body = replace(body,"bough","bou");

body = replace(body,"cough","kof");

body = replace(body,"igh","ahy");

body = replace(body,"gh\n","\n");

body = replace(body,"gh","g");

//to, too, two - Just a quick patch for those three words, not a general solution to any problem I can see

body = replace(body,".to\n",".too\n");

body = replace(body,".two\n",".too\n");

//q at end

body = realReplace("QQQ",body,"q\n","k\n");

//w at end

body = replace(body,".low\n",".loh\n");//special cases

body = replace(body,".row\n",".roh\n");

body = replace(body,"ow\n","au\n");

//.sy

body = replace(body,".syr",".suhr"); //Moved up to e-enders

body = replace(body,".syr",".sir");

body = replace(body,".sly",".slahy");

body = replace(body,".lying\n",".lahying\n");

body = replace(body,".ly",".li");

//sz->siz - The coward's way out. I need to sit down and make this thing more cohesive

body = replace(body,"sz\n","siz\n");

body = replace(body,"pie\n","pahy\n"); // NOT normal, aka special

body = realReplace("qqq",body,".or",".awr");

body = realReplace("qqq",body,"y\n","ee\n");

body = realReplace("qqq",body,"ehee\n","ehy\n");

body = realReplace("qqq",body,"ahee\n","ahy\n");

body = realReplace("qqq",body,"eee\n","ey\n"); //fixing issues raised by y->ee as compared to other phonetics

String[] temp = {"en","st","un","c","f","g","s","t",""};

body = replace(body,"ctable\n","kteybuhl\n"); //save the c's!

for(int i = 0; i<temp.length;i++)

if(temp.equals("c"))

body = replace(body,"kable\n","eybuhl\n");

else

body = replace(body,temp+"able\n","eybuhl\n");

body = replace(body,"able\n","uhbuhl\n"); //This one is either "eybuhl" for a few short words or "uhbuhl" for all others

body = replace(body,"ble\n","buhl\n");

//x's

body = replace(body,".xy",".zi");

body = replace(body,"xious","kSuhs");

//General fixer for suffixes

//body = replace(body,"\n","\n");

//The annoying part is the hodge-podgeness of English. The only workable rout may be just to demand phonetic spelling in cases like "Tow"

//Necessary --Moved down to make ease-of-use conversions easier

body = replace(body,"th","T");

body = replace(body,"sh","S");

//body = replace(body,"ch","C"); //took some liberties here, capitalized the C to make room for the c->k/s conversion

body = replace(body,"x","X"); //Consistency - x is really a compound character of ks.

body = replace(body,"qu","ku");

//body = replace(body,"q","ku");

/* body = replace(body,"wa","ua"); //Unnecessary? I think not! I'm not sure why, but no.

body = replace(body,"we","ue");

body = replace(body,"wi","ui");

body = replace(body,"wo","uo");

body = replace(body,"wu","uu"); */

body = replace(body,"w","u"); //exception catcher

if(debug_end_e){

body = replace(body,"e\n","Q\n"); //Just for debugging

body = replace(body,".TQ",".Te");

body = replace(body,".bQ",".be");

body = replace(body,".seQ",".seee");

body = replace(body,".mQ",".me");

body = replace(body,"eQ\n","ee\n");

body = replace(body,"Qy\n","ey\n");

body = replace(body,".hQ",".he");

body = replace(body,".shQ",".she");

}

return body;

}

private static String replace(String body, String target, String sub){

return realReplace("",body,target,sub);

}

private static String realReplace(String sofar, String body, String target, String sub)

{

int target_size = target.length();

int sub_size = sub.length();

/* if((min<Count++)&&(max>Count))

Targets+= target+"_"; */

if(Counting)

{

Count++;

if(target.equals("w"))

System.out.println("Replaces Run: "+Count);

}

//As of 1.8.8.1, '.' and '\n' are only codes for ' '. Spaces will be added before and after every \n, as well as after every period, then removed at the end.

//'.'==' '

if(target.startsWith("."))

return realReplace(sofar, body,(" "+target.substring(1,target_size)),(" "+sub.substring(1,sub_size)));

else if(target.endsWith("\n"))

return realReplace(sofar, body,(target.substring(0,target_size-1)+" "),(sub.substring(0,sub_size-1)+" ")); //space substitution

if(target.endsWith(" "))

if(sofar.length()<=2){ //that took longer than it should have. Anyone who can suggest improvements is welcome to try.

if(target.equals("y "))

System.out.println(target);

if((!sofar.contains("z"))&&(!sofar.contains("l"))){ //I think contains() covers it. It saves time over endsWith() if it stops unnecessary calls to realReplace(), as long as it doesn't cut out possible permutations

if(!sofar.contains("i"))// s->z

if((target_size>=2)&&(target.charAt(target_size-2)!='s')&&(target.charAt(target_size-2)!='z')) //Double-checking s/z

if(target.charAt(target_size-2)=='e')

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z "));

else if((sub_size>=2)&&(sub.charAt(sub_size-2)=='y'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"ez ")); //s->z

else if(((target_size>=2)&&(target.charAt(target_size-2)=='y'))||(target_size<3)) //bug stopper

if((sub_size>=2)&&(sub.charAt(sub_size-2)=='e'))

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"z "));

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-2)+"ies "),(sub.substring(0,sub_size-1)+"iez ")); //s->z

else

body = realReplace(sofar+"z",body,(target.substring(0,target_size-1)+"s "),(sub.substring(0,sub_size-1)+"z ")); //s->z

/* //y

body = realReplace("qqq",body,"ay ","ey "); //stopgap, might want to revisit

body = replace(body,"ey ","ey ");

body = realReplace("qqq",body,"oy ","oi ");

body = realReplace("qqq",body,"uy ","ahy ");

body = realReplace("qqq",body,"y ","ee "); //might need generalized in replace()

body = replace(body,"ty","tahy"); */

//ly, focus on y as of 1.7.4.3 - It might need some work

if(target.equals("sly ")) //special case

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else{

//ly

if((target_size>=5)&&(target.substring(target_size-5,target_size-1).equals("able")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"y "),(sub.substring(0,sub_size-4)+"lee ")); //ably

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='y'))

if((sub_size>=3)&&(sub.substring(sub_size-3,sub_size-1).equals("ee")))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-3)+"ily "),(sub.substring(0,sub_size-3)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-2)+"ily "),(sub.substring(0,sub_size-2)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"pily "),(sub.substring(0,sub_size-1)+"uhlee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"tily "),(sub.substring(0,sub_size-1)+"uhlee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee "));

//y

if((target_size>=2)&&(target.charAt(target_size-2)=='a')) //might need work

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ey "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"y "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='o'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-1)+"i "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='u'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"y "),(sub.substring(0,sub_size-2)+"ahy "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"py "),(sub.substring(0,sub_size-1)+"ee "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"y",body,(target.substring(0,target_size-1)+"ty "),(sub.substring(0,sub_size-1)+"ee "));

else

body = realReplace(sofar+"l",body,(target.substring(0,target_size-1)+"ly "),(sub.substring(0,sub_size-1)+"lee ")); //might not be needed

}

if((!sofar.contains("g"))&&(!sofar.contains("i"))&&(!sofar.contains("r"))){ //covers multiple

if((!target.endsWith("g "))&&(!target.endsWith("gs "))&&(!target.endsWith("gz "))) //leave no base uncovered

if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ie")))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-3)+"ying "),(sub.substring(0,sub_size-1)+"ing ")); //replacing 'ie' before gerund

else if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-2)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ping "),(sub.substring(0,sub_size-1)+"ing "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ting "),(sub.substring(0,sub_size-1)+"ing "));

else if((!target.endsWith("gs "))&&(!target.endsWith("gz "))) //no "ing\n" or s\z at end

body = realReplace(sofar+"g",body,(target.substring(0,target_size-1)+"ing "),(sub.substring(0,sub_size-1)+"ing ")); //no e, presumably ends in consonant

if((!sofar.contains("a"))&&(!sofar.contains("d"))) //ish

if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"pish "),(sub.substring(0,sub_size-1)+"ish "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"tish "),(sub.substring(0,sub_size-1)+"ish "));

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ed")))||(target_size<3))

body = realReplace(sofar+"i",body,(target.substring(0,target_size-1)+"ish "),(sub.substring(0,sub_size-1)+"ish "));

if(!sofar.contains("a")) //able

if((target_size>=2)&&(target.charAt(target_size-2)=='p')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"pable "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if((target_size>=2)&&(target.charAt(target_size-2)=='t')){

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"table "),(sub.substring(0,sub_size-1)+"uhbuhl "));

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

}

else if(((target_size>=3)&&(!target.substring(target_size-3,target_size-1).equals("ly")))||(target_size<3))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(target.equals("fly")||target.equals("unfly"))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"uhbuhl "));

else if(((target_size>=4)&&(target.substring(target_size-4,target_size-1).equals("ing")))||(target_size<4))

body = realReplace(sofar+"a",body,(target.substring(0,target_size-1)+"able "),(sub.substring(0,sub_size-1)+"eybuhl "));

}

if((!sofar.contains("g"))&&(!sofar.contains("d"))){ //covers multiple

if(target_size>=2) //d at end

if(target.charAt(target_size-2)=='e')

if((target_size>=3)&&(target.charAt(target_size-3)=='c'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"st "));

else

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed ")); //NOT st

else if(target.charAt(target_size-2)=='s')

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("se")))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"d "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ped "),(sub.substring(0,sub_size-1)+"ed "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ted "),(sub.substring(0,sub_size-1)+"ed "));

else if((target.charAt(target_size-2)!='s')||((target_size>=3)&&(target.substring(target_size-3,target_size-1).equals("ss"))))

body = realReplace(sofar+"d",body,(target.substring(0,target_size-1)+"ed "),(sub.substring(0,sub_size-1)+"ed "));

//er

if(!sofar.contains("r"))

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r "),(sub.substring(0,sub_size-1)+"er ")); //removing 'e'

else if((target_size>=2)&&(target.charAt(target_size-2)=='p'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"per "),(sub.substring(0,sub_size-1)+"er "));

else if((target_size>=2)&&(target.charAt(target_size-2)=='t'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"ter "),(sub.substring(0,sub_size-1)+"er "));

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er "),(sub.substring(0,sub_size-1)+"er "));

}

/* //ate, not bothering with fobiddances - Never mind

if((target_size>=2)&&(target.charAt(target_size-2)=='e'))

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"r\n"),(sub.substring(0,sub_size-1)+"er\n")); //removing 'e'

else

body = realReplace(sofar+"r",body,(target.substring(0,target_size-1)+"er\n"),(sub.substring(0,sub_size-1)+"er\n")); */

//Why do these need to be dealt with here?

//Because these permuations need to be available to figure out which \n grammars to apply

//ed, ish, ly, ing, able, edly, ishly, ably, lying, eding, abling

//Dirty method - add a recursion counter to replace()

//6 max - ed ish ly ing able z

//ablingly, lyingly - 3

//ablinger

//s-z, ly-l, ing-g, d-d, ish-i, able-a

//everything abides i, nothing abides s/l //nevermind, not much likes i either

//a allows l/s/d,

//a forbids a, i

//d forbids d, i

//g forbids d, g, i, a

//i forbids s, g, i, a

//er-r

//r forbids g, i, a, r

//r is forbidden by s, l, g, d

//y-y

//Not messing with forbidding now (1.8.8.2)

//I think that forbiddance is total - no forbidden suffixes at any point before

}

}

for(int i = 0; i<=body.length()-target_size;i++)

{

if(body.substring(i,i+target_size).equals(target))

{

body = body.substring(0,i)+sub+body.substring(i+target_size,body.length());

i+=(sub_size-target_size);

}

}

return body;

}

}

Edited by Kurkistan
Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...