Alien Pranksters
A meteor falls in backwoods Montana. The farmer who discovers it contacts the media, and it immediately becomes a global sensation. It is a perfect shiny sphere with no evidence of damage, covered in strange etchings. Radiography is eventually able to discern what appears to be a book inside. After furious debate, a hole is etched with a laser and the book is extracted.
It is remarkable. The title is etched with some kind of synthetic gemstone that seems to glow with its own light. 512 pages of text are densely written on a seemingly indestructible material, with a lustrous, iridescent ink. The characters are packed on each page, without apparent spaces, and are like no language ever seen. They are hauntingly beautiful. Looking at a page, one's head seems to swim in ethereal eddies of alien thought.
It is universally acknowledged as the most profound artifact ever discovered. The world economy lurches to a near halt as everyone is glued to their phone, waiting for the smallest update. After furious demands and even protests, the entire contents are made available in the public domain. Every professional and amateur linguist drops what they are doing, and the race to decode is on.
135 distinct glyphs are immediately identified, including 20 modifiers that seem to serve as punctuation. Characters, words, and groups of words are picked out, with a language-like distribution of frequencies. Humanity vibrates with excitement, the pace of discovery foretells breakthrough, always just around the corner.
Over weeks, and then months, enthusiasm slowly transforms to an increasingly grim determination. Where are the prime numbers? The pictograms? Nothing whatsoever, just pages of these beautifully maddening, indecipherable glyphs. For such a seemingly advanced alien race, the lack of any assistance seems to betray a curious absence of mental empathy. One camp claims that the codex, despite its spectacular appearance and delivery, was not intended for a human audience. Another, that the book is a kind of test, and is intentionally inscrutable, so that only the most intelligent species may decode it. An increasing subset of these despair that humanity is doomed to fail it.
Months turn to years, and years into decades. Public interest has long subsided, but the book becomes the basis of an entire academic field. Devotees spend their lives obsessed with it. Multiple religions are born with The Codex as its holy text, meditating daily on elaborate reproductions of its pages. As the decades roll past, increasingly powerful technology is brought to bear on the problem. AI that has long ago transcended AI which itself had long ago usurped the crown of ingenuity from humanity. Quantum computers which make today's look like glorified abaci.
In truth, what some suspected, only half in jest, turned out to be correct. The text was a practical joke played on humanity by a cruel and whimsical alien species. It is complete nonsense, random gibberish, imbued with enough regularity to look like a plausible language, but no more.
The question is this: given enough time and computing power, can humanity eventually "discover" an interpretation that renders the text coherent? While in truth, inventing one out of whole cloth? Or will the text remain indecipherable forever?
It is remarkable. The title is etched with some kind of synthetic gemstone that seems to glow with its own light. 512 pages of text are densely written on a seemingly indestructible material, with a lustrous, iridescent ink. The characters are packed on each page, without apparent spaces, and are like no language ever seen. They are hauntingly beautiful. Looking at a page, one's head seems to swim in ethereal eddies of alien thought.
It is universally acknowledged as the most profound artifact ever discovered. The world economy lurches to a near halt as everyone is glued to their phone, waiting for the smallest update. After furious demands and even protests, the entire contents are made available in the public domain. Every professional and amateur linguist drops what they are doing, and the race to decode is on.
135 distinct glyphs are immediately identified, including 20 modifiers that seem to serve as punctuation. Characters, words, and groups of words are picked out, with a language-like distribution of frequencies. Humanity vibrates with excitement, the pace of discovery foretells breakthrough, always just around the corner.
Over weeks, and then months, enthusiasm slowly transforms to an increasingly grim determination. Where are the prime numbers? The pictograms? Nothing whatsoever, just pages of these beautifully maddening, indecipherable glyphs. For such a seemingly advanced alien race, the lack of any assistance seems to betray a curious absence of mental empathy. One camp claims that the codex, despite its spectacular appearance and delivery, was not intended for a human audience. Another, that the book is a kind of test, and is intentionally inscrutable, so that only the most intelligent species may decode it. An increasing subset of these despair that humanity is doomed to fail it.
Months turn to years, and years into decades. Public interest has long subsided, but the book becomes the basis of an entire academic field. Devotees spend their lives obsessed with it. Multiple religions are born with The Codex as its holy text, meditating daily on elaborate reproductions of its pages. As the decades roll past, increasingly powerful technology is brought to bear on the problem. AI that has long ago transcended AI which itself had long ago usurped the crown of ingenuity from humanity. Quantum computers which make today's look like glorified abaci.
In truth, what some suspected, only half in jest, turned out to be correct. The text was a practical joke played on humanity by a cruel and whimsical alien species. It is complete nonsense, random gibberish, imbued with enough regularity to look like a plausible language, but no more.
The question is this: given enough time and computing power, can humanity eventually "discover" an interpretation that renders the text coherent? While in truth, inventing one out of whole cloth? Or will the text remain indecipherable forever?
Comments (57)
No interpretation. It's not a language.
Language, and what you wrote about is an appearance of a 'language', has logical steps and intentional word/sound connection. Behind any language, there are minds that want to express an idea or ideas.
This a hermeneutics question, asking if there is a single correct perspective for linguistic interpretation. The aliens' interpretation scheme is based upon their culture and life and is only valid to earth culture if earth culture adheres to the hermeneutic that we are to interpret alien langauge as if we were alien.
In this example though we have limited knowledge of alien life to consider.
"To understand a text always means to apply it to ourselves and thus to find its meaning.
Hans-Georg Gadamer, Truth and Method
Maybe. But that is just semantics. "Is it an interpretation or isn't it" is ultimately definitional. I'm interested if meaning can be constructed in noise.
Did you miss
Quoting hypericin
Noise without intention and a look back to build up what's ahead is just...noise.
Are we talking about any interpretation at all? Or specifically one that would comport with what we might expect intelligent aliens (who have decided to communicate with us) to have to say to us?
Any interpretation at all is too permissive, only our alien expectations is too restrictive. What I am asking is, can a incontrovertible message be derived (and in doing so, likely a language)?
*, **, ***, ****, *****
I wonder if intelligent aliens would find this universal, along with demonstrated basic arithmetic operations.
Moving from the universal to local meanings seems like it'd be supremely difficult if not impossible, if alien life is nothing like human life.
It seems to me there must be a kernel of meaning, or perhaps some arbitrary carry-over from the aliens actual means of written expression, to the codex, for there to be some sort of incontrovertible message to be derived in the codex. That is to say, that across all possible combinations of the arbitrarily created, meaningless characters that could be created according to the potentially spurious linguistic rules, there is a particular kernel of meaning that needs to manifest in just the right combination of characters to create an incontrovertible message - the combination we see in the codex. From there we could perhaps extrapolate some sort of language? Im not sure.
This kernel of meaning might not even originate with the creation of the codex, but rather be related to the openness of all the possible, valid combinations of the alien characters as a sort of commonly occurring connection arising from some emergent meta-rules.
Quoting hypericin
Thats two totally different questions
1. Can we see the meaning in the text?
2. Would we fool ourselves that a reasoning we imposed on the text was in the text when it was not?
My answer to both is no, probably not (definitely not for me, but Incant speak for everyone.)
Seems reasonable to assume it is language and text. But maybe it isnt. But still seems reasonable to assume it was made by something sentient, like us, but maybe not. Until we find a Rosetta Stone, or a decryption key, confirming it is indeed a language at all, or even an artifact of a knowing being, I think most people would never get too far convincing others about the meaning of its language.
Did you see the movie Contact with Jodi Foster? They had a similar alien text problem. The aliens built in a decryption key to help other intelligent species learn the language. Neat movie.
I think we are in the same page. It is not possible to derive a message from noise. But that is just my intuition.
Quoting Fire Ologist
Everything about the presentation screams "language". And note that the aliens embedded language like statistical patterns into the noise.
Quoting wonderer1
I had indeed heard of it, this is probably the closest real life analog to my post. Epistemically that is, we still don't know what it is (which surprises me).
Quoting Nils Loc
You missed some key parts of the op.
The key was probably contained in the etchings on the sphere, and a key part was where they cut the whole to remove the book.
Out what jelly mould or cake tin was this truth turned? It is sometimes difficult for me to say with certainty even on this site and in English whether some controversial complex science laden post is too hard for me to understand, or too incoherent to be worth reading at all. But in this case, I'm going to go out on a limb and call nonsense.
https://www.poetryfoundation.org/poems/43909/the-hunting-of-the-snark
Isn't imposing a false meaning on the text achievable with a considerable bit of work? It's just mapping a known language/meaning onto a novel set of symbols. The text could probably serve as code for innumerable different meanings. I guess it really depends on the patterns/regularities of the text in question.
Apparently you can encode information in noise. Binary code looks like digital noise. In your story there is the sense that there is no original message in the noise anyway because it is in truth a practical joke, so there is no deriving a message, only imposing/inventing one.
No. Imagine how the symbols are arranged in a language, vs noise. There is a lot of structure and repetition in a language, whereas noise has none.
Quoting Nils Loc
Noise by definition carries no information. Binary code does not look like noise unless it has been perfectly compressed.
Quoting unenlightened
We can with the benefit of omniscience. I wonder if the earthlings ever can.
But your alien text has structure and repetition, plausibly functions like a language as a carrier of information, like the Voynich manuscript. Otherwise it wouldn't be interesting to the experts.
If I was sitting in a classroom in which everyone was talking and I was trying to understand what the lecturer was saying over other discussions, the unwanted interference of other coherent conversations could be considered noise, even if the only thing I could understand was the very thing I didn't want to listen to (the noise).
Noise is relative to the receiver, as what interferes in the transmission/reception of a message. A concern for random noise (if that is how you are defining it) isn't that relevant to your hypothetical text because if it looked like random noise to begin with no one would consider whether it could carry meaning.
Quoting hypericin
With respect to using the text as the basis for the creation of a language, which could possibly make original text arbitrarily coherent in some new meaning, the syntactical/structural content is all that matters. The semantic content is gibberish (or lost) but the syntactic content could be useful and is not random.
In any case, we could use a machine learning expert who is also a linguist to weigh in.
Sorry, let me try again, I might have gotten mixed up in my last attempt to reply to this.
I guess what I am asking is precisely this. CAN a false meaning be imposed on such a text? It seems genuinely unclear to me.
Consider one symbol, A. Literally any meaning can be imposed. Now two symbols, AB. Still, any meaning pair will do. Now repeat those in some pattern:
ABBAAB
Now, meaning already becomes quite constrained. There are only so many values we can assign to A and B such that the string makes sense (for instance, it might be instructions to enter a code to a lock where there are two options ). Now consider the codex. 512 pages of words appearing with some probability distribution, and phrases in some probability distribution. But with no underlying semantic content. By page 5 the constraints are already bad, by 512 they are crushing. Can ANY meaning at all be imposed on this thing? It it just not clear to me.
Right. If we consider avenues of meaning corresponding to one-dimensional strings of information, such as what might unlock a certain combination lock, we can impose meaning somewhat easily on the codex - we just need a corresponding lock or something that will accept the codex as a raw input. However, since you suggest that the codex appears to be written in a language due to probabilistic distributions of characters and phrases, we are inclined to consider different meanings.
Indeed, this is what I proposed in my last comment: if there is a kernel of meaning insofar as a certain combination of the characters could have an incontrovertible meaning, then the kernel of meaning must manifest in the specific combination of characters and phrases we see in the codex. It being a one-dimensional combination/string would simplify this. But it would be incredibly unlikely that this raw input is useful, I think. Alternatively, we could consider it the way you have laid out - as a piece of written communication in a language, which is more difficult to parse.
Therefore, I think that if we could determine if when fragments from the codex are treated as one dimensional strings they derange in predictable patterns - that is to say they are only useful up to a point when tested for being a model for a more a more straightforward, transparent meaning - then we know that somewhere in there is a statement conveying meaning that subsumes the demands of the corresponding, meaningful one-dimensional string it is being tested to model up to that point.
Think of the combination ABBAB. If we were to say that ABBAB in alien characters means always eat the pizza crust first, except on Wednesdays, and ABBABA corresponds to an alphabetical combination locks code, they agree up to the last B in the truncated code. However, if always eat the pizza crust first, except on Wednesdays and Thursdays is then evaluated in alien characters because that is the fragment being considered from the codex, and it changes the actual string from ABBAB to ABBABC, then we know that there is disagreement between the locks code and the meaning of the sentence in words.
This allows us to guess at the meaning of fragments of the codex by logging the valid one-dimensional strings of meaning and then guessing at their potential meaning as written pieces of communication by substituting alien characters with (perhaps arbitrarily assigned) meanings until the agreement with those one-dimensional strings terminates and then repeat the process.
This would require a probabilistic character generator that could differentiate between and calculate both semi-correct but incomplete one-dimensional strings and phrases and fragments of written language, but I think that could be created given the work the Aliens put into the prank.
So yes, given enough time and computing power, a meaning can be imposed on the codex, I think.
The thread isn't really about creating something indecipherable, but that's pretty cool, too.
If it is an encrypted alien message, the task is certainly hopeless. But if it was, then it wouldn't display the language like regularities I mentioned.
In any case the op is different from decryption. Decryption is about decoding meaning from a signal. The op is about mapping meaning onto nonsense.
I don't follow what you are proposing. What is a "valid one dimensional strong of meaning"?
By "one-dimensional string of meaning" that I mean a combination of characters that has a function or a meaning insofar as a one-dimensional string of characters can. That is, for example, things like lock combinations, a series of inputs into a particular algorithm, etc. In the context of the codex, valid one-dimensional strings of meaning would be those strings that model something more complex in terms of fragments from the codex (although imperfectly), and it's pretty open-ended what their function and meaning could be predicated on. However, since we are specifically concerned with the content of a written "language", they would be at least partially predicated on written content.
Ha, I had this idea for a short story, although in my version and alien race was receiving the slowly scrolling text of the whole of Wikipedia through an indestructible but otherwise inert screen. The main question I had thought of was if the text, having no relationship to their world, really meant anything in their context.
In theory, any medium with enough measurable variance can encode any message, with more variance needed to capture more complexity. But seemingly endless amounts of complexity can also be off-loaded to the perceiver. So for instance, in analytic philosophy papers, signs like "S1," "S2," and "S3," with be used as stand-ins for sentences that are themselves high level summaries for very complex ideas. Yet every time we read "S2" in this context, we "unpack" it into a wider meaning. Or, similarly, through good training, thousands of men on a warship might be taught to all respond differently, and to begin completing complex tasks, from a single piercing alarm tone, a totally on or off signal.
To make gibberish not appear to be random noise, it would have to have some structure, and this would be in some sense meaningful, even if it didn't correspond to an alien language. Your artifact would still have information about its production process. It would still have aspects that were invariant, that could be traced and understood. Knowing that the aliens understand linear algebra, etc., seems important enough.
While I think that this is true, we are talking about imposing an incontrovertible meaning on this particular alien text. That means that out of the infinitude of possible messages a given text written in the (statistically simulated) alien language could convey, we have to limit our analysis (at least initially) to deriving a meaning for this one specifically. Or maybe we could use it as a basis for a more complex analysis, although I'm not entirely confident in the method I have proposed.
But what possible combination of characters could have an incontrovertible meaning, given that there is in fact no meaning at all to the codex?
But this is cheating, and would be readily apparent to the community of decoders. You are essentially putting the decoding into the decoder.
Suppose someone came up with an ad-hoc decoder, like
" :rage: :naughty: :heart:" = "Call me Ishmael", when it appears on page 1.
Someone else could come up with decoder, much smaller than the ad-hoc decoder, that decodes the ad-hoc decoder itself into the ad-hoc decoding.
I see what you are saying, but I'm mostly laying out what conditions would be necessary for an interpretation to be incontrovertible; I'm not saying that that such a kernel of meaning exists without prosecution of the problem. Actually, to humanity, this kernel of meaning exists in a sense de facto, even if it must be doubted. Even further, I would say that any endeavor to interpret the text in a meaningful way probably has to assume that the codex could theoretically have a discoverable, incontrovertible meaning, even if it cannot possibly be truly identified - because it is the limiting case.
Thus, even if we cannot say there is definitely an incontrovertible meaning, I would say that we can approach it from a probabilistic standpoint that might get us close to virtual incontrovertibility. That is to say that if we could, across the distribution of meanings the codex could take on, narrow down the likelihoods of certain interpretations over others, there is probably one that is most likely, although I don't know to what degree, or what degree to which it would have to be the case to be considered the correct interpretation.
Quoting ToothyMaw
I think this is right, and I think cohernece is the only criteria we can use to decide likelihood (the fact that these are aliens means we have to make huge allowances for things that don't make sense, which makes this evaluation much more difficult).
I don't understand this assumption. Does every novel have a single incontrovertible meaning? Take for instance idioms/metaphors, which bring forth the issue/conflict of literal versus figurative meanings. Both the following passages are coherent on two levels (?), but they have two different meanings based on whether or not you have knowledge of what the idiomatic content actually means.
[quote=ChatGPT paragraph in Idioms]I decided to bite the bullet and hit the road early, hoping to beat the clock, but when push came to shove, traffic was a whole different ball game. By the time I made it to the office, I was running on fumes, yet I still had to jump through hoops to get the project off the ground. At the end of the day, though, we pulled it off by the skin of our teeth.[/quote]
It gets even more bizarre if you translate foreign idioms:
[quote=ChatGPT paragraph in foreign idioms]I woke up feeling like I had an octopus on my face, but I decided to tie my stomach and head to work. The meeting was chaos everyone was watering their salad while the boss was trying to give birth to a mountain. When it was my turn to speak, I almost dropped my face, but somehow I managed to hang noodles on everyones ears. By the end, we were all pressing the cucumber, pretending everything was fine.[/quote]
If alien codex were an idiomatic prank that was deciphered at a literal level, the meaning would still be lost. This would be compounded by the gulf between what is universal between species and what is hopelessly local and perhaps untranslatable.
Really I should have said "translation", not "meaning". And it is true, not every earth-language translation is the same. What I really meant was, the assumption has to be that the thing isn't War and Peace (in spaaace) and a dietetic guidebook.
Where did I say that? I suppose that my method would, ideally, approach creating a single string of meaning if it were applied over and over again, but I don't think we start with that translation or that it would be absolutely incontrovertible. Furthermore, it could arise out of analysis of the coherence of various possible translations.
Couldn't it be possible that there are actually hundreds to billions of variations of meaning that can be imposed on the codex that satisfy the level of coherence hypericin/humanity is looking for. If this was known to be the likelihood, the meaning of any can be disputed within/against that set of all possibilities. What exactly makes the manufactured meaning of the text incontrovertible? Are we assuming only one meaning can fit the codex?
Like I said in an earlier post:
Quoting ToothyMaw
Good point. If one coherent (whatever that means) interpretation can be produced it seems likely innumerable can be. This will call the legitimacy of all of them into question. There might be advocates of each of them.
This is one logical outcome. However I still intuitively feel that no coherent (whatever that means) translation can ever be produced.
The likelihood of arriving at one meaning might be a consequence of how difficult it is to make the codex coherent though. If you had the set of all possible coherent meanings, which might be numerically staggering, what exactly would help you to pick the "one that is most likely"?
We could just do rote textual analysis by a reader, I guess. Although, that is hardly feasible given the potential multitudes of valid meanings, so I guess we would need some sort of efficient process or algorithm or something. I'll get back to you on that.
It would be kind of like stipulating: "only really big masses can balance this scale" and then measuring various masses on a scale until we find one that gets the closest to balancing the scale and then saying that that mass qualifies as being the closest to being really big.
So, to make it as clear as possible, that means that only an incontrovertible meaning has a 100% chance of being the correct meaning, and every other interpretation has a chance of being correct that aligns with a probability assigned according to how close it is to being incontrovertible.
Suppose there are 10 different civilizations similar to ours in their intelligence/knowledge/life, that all receive the same hoax codex, and the syntactical nature of the codex serves perfectly as any language emptied of original meaning might. Each of these civilizations go to work at imposing meaning onto the script in a way that achieves a compelling level of coherence such that they have, in their expert opinion, reached a stage of incontrovertible meaning, which really just means they've achieved a remarkable coherence/intelligibility that seems indisputable.
What is the likelihood that the meaning of these 10 different efforts in different parts of the universe yield the same understanding? My intuition is that every completed codex would be radically different in meaning, yet perfectly intelligible and complete. The attitude that forms as to why the text's meaning is incontrovertible comes simply from the fact that it is way too difficult to try again afresh on any planet. Therefore there is no absolute incontrovertible meaning of any version, except with regard to all the work already done. It is only deemed incontrovertible because the meaning created "out of whole cloth" works but that fails to take in mind what else could work.
Is there any way we can ground our speculation as to whether there are many possible perfect impositions of meaning of or just a few or only one that works for the codex?
"Incontrovertible" seems far from a rigorous, objective term. It is a "know it when I see it" kind of thing. At one end are completely coherent novels, or the musings of an alien Aristotle. At the other end is gibberish. But between them is a whole hazy spectrum of material that kind of makes sense, if you squint hard enough, make ample allowances for alien references and ways of thinking, and don't pay too much attention to all the contradictions. I suspect that something along these lines would be the best case scenario. Here, one person's "incontrovertible" is another's "horseshit".
But that is only half the problem. The other half is the method the transition was achieved. You can imagine a perfectly ad hoc method, like, "XYZ means ABC, when seen on page one". This might yield an "incontrovertible" text: "One million moons ago our 12-eyed ancestors first descended from the trees...", but that is meaningless because the method was bullshit. On the other end, you can have a beautiful, logical grammar. Again, in between these two lies a spectrum of complications, exceptions, and hacks.
Both translation and method have to be evaluated, not one or the other.
Quoting Nils Loc
Or none. But that is the question. Is there a linguist in the house?
I came up with a semi-rigorous way of defining incontrovertibility: if a translation can be modeled by a one-dimensional string or series of strings that do have an incontrovertible meaning, and the linguistic content of the translation would be correct only in the case that the content of that one-dimensional string is 100% correct or realized, then it could be an incontrovertible interpretation. Other interpretations would have probabilities of being correct associated with the likelihood of the one-dimensional strings modelling them being correct or realized both generally and with reference to modeling the text itself in a coherent way.
The method behind finding these translations is beyond me.
Even though it's wasted effort, the feeling that there are important secrets that we might figure out soon might make us more aspirational and hopeful.
I've sometimes had the same thought with [possible hijack incoming] the idea of aliens coming to earth, deciding we aren't that interesting, and leaving immediately, for good. As frustrating as it would be, human progress would skyrocket. Knowing that interstellar travel is doable, practically, and that there are species out there would be tantalyzing.
Think Maw is just considering translation from an insufficient sample of text with known (incontrovertible) meaning.
The Rosetta stone would probably be an interesting case to read up on. Modern day Coptic was a vital source for deciphering Egyptian hieroglyphics because of the strong phonetic correspondences between the two languages. If they had the Rosetta stone but spoken Coptic was extinct, would they still be able to crack the hieroglyphics? Possibly not. Coptic furnishes most of the clues to reconstructing the meaning that the Rosetta stone translation does not contain.
Trying to reconstruct a foreign dictionary with just a handful of entries sounds impossible and absurd, as would be finding meaning in the alien codex.
The largest outcome naturally is that we aren't alone as a 512-page book with obscure writing doesn't accidentally form just by accident in the universe. The real problem simply is that there's no way of knowing just what "the book" is about or what it is meant for. It can look like to us as a book, but that is the only thing we understand. We can just guess and this makes cracking of any code difficult.
Indeed we had to have the Rosetta stone to finally crack the ancient hieroglyphs. Even before we could assume what they were telling: praising the greatness of the Pharaohs etc. What else do you write in Temples etc? In this case, people would be having argument on just what is the whole function of the "book".
Yes, this is the other side of the coin that I don't think has been mentioned yet. It may be that even if the contents were perfectly meaningful, we would never be able to crack it.
I think if this book arrived today, this would be the case. We would need a Rosetta Stone, or something , to assist, beyond the text itself. This is why I appealed not just to the focused effort of all the worlds linguists, but to transhuman AI, and fully matured quantum computers that could evaluate millions of grammars in a second. Would this be enough? Not sure!
But the core premise is that there is no meaning at all in the text.
[Quoting Nils Loc
Perhaps. But it would be interesting to see how the Coptic/Egyptian case, minus the Coptic, would have played out, had even today's technology been available. With far future technology, maybe decoding even a (meaningful) alien text would be tractable. It is, after all, ultimately a computational problem: there are only so many grammars that can produce the text, and only so many meanings that can be assigned to individual semantic units (though the alien factor would certainly compound the problem, there are certainly many alien meanings that would have no earthly correspondence and would be impossible to anticipate. There would have to be allowance for a fraction of words with unknowable meanings).
Quoting hypericin
@Nils Loc basically has it. I am suggesting we use a string that is incontrovertible in meaning (yet meaningful independent of any meaning we might assign to the codex) to scaffold interpretations. To start, we would need to find a string that is both incontrovertible in meaning and can model the codex. By "model the codex" I mean for the string and the codex to exist such that they are arranged in an identical combination of characters (whatever they might actually look like or represent for each). Then, if this string is both incontrovertible in meaning and the content of a particular interpretation of the codex hinges on the content of this string being absolutely confirmed in reality, much like a common proposition might be considered to be true, then this interpretation is potentially making a coherent statement about reality by virtue of being both semantically and materially meaningful.
This would work because there is a sort of interface between the meaning of the string and that of the codex that gives an interpretation an indisputable meaning in a virtual sense. I don't know if that qualifies as real incontrovertibility, though.
So if in the codex we encounter,
:smile: :hearts: :smirk: :point: :lol: :wink: :nerd: :love: :roll: :monkey: :nerd:
Are you proposing we can map this to, say, "Dogs are Cute", and then proceed from there, with more mappings?
If not, please ground with a simple example.
Yes, that mapping works, but the process would be more like modeling your string of emojis after an interpretation that says "Dogs are Cute" - although it could be done this way too, I suppose. Furthermore, if that string of emojis were to actually express, say, that "all four-legged animals that bark are cute" in emojis, then we have pretty much successfully executed what I have described and can proceed with more mappings if we so desire.
On average the number of unique words used only once in any written work is 40-60%. This is a problem for translators if they don't have other works in which the same words appear to help them infer meaning.
An interesting easy exercise would be to scrub any book of its hapax legomena (unique words that appear only once) and see how much meaning is lost for the reader. How much work does the remaining context do to interpret the missing 40-60%?
If the alien codex was actually a version of English gibberish with fine sytnax and was entirely original (had no other copy or translation on Earth), even with a known sentence with incontrovertible meaning, I still believe it's fully untranslatable. The ratio of known meaning to unknown is really vital to the possibility of deciphering/translating language.
J.L. Borges wrote a story inspired by the thought experiment of the set of all possible books given a certain text length and symbol set. The combination output exceeds the estimated number of atoms in the universe and that can easily grow (exponentially) by increasing the length of text and symbol set. I've always wanted to know about specific qualitative sets within the space of all possible books given those stipulations. Using the English alphabet, what percent of the set of all possible books would be complete and comprehensible for any reader today? These question is unanswerable but I intuit the proportion is tiny, maybe the number of atoms in the solar system or galaxy out of the number of atoms in the universe. The mystery makes for an itch that can't be scratched.
How big would be set of the translation variants of Moby Dick in English? Can we replace the whale with a small land animal and consider it a variant of Moby Dick?
I think this is far more answerable than my question.
Assume the set of all 100 page books, 1500 characters per page. Ignore punctuation.
In the random case, that is 26^150000, or 10^212246, vastly, vastly larger than the number of atoms in the observable universe , just 10^80!
Now the coherent subset. Assume average word length of 5 words. Thats about 150000/5 = 25000 words. Given a incomplete text in a natural language, there are roughly, on average, 35 plausible word choices that may follow (according to chatgpt, who would know!). So roughly, that's about 35 ^ 25000, or 10^38602 "coherent" books (I suspect this is generous).
That still dwarfs the number of atoms in the universe, but is utterly dominated by the number of random texts. If the number of possible books was represented by all the atoms in the universe, the number of coherent books would be far, far, far, far less than one atom's worth!
:up:
Good luck on gaining any insight into your original problem. Let me know when you've figured it out. :sweat:
I realized I did a horrible job conveying just how tiny the coherent subset of every possible book is. One atom vs. the whole observable universe doesn't nearly do it justice.
Every possible book is represented by all 10^80 atoms in a special universe,
Where every single atom contains a sub-universe containing 10^80 atoms,
And each of those atoms contains a sub-universe containing 10^80 atoms,
And each of those atoms contains a sub-universe containing 10^80 atoms...
(repeat this 2167 more times)
Then one of those atoms represents the coherent subset.
Even though, to enumerate the coherent subset, one atom per book, you would need a nested universe like this, 482 layers deep!
And of course, that is just for paltry 100 pagers. It gets much, much worse as the page count goes up...