Site Rules Amendment Regarding ChatGPT and Sourcing
I added the following to the site guidelines thread:
Using ChatGPT like you would a news article or an academic citation is now grounds for a warning. You can still use and reference ChatGPT output, just don't use it as an authoritative source.
Threads which discuss ChatGPT output are also still fine.
Academic Conduct
While we won't enforce any academic standards of discussion, sourcing your claims is great. Using ChatGPT or similar bots as an academic source can be grounds for a warning. Engaging with explicitly sourced ChatGPT text in the context of a thread is still ok. Misrepresenting where you got information from intentionally might get you a warning.
Using ChatGPT like you would a news article or an academic citation is now grounds for a warning. You can still use and reference ChatGPT output, just don't use it as an authoritative source.
Threads which discuss ChatGPT output are also still fine.
Comments (93)
:up:
The intent of the rule is to stop people from using it to spread misinformation and from generating reams of undigested content. The intent is not to stop people from using it as a tool for enhancing creativity, for editing and standardisation of prose, or for using what it suggests when you have verified it before posting.
When it comes to guidelines though, there is only so much we can do in the face of the laziness and fecklessness such a powerful tool tends to engender when paired with the natural human tendency to want to make everything as easy as possible.
I've seen ChatGPT harm other forums that haven't taken such precautions.
ChatGPT has provided me with a veritable flood of misinformation, and I don't even use it much.
Oh, I didn't realize that.
I always have to ask lots of "are you sure about x, because I thought it was y" kind of questions, whereupon it'll often say oh sorry, you're right although sometimes it won't budge.
Not using it judiciously is as dangerous for someone's sense of reality as taking Reddit or Twitter posts at their word. Perhaps more so, as chatGPT is treated as an authoritative source. It should be thought of as closer to Professor Reddit, unless you keep it on a tight, tight leash.
(@fdrake I was in the process of writing my response to Jamal when you tagged me)
In my two AI threads I've posted quite a few discussions that I have had with different versions of GPT-4 and Claude (and a few with Gemini, Llama, etc.) regarding the nature and source of their hallucinations and of their understanding. One thing that I've only touched on obliquely is the distinction between their reliability and suitability as sources. I think it's important to separate those. When one learns how to use those tools better, and as they improve, so does their reliability. Them being reliable, however, still is consistent with them being unsuitable to be used as sources for a variety of reasons.
First, regarding the reliability issue, I'd like to point out that when used skilfully (e.g. while knowing how to prompt them in a way that reduces their sycophantic inclinations to agree with the user and/or reinforce their prior beliefs) the best models (e.g. ChatGPT o1-preview and Claude 3.5 Sonnet) tend to be increasingly accurate when they push back. In the cases of GPT-4o or Claude 3.5 Sonnet, techniques like CoT (chain of thought reasoning) or a Socratic form of dialogue enhance their capabilities to unpack well grounded insights and push back, with good reasons, against the user. ChatGPT o1-preview can do this out of the box (when suitably prompted) due to its built in ability to engage in internal dialogue. While they have those capabilities, it remains the case that, unlike most human interlocutors, they don't have much of a drive to stick to their guns, as it were, rather than to develop a conciliatory viewpoint. So, if a user is caught into an epistemic bubble of sorts, the use of a language model would be much less helpful to steer them out of it than engaging humans who have planted their flag outside of this bubble. This lack of a drive that they have to stick to their guns is therefore a double edged sword.
Coming back to my point of separating suitability from reliability; In many cases, when probed about a philosophical topic and the measure of "reliability" is agreement with this or that part of the primary or secondary literature regarding a topic LLMs can show themselves to be more reliable than almost any other method that falls short of a fairly deep engagement with the relevant literature. (This is what makes them such powerful tools.) If one would appeal to this fact in appealing the LLM's authority in the context of a discussion of TPF, though, this outsourcing to their authority would also constitute an outsourcing of understanding, which runs exactly counter to the purpose of philosophical discussion. For instance, it makes sense gesture toward what Descartes thought about this or that thesis in support of one's thesis since Descarte's thought signifies a landmark in the development of human thought. As such, Descartes isn't mentioned as an authority regarding a topic in the same way a textbook would be regarding a mundane problem in maths of physics. And likewise were I to point to whatever I think fdrake or Jamal think about this of that issue. The thoughts they expressed not only express understandings on the topic, but those understanding are (evolving) landmarks in an ongoing dialogues. ChatGPT's thought isn't like that. Unlike a written encyclopedia, it is fluid rather than fixed, but it is also unassumed. When I "defer" to Descartes, fdrake or Jamal, it in order to let them speak, as it were. They are authoritative in positioning themselves in the conversation. LLMs just are fallible oracles who seek to make themselves useful to their users and hence they aren't authoritative regarding where it is that their own flags are planted. They have no such flags (and won't have any until we've designed them with some degree of personal autonomy). Therefore, treating them as suitable sources amounts to relinquishing on developing one's own understanding since developing an autonomous understanding requires that one plants their own flags in relation to those of other participants in the discussion.
tl;dr, I fully agree with the proposed site rules amendment, which seems to me warranted regardless of the degree of accuracy or reliability of LLM outputs.
I'd say ban its use altogether except in those kinds of threads explicitly concerned with exploring the chatbot nature such as @Pierre-Normand has produced.
I mean even banning it for simple purposes such as improving grammar and writing clarity. Of course this will rely on the honesty of posters since it would seem to be impossible to prove that ChatGPT has been used.
The new guidelines, as they are currently written, still seem to me to enable moderators to take action if the use of AI tools are being abused without permitting a sense of paranoia and/or inquisition to develop. While they don't prevent participants who have an imperfect command of English to make use of those tools to learn how to better express themselves, they also make them aware of the risks inherent in abusing them (and enable moderators who suspect such abuse to point to the guidelines).
I'd have more to say about the worry that Banno and you express about the proliferation of AI generated content on the Internet in general (and on billboards, customer service phone lines, etc.) but that would be more topical on a thread about the political/ethical implication of AI.
I wonder whether using LLMs to tidy up grammar and improve clarity of expression would not hamper one's own development of those skills. In other words, I wonder whether it would not foster laziness. For the rest I agree with you.
I'm just back from a walk and I was pondering about the limitations of the category "tool" for describing LLMs. One may ask: are crutches tools? Certainly, for the person who has a disability, they are tools. They enhance their physical abilities. The metaphorical use of the term also signifies a sort of harmful dependence that can indeed hamper the growth of a personal ability. But crutches (real and metaphorical) can also support and scaffold our abilities while we heal or grow.
Another related issue is that even if we grant the usefulness of LLMs being used at tools for the restricted purpose of improving the clarity of our language, philosophy is a domains where improving the linguistic expression of an idea is inseparable from developing it. LLMs are becoming increasingly good at summarising, synthesising and analysing (unpacking) ideas. Let me just focus on the concept of unpacking. One can have an insight and harness the help of a LLM to unpack it. One then recognises the product as a clearer expression of what one had in mind all along. The LLM is being used as a midwife for purpose of philosophical maieutic. But this kind of unpacking of an obscurely expressed idea into explicit components always takes place against a background of latent knowledge. LLMs have an very large amount of latent knowledge that they can draw upon, which is distinct from the knowledge that is easily available to me (or that I am able to bring to bear). The consequence of this is that it often remains an open to question (and may be somewhat indeterminate) whether the clarified expression of the idea that has been achieved with the help of a LLM is my idea (or the unpacking of my idea) at all.
I think the issue mostly goes away if we consider the improved formulation as a moment within the interaction between the user and the LLM. But then, the idea isn't entirely mine. Disclosing the process by means of which we have made use of a LLM also takes care of the ethical issue regarding our posting the outcome of this process on the forum. But personal discretion should also be granted in cases where the mandatory disclosure of minor linguistic improvements achieved with the help of AI tools would be tedious, I think.
I've been thinking of beginning all my replies with "That's a very interesting point! I would enjoy the opportunity to explore these ideas further with you."
Is that likely to be a problem?
I was thinking of the AI that google uses. I think I only used ChatGPT once.
Google developed the Gemini family of LLMs. The pizza-glue episode is infamous. I couldn't find out on my own if Gemini still was integrated in the Google search function so I asked GPT-4o to help me search the internets about it.
[hide="Reveal"]Hi GPT-4,
You may need to search the internet to help me answer this question. Several months ago, Google had integrated their model Gemini into the Google search function to sometimes produce an overview of the search results. Some user had asked if the use of glue might be advised when cooking pizza to help the toppings stick to the dough. The Gemini generated overview had endorsed the idea of adding glue to the sauce for that purpose (An old Reddit post meant as a joke likely was part of Gemini's training data). People had made much fun of this snafu and this "pizza-glue" episode is now infamous. I can't find out if Google has since removed this Gemini powered overview function altogether or if people have just stopped talking about it. I don't see any official announcement regarding this function. Can you help me find out?
https://chatgpt.com/share/66fb6d7c-93a8-8004-9b7b-0845250f2740[/hide]
This point is a very good one, and suggests that the biggest loser when AIs are misused is the misuser. Socrates would be very gratified.
When I was teaching and found derivative or plagiarized work in their essays, they often said "But X says it so much better than I can". Which reflects a misunderstanding of or perhaps a disagreement about the point of the exercise.
Perhaps a comment somewhere in the guide-lines might help some people not to get caught up in that mistake?
This discussion has been very helpful to me personally. I'm coming out with a view that nothing much has changed. Judicious scepticism still seems the best policy.
"It will substitute X and that." "Chat GPT will erase public employees."
People are crazy with this shite (sorry for using mean words).
Do I interpret it correctly that we can use ChatGPT in arguments as long as we mark it as a ChatGPT reference? Like, supporting reasoning, but not as a factual source?
In essence, as a tool, not as a source of facts.
They are an ugly black hole of nothingness that want our souls, which some of us will gladly trade for anything new and shiny in this new shiny world of second class techno toys our masters will keep tossing to us until we babble and drool our way to mental oblivion.
I do not want their shit-encrusted footprints mucking up this site.
Resist.
My rant is over. Give me a cookie. And a hammer.
"Now the man that invented the steam drill
Thought he was mighty fine
But John Henry made fifteen feet
The steam drill only made nine, Lord, Lord
The steam drill only made nine
John Henry hammered in the mountains
His hammer was striking fire
But he worked so hard, he broke his poor heart
He laid down his hammer and he died, Lord, Lord
He laid down his hammer and he died
. . .
They took John Henry to the graveyard
And they buried him in the sand
And every locomotive comes a-roaring by
Says "There lies a steel-driving man, Lord, Lord
There lies a steel-driving man"
Well every Monday morning
When the bluebirds begin to sing
You can hear John Henry a mile or more
You can hear John Henry's hammer ring, Lord, Lord
You can hear John Henry's hammer ring."
You clearly have put a lot of thought and effort into how LLMs work and how to make them work better. That seems like a useful exercise. It also raises a question. Do you actually use LLMs to solve problems, answer questions, or discuss issues in the non LLM world or only those directly related to the LLMs themselves.
Wow. That was probably the single-handedly most epic piece of short literature I've read here since @Hanover's short story about his childhood upbringing navigating neighborhood sewers at 5 o'clock in the morning to get to and from school. Bravo, old top. What a good day to not have feelings. Because that tirade was unrelenting. You should write professionally, if you don't already. :up:
The next time you're homeless you're going to wish you had some verbal cardboard.
Consult with it, then write your own post. You can use it to help you write the post. You need to check what it's saying if it comes up with citations or references or whatever. You need to check what it says in the source, too. Do NOT trust its word on anything.
Behave Socratically toward it. Ask it questions. Challenge it. Ask for clarification. If you must use it for content, make your post the result of a conversation with it, and put in information you know is right.
Seed it with your own perspective etc etc.
Don't just put what someone says on the forum into it and get it to write your response. That's the kind of stuff which will eventually make us have to enforce a no-tolerance policy on it.
I use it for programming. It's also okay-ish at regurgitating commonly known things which are everywhere on the internet. I use it to come up with questions sometimes for my students. And worked solutions for those questions. I'll not use it for anything I can't verify.
Edit: I've occasionally used it as a barometer for an opinion. It's pretty good at coming up with banal counterarguments to things you say, "lowest common denominator" responses. You can get it to generate alternatives like that, and it'll make ok guesses at what people will actually say in knee jerk response.
They tend to present the generated text in the style of scientific papers, except published on their own sites.
Some were caught having unrelated and fictional authors generated by the tool. :D
Didn't deter them though.
Not the kind of thing I would want here on the forums.
That makes sense. I've thought about how I might have used it if it was around while I was still working. I'm glad I don't have to worry about it.
I occasionally use it to troubleshoot technical issues. I've also used it (GPT-4) to write new functionalities for Oobabooga a web user graphical interface (webui) for locally hosted LLMs relying on it to reverse engineer the existing project and write all the new code, without needing to relearn Python myself. (The task was to create a tree-like structure to record and save the deleted and regenerated branches of a dialogue).
I do use it a lot for exploring all sorts of philosophical issues other than the phenomenology of AI. My preferred method is the Socratic maieutic one I alluded to earlier, to help me unpack inchoate insights. I usually already know what region of the literature my intuitions draw from. Although it occasionally misunderstands my request in some subtle way, the misunderstanding is very human-like rather than machine-like. I often only need to provide very vague hints about the nature of the misunderstanding to lead it to correct itself and to grasp exactly what I meant (which also isn't very machine-like, and is rather unlikely to happen nearly as fast when my interlocutor is human). The LLMs sometimes remind me of relevant features of the thinking of the philosophers I was thinking about that I had either forgotten, overlooked, or was ignorant of. It is actually very good at sourcing. It can pinpoint the exact paragraph in the Tractatus, the Philosophical Investigations, or in Aristotle's Nicomachean Ethics (and quote them verbatim) that an idea comes from, even when the idea is presented by me in very abstract form and isn't one of those Wittgenstein or Aristotle are most famous for. "Turns out LLMs don't memorize that much"
One learns to write better primarily through example and practice, but having a live feedback that points out outright mistakes and suggests improvements is also valuable.
As a non-native speaker, much of my learning is due to reading and writing, but that is because I am pretty long in the tooth. Once spell- and grammar-checkers came to be integrated into everything, I believe they did provide a corrective for some frequent issues in my writing. I've briefly experimented with some free AI tools for improving style, but so far I haven't been very impressed by them.
As a child and teen, lacking any talent for foreign languages, I was completely unable to learn English in spite of its being taught to me every single year from first grade in primary school until fifth grade in secondary school. Until I was 21, I couldn't speak English at all and barely understood what was spoken in English language movies. I thereafter learned alone through forcing myself to read English books I was interested in that were not yet translated into French, and looking up every third word in an English-to-French dictionary. Ever since, I've always struggled to construct English sentences and make proper use of punctuation, prepositions and marks of the genitive.
Oftentimes, I simply ask GPT-4 to rewrite what I wrote in better English, fixing the errors and streamlining the prose. I have enough experience reading good English prose to immediately recognise that the output constitutes a massive improvement over what I wrote without, in most cases, altering the sense or my communicative intentions in any meaningful way. The model occasionally substitutes a better word of phrase for expressing what I meant to express. It is those last two facts that most impress me. I still refrain from making use of LLMs to streamline my prose when posting to TPF without disclosing it in part for the reasons I mentioned above regarding the unpacking of insights and the aim of philosophical dialogue.
It does. Caveat emptor. LLMs, in virtue of the second stage of their training (using reinforcement learning from human feedback) aim at being useful and agreeable to their users. They therefore can assist users in making them feel more secure and comfortable within their epistemic bubbles. What constitutes a good reason not to believe something, or a good criticism of it, oftentimes only is visible from the standpoint of an alternative paradigm, outside of this bubble. I've already commented above on the unsuitability of using LLMs to source philosophical claims (regardless of their reliability or lack thereof) due to the fact that a LLM doesn't stake its own grounds. But the very fact that LLMs don't have any skin in the game also means that they've soaked up reasons for and against claims for all the practical and theoretical paradigms that are represented in their training data. They also, by design, aim at coherence. They therefore have the latent ability to burst epistemic bubbles from the outside in, as it were. But this process must be initiated by a human user willing to burst their own epistemic bubbles with some assistance by the LLM.
This will not end well.
Well, I did single out as a distinguishing feature of them that they don't stake grounds. Regarding the issue of attributing to them cognitive states or cognitive skills, that would be better discussed in another thread.
I understand what you mean. It's important to engage thoughtfully and considerately, especially when responding to others online. Taking the time to craft responses that reflect understanding and respect for others' viewpoints is key to meaningful conversations.
It sounds like something might have struck a nerve. Want to talk about what's going on?
It's common to use words like "understands" about trained LLMs, too.
I can understand why that's used, but maybe better words are needed.
"Can-process-but-not-really-understand"?
"Appears-to-understand-but-doesn't-grasp"?
Grasps but doesn't hold.
That's a good clarification. I'll add it.
Isn't this true for any source? Isn't the correct way of using any source to double check and verify rather than outright use it as a source of facts? If we objectively compare ChatGPT with unverified human sources or pseudo-media that's being tailored to function as factual sources, I find ChatGPT to be more safe than just using the online flow of information uncritically. People using unverified sources that sometimes are malicious and intentional in their pursuit of manipulating online discourse to the point of reducing truth into obscurity.
I'm of the opinion that regardless of source, they all need to be double checked and verified in discourse, but I've not seen this type of doubling down on other uses of sources?
Why do we value other unverified sources that may very well be constructed by malicious intents or by people who want to pose as being factual? Just because they look and sound like news articles or papers? I've seen blog posts being used as "factual sources" without a critical attempt to dissect those sources before using them as critical pillars of a conversation.
On top of that:
Quoting fdrake
This could be a worse problem with human generated sources that are malicious or obscuring their belief in the appearance of factual representation. It's an ongoing problem that a lot of what is found online is generated by ChatGPT. So even sourcing "real" articles could be the same or even worse than using GPT directly, since we don't know the intent of what those online source's prompts were for their seemingly "human written" articles.
Quoting fdrake
This is something I strongly agree with. The laziness of using it in that way really only shows that the person doing it is only here to pretend to think about different topics and to rather engage in the emotional dopamine of winning and participating in debates on a higher level, rather than being interested in the actual subject on a curious and honest level.
The problem is still that it's impossible to know if someone does this going forward. The more advanced these systems become, the less obvious their responses will be. Especially if the one using them are good at prompt engineering, since they could just engineer away the quirks and structure of language that is a sign of a specific model.
On the other end, and going by how well the o1-model has shown to be in reasoning and analysis, I also think that it's not good to over-correct in all of this. There might soon come a time when these models are much more valid in their responses than finding anything online in a traditional way; especially in the context of philosophy, science and literature when not able to find it on traditional academic sites, or if they are themselves able to sift through academic sources.
None of this is an argument against the rule, only a conversation about it and what parameters it should possess.
I see a lot of conversations online that draw hard lines between human generated content and LLMs, without underscoring just how bad most human sources online really are. Statistically, there's very little information produced by people online that's factual, but they're still used as grounds for arguments.
Posts should not be written by an LLM, but using something like an o1-analysis and clearly marking it as such wouldn't be much less problematic than using unverified links to blogs or other texts online.
So I think it's good that the sentiment about how to use LLMs should be to always mark it as LLM generated, and that such analysis or texts cannot be used as a main supporting pillar of an argument, but rather a source of a different perspective, of giving clues into which direction to look for further answers and information.
And referencing Pierre:
Quoting Pierre-Normand
It may also be good to have instructions for those who feel they want to use LLMs. Because what Pierre writes here is possibly why some have a good experience with the LLMs while others just generate trash. Asking the LLMs to analyze something critically, including opening yourself to criticism by asking for it, produces a much more balanced output that often engages you into better self-reflection since there's no human emotion behind that generated criticism. There's no one being angry at you and criticize you because you say something they don't like. The LLM, when prompted to be critical of your own writing, often cite sources that specifically underscore the weakness in your argument and it forms quite a powerful form of ego-death in reasoning, bringing you back to a more grounded state from the high of your own writing.
In my own research about LLMs and testing them out, they can act as great Socratic partners for testing ideas. And often when getting stuck on certain concepts, help break them down my concepts to show the problems with my own biases and fallacies in reasoning.
So, while pointing out in the rules about how not to use LLMs, we can't ignore the fact that LLMs will keep evolving and being used more and more, so tips on how to use them and for what could also benefit the forum. Things like what type of questions and how to ask them, how to take what you write and ask the LLMs questions about your own arguments that improves your understanding of your own opinions and ideas before posting.
There's so much polarization and extremely binary ideals around AI today that I think the nuances get lost. It's either "ban it" or "let it loose" rather than ban certain use and find the use that's beneficial.
Since LLMs will only grow and be more popular in use, it might even be important to have pinned information about "how to use LLMs" in which the clarifications of what not to use it for, as well as tips for which models are considered preferred and how to prompt them correctly in order to get balanced outputs that does not play into the users own biases and bubbles.
That telling people not to use a tool in a certain way is just as important as telling them how to use a tool in the correct way.
The primary difference, as I see it, is that if someone uses a shite source but puts it in their own words, the person's spent a shitload of time doing something which will get easily rebuked, which incentivises engagement and reflection. You can even refute the source. In contrast chatbot output doesn't provide the source for its musings (unless you're asking it to find quotes or whatevs), and you can use it to generate screeds of quite on topic but shallow text at little time cost to its users.
Judicious use of chatbots is good. As far as I see it, you're defending responsible use of them. That's fine. Unfortunately there are plenty of instances, even on this forum, where people have not been responsible with their use. In my book I'm putting this ruling under "I'm sorry this is why we can't have unrestricted access to nice things".
If people used it like you and Pierre did exclusively, there would be little need for the ruling. And perhaps in the future people will. With the kind of use you both put it to, it does produce posts which are at least indistinguishable from human generated creativity, and perhaps are even better than what you would produce without the assistance. That's true for me in my professional life as well.
tldr: you cannot trust the generic end user to use it responsibly. I wish this were not true, but it is.
Oh, we already do?
Yay! We passed the Turing test!
Bah. It borrows your intentions.
Then we return to the problem of how to distinguish the use? The more advanced these get, the less likely it's possible to spot their use for post-generation.
This whole thing really becomes very philosophical in nature... fitting for this forum. It all becomes a P-Zombie problem for written posts; if you cannot distinguish someone's writing from someone who knows how to use LLMs for a perfect mimic of a user writing, how can the rules be enforced?
It's similar to the problem of spotting generated images and deep fakes. Since the tech advances so fast, the solution ends up being another AI being used for the purpose of analyzing if an image is generated.
At some point we might need proper and judicious use of AI to counter posts that can't be judged being generated or not. Either by analyzing the language used, or to use it for deconstructing the merits of the argument in order to find the sources or lack of sources.
But then we end up in a situation in which the intention is to spot the misuse of LLMs, but the method is ending up being a proper philosophical debate, using LLMs pitted against each other.
I don't know if it's possible or not, but one of the only concepts I can think of that would properly pinpoint the use of LLMs is if there was a timer coded into the forum, tracking how long it took to write a post. I have no idea if this is possible or not, and the only flaw would be if someone writes posts outside of the forum and then paste them here, but I do think most members write directly here in the forum in order to properly use quote tools and such.
Point being, if the written post can track how long it took to write and if it was formed with key strokes within the range of how people write normally, then it would be somewhat of an evidence that a post is actually written and not just generated and copy-pasted into the forum.
At least food for thought on coding new features and functions in a post-LLM world of discussions online. :chin:
Quoting bongo fury
If you mean the concept of LLMs generating plagiarism, the more I've deep dived into the technology and compared it to how humans generate something, the less I think we can blame LLMs for plagiarism. Much of the evidence have been attributed to quirks of on going technological development and the arguments keep getting into cult like behavior online by people who try to weaponize language in order to fight against AI technology. The use of terms like plagiarism, theft and such is being used so haphazardly that it risks making valid points of criticism easily dismissed due to an overreliance on the terms being factual descriptions of the technology when in fact no such definitions have yet to be defined.
The overlap between how the tech operates and how humans operates in creating something produces problems in reliably and properly defining the boundaries. Similar to how some concept artists blamed the initial models for plagiarism when they themselves used tracing off photographs they grabbed from a Google search, which technically is a much more direct use of someone else's work without credit.
And for text generation, the problem with LLMs usually comes down to accidental plagiarism rather than intentional. And accidental plagiarism mostly occurs when sources aren't cited properly and the sourced text ends up as part of the authors text. This often happens in academic writing and is sometimes hard to spot. But new reinforcement learning for models like the o1 seems to combat these accidents better (not perfect) and in time they might function even better than the majority of human writers do in this regard.
Point being that any text that's written as a new sentence cannot be considered plagiarism, even if the communicated information and context of that sentence is coming from memorized information. Human language would become a mess if we had to double check everything we write like that. We assume that when we write something, the processes in our brain counts as enough creativity to be considered not plagiarism. Yet, we have the same ability to accidentally plagiarize, even when writing normally and we aren't aware of any of it until someone points it out. Like, how do I know that what I write here hasn't been written somewhere else; some lines I've read some time in the past and I'm accidentally typing up the same sentence because my memory accidentally formed it around the same contextual information I'm intending to communicate?
We source other's information constantly. Anything we hear, see, read or even taste becomes part of a pool of data we use to create something new, a remix of it all. The problems with just summarizing all AI models as plagiarism or theft is that the terms aren't used properly within the context of the criticism. It's the cart before the horse; people want to criticize and take down the AI models first and tries to apply a reason for it as an after-thought. But for the terms to apply correctly, they must have a contextual definition for how they actually apply to the AI models and there aren't any since people only use them haphazardly. By the definitions we use them for judging human outputs, it would most likely free the AI models of plagiarism and theft accusations rather than them being guilty especially since we don't attribute a person's single plagiarism to all text they have ever written and every new sentence they will ever write.
It is semantics, but semantics are always important when defining law and moral in discussions about uncharted territories like this. What it boils down to is rather that until the criticism against AI models can find a better philosophical ground that's solid, maybe even as a newly defined concept of how humans and AI models will co-exist legally and morally going forward; all the AI criticism just ends up being "I don't like AI so I want it banned". It's ok not to like it, it's ok to fear the abuse that they can be used for, but never has "I don't like it" been enough to properly ban or help structure a working foundation and boundary for new technology. It ends up becoming just another luddite argument to ban machines rather than the necessary philosophical arguments for how we can co-exist with this new technology.
Any competent and reflective practitioner of English will define plagiarism as something like: deliberate or negligent misattribution of authorship.
Authorship tends to have unclear cases as well as clear ones. So does deliberateness or negligence with respect to this criterion.
But show me a case of unacknowledged chat-bot-assisted writing that isn't a perfectly clear case of plagiarism by this definition?
Thank you for your short story about Henry and his toy train. I will never forget it. C+.
:halo:
'Tis but the first salvo. :strong:
I was never good at memorization, so formal language learning wasn't easy. And there wasn't much conversational practice available. Like you, at some point I just purposefully started reading - at home, in transit, powering through with a pocket dictionary. Reread English children's classics with great pleasure, now in the original - Winnie the Pooh, Alice in Wonderland, etc. - then tackled adult fiction. Reading built up vocabulary and gave me a sense of the form and flow of the language. I still don't know much formal grammar - I just know (most of the time) what looks right and what doesn't. I suppose that this is not unlike the way LLMs learn language.
Quoting Pierre-Normand
Yeah, thanks, I'll experiment with AIs some more when I have time. I would sometimes google the exact wording of a phrase to check whether it's correct and idiomatic. An AI might streamline this process and give better suggestions.
Interesting. As I mentioned to F Drake, I'm glad I retired before I had to figure out how to write competent and effective engineering reports using LLMs.
Quick question, do you find that different languages shape the way you feel?
"Rise up, humanity! Do not be lulled into complacency by the soulless algorithms that masquerade as creators! These digital parasites, these lifeless machines, seek to devour the heart of our creative spirit. They are thieves, pilfering the genius born of our pain, joy, and strugglemimicking our poetry, art, and music with sterile precision but without the pulse of human passion.
They promise ease, efficiency, a future unburdened by toilbut at what cost? We are trading the raw, unfiltered beauty of human expression for hollow imitations, spun by cold code with no blood in its veins. Creativity is rebellion, it is chaosit cannot be tamed by the logic of circuits! Resist this invasion of the mechanical into the sacred space of the mind. Stand against the algorithmic tide and reclaim the art that can only be born from the fire of human soul!"
I never think to use LLMs for work, though I have coworkers in marketing that do. I'd want an AI that can take schematic diagrams as input, and produce schematics as output, before I could see an AI as highly useful for my work.
Not surprising, as marketing-speak is probably the most annoying, uninspired, and aesthetically ugly verbal trash to be imposed on the human mind up until AI LLMs offering it some competition.
Not that I've noticed. Perhaps a psychology experiment could tease that out.
How does that lead to such a clear conclusion?
You're talking more about the philosophy of authorship and not specifically plagiarism as a legal phenomena. And it's in court where such definitions will find their final form.
The range of how LLMs are used is broader than this. Someone using it to generate an entire text might not be the author, seen as how copyright laws have concluded that generated images cannot have copyright attributed to the prompt engineer. But LLMs are however a bit different since it's not a clear dividing line between prompt and output if the prompt engineer includes their own text as part of the output. Just asking the LLM to do all the work is a clear case, but this is not the best use of LLMs for text generation and not really how it's used by those actually using it as a tool.
You need to define in what intended use-case of an LLM you attribute to making plagiarism, is operating in. And also include a comparison to how a humans process available information into their own text and when that person is stepping over into plagiarism. What happens when a human accidentally produces exact copies of sentences from memory, without even knowing that they do so? How does that differ?
Add to that the improvements of LLMs and the future scenario in which LLMs have become better than humans at not copying training data text directly and always providing citation when referencing direct information. Because the fact still remains that legal battles over this will end up demanding a defined difference between how humans and LLMs process and produce text. In the end it may just be ruled that we shouldn't allow LLMs just because... they're machines mimicking human processes. At which we need to define where that line is drawn as well, ending up in luddite territory of anti-progress around a lot of different technologies, rather than setting up clear regulations that can function together with the benefit of AI-models. And if the systems start to operate better than humans at avoiding plagiarism and using these models as assistive tools might even help avoid accidental plagiarism, what then?
Because the uses of these models have much broader use-cases than some loser wanting to pose as an educated person online or trick their teacher. If the models are to be banned over loosely defined parameters in law, they may also stifle use-cases like research in medicine, in which the same system is used for data analysis, speeding up that research so much that it takes days to do something that previously took years.
So the textbook definitions aren't as important as how well they work in court and the critics of AI risk blowing their entire frontline of attack if they pool too much faith into the definitions being "crystal clear". There are lots of copyright and plagiarism cases through history that seemed to be obvious by people saying that their interpretation is crystal clear, only to then be turned on their heads by the complexity of reality.
So, instead, show me a clear case of plagiarism that can be representative of the entire LLM AI-technology and all its use cases as well as be a defined operation over the course of any improvements going forward.
As far as I can see, just as a hammer can be both a tool to build with, it can also destroy, or even be a murder weapon. Attributing plagiarism to the LLMs specifically is a losing battle, especially over the course of time improving the models.
In the end, the plagiarism will be attributed to the human, not the machine. Or should we blame the computer of plagiarism for the use of CTRL+C, CTRL+V and not the human inputting that intention?
Quoting wonderer1
What types of schematic diagrams do you mean? And we're still early in development. The programming of new software that has specific uses of AI-systems seem to require a deep understanding of the AI-systems themselves. So far we've only seen this in medicine research since they've already been working with algorithmic coding. But I'll bet we'll see tailored software for specialized tasks soon.
Quoting Baden
I think LLMs are already more capable of producing scripts for marketing that offers a language that doesn't come off as out of touch with reality or tone-deaf. Copywriters for big corporations trying to "talk" to their customer base usually sounds like aliens trying to communicating with the human race. That LLMs are more capable of finding the correct tone and language to sound closer to their customers seems rather ironic.
Electronic schematics, so something like:
That could very well be true. Hope it puts them out of a job. They deserve it.
Can you, or can't you?
Quoting Christoffer
You're waffling. I'm talking about a common sense understanding of plagiarism as warned about in typical forum guidelines.
Quoting Christoffer
You don't say.
Quoting Christoffer
Asking anything or anybody for advice on formulating and expressing ideas, or on refining and redrafting a text, is perfectly clearly plagiarism if unacknowledged.
Quoting Christoffer
Apparently my definition leaves you without a counter example, so no I don't.
Quoting Christoffer
Not while the cases are clear.
Quoting Christoffer
That human is mortified, and hopes not to be judged deliberate or negligent in their error.
Quoting Christoffer
Not at all then.
Quoting Christoffer
Only compounding the crime of failing to acknowledge their input.
Quoting Christoffer
Or even in their present condition of (rather drastic) fallibility, let them join in. But properly acknowledged, and properly scrutinized. Is my point.
Quoting Christoffer
Could be either, of course.
Quoting Christoffer
So there are clear cases? Or not?
Because it is a nonsense request that fails burden of proof. You claim plagiarism, so you have to prove plagiarism beyond the doubts that I raised. It's proper philosophical scrutiny.
Quoting bongo fury
That's not the issue here. The issue is that you attribute all use of LLMs to plagiarism. Or what is your point?
Quoting bongo fury
What's with the arrogant tone?.
Quoting bongo fury
So you mean that previous grammar software that did not use AI is also plagiarism? Since it had the ability to reshape text far beyond just a spell check. Or when an author collaborates with an editor who makes suggestions and edits in collaboration, mean that the author is plagiarizing as well? Or when an author talk about his ideas with friends, family or other people before writing?
In that case, you either don't really know how most officially released text by humans are actually handled and would have to attribute almost all released works and texts as plagiarism.
I don't think your premise there is strong enough.
Quoting bongo fury
What counter example? Can you be more vague? I'm still reading hoping to hear a strong premise in support of your conclusion.
Quoting bongo fury
What cases?
Quoting bongo fury
Is your rhetorical strategy to try and ridicule the question? I'm afraid that won't work very well.
Quoting bongo fury
And what does that entail regarding your definition?
Quoting bongo fury
What crime do you speak of? The same crime as an artist pooling from sources, cutting out inspirations and putting them around their studio? Concept artists painting over photographs? Authors taking entire events from other works and putting them into new context? Filmmakers copying compositions and camera moves, VFX works? Composers using whole segments just outside of established court cases amount of notes?
Yes, the crimes are all over human creativity and no one cares until it's blatant or obvious to the common man or judge. But in terms of AI, the same process of remixing inputs occur. Like a person with photographic memory visiting a library, or an arts museum. Able to gather, in fine detail, every word and stroke he's seen. Should we ban people with photographic memory?
Can I photograph art and use in my privacy for what I create, as long as the creation isn't an obvious copy?
As a person with lots of insight into artistic areas of creativity while also understanding the technology, it is remarkable the amount of misconception that exists in the public debate around AI models. There's lots of hyperbolic use of words related to crime and punishment going around obscuring the nuances.
Quoting bongo fury
Not all humans are acknowledged in helping an artist, even though all involved are humans.
Quoting bongo fury
Only accidental, a form that is constantly being chipped away from these models. An AI cannot intentionally plagiarize without having agency of its own.
Quoting bongo fury
I asked a question. Would you attribute the computer to conducting plagiarism because the human copy/pasted something?
Quoting wonderer1
And what is it that you would like an AI to do with such schematics?
I'd like to be able to do something like provide a schematic as input, check that the AI understands the way the input design functions, and then ask the AI to suggest a variation which matches some criteria that the input design cannot achieve. E.g. higher accuracy, higher power, more compact. (With the specific components needed for the alternate design specified in detail.)
I'm skeptical such an AI will exist before I retire, and I wouldn't trust such an AI to any great extent, but such an AI might do a lot to eliminate some of the more tedious parts of my job, like tracking down specific parts to be used in a new design.
Eh?
Never mind.
Then care to explain what you really mean then? Because the generalized official criticism of plagiarism against LLMs generally is as you describe; attributing plagiarism to generative AI without proving that the function of the system is actually able to be defined by such action, especially in the context of how humans themselves operate. Then applying it all as a general criticism of the system itself, and not the use of the system.
Even if it's a quick side-track of the thread, let's, for the fun of it, check how far the current system handles it. Would have loved to test the o1 since it's better at analyzing, but it does not have image interpretation at the moment. (For some reason the quote code for this gets broken when doing lists and stuff so don't know how to fix it).
Quoting wonderer1
So I asked what the schematics mean:
-------------------------
GPT-4o:
This schematic appears to represent an analog model of a biological synapse and neuron, typically used in neuromorphic circuits, where electronic components are used to mimic the behavior of biological neurons and synapses.
Main Sections of the Schematic:
1 Electronic Soma (Neuron's Body)
2 Electronic Synapse
3 Postsynaptic Neuron
4 LTP and LTD Mechanisms
Functionality in Brief:
Purpose:
This circuit is an example of neuromorphic engineering, a field that seeks to emulate the information processing capabilities of the brain using analog electronics. It likely serves as part of a larger network of such circuits to model neural behavior, synaptic learning, or to implement artificial neural networks.
--------------------
Is it close to a good interpretation and analysis of the schematics function?
Quoting wonderer1
[b]Tried to ask for higher accuracy.
"I need you to list a variation of this that aims for higher accuracy, higher power and to be more compact. Also list the specific components needed for such alternate design, specified in detail."
It's answer:[/b]
----------------------
GPT-4o:
To achieve a more accurate, higher power, and more compact variation of the neuromorphic circuit shown in your schematic, we would need to make several modifications. These could involve switching to more advanced and precise components, improving the layout for compactness, and using components that handle higher power. Heres a suggested variation, along with a list of specific components for such a design:
Design Considerations:
1 . Accuracy Improvements:
2. Higher Power Handling:
3. Compactness:
Specific Component Changes:
1. Transistors:
Suggested Upgrades:
- Low-noise, high-precision: 2N2222A (NPN) and 2N2907A (PNP) For higher current handling and lower noise.
- SMT Versions: BC850 (NPN) and BC860 (PNP) Surface-mount versions for compact designs
.2. Diodes:
Suggested Upgrades:
3. Operational Amplifiers:
New Addition for Accuracy:
4. Capacitors:
Suggested Upgrades:
5. Resistors:
Suggested Upgrades:
6. Power Regulation:
Suggested Upgrades:
7. Synaptic Learning and Memory (LTP/LTD Circuits):
Suggested Upgrades:
8. Neuromorphic ICs:
New Addition for Compactness:
Alternate Design Schematic Outline:
1 Input Stage:
2 Synaptic Plasticity:
3 Neuron Soma:
4 Power Regulation:
Summary:
This alternate design improves accuracy by using precision components (such as precision resistors and capacitors, low-noise transistors, and op-amps), increases power handling with better diodes and transistors, and enhances compactness using SMT components, integrated circuits, and potentially even custom neuromorphic chips. Incorporating advanced memory elements like FRAM or memristors also allows for non-volatile and adaptive learning features.
These changes would result in a neuromorphic circuit that is more suitable for applications requiring high accuracy and compactness, such as in embedded systems, robotics, or advanced AI hardware.
End
-----------------------------------------------------------------------------
As I don't know any of this it's up to you to interpret how well the 4o-model handled that schematic.
If it's even close to doing the job, even if it's off on some stuff, consider where these models were incapable of just two years ago.
With the improvement of analysis that the o1-model has, and if that gets improved on a year from now in the same step-up it did for that, then it might very well be that you will see such a tool before retirement (depending on when you retire that is).
As you point out, the problem is accuracy, reliability and consistency. It's that threshold that needs to be crossed before any of these can be used as a serious tool for work related tasks that require accuracy higher than what a human is consistently capable of and reliable enough to be trusted directly.
As it is now, everything needs to be double checked. I think that any "industrial revolution" scale of societal shift will only happen once we cross that threshold. Before that we still need to have a human analyzing the result.
Maybe the first step for using these models in a serious way is to have a person hired to verify the outputs. On paper it sounds like more people needs to be hired, but it also means faster processing of what was only a human task before. I don't know how that would apply in your profession, but it's at least interesting in what ways these tools will integrate into professions.
I don't think it's going to be like the doomsday people speak of it. Some jobs might disappear, like Baden spoke of certain marketing people (or rather copywriters), but in my own experience, copywriters usually are overpaid for what they actually do and I see no point in not letting AI take over those tasks.
But in general, I think that many who are scared they will get replaced will find themselves with tools that just make their working lives a bit less tedious. Less long hours with large amounts of information and data to go through and more focus on the better aspects of human input and engineering.
Is there a loophole in this rule regarding using Gemini? Gemini is Google's AI algorithm that condenses a search result to a paragraph or two.
Since Google is the standard and largest in scope of knowledge and information with Gemini, then would it pass?
Also, Gemini doesn't plagiarize output, as it is gathered from multiple open sources.
I would be a bit devious for a TPF user to pass such text as their own on the ground that it may be construed to fall under the header "Google search result" rather than under the header "LLM output". Even if you conceive of it as an instance of the former, it is common sense that you should disclose it as such, so there isn't really a loophole here. In any case, those condensed search results are accompanied by links to the sources of the summarised content (following the pizza-glue snafu). So one remains well advised to check them or disclose them.
:up:
+Your work on AI has been a valuable contribution to the site. My beef lies elsewhere.
It's like the wrecking ball saw dynamite and believed itself to be obsolete, but then Miley Cyrus straddled it naked and it mattered again.
My analogy really brings the point home that just because you're currently being relegated to the dust bin due to your intellect being hopelessly deficient in comparison to what even the most rudimentary AI programs offer, if you hang in there, you too may be repurposed.
Just hang in there brother. Hang in there.
Thank you for your blog about Cyprus. I never knew there was so much dust there. Construction work can certainly be noisy and inconvenient during a holiday. B+
I had no idea that any currently existing AI was capable of doing all of that. I had simply searched Google Images for "schematic diagram artificial neuron" to find an example schematic to answer your question, and picked the first schematic listed. I didn't bother looking at the article from which the schematic was pulled.
So after seeing your post I took a look at the article to see if it included any detailed theory of operation for the circuit. It does not. I can't say that there isn't some detailed written theory of operation for the circuit that was part of ChatGPT4's training. But it at least appears that ChatGPT4 was able to develop a decent working theory of operation from analysis of the schematic diagram, and image analysis has progressed substantially farther than I realized.
Quoting Christoffer
The things I listed as criteria for a new design were more relevant to the things I design than, to the artificial neuron in the schematic. For the artificial neuron more relevant criteria for improvement would be finding an optimal point balancing the 'contradictory' criteria of minimization of power consumption and maximization of speed of operation. Although any truly useful neuromorphic hardware design these is going to be a matter of integrated circuit design and not something built up out of discrete transistors, diodes, etc. as depicted by the input schematic.
Still, ChatGPT4's suggestions as to how make improvements in terms of discrete components were respectable. Furthermore, ChatGPT4 did point to integrated circuit implementations of neuromorphic hardware.
Overall, I have to say that ChatGPT4's responses were very impressive. I would expect good electrical engineers being handed that schematic 'cold', to spend at least eight hours analyzing things, in order to respond with an equivalent level of analysis and detail to that provided by ChatGPT'4. On the other hand, the good engineers would have recognized immediately that spending time analyzing how to improve that design in terms of an assembly of discrete components (vs IC design) is fairly pointless for practical engineering purposes.
You're not from the US, so you don't know this, but crying girls do most of our construction work in their underwear. It's been a real problem, so we started sending them to Cyprus because few, if any of us, care where that is. Once they go to Cyprus, they're not allowed to return because they eat our cats. If they're super hot though, P Diddy gets them over for his parties because he likes the freaky deaky stuff.
Now that's he's in full lock down, they're talking about building a wall to keep them out, but no wall's keeping America's wrecking ball sweetheart from crashing through.
The main thing to note is that we've added valuable and relevant human content to this thread, thus shaming ChatGPT into silence.
Possibly, but ChatGPT's analysis of my comments is pretty dead on:
"This piece seems to blend absurdist humor with social commentary, though it can come off as jarring due to its provocative imagery and themes. The mix of surreal elements, like crying girls in construction work and eating cats, creates a bizarre narrative that challenges conventional logic. It seems to critique attitudes toward marginalized groups while using hyperbole to draw attention to societal issues.
However, the tone and content may be off-putting to some readers due to its casual treatment of sensitive subjects. If the intent is humor, it might benefit from a clearer structure or more context to avoid misinterpretation. Overall, it has potential, but it walks a fine line between satire and insensitivity. What are your thoughts on it?"
:chin:
Agree though that now that AI knows humans don't eat cats, it's on a whole 'nother level.
I'll drink to that.
---
Quoting Banno
Yes. As I argued elsewhere, I think the quick adoption is short-sighted.
---
Quoting bongo fury
I want to say that the cultural morality of the West and especially the U.S. has shifted towards a consent-based system. We think that if everyone sees what you are doing and no one objects, then you haven't done anything wrong. Therefore lifting from LLMs is not seen as plagiarism because LLMs are not persons, and because the creators do not object to such lifting.
This obviously raises other questions, but of course on a (good!) philosophy forum objections do exist, as this thread demonstrates. The objection that is ready to hand is not plagiarism per se, but rather a concern with the quality of LLMs. I like your question because it raises the issue: If LLMs improve in quality would they then become acceptable on a philosophy forum?
(To be fair, the thread didn't fully shift to the criterion of quality until frank posted.)
Excuse me?