Jump to content

Source code: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m Reverted 1 edit by 188.236.1.15 (talk) to last revision by ClueBot NG
rewrite from reliable sources
Tag: harv-error
Line 1: Line 1:
{{Short description|Well-structured, human-readable text that encodes behavior of a computer}}
{{Short description|Well-structured, human-readable text that encodes behavior of a computer}}
{{About|the software concept|the film|Source Code}}{{Use dmy dates|date=January 2016}}
{{About|the software concept|the film|Source Code}}{{Use dmy dates|date=January 2016}}
[[File:Hello world c.svg|thumb|Simple [[C (programming language)|C-language]] source code example, a [[procedural programming language]]. The resulting program prints "hello, world" on the computer screen. This first known "[["Hello, World!" program|Hello world]]" [[Snippet (programming)|snippet]] from the seminal book ''[[The C Programming Language]]'' originates from [[Brian Kernighan]] in the [[Bell Labs|Bell Laboratories]] in 1974.<ref name="ctutorial">{{cite web| url = http://cm.bell-labs.com/cm/cs/who/dmr/ctut.pdf| title = Programming in C: A Tutorial |first1=Brian W. |last1=Kernighan |publisher=Bell Laboratories, Murray Hill, N. J. | archive-url = https://web.archive.org/web/20150223025837/http://cm.bell-labs.com/cm/cs/who/dmr/ctut.pdf| archive-date = 23 February 2015| url-status = dead}}</ref><!-- See http://cm.bell-labs.com/cm/cs/who/dmr/ctut.pdf for original.-->]]
[[File:Hello world c.svg|thumb|Simple [[C (programming language)|C-language]] source code example, a [[procedural programming language]]. The resulting program prints "hello, world" on the computer screen. This first known "[["Hello, World!" program|Hello world]]" [[Snippet (programming)|snippet]] from the seminal book ''[[The C Programming Language]]'' originates from [[Brian Kernighan]] in the [[Bell Labs|Bell Laboratories]] in 1974.<ref name="ctutorial">{{cite web| url = http://cm.bell-labs.com/cm/cs/who/dmr/ctut.pdf| title = Programming in C: A Tutorial |first1=Brian W. |last1=Kernighan |publisher=Bell Laboratories, Murray Hill, N. J. | archive-url = https://web.archive.org/web/20150223025837/http://cm.bell-labs.com/cm/cs/who/dmr/ctut.pdf| archive-date = 23 February 2015| url-status = dead}}</ref><!-- See http://cm.bell-labs.com/cm/cs/who/dmr/ctut.pdf for original.-->]]
Line 12: Line 12:
Often, the source code of [[application software]] is not distributed or publicly available since the producer wants to protect their [[Intellectual property]] (IP). But, if the source code is available ([[open source]]), it can be useful to a [[user (computing)|user]], programmer or a [[system administrator]], any of whom might wish to study or modify the program.
Often, the source code of [[application software]] is not distributed or publicly available since the producer wants to protect their [[Intellectual property]] (IP). But, if the source code is available ([[open source]]), it can be useful to a [[user (computing)|user]], programmer or a [[system administrator]], any of whom might wish to study or modify the program.


== Definitions ==
====
The first programmable computers, which appeared at the end of the 1940s,{{sfn|Gabbrielli|Martini|2023|p=519}} were programmed in [[machine language]] (simple instructions that could be directly executed by the processor). Machine language was difficult to debug and was not [[portability (computing)|portable]] between different computer systems.{{sfn|Gabbrielli|Martini|2023|pp=520–521}} Initially, hardware resources were scarce and expensive, while [[human resources]] were cheaper.{{sfn|Gabbrielli|Martini|2023|p=522}} As programs grew more complex, [[programmer productivity]] became a bottleneck. This led to the introduction of [[high-level programming language]]s such as [[Fortran]] in the mid-1950s. These languages [[abstraction (computing)|abstracted]] away the details of the hardware, instead being designed to express algorithms that could be understood more easily by humans.{{sfn|Gabbrielli|Martini|2023|p=521}}{{sfn|Tracy|2021|p=1}} As instructions distinct from the underlying [[computer hardware]], software is therefore relatively recent, dating to these early high-level [[programming languages]] such as [[Fortran]], [[Lisp (programming language)|Lisp]], and [[Cobol]].{{sfn|Tracy|2021|p=1}} The invention of high-level programming languages was simultaneous with the [[compiler]]s needed to translate the source code automatically into machine code that can be directly executed on the [[computer hardware]].{{sfn|Tracy|2021|p=121}}


Source code is the form of code that is modified directly by humans, typically in a high-level programming language. [[Object code]] can be directly executed by the machine and is generated automatically from the source code, often via an intermediate step, [[assembly language]]. While object code will only work on a specific platform, source code can be ported to a different machine and recompiled there. For the same source code, object code can vary significantly—not only based on the machine for which it is compiled, but also based on performance optimization from the compiler.{{sfn|Lin ''et al.''|2001|pp=238-239}}{{sfn|Katyal|2019|p=1194}}
[[Richard Stallman|Richard Stallman's]] definition, formulated in his [[GPL|1989 seminal license]], proposed source code as whatever form in which software is modified:
<blockquote> The "source code" for a work means the preferred form of the work for making modifications to it.<ref>{{Cite web |date=29 June 2007 |title=The GNU General Public License v3.0 |url=https://www.gnu.org/licenses/gpl-3.0.html |url-status=live |archive-url=https://web.archive.org/web/20240115022708/http://www.gnu.org/licenses/gpl-3.0.html |archive-date=Jan 15, 2024 |website=GNU Project |publisher=Free Software Foundation}}</ref></blockquote>

Some classical sources define source code as the text form of programming languages, for example:
<blockquote>
Source code (also referred to as source or code) is the version of software as it is originally written (i.e., typed into a computer) by a human in [[plain text]] (i.e., human readable alphanumeric characters).<ref>{{cite web |url-status=live |website=The Linux Information Project |url=http://www.linfo.org/source_code.html |title=Source Code Definition |archive-url=https://web.archive.org/web/20171003223931/http://www.linfo.org/source_code.html |archive-date=3 October 2017 |orig-date=May 23, 2004 |date=February 14, 2006 }}</ref>
</blockquote>
This responds to the fact that, when [[program translation]] first appeared, the contemporary form of software production were textual programming languages, thus source code was text code while [[assembly language|machine code]] was target code. However, as programming pipelines started to incorporate more intermediate forms, some in languages like JavaScript that could be either source or target, text code stopped being synonymous with source code.

Stallman's definition thus contemplates JavaScript and HTML's source-target ambivalence, as well as contemplating possible future forms of software production, like visual programming languages, or datasets in Machine Learning.<ref>{{cite web|url=https://www.gnu.org/philosophy/free-sw.en.html|title=What is Free Software? |website=GNU |access-date=12 December 2015|archive-date=3 July 2017|archive-url=https://web.archive.org/web/20170703140224/http://www.gnu.org/philosophy/free-sw.en.html|url-status=live}}</ref><ref>{{cite web|last=Stallman |first=Richard |url=https://www.gnu.org/philosophy/javascript-trap.html |title=The JavaScript Trap |publisher=GNU Project |date=2017-11-15 |accessdate=2022-07-20}}</ref>

Other broader interpretations, however, consider source code to include the machine code along with all the high level languages that produce it, this definition undoes the original machine/text distinction by considering each step in the program translation to be source code.

<blockquote>For the purpose of clarity "source code" is taken to mean any fully executable description of a software system. It is therefore so construed as to include machine code, very high level languages and executable graphical representations of systems.

<ref>"[http://www.cs.ucl.ac.uk/staff/M.Harman/scam10.pdf Why Source Code Analysis and Manipulation Will Always Be Important]" by [[Mark Harman (computer scientist)|Mark Harman]], 10th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2010). [[Timișoara]], [[Romania]], 12–13 September 2010.</ref><ref>"[http://www.ieee-scam.org/ SCAM Working Conference]". {{Webarchive|url=https://web.archive.org/web/20170929020604/http://www.ieee-scam.org/ |date=29 September 2017 }}.</ref> </blockquote>

This approach allows for a much more flexible approach to system analysis, dispensing with the requirement for designer to collaborate by publishing a convenient form for understanding and modification. It can also be applied to scenarios where a designer is not needed, like DNA.
However, this form of analysis does not contemplate a costlier machine-to-machine code analysis than human-to-machine code analysis.

The earliest programs for [[stored-program computer]]s were entered in binary through the [[front panel]] switches of the computer. This [[first-generation programming language]] had no distinction between source code and [[machine code]].

When IBM first offered software to work with its machine, the source code was provided at no additional charge. At that time, the cost of developing and supporting software was included in the price of the hardware. For decades, IBM distributed source code with its software product licenses, until 1983.<ref>{{cite magazine|magazine=[[Computerworld]]|volume=22|issue=6|date=February 8, 1988|author=Martin Goetz|page=59|title=Object-code only: Is IBM playing fair?|quote=It was in 1983 that IBM reversed its 20-year-old policy of distributing source code with its software product licenses.|url=https://books.google.com/books?id=hSBrPSYgjI4C}}</ref>

Most early computer magazines published source code as [[type-in program]]s.

Occasionally the entire source code to a large program is published as a hardback book, such as ''Computers and Typesetting'', vol. B: ''TeX, The Program'' by [[Donald Knuth]], ''PGP Source Code and Internals'' by [[Philip Zimmermann]], ''PC SpeedScript'' by [[Randy Thompson]], and ''µC/OS, The Real-Time Kernel'' by Jean Labrosse.


== Organization ==
== Organization ==
{{main|Software configuration management}}
Most programs do not contain all the resources needed to run them and rely on external [[software library|libraries]]. Part of the compiler's function is to link these files in such a way that the program can be executed by the hardware.{{sfn|Tracy|2021|pp=122-123}}


The source code which constitutes a [[computer program|program]] is usually in one or more [[text file]]s stored on a computer [[file system]]. A larger [[codebase]] may be organized in a [[Directory (file systems)|directory tree]] known as a ''source tree''. Source code can also be stored in a [[database]], as is common for [[stored procedure]]s, or elsewhere.
[[Image:CodeCmmt002.svg|thumb|right|A more complex [[Java (programming language)|Java]] source code example. Written in [[object-oriented programming]] style, it demonstrates [[boilerplate code]]. With prologue comments indicated in red, inline comments indicated in green, and program statements indicated in blue.]]
[[Image:CodeCmmt002.svg|thumb|right|A more complex [[Java (programming language)|Java]] source code example. Written in [[object-oriented programming]] style, it demonstrates [[boilerplate code]]. With prologue comments indicated in red, inline comments indicated in green, and program statements indicated in blue.]]


Software developers often use [[Software configuration management|configuration management]] to track changes to source code files ([[version control]]). The configuration management system also keeps track of which object code file corresponds to which version of the source code file.{{sfn|O'Regan|2022|pp=230-231, 233, 377}}
A program's source code can be written in multiple programming languages.<ref>{{cite web |title=Extending and Embedding the Python Interpreter<!--mdash is in title that now has "Python 2.7.11 documentation" not: " — Python v2.6 Documentation"--> |url=https://docs.python.org/extending/ |work=Python documentation |access-date=17 August 2014 |archive-date=3 October 2012 |archive-url=https://web.archive.org/web/20121003050336/http://docs.python.org/extending/ |url-status=live }}</ref> For example, it is not uncommon for a program written primarily in [[C (programming language)|C]] to have portions written in [[assembly language]] for optimization purposes.

Some languages allow multiple languages in the same file. For example, a block of assembly embedded in a C file.

[[Library (computing)#Linking|Library linking]] allows for components to be written and compiled separately, sometimes in multiple languages, and later integrated into a program. For example, with [[Java (programming language)|Java]], classes are compiled into separate files that are linked together by the interpreter at [[Runtime (program lifecycle phase)|runtime]]. [[Microsoft Windows]] supports programs built from [[Dynamic-link library|DLLs]]; each of which can be written in any language that can be compiled to a DLL. Similarly, [[Microsoft .NET]] supports programs built from [[.NET assemblies]]; each of which can be written in any [[.NET language]].

A program's [[entry point]] can be an interpreter. The interpreter could be designed for an application-specific, custom language or for a general-purpose language so that the interpreter is can be used for multiple applications.
<ref>{{Cite web |title=Interpreter Method |url=http://www.techopedia.com/definition/7793/interpreter |access-date=2022-08-04 |website=Techopedia |date=12 August 2020 |first1=Margaret |last1=Rouse |language=en}}</ref>

Typically, source code is stored in a [[version control system]].

To produce runnable software, a complex [[codebase]] often requires building (compiling, assembling, ...) many source code files {{endash}} hundreds, thousands or more. Instructions for building, such as a [[Makefile]], are often controlled with the source code in the same version control [[Repository (version control)|repository]]. These build instruction files describe the relationships among the source code files and contain information about how they are to be built separately and then combined together.


== Purposes ==
== Purposes ==
===Estimation===

The number of lines of source code is often used as a metric when evaluating the productivity of computer programmers, the economic value of a code base, [[software development effort estimation|effort estimation]] for projects in development, and the ongoing cost of [[software maintenance]] after release.{{sfn|Foster|2014|pp=249, 274, 280, 305}}
Source code is primarily input to an computer process that ultimately results in controlling computer behavior. In other words, it is input to a [[compiler]], [[Interpreter (computing)|interpreter]] or the like.
===Communication===

It is also used to communicate [[algorithm]]s between people {{endash}} e.g., [[code snippets]] online or in books.<ref name=Spinellis>Spinellis, D: ''Code Reading: The Open Source Perspective''. Addison-Wesley Professional, 2003. {{ISBN|0-201-79940-5}}</ref>
is also used to communicate [[algorithm]]s between people {{endash}} e.g., [[code snippets]] online or in books.<ref name=Spinellis>Spinellis, D: ''Code Reading: The Open Source Perspective''. Addison-Wesley Professional, 2003. {{ISBN|0-201-79940-5}}</ref>


[[Programmer|Computer programmers]] may find it helpful to review existing source code to learn about programming techniques.<ref name=Spinellis/> The sharing of source code between developers is frequently cited as a contributing factor to the maturation of their programming skills.<ref name=Spinellis/> Some people consider source code an expressive [[Media (arts)|artistic medium]].<ref>"''Art and Computer Programming''" [http://www.onlamp.com/pub/a/onlamp/2005/06/30/artofprog.html ONLamp.com] {{Webarchive|url=https://web.archive.org/web/20180220045508/http://www.onlamp.com/pub/a/onlamp/2005/06/30/artofprog.html |date=20 February 2018 }}, (2005)</ref>
[[Programmer|Computer programmers]] may find it helpful to review existing source code to learn about programming techniques.<ref name=Spinellis/> The sharing of source code between developers is frequently cited as a contributing factor to the maturation of their programming skills.<ref name=Spinellis/> Some people consider source code an expressive [[Media (arts)|artistic medium]].<ref>"''Art and Computer Programming''" [http://www.onlamp.com/pub/a/onlamp/2005/06/30/artofprog.html ONLamp.com] {{Webarchive|url=https://web.archive.org/web/20180220045508/http://www.onlamp.com/pub/a/onlamp/2005/06/30/artofprog.html |date=20 February 2018 }}, (2005)</ref>


Source code often contains [[comment (programming)|comment]]s—blocks of text marked for the compiler to ignore. This content is not part of the program logic, but is instead intended to help readers understand the program.{{sfn|Kaczmarek ''et al.''|2018|p=68}}
[[Porting]] software to another [[computer platform]] is usually prohibitively difficult and expensive without source code. One possible porting option without source code is [[binary translation]]. An other is emulation of the original platform although this is often too computationally expensive; runs slowly.
<ref>{{Cite web |title=Software Portability – CodeProject |url=https://www.codeproject.com/Answers/803159/What-is-Portability |access-date=2022-08-04 |website=codeproject.com}}</ref>


Companies often keep the source code confidential in order to hide algorithms considered a [[trade secret]]. Proprietary, secret source code and algorithms are widely used for sensitive government applications such as [[criminal justice]], which results in [[black box]] behavior with a lack of [[Transparency (behavior)|transparency]] into the algorithm's methodology. The result is avoidance of public scrutiny of issues such as bias.{{sfn|Katyal|2019|pp=1186–1187}}
[[Decompilation]] is the process of converting machine code to a more usable form {{endash}} often to [[assembly code]] or [[High-level programming language|high-level language]] source code.
===Modification===
{{see also|Software development|Software maintenance}}
Access to the source code (not just the object code) is essential to modifying it.{{sfn|Katyal|2019|p=1195}} Understanding existing code is necessary to understand how it works{{sfn|Katyal|2019|p=1195}} and before modifying it.<ref name=Offutt>{{cite web |last1=Offutt |first1=Jeff |author1-link=Jeff Offutt |title=Overview of Software Maintenance and Evolution |url=https://cs.gmu.edu/~offutt/classes/437/maintessays/maintEvolutionOverview.html |website=[[George Mason University]] Department of Computer Science |access-date=5 May 2024 |date=January 2018}}</ref> The rate of understanding depends both on the code base as well as the skill of the programmer.{{sfn|Tripathy |Naik|2014|p=296}} Experienced programmers have an easier time understanding what the code does at a high level.{{sfn|Tripathy |Naik|2014|p=297}} [[Software visualization]] is sometimes used to speed up this process.{{sfn|Tripathy |Naik|2014|pp=318-319}}


Many software programmers use an [[integrated development environment]] (IDE) to improve their productivity. IDEs typically have several features built in, including a [[source-code editor]] that can alert the programmer to common errors.{{sfn|O'Regan|2022|p=375}} Modification often includes [[code refactoring]] (improving the structure without changing functionality) and restructuring (improving structure and functionality at the same time). {{sfn|Tripathy |Naik|2014|p=94}} Nearly every change to code will introduce new bugs or unexpected [[ripple effect]]s, which require another round of fixes.<ref name=Offutt/>
[[Software reusability]] describes the practice of using existing software in another software system via a [[software library]]. Some may consider reuse to include adapting source code from one piece of software to another.


[[Code review]]s by other developers are often used to scrutinize new code added to a project.{{sfn|Dooley|2017|p=272}} The purpose of this phase is often to verify that the code meets style and [[maintainability]] standards and that it is a correct implementation of the [[software design]].{{sfn|O'Regan|2022|pp=18, 21}} According to some estimates, code review dramatically reduce the number of bugs persisting after [[software testing]] is complete.{{sfn|Dooley|2017|p=272}} Along with software testing that works by executing the code, [[static program analysis]] uses automated tools to detect problems with the source code. Many IDEs support code analysis tools, which might provide metrics on the clarity and maintainability of the code.{{sfn|O'Regan|2022|p=133}} [[Debuggers]] are tools that often enable programmers to step through execution while keeping track of which source code corresponds to each change of state.{{sfn|Kaczmarek ''et al.''|2018|pp=348-349}}
== Legal aspects ==


===Compilation and execution===
{{see also|History of free and open-source software}}
Source code files in a high-level programming language must go through a stage of preprocessing into [[machine code]] before the instructions can be carried out.{{sfn|Tracy|2021|p=121}} After being compiled, the program can be saved as an [[object file]] and the [[Loader (computing)|loader]] (part of the operating system) can take this saved file and [[execution (computing)|execute]] it as a [[process]] on the computer hardware.{{sfn|Tracy|2021|pp=122-123}} Some programming languages use an [[Interpreter (computing)|interpreter]] instead of a compiler. An interpreter converts the program into machine code at [[execution (computing)|run time]], which makes them 10 to 100 times slower than compiled programming languages.{{sfn|O'Regan|2022|p=375}}{{sfn|Sebesta|2012|p=28}}


==Quality==
The situation varies worldwide, but in the United States before 1974, software and its source code was not [[copyright]]able and therefore always [[public domain software]].<ref>{{Cite journal|last1=Liu |first1=Joseph P.|last2=Dogan |first2= Stacey L.|date=2005|title=Copyright Law and Subject Matter Specificity: The Case of Computer Software|url=https://lawdigitalcommons.bc.edu/lsfp/536/|journal=New York University Annual Survey of American Law|language=en|volume=61|issue=2 |url-status=dead |archive-url=https://web.archive.org/web/20210625073240/https://lawdigitalcommons.bc.edu/lsfp/536/ |archive-date=Jun 25, 2021}}</ref>
{{main article|Software quality}}


[[Software quality]] is an overarching term that can refer to a code's correct and efficient behavior, its reusability and [[porting|portability]], or the ease of modification.{{sfn|Galin|2018|p=26}} It is usually more cost-effective to build quality into the product from the beginning rather than try to add it later in the development process.{{sfn|O'Regan|2022|pp=68, 117}} Higher quality code will reduce lifetime cost to both suppliers and customers as it is more reliable and [[maintainability|easier to maintain]].{{sfn|O'Regan|2022|pp=3, 268}}{{sfn|Varga|2018|p=12}}
In 1974, the US Commission on New Technological Uses of Copyrighted Works (CONTU) decided that "computer programs, to the extent that they embody an author's original creation, are proper subject matter of copyright".<ref>[http://digitalcommons.law.ggu.edu/cgi/viewcontent.cgi?article=1344&context=ggulrev Apple Computer, Inc. v. Franklin Computer Corporation Puts the Byte Back into Copyright Protection for Computer Programs] {{Webarchive|url=https://web.archive.org/web/20170507231059/http://digitalcommons.law.ggu.edu/cgi/viewcontent.cgi?article=1344&context=ggulrev |date=7 May 2017 }} in Golden Gate University Law Review Volume 14, Issue 2, Article 3 by Jan L. Nussbaum (January 1984)</ref><ref name="sail_book">Lemley, Menell, Merges and Samuelson. ''Software and Internet Law'', p. 34.</ref>


Maintainability is the quality of software enabling it to be easily modified without breaking existing functionality.{{sfn|Varga|2018|p=5}} Following coding conventions such as using clear function and variable names that correspond to their purpose makes maintenance easier.{{sfn|Tripathy |Naik|2014|pp=296-297}} Use of [[conditional loop]] statements only if the code could execute more than once, and eliminating code that will never execute can also increase understandability.{{sfn|Tripathy |Naik|2014|p=309}} Many software development organizations neglect maintainability during the development phase, even though it will increase long-term costs.{{sfn|Varga|2018|p=12}} [[Technical debt]] is incurred when programmers, often out of laziness or urgency to meet a deadline, choose quick and dirty solutions rather than build maintainability into their code.{{sfn|Varga|2018|pp=6-7}} A common cause is underestimates in [[software development effort estimation]], leading to insufficient resources allocated to development.{{sfn|Varga|2018|p=7}} A challenge with maintainability is that many [[software engineering]] courses do not emphasize it.{{sfn|Varga|2018|pp=7-8}} Development engineers who know that they will not be responsible for maintaining the software do not have an incentive to build in maintainability.<ref name=Offutt/>
In 1983 in the United States court case ''[[Apple v. Franklin]]'' it was ruled that the same applied to [[object code]]; and that the Copyright Act gave computer programs the copyright status of literary works.


== Copyright and licensing ==
In 1999, in the United States court case ''[[Bernstein v. United States]]'' it was further ruled that source code could be considered a constitutionally protected form of [[free speech]]. Proponents of free speech argued that because source code conveys information to programmers, is written in a language, and can be used to share humor and other artistic pursuits, it is a protected form of communication.<ref>{{cite web |url=http://cr.yp.to/export/2002/08.02-bernstein-subst.pdf |title=Info |publisher=cr.yp.to |access-date=2019-12-27 |archive-date=7 June 2011 |archive-url=https://web.archive.org/web/20110607045922/http://cr.yp.to/export/2002/08.02-bernstein-subst.pdf |url-status=live }}</ref><ref>[https://www.eff.org/cases/bernstein-v-us-dept-justice Bernstein v. US Department of Justice] {{Webarchive|url=https://web.archive.org/web/20180404070256/https://www.eff.org/cases/bernstein-v-us-dept-justice |date=4 April 2018 }} on eff.org</ref><ref>[https://www.eff.org/deeplinks/2015/04/remembering-case-established-code-speech EFF at 25: Remembering the Case that established Code as Speech] {{Webarchive|url=https://web.archive.org/web/20180105193250/https://www.eff.org/deeplinks/2015/04/remembering-case-established-code-speech |date=5 January 2018 }} on EFF.org by Alison Dame-Boyle (16 April 2015)</ref>
{{main|Software copyright|Software license}}
{{see also|History of free and open-source software}}


The situation varies worldwide, but in the United States before 1974, software and its source code was not [[copyright]]able and therefore always [[public domain software]].<ref>{{Cite journal|last1=Liu |first1=Joseph P.|last2=Dogan |first2= Stacey L.|date=2005|title=Copyright Law and Subject Matter Specificity: The Case of Computer Software|url=https://lawdigitalcommons.bc.edu/lsfp/536/|journal=New York University Annual Survey of American Law|language=en|volume=61|issue=2 |url-status=dead |archive-url=https://web.archive.org/web/20210625073240/https://lawdigitalcommons.bc.edu/lsfp/536/ |archive-date=Jun 25, 2021}}</ref> In 1974, the US Commission on New Technological Uses of Copyrighted Works (CONTU) decided that "computer programs, to the extent that they embody an author's original creation, are proper subject matter of copyright".<ref>[http://digitalcommons.law.ggu.edu/cgi/viewcontent.cgi?article=1344&context=ggulrev Apple Computer, Inc. v. Franklin Computer Corporation Puts the Byte Back into Copyright Protection for Computer Programs] {{Webarchive|url=https://web.archive.org/web/20170507231059/http://digitalcommons.law.ggu.edu/cgi/viewcontent.cgi?article=1344&context=ggulrev |date=7 May 2017 }} in Golden Gate University Law Review Volume 14, Issue 2, Article 3 by Jan L. Nussbaum (January 1984)</ref><ref name="sail_book">Lemley, Menell, Merges and Samuelson. ''Software and Internet Law'', p. 34.</ref>
=== Licensing ===
{{main article|Software license}}
{{quotebox|title=Copyright notice example:<ref>{{cite web |url=https://www.apache.org/licenses/LICENSE-2.0 |title=License |publisher=www.apache.org |access-date=2019-12-27 |archive-date=23 September 2015 |archive-url=https://web.archive.org/web/20150923172828/http://www.apache.org/licenses/LICENSE-2.0 |url-status=live }}</ref>|
Copyright [yyyy] [name of copyright owner]


[[Proprietary software]] is rarely distributed as source code.{{sfn|Boyle|2003|p=45}} Although the term [[open-source software]] literally refers to [[source-available software|public access to the source code]],{{sfn|Morin ''et al.''|2012|loc=Open Source versus Closed Source}} open-source software has additional requirements: free redistribution, permission to modify the source code and release derivative works under the same license, and nondiscrimination between different uses—including commercial use.{{sfn|Sen ''et al.''|2008|p=209}}{{sfn|Morin ''et al.''|2012|loc=Free and Open Source Software (FOSS) Licensing}} The free [[software reuse|reusability]] of open-source software can speed up development.{{sfn|O'Regan|2022|p=106}}
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

: <nowiki>http://www.apache.org/licenses/LICENSE-2.0</nowiki>

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
|width=300px}}
<!-- FIXME: Since not a quote, different template should be used... -->

An author of a [[Threshold of originality|non-trivial work]] like software,<ref name="sail_book"/> has several [[exclusive right]]s, among them the copyright for the source code and [[object code]].<ref name="fsm">{{cite web |url=http://www.freesoftwaremagazine.com/articles/what_if_copyright_didnt_apply_binary_executables |title=What if copyright didn't apply to binary executables? |date=2008-08-29 |first=Terry |last=Hancock |publisher=[[Free Software Magazine]] |access-date=2016-01-25 |archive-date=25 January 2016 |archive-url=https://web.archive.org/web/20160125013542/http://www.freesoftwaremagazine.com/articles/what_if_copyright_didnt_apply_binary_executables |url-status=dead }}</ref> The author has the right and possibility to grant customers and users of his software some of his exclusive rights in form of [[software licensing]]. Software, and its accompanying source code, can be associated with several licensing paradigms; the most important distinction is [[free software]] vs [[proprietary software]]. This is done by including a [[copyright notice]] that declares licensing terms. If no notice is found, then the default of ''[[All rights reserved]]'' is implied.

Generally speaking, a software is free software if its users are free to use it for any purpose, study and change its source code, give or sell its exact copies, and give or sell its modified copies. Software is ''proprietary'' if it is distributed while the source code is kept secret, or is privately owned and restricted. One of the first software licenses to be published and to explicitly grant these freedoms was the [[GNU General Public License]] in 1989; the [[BSD license]] is another early example from 1990.

For proprietary software, the provisions of the various copyright laws, [[trade secret|trade secrecy]] and [[patent]]s are used to keep the source code closed. Additionally, many pieces of [[retail software]] come with an [[end-user license agreement]] (EULA) which typically prohibits [[decompilation]], [[reverse engineering]], analysis, modification, or circumventing of [[copy protection]]. Types of source code protection—beyond traditional [[compiler|compilation]] to [[object code]]—include code encryption, [[code obfuscation]] or [[code morphing]].

== Quality ==
{{main article|Software quality}}
The way a program is written can have important consequences for its maintainers. [[Coding conventions]], which stress [[readability]] and some language-specific conventions, are aimed at the maintenance of the software source code, which involves [[debugging]] and [[Patch (computing)|updating]]. Other priorities, such as the speed of the program's execution, or the ability to compile the program for multiple architectures, often make code readability a less important consideration, since code ''quality'' generally depends on its ''purpose''.


== See also ==
== See also ==
Line 138: Line 84:


=== Sources ===
=== Sources ===
{{refbegin|indent=yes}}
* (VEW04) "Using a Decompiler for Real-World Source Recovery", M. Van Emmerik and T. Waddington, the ''Working Conference on Reverse Engineering'', [[Delft]], [[Netherlands]], 9–12 November 2004. [https://web.archive.org/web/20060108153532/http://www.itee.uq.edu.au/~emmerik/experience_long.pdf Extended version of the paper].
*{{cite book |last1=Ablon |first1=Lillian |last2=Bogart |first2=Andy |title=Zero Days, Thousands of Nights: The Life and Times of Zero-Day Vulnerabilities and Their Exploits |date=2017 |publisher=Rand Corporation |isbn=978-0-8330-9761-3 |language=en|url=https://www.rand.org/content/dam/rand/pubs/research_reports/RR1700/RR1751/RAND_RR1751.pdf}}

*{{cite book |last1=Campbell-Kelly |first1=Martin |last2=Garcia-Swartz |first2=Daniel D. |title=From Mainframes to Smartphones: A History of the International Computer Industry |date=2015 |publisher=Harvard University Press |isbn=978-0-674-28655-9 |language=en}}
*{{cite book |last1=Daswani |first1=Neil|authorlink=Neil Daswani |last2=Elbayadi |first2=Moudy |title=Big Breaches: Cybersecurity Lessons for Everyone |date=2021 |publisher=Apress |isbn=978-1-4842-6654-0}}
*{{Cite book |last=Dooley |first=John F. |title=Software Development, Design and Coding: With Patterns, Debugging, Unit Testing, and Refactoring |date=2017 |publisher=Apress |isbn=978-1-4842-3153-1 |language=en}}
*{{cite book |last1=Foster |first1=Elvis C. |title=Software Engineering: A Methodical Approach |date=2014 |publisher=Apress |language=en|isbn= 978-1-4842-0847-2}}
*{{cite book |last1=Galin |first1=Daniel |title=Software Quality: Concepts and Practice |date=2018 |publisher=John Wiley & Sons |isbn=978-1-119-13449-7 |language=en}}
*{{cite book |last1=Haber |first1=Morey J. |last2=Hibbert |first2=Brad |title=Asset Attack Vectors: Building Effective Vulnerability Management Strategies to Protect Organizations |date=2018 |publisher=Apress |isbn=978-1-4842-3627-7 |language=en}}
*{{cite book |last1=Kaczmarek |first1=Stefan |last2=Lees |first2=Brad |last3=Bennett |first3=Gary |last4=Fisher |first4=Mitch |title=Objective-C for Absolute Beginners: iPhone, iPad and Mac Programming Made Easy |date=2018 |publisher=Apress |isbn=978-1-4842-3428-0 |ref={{sfnref|Kaczmarek ''et al.''|2018}} |language=en}}
*{{cite journal |last1=Katyal |first1=Sonia K. |title=The Paradox of Source Code Secrecy |journal=Cornell Law Review |date=2019 |volume=104 |pages=1183 |url=https://heinonline.org/HOL/LandingPage?handle=hein.journals/clqv104&div=32&id=&page=}}
*{{cite book |last1=Kitchin |first1=Rob |last2=Dodge |first2=Martin |title=Code/space: Software and Everyday Life |date=2011 |publisher=MIT Press |isbn=978-0-262-04248-2 |language=en}}
*{{cite journal |last1=Lin |first1=Daniel |last2=Sag |first2=Matthew |last3=Laurie |first3=Ronald S. |title=Source Code versus Object Code: Patent Implications for the Open Source Community |journal=Santa Clara Computer and High Technology Law Journal |date=2001 |volume=18 |pages=235 |url=https://heinonline.org/HOL/LandingPage?handle=hein.journals/sccj18&div=16&id=&page=|ref={{sfnref|Lin et al.|2001}}}}
*{{cite journal |last1=Morin |first1=Andrew |last2=Urban |first2=Jennifer |last3=Sliz |first3=Piotr |title=A Quick Guide to Software Licensing for the Scientist-Programmer |journal=PLOS Computational Biology |date=2012 |volume=8 |issue=7 |pages=e1002598 |doi=10.1371/journal.pcbi.1002598 |url=https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002598 |language=en |issn=1553-7358|ref={{sfnref|Morin et al.|2012}}}}
*{{cite book |last1=O'Regan |first1=Gerard |title=Concise Guide to Software Engineering: From Fundamentals to Application Methods |date=2022 |publisher=Springer Nature |isbn=978-3-031-07816-3 |language=en}}
*{{cite journal | last=Sen | first=Ravi | last2=Subramaniam | first2=Chandrasekar | last3=Nelson | first3=Matthew L. | title=Determinants of the Choice of Open Source Software License | journal=Journal of Management Information Systems | publisher=Informa UK Limited | volume=25 | issue=3 | year=2008 | issn=0742-1222 | doi=10.2753/mis0742-1222250306 | pages=207–240|ref={{sfnref|Sen et al.|2008}}}}
* {{cite book |last1=Sebesta |first1=Robert W. |title=Concepts of Programming Languages |date=2012 |publisher=Addison-Wesley |isbn=978-0-13-139531-2 |edition=10 |language=en}}
*{{cite book |last1=Tracy |first1=Kim W. |title=Software: A Technical History |date=2021 |publisher=Morgan & Claypool Publishers |isbn=978-1-4503-8724-8 |language=en}}
*{{cite book |last1=Tripathy |first1=Priyadarshi |last2=Naik |first2=Kshirasagar |title=Software Evolution and Maintenance: A Practitioner's Approach |date=2014 |publisher=John Wiley & Sons |isbn=978-0-470-60341-3 |language=en}}
*{{cite book |last1=Varga |first1=Ervin |title=Unraveling Software Maintenance and Evolution: Thinking Outside the Box |date=2018 |publisher=Springer |isbn=978-3-319-71303-8 |language=en}}
{{refend}}
== External links ==
== External links ==
{{Wiktionary|code|source code}}
{{Wiktionary|code|source code}}
{{Commons category}}
{{Commons category}}
* [http://www.linfo.org/source_code.html Source Code Definition] by The Linux Information Project (LINFO)
* {{cite web| title=Obligatory accreditation system for IT security products
|date=22 September 2008
|quote=will introduce rules requiring foreign firms to disclose secret information about digital household appliances and other products from May next year, the [[Yomiuri Shimbun]] said, citing unnamed sources. If a company refuses to disclose information, China would ban it from exporting the product to the Chinese market or producing or selling it in China, the paper said. <!--|access-date=24 April 2009
Text was "may start from May 2009, reported by Yomiuri on 2009-04-24."-->|url=http://www.metafilter.com/75061/Obligatory-accreditation-system-for-IT-security-products |publisher=MetaFilter.com}}
* [http://rosettacode.org/wiki/Main_Page Same program written in multiple languages]


{{Authority control}}
{{Authority control}}

Revision as of 13:59, 31 May 2024

Simple C-language source code example, a procedural programming language. The resulting program prints "hello, world" on the computer screen. This first known "Hello world" snippet from the seminal book The C Programming Language originates from Brian Kernighan in the Bell Laboratories in 1974.[1]

In computing, source code, or simply code or source, is text (usually plain text) that conforms to a human-readable programming language and specifies the behavior of a computer. A programmer writes code to produce a program that runs on a computer.

Since a computer, at base, only understands machine code, source must be translated in order to be used by the computer and this can be implemented in a variety of ways depending on available technology. Source code can be converted by a compiler or an assembler into machine code that can be directly executed. Alternatively, source code can be processed without conversion to machine code via an interpreter that performs the actions prescribed by the source code via the interpreter's machine code. Other technology (i.e. bytecode) incorporates both mechanisms by converting the source code to an intermediate form that is often not human-readable but also not machine code and an interpreter executes the intermediate form.

Most languages allow for comments. The programmer can add comments to document the source code for themself and for other programmers reading the code. Comments cannot be represented in machine code, and therefore, are ignored by compilers, interpreters and the like.

Often, the source code of application software is not distributed or publicly available since the producer wants to protect their Intellectual property (IP). But, if the source code is available (open source), it can be useful to a user, programmer or a system administrator, any of whom might wish to study or modify the program.

Background

The first programmable computers, which appeared at the end of the 1940s,[2] were programmed in machine language (simple instructions that could be directly executed by the processor). Machine language was difficult to debug and was not portable between different computer systems.[3] Initially, hardware resources were scarce and expensive, while human resources were cheaper.[4] As programs grew more complex, programmer productivity became a bottleneck. This led to the introduction of high-level programming languages such as Fortran in the mid-1950s. These languages abstracted away the details of the hardware, instead being designed to express algorithms that could be understood more easily by humans.[5][6] As instructions distinct from the underlying computer hardware, software is therefore relatively recent, dating to these early high-level programming languages such as Fortran, Lisp, and Cobol.[6] The invention of high-level programming languages was simultaneous with the compilers needed to translate the source code automatically into machine code that can be directly executed on the computer hardware.[7]

Source code is the form of code that is modified directly by humans, typically in a high-level programming language. Object code can be directly executed by the machine and is generated automatically from the source code, often via an intermediate step, assembly language. While object code will only work on a specific platform, source code can be ported to a different machine and recompiled there. For the same source code, object code can vary significantly—not only based on the machine for which it is compiled, but also based on performance optimization from the compiler.[8][9]

Organization

Most programs do not contain all the resources needed to run them and rely on external libraries. Part of the compiler's function is to link these files in such a way that the program can be executed by the hardware.[10]

A more complex Java source code example. Written in object-oriented programming style, it demonstrates boilerplate code. With prologue comments indicated in red, inline comments indicated in green, and program statements indicated in blue.

Software developers often use configuration management to track changes to source code files (version control). The configuration management system also keeps track of which object code file corresponds to which version of the source code file.[11]

Purposes

Estimation

The number of lines of source code is often used as a metric when evaluating the productivity of computer programmers, the economic value of a code base, effort estimation for projects in development, and the ongoing cost of software maintenance after release.[12]

Communication

Source code is also used to communicate algorithms between people – e.g., code snippets online or in books.[13]

Computer programmers may find it helpful to review existing source code to learn about programming techniques.[13] The sharing of source code between developers is frequently cited as a contributing factor to the maturation of their programming skills.[13] Some people consider source code an expressive artistic medium.[14]

Source code often contains comments—blocks of text marked for the compiler to ignore. This content is not part of the program logic, but is instead intended to help readers understand the program.[15]

Companies often keep the source code confidential in order to hide algorithms considered a trade secret. Proprietary, secret source code and algorithms are widely used for sensitive government applications such as criminal justice, which results in black box behavior with a lack of transparency into the algorithm's methodology. The result is avoidance of public scrutiny of issues such as bias.[16]

Modification

Access to the source code (not just the object code) is essential to modifying it.[17] Understanding existing code is necessary to understand how it works[17] and before modifying it.[18] The rate of understanding depends both on the code base as well as the skill of the programmer.[19] Experienced programmers have an easier time understanding what the code does at a high level.[20] Software visualization is sometimes used to speed up this process.[21]

Many software programmers use an integrated development environment (IDE) to improve their productivity. IDEs typically have several features built in, including a source-code editor that can alert the programmer to common errors.[22] Modification often includes code refactoring (improving the structure without changing functionality) and restructuring (improving structure and functionality at the same time). [23] Nearly every change to code will introduce new bugs or unexpected ripple effects, which require another round of fixes.[18]

Code reviews by other developers are often used to scrutinize new code added to a project.[24] The purpose of this phase is often to verify that the code meets style and maintainability standards and that it is a correct implementation of the software design.[25] According to some estimates, code review dramatically reduce the number of bugs persisting after software testing is complete.[24] Along with software testing that works by executing the code, static program analysis uses automated tools to detect problems with the source code. Many IDEs support code analysis tools, which might provide metrics on the clarity and maintainability of the code.[26] Debuggers are tools that often enable programmers to step through execution while keeping track of which source code corresponds to each change of state.[27]

Compilation and execution

Source code files in a high-level programming language must go through a stage of preprocessing into machine code before the instructions can be carried out.[7] After being compiled, the program can be saved as an object file and the loader (part of the operating system) can take this saved file and execute it as a process on the computer hardware.[10] Some programming languages use an interpreter instead of a compiler. An interpreter converts the program into machine code at run time, which makes them 10 to 100 times slower than compiled programming languages.[22][28]

Quality

Software quality is an overarching term that can refer to a code's correct and efficient behavior, its reusability and portability, or the ease of modification.[29] It is usually more cost-effective to build quality into the product from the beginning rather than try to add it later in the development process.[30] Higher quality code will reduce lifetime cost to both suppliers and customers as it is more reliable and easier to maintain.[31][32]

Maintainability is the quality of software enabling it to be easily modified without breaking existing functionality.[33] Following coding conventions such as using clear function and variable names that correspond to their purpose makes maintenance easier.[34] Use of conditional loop statements only if the code could execute more than once, and eliminating code that will never execute can also increase understandability.[35] Many software development organizations neglect maintainability during the development phase, even though it will increase long-term costs.[32] Technical debt is incurred when programmers, often out of laziness or urgency to meet a deadline, choose quick and dirty solutions rather than build maintainability into their code.[36] A common cause is underestimates in software development effort estimation, leading to insufficient resources allocated to development.[37] A challenge with maintainability is that many software engineering courses do not emphasize it.[38] Development engineers who know that they will not be responsible for maintaining the software do not have an incentive to build in maintainability.[18]

The situation varies worldwide, but in the United States before 1974, software and its source code was not copyrightable and therefore always public domain software.[39] In 1974, the US Commission on New Technological Uses of Copyrighted Works (CONTU) decided that "computer programs, to the extent that they embody an author's original creation, are proper subject matter of copyright".[40][41]

Proprietary software is rarely distributed as source code.[42] Although the term open-source software literally refers to public access to the source code,[43] open-source software has additional requirements: free redistribution, permission to modify the source code and release derivative works under the same license, and nondiscrimination between different uses—including commercial use.[44][45] The free reusability of open-source software can speed up development.[46]

See also

References

  1. ^ Kernighan, Brian W. "Programming in C: A Tutorial" (PDF). Bell Laboratories, Murray Hill, N. J. Archived from the original (PDF) on 23 February 2015.
  2. ^ Gabbrielli & Martini 2023, p. 519.
  3. ^ Gabbrielli & Martini 2023, pp. 520–521.
  4. ^ Gabbrielli & Martini 2023, p. 522.
  5. ^ Gabbrielli & Martini 2023, p. 521.
  6. ^ a b Tracy 2021, p. 1.
  7. ^ a b Tracy 2021, p. 121.
  8. ^ Lin et al. 2001, pp. 238–239.
  9. ^ Katyal 2019, p. 1194.
  10. ^ a b Tracy 2021, pp. 122–123.
  11. ^ O'Regan 2022, pp. 230–231, 233, 377.
  12. ^ Foster 2014, pp. 249, 274, 280, 305.
  13. ^ a b c Spinellis, D: Code Reading: The Open Source Perspective. Addison-Wesley Professional, 2003. ISBN 0-201-79940-5
  14. ^ "Art and Computer Programming" ONLamp.com Archived 20 February 2018 at the Wayback Machine, (2005)
  15. ^ Kaczmarek et al. 2018, p. 68.
  16. ^ Katyal 2019, pp. 1186–1187.
  17. ^ a b Katyal 2019, p. 1195.
  18. ^ a b c Offutt, Jeff (January 2018). "Overview of Software Maintenance and Evolution". George Mason University Department of Computer Science. Retrieved 5 May 2024.
  19. ^ Tripathy & Naik 2014, p. 296.
  20. ^ Tripathy & Naik 2014, p. 297.
  21. ^ Tripathy & Naik 2014, pp. 318–319.
  22. ^ a b O'Regan 2022, p. 375.
  23. ^ Tripathy & Naik 2014, p. 94.
  24. ^ a b Dooley 2017, p. 272.
  25. ^ O'Regan 2022, pp. 18, 21.
  26. ^ O'Regan 2022, p. 133.
  27. ^ Kaczmarek et al. 2018, pp. 348–349.
  28. ^ Sebesta 2012, p. 28.
  29. ^ Galin 2018, p. 26.
  30. ^ O'Regan 2022, pp. 68, 117.
  31. ^ O'Regan 2022, pp. 3, 268.
  32. ^ a b Varga 2018, p. 12.
  33. ^ Varga 2018, p. 5.
  34. ^ Tripathy & Naik 2014, pp. 296–297.
  35. ^ Tripathy & Naik 2014, p. 309.
  36. ^ Varga 2018, pp. 6–7.
  37. ^ Varga 2018, p. 7.
  38. ^ Varga 2018, pp. 7–8.
  39. ^ Liu, Joseph P.; Dogan, Stacey L. (2005). "Copyright Law and Subject Matter Specificity: The Case of Computer Software". New York University Annual Survey of American Law. 61 (2). Archived from the original on 25 June 2021.
  40. ^ Apple Computer, Inc. v. Franklin Computer Corporation Puts the Byte Back into Copyright Protection for Computer Programs Archived 7 May 2017 at the Wayback Machine in Golden Gate University Law Review Volume 14, Issue 2, Article 3 by Jan L. Nussbaum (January 1984)
  41. ^ Lemley, Menell, Merges and Samuelson. Software and Internet Law, p. 34.
  42. ^ Boyle 2003, p. 45.
  43. ^ Morin et al. 2012, Open Source versus Closed Source.
  44. ^ Sen et al. 2008, p. 209.
  45. ^ Morin et al. 2012, Free and Open Source Software (FOSS) Licensing.
  46. ^ O'Regan 2022, p. 106.

Sources