Nutritive Polypeptides and Formulations Thereof, and Methods of Production and Use Thereof

Nutritive polypeptides are provided herein. Also provided are various other embodiments including nucleic acids encoding the polypeptides, recombinant microorganisms that make the polypeptides, vectors for expressing the polypeptides, methods of making the polypeptides using recombinant microorganisms, compositions and formulations that comprise the polypeptides, and methods of using the polypeptides, compositions and formulations.

Skip to: Description  ·  Claims  · Patent History  ·  Patent History
Description
CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No. 15/081,004, filed Mar. 25, 2016, which is a continuation in part PCT/US2014/057528, filed Sep. 25, 2014, PCT/US2014/057527, filed Sep. 25, 2014, and PCT/US2014/057526, filed Sep. 25, 2014, which claim priority to U.S. Provisional Application No. 61/906,862, filed Nov. 20, 2013, U.S. Provisional Application No. 61/882,305, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,295, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,300, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,222, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,212, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,198, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,189, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,180, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,274, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,271, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,267, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,264, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,260, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,254, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,250, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,246, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,243, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,129, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,240, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,235, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,234, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,232, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,229, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,225, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,220, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,219, filed Sep. 25, 2013, U.S. Provisional Application No. 61/882,214, filed Sep. 25, 2013, and U.S. Provisional Application No. 61/882,211, filed Sep. 25, 2013; the entire disclosures of which are hereby incorporated by reference in their entirety for all purposes.

SEQUENCE LISTING

The instant application contains a “lengthy” Sequence Listing which has been submitted via CD-R in lieu of a printed paper copy, and is hereby incorporated by reference in its entirety. Said CD-R, recorded on Sep. 26, 2019, are labeled “CRF”, “Copy 1” and “Copy 2”, respectively, and each contains only one identical 174,382,592 bytes file (AXC-088CPC1_SL.txt).

BACKGROUND

Dietary protein is an essential nutrient for human health and growth. The World Health Organization recommends that dietary protein should contribute approximately 10 to 15% of energy intake when in energy balance and weight stable. Average daily protein intakes in various countries indicate that these recommendations are consistent with the amount of protein being consumed worldwide. Meals with an average of 20 to 30% of energy from protein are representative of high-protein diets when consumed in energy balance. The body cannot synthesize certain amino acids that are necessary for health and growth, and instead must obtain them from food. These amino acids, called “essential amino acids”, are Histidine (H), Isoleucine (I), Leucine (L), Lysine (K), Methionine (M), Phenylalanine (F), Threonine (T), Tryptophan (W), and Valine (V). Dietary protein sources that provide all the essential amino acids are referred to as “high quality” proteins. Animal foods such as meat, fish, poultry, eggs, and dairy products are generally regarded as high quality protein sources that provide a good balance of essential amino acids. Casein (a protein commonly found in mammalian milk, making up 80% of the proteins in cow milk) and whey (the protein in the liquid that remains after milk has been curdled and strained) are major sources of high quality dietary protein. Foods that do not provide a good balance of essential amino acids are referred to as “low quality” protein sources. Most fruits and vegetables are poor sources of protein. Some plant foods including beans, peas, lentils, nuts and grains (such as wheat) are better sources of protein but may have allergenicity issues. Soy, a vegetable protein manufactured from soybeans, is considered by some to be a high quality protein. Studies of high protein diets for weight loss have shown that protein positively affects energy expenditure and lean body mass. Further studies have shown that overeating produces significantly less weight gain in diets containing at least 5% of energy from protein, and that a high-protein diet decreases energy intake. Proteins commonly found in foods do not necessarily provide an amino acid composition that meets the amino acid requirements of a mammal, such as a human, in an efficient manner. The result is that, in order to attain the minimal requirements of each essential amino acid, a larger amount of total protein must be consumed in the diet than would be required if the quality of the dietary protein were higher. By increasing the quality of the protein in the diet it is possible to reduce the total amount of protein that must be consumed compared to diets that include lower quality proteins. Traditionally, desirable mixtures of amino acids, such as mixtures comprising essential amino acids, have been provided by hydrolyzing a protein with relatively high levels of essential amino acids, such as whey protein, and/or by combining free amino acids in a mixture that optionally also includes a hydrolyzed protein such as whey. Mixtures of this type may have a bitter taste, undesirable mouthfeel and are poorly soluble, and may be deemed unsuitable or undesirable for certain uses. As a result, such mixtures sometimes include flavoring agents to mask the taste of the free amino acids and/or hydrolyzed protein. In some cases compositions in which a proportion of the amino acid content is provided by polypeptides or proteins are found to have a better taste than compositions with a high proportion of total amino acids provided as free amino acids and/or certain hydrolyzed proteins. The availability of such compositions has been limited, however, because nutritional formulations have traditionally been made from protein isolated from natural food products, such as whey isolated from milk, or soy protein isolated from soy. The amino acid profiles of those proteins do not necessarily meet the amino acid requirements for a mammal. In addition, commodity proteins typically consist of mixtures of proteins and/or protein hydrolysates which can vary in their protein composition, thus leading to unpredictability regarding their nutritional value. Moreover, the limited number of sources of such high quality proteins has meant that only certain combinations of amino acids are available on a large scale for ingestion in protein form. The agricultural methods required for the supply of high quality animal protein sources such as casein and whey, eggs, and meat, as well as plant proteins such as soy, also require significant energy inputs and have potentially deleterious environmental impacts.

Accordingly, it would be useful in certain situations to have alternative sources and methods of supplying proteins for mammalian consumption. One feature that can enhance the utility of a nutritive protein is its solubility. Nutritive proteins with higher solubility can exhibit desirable characteristics such as increased stability, resistance to aggregation, and desirable taste profiles. For example, a nutritive protein that exhibits enhanced solubility can be formulated into a beverage or liquid formulation that includes a high concentration of nutritive protein in a relatively low volume of solution, thus delivering a large dose of protein nutrition per unit volume. A soluble nutritive protein can be useful in sports drinks or recovery drinks wherein a user (e.g., an athlete) wants to ingest nutritive protein before, during or after physical activity. A nutritive protein that exhibits enhanced solubility can also be particularly useful in a clinical setting wherein a subject (e.g., a patient or an elderly person) is in need of protein nutrition but is unable to consume solid foods or large volumes of liquids.

SUMMARY OF THE INVENTION

In a first aspect, provided are methods of preventing or reducing loss of muscle mass and/or muscle function in a human subject, including the steps of: i) identifying a human subject at risk of protein malnourishment, and ii) administering to the human subject a nutritional formulation in an amount sufficient to prevent or reduce a loss of muscle mass and/or muscle function, wherein the nutritional formulation includes an isolated nutritive polypeptide including an amino acid sequence at least about 90% identical over at least about 50 amino acids to a polypeptide sequence provided herein; wherein the formulation includes at least 1.0 g of the nutritive polypeptide; wherein the formulation is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml or as a solid or semi-solid in a total mass not greater than about 200 g; and wherein the formulation is substantially free of non-comestible products. In one embodiment, the human subject is identified as in need of a pharmaceutical composition, wherein administration of the pharmaceutical composition increases a risk of loss of muscle mass and/or muscle function. In one embodiment, the human subject is identified as suffering from a disease, disorder or condition and is in need of a pharmaceutical composition, wherein i) the disease, disorder or condition or ii) the administration of the pharmaceutical composition, or both i) and ii) increases a risk of loss of muscle mass and/or muscle function.

In another aspect, the invention provides methods of treating a disease, disorder or condition characterized or exacerbated by protein malnourishment in a human subject in need thereof, including the step of administering to the human subject a nutritional formulation in an amount sufficient to treat such disease, disorder or condition, wherein the nutritional formulation includes an isolated nutritive polypeptide including an amino acid sequence at least about 90% identical over at least about 50 amino acids to a polypeptide sequence provided herein; wherein the formulation includes at least 1.0 g of the nutritive polypeptide; wherein the formulation is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml or as a solid or semi-solid in a total mass not greater than about 200 g; and wherein the formulation is substantially free of non-comestible products. In one embodiment, the formulation includes an agriculturally-derived food product.

In another aspect, the invention provides methods of reducing the risk of a human subject developing a disease, disorder or condition characterized or exacerbated by protein malnourishment, including the steps of (i) identifying the human subject as being at risk of developing the disease, disorder or condition; and (ii) administering in one or more doses a nutritional formulation an isolated nutritive polypeptide including an amino acid sequence at least about 90% identical over at least about 50 amino acids to a polypeptide sequence provided herein; wherein the formulation includes at least 1.0 g of the nutritive polypeptide; wherein the formulation is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml or as a solid or semi-solid in a total mass not greater than about 200 g; and wherein the formulation is substantially free of non-comestible products. In one embodiment, the human subject is at risk of developing malnutrition or protein malnutrition. In one embodiment, the human subject is a pregnant subject or lactating female subject.

In another aspect, the invention provides methods of preventing or reducing the severity of physical or athletic performance-associated tissue damage, including administering to a subject in one or more doses a nutritional formulation including an isolated nutritive polypeptide including an amino acid sequence at least about 90% identical over at least about 50 amino acids to a polypeptide sequence provided herein; wherein the formulation includes at least 1.0 g of the nutritive polypeptide; wherein the formulation is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml or as a solid or semi-solid in a total mass not greater than about 200 g; wherein the formulation is substantially free of non-comestible products, wherein the nutritive polypeptide is present in the nutritional formulation in an amount sufficient to prevent or reduce the severity of a physical or an athletic performance-associated tissue damage. In one embodiment, the nutritional formulation is administered within about thirty minutes of the cessation of the physical or the athletic performance. In one embodiment, the nutritional formulation is administered within about thirty minutes of the onset of the physical or the athletic performance. In another embodiment, the nutritional formulation is administered during the physical or the athletic performance.

In another aspect, the invention provides methods of increasing muscle anabolism in a human subject, including administering to a human subject in one or more doses a nutritional formulation including an isolated nutritive polypeptide including an amino acid sequence at least about 90% identical over at least about 50 amino acids to a polypeptide sequence provided herein; wherein the formulation includes at least 1.0 g of the nutritive polypeptide; wherein the formulation is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml or as a solid or semi-solid in a total mass not greater than about 200 g; wherein the formulation is substantially free of non-comestible products, wherein the nutritive polypeptide is present in the nutritional formulation in an amount sufficient to increase muscle anabolism in the subject after the administration thereof. In one embodiment, the human subject is under 18 years of age. In one embodiment, the human subject is equal to or over 65 years of age.

In another aspect, the invention provides methods of improving the nutritional status of a human subject, including administering to the subject an effective amount of a nutritional formulation including an isolated nutritive polypeptide including an amino acid sequence at least about 90% identical over at least about 50 amino acids to a polypeptide sequence provided herein; wherein the formulation includes at least 1.0 g of the nutritive polypeptide; wherein the formulation is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml or as a solid or semi-solid in a total mass not greater than about 200 g; wherein the formulation is substantially free of non-comestible products, wherein the nutritive polypeptide is present in the formulation in an amount sufficient to improve the nutritional status in the human subject after the administration thereof.

In another aspect, the invention provides methods of formulating a nutritional product, including the steps of providing a nutritive composition including an isolated nutritive polypeptide including an amino acid sequence at least about 90% identical over at least about 50 amino acids to a polypeptide sequence provided herein; and combining the nutritive composition with at least one of a tastant, a nutritional carbohydrate and a nutritional lipid, thereby formulating a nutritional product, wherein the product includes at least 0.1 g of the nutritive polypeptide; and wherein the product is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml or as a solid or semi-solid in a total mass not greater than about 200 g. In one embodiment, the product is substantially free of non-comestible products.

In another aspect, the invention provides methods of formulating a nutritional formulation, including the steps of (i) providing a nutritive polypeptide produced from a microorganism; (ii) combining the nutritive polypeptide with an agriculturally-derived food product, a tastant, a vitamin, a mineral, or a combination thereof in amounts of the nutritive polypeptide sufficient to formulate a nutritional formulation, wherein the nutritional formulation is formulated in a pharmaceutically acceptable carrier.

In another aspect, the invention provides methods for selecting an amino acid sequence of a nutritive polypeptide, including i) providing a library of amino acid sequences including a plurality of amino acid sequences, ii) identifying in the library one or more amino acid sequences including at least one amino acid of interest, and iii) selecting the one or more identified amino acid sequences, thereby selecting an amino acid sequence of a nutritive polypeptide.

In another aspect, the invention provides methods for selecting an amino acid sequence of a nutritive polypeptide, including i) providing a library of amino acid sequences including a plurality of amino acid sequences, ii) identifying in the library one or more amino acid sequences including a ratio of at least one amino acid residues of interest to total amino acid residues greater than or equal to a selected ratio, and iii) selecting the one or more identified amino acid sequences, thereby selecting an amino acid sequence of a nutritive polypeptide.

In another aspect, the invention provides methods for selecting an amino acid sequence of a nutritive polypeptide, including i) providing a library of amino acid sequences including a plurality of amino acid sequences, ii) identifying in the library one or more amino acid sequences including a ratio of at least one amino acid residues of interest to total amino acid residues less than or equal to a selected ratio, and iii) selecting the one or more identified amino acid sequences, thereby selecting an amino acid sequence of a nutritive polypeptide. In one embodiment, the methods further include providing a nucleic acid sequence encoding the selected one or more identified amino acid sequences. In one embodiment, the methods further include expressing the provided nucleic acid sequence under conditions such that the nutritive polypeptide is produced. In one embodiment, the methods further include expressing the provided nucleic acid sequence in a recombinant microorganism under conditions such that the nutritive polypeptide is produced.

In another aspect, the invention provides nutritive formulations including an isolated nutritive polypeptide including an amino acid sequence at least about 90% identical over at least about 50 amino acids to a polypeptide sequence provided herein; wherein the formulation includes at least 1.0 g of the nutritive polypeptide; wherein the formulation is present as i) a liquid, semi-liquid or gel in a volume not greater than about 500 ml or ii) a solid or semi-solid in a total mass not greater than about 200 g; and wherein the formulation is substantially free of non-comestible products. In one embodiment, the amino acid sequence includes a polypeptide nutritional domain including an N-terminal amino acid and a C-terminal amino acid, wherein: i) the N-terminal amino acid is not situated at the N-terminus of an amino acid sequence including a polypeptide that contains the polypeptide nutritional domain, or ii) the C-terminal amino acid is not situated at the C-terminus of an amino acid sequence including a polypeptide that contains the polypeptide nutritional domain, or iii) the N-terminal amino acid is not situated at the N-terminus of an amino acid sequence including a polypeptide that contains the polypeptide nutritional domain and the C-terminal amino acid is not situated at the C-terminus of an amino acid sequence including a polypeptide that contains the polypeptide nutritional domain. In one embodiment, the polypeptide nutritional domain consists of no more than about 99% of the amino acid sequence. In one embodiment, the nutritive polypeptide further includes a signal peptide sequence. In one embodiment, the nutritive polypeptide includes at least 50%, 60%, 70%, 80% or 90% of the polypeptides present in the formulation. In one embodiment, the nutritive polypeptide is present in an amount sufficient to provide a nutritional benefit to a human subject suffering from protein malnutrition or a disease, disorder or condition characterized by protein malnutrition. In one embodiment, the nutritive polypeptide is formulated in a pharmaceutically acceptable carrier. In one embodiment, the nutritive polypeptide is formulated in or as a food or a food ingredient, or as a medical food or as a medical food ingredient. In one embodiment, the nutritive polypeptide is formulated in or as a beverage or a beverage ingredient. In one embodiment, the amino acid sequence encodes an enzyme having a primary activity, and wherein the nutritive polypeptide substantially lacks the primary activity. In one embodiment, the isolated nutritive polypeptide has an aqueous solubility at pH 7 of at least 12.5 g/L. In one embodiment, the isolated nutritive polypeptide has a simulated gastric digestion half-life of less than 30 minutes. In one embodiment, the formulations further include a component selected from a tastant, a protein mixture, a polypeptide, a peptide, a free amino acid, a carbohydrate, a lipid, a mineral or mineral source, a vitamin, a supplement, an organism, a pharmaceutical, and an excipient. In one embodiment, the human subject is suffering from a muscle wasting disease, disorder or condition. In one embodiment, the amino acid sequence contains a density of branched chain amino acids about equal to or greater than the density of branched chain amino acids present in a full-length reference nutritional polypeptide or a reference polypeptide-containing mixture. In one embodiment, the reference nutritional polypeptide is bovine beta lactoglobulin or bovine type I collagen or wherein the reference polypeptide-containing mixture includes bovine whey. In one embodiment, the amino acid sequence contains a density of essential amino acids about equal to or greater than the density of essential chain amino acids present in a full-length reference nutritional polypeptide or a reference polypeptide-containing mixture. In one embodiment, the reference nutritional polypeptide is bovine beta lactoglobulin or bovine type I collagen or wherein the reference polypeptide-containing mixture includes bovine whey. In one embodiment, the amino acid sequence contains a density of at least one amino acid selected from the group consisting of leucine, arginine and glutamine about equal to or greater than the density of the selected amino acid present in a full-length reference nutritional polypeptide or a reference polypeptide-containing mixture. In one embodiment, the reference nutritional polypeptide is bovine beta lactoglobulin or bovine type I collagen or wherein the reference polypeptide-containing mixture includes bovine whey.

In another aspect, the invention provides formulations including at least one hundred milligrams of a nutritive polypeptide secreted and substantially isolated from a microorganism, wherein the formulation is substantially free of non-comestible products.

In another aspect, the invention provides formulations including at least one nutritive polypeptide including an amino acid sequence at least about 99% identical to a naturally occurring polypeptide capable of being secreted from a microorganism, wherein the nutritive polypeptide is present in the formulation in an amount sufficient to provide a nutritional benefit equivalent to or greater than at least about 2% of a reference daily intake value of protein or is otherwise present in an amount sufficient to provide a feeling of satiety when consumed by a human subject. In one embodiment, the nutritive polypeptide is secreted from a microorganism. In one embodiment, the nutritive polypeptide is secreted and substantially purified from the microorganism. In one embodiment, the nutritive polypeptide is substantially purified from one or more other polypeptides capable of being secreted by the microorganism. In one embodiment, the microorganism is selected from the group consisting of the genera Aspergillus, Trichoderma, Penicillium, Chrysosporium, Acremonium, Fusarium, Trametes, and Rhizopus. In one embodiment, the microorganism is selected from the group consisting of the genera Escherichia, Bacillus, Saccharomyces, Pichia, Corynebacterium, Synechocystis, Synechococcus and Streptomyces.

In another aspect, the invention provides recombinant microorganisms including an exogenous nucleic acid sequence encoding a polypeptide at least 90% identical to SEQID-04129 to SEQID-44483, wherein the polypeptide is capable of being secreted from the microorganism.

In another aspect, the invention provides recombinant microorganisms including an exogenous nucleic acid sequence encoding a nutritive polypeptide including an amino acid sequence having a percentage content of one or more branched chain amino acids greater than bovine whey, wherein the polypeptide nutritional domain is capable of being secreted from the microorganism.

In another aspect, the invention provides recombinant microorganisms including an exogenous nucleic acid sequence encoding a nutritive polypeptide including an amino acid sequence having a percentage content of one or more essential amino acids greater than bovine whey, wherein the nutritive polypeptide is capable of being secreted from the microorganism. The invention also provides nutritive polypeptides produced by the organisms provided herein.

In another aspect, the invention provides nutritive formulations including nutritive polypeptides, wherein the formulation includes at least 1.0 g of the nutritive polypeptide; wherein the formulation is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml or as a solid or semi-solid in a total mass not greater than about 200 g; and wherein the formulation is substantially free of non-comestible products.

In another aspect, the invention provides libraries including a plurality of nucleic acid sequences encoding a plurality of nutritive polypeptides, wherein the plurality of nucleic acid sequences are obtained from one or more edible species. In one embodiment, at least two nucleic acid sequences encode two or more polypeptides at least 90% identical to SEQID-04129 to SEQID-44483.

In another aspect, the invention provides methods for selecting a nutritive polypeptide including an amino acid sequence from the libraries described herein, including expressing two or more nucleic acid sequences present in the library and purifying one or more nutritive polypeptides. In one embodiment, the purifying step includes an ion exchange step. In one embodiment, the purifying step includes an affinity purification step.

In another aspect, the invention provides methods of preventing or reducing loss of muscle mass and/or muscle function in a human subject, including the steps of: i) identifying a human subject at risk of protein malnourishment, and ii) administering to the human subject a nutritional formulation in an amount sufficient to prevent or reduce a loss of muscle mass and/or muscle function, wherein the nutritional formulation includes an isolated nutritive polypeptide including an amino acid sequence at least about 90% identical over at least about 50 amino acids to a polypeptide sequence provided herein; wherein the formulation includes at least 1.0 g of the nutritive polypeptide; wherein the formulation is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml or as a solid or semi-solid in a total mass not greater than about 200 g; and wherein the formulation is substantially free of non-comestible products. In one embodiment, the human subject is identified as in need of a pharmaceutical composition, wherein administration of the pharmaceutical composition increases a risk of loss of muscle mass and/or muscle function. In one embodiment, the human subject is identified as suffering from a disease, disorder or condition and is in need of a pharmaceutical composition, wherein i) the disease, disorder or condition or ii) the administration of the pharmaceutical composition, or both i) and ii) increases a risk of loss of muscle mass and/or muscle function.

In another aspect, the invention provides methods for selecting an amino acid sequence of a nutritive polypeptide, including i) providing a library of amino acid sequences including a plurality of amino acid sequences, ii) identifying in the library one or more amino acid sequences including at least one amino acid of interest, and iii) selecting the one or more identified amino acid sequences, thereby selecting an amino acid sequence of a nutritive polypeptide.

In another aspect, the invention provides methods for selecting an amino acid sequence of a nutritive polypeptide, including i) providing a library of amino acid sequences including a plurality of amino acid sequences, ii) identifying in the library one or more amino acid sequences including a ratio of at least one amino acid residues of interest to total amino acid residues greater than or equal to a selected ratio, and iii) selecting the one or more identified amino acid sequences, thereby selecting an amino acid sequence of a nutritive polypeptide.

In another aspect, the invention provides methods for selecting an amino acid sequence of a nutritive polypeptide, including i) providing a library of amino acid sequences including a plurality of amino acid sequences, ii) identifying in the library one or more amino acid sequences including a ratio of at least one amino acid residues of interest to total amino acid residues less than or equal to a selected ratio, and iii) selecting the one or more identified amino acid sequences, thereby selecting an amino acid sequence of a nutritive polypeptide.

In another aspect, the invention provides methods of formulating a nutritional product, including the steps of providing a composition including an enzyme-class polypeptide having no substantial catalytic activity at a concentration of from about 1 milligram to 1000 milligrams of enzyme-class polypeptide per gram of the nutritional product, and combining the enzyme-class polypeptide with at least one of a tastant, a nutritional carbohydrate and a nutritional lipid, thereby formulating a nutritional product, wherein the nutritional product is comestible.

In another aspect, the invention provides methods of formulating a nutritional product, including the steps of providing a composition including an enzyme-class polypeptide having substantial catalytic activity at a concentration of from about 1 milligram to 1000 milligrams of enzyme-class polypeptide per gram of the nutritional product, and combining the enzyme-class polypeptide with at least one of a tastant, a nutritional carbohydrate and a nutritional lipid, thereby formulating a nutritional product, wherein the nutritional product is comestible.

In another aspect, the invention provides methods of formulating a nutritional formulation, including the steps of (i) providing a nutritive polypeptide produced from a microorganism, wherein the nutritive polypeptide includes an amino acid sequence at least about 95% identical to an enzyme-class polypeptide or fragment or domain thereof; (ii) combining the nutritive polypeptide with an agriculturally-derived food product, a tastant, a vitamin, a mineral, or a combination thereof in amounts of the nutritive polypeptide sufficient to formulate a nutritional formulation, wherein the nutritional product is comestible.

In another aspect, the invention provides methods of treating a disease, disorder or condition characterized or exacerbated by malnourishment or protein malnourishment in a human subject in need thereof, including the step of administering to the human subject a nutritional formulation in an amount sufficient to treat such disease, disorder or condition, wherein the nutritional formulation includes a nutritive polypeptide and optionally an agriculturally-derived food product, wherein the nutritive polypeptide includes an amino acid sequence at least about 95% identical to an enzyme-class polypeptide or fragment or domain thereof.

In another aspect, the invention provides methods of reducing the risk of a human subject developing a disease, disorder or condition characterized or exacerbated by protein malnourishment, including the steps of (i) identifying the human subject as being at risk of developing the disease, disorder or condition; and (ii) administering in one or more doses a nutritional formulation including a nutritive polypeptide and an agriculturally-derived food product, wherein the nutritive polypeptide includes an amino acid sequence at least about 95% identical to an enzyme-class polypeptide or fragment or domain thereof. In one embodiment, the human subject is at risk of developing malnutrition or protein malnutrition. In one embodiment, the human subject is a pregnant subject or lactating female subject.

In another aspect, the invention provides methods of reducing the severity of physical or athletic performance-associated tissue damage, including administering to a post-physical or athletic performance subject in one or more doses a nutritional formulation including a nutritive polypeptide and an agriculturally-derived food product, wherein the nutritive polypeptide includes an amino acid sequence at least about 95% identical to an enzyme-class polypeptide or fragment or domain thereof, wherein the nutritive polypeptide is present in the nutritional formulation in an amount sufficient to reduce the severity of a physical or athletic performance-associated tissue damage. In one embodiment, the nutritional formulation is administered within about 90 minutes of the cessation of the athletic performance. In one embodiment, the nutritional formulation is administered within about 90 minutes before the onset of the athletic performance. In one embodiment, the nutritional formulation is administered during the athletic performance.

In another aspect, the invention provides methods of improving the nutritional status of a human subject, including administering to the subject an effective amount of a nutritional formulation including a nutritive polypeptide and an agriculturally-derived food product, wherein the nutritive polypeptide includes an amino acid sequence at least about 95% identical to an enzyme-class polypeptide or fragment or domain thereof. In one embodiment, the nutritive polypeptide is at least 90% identical to a polypeptide selected from the group consisting of SEQID 00001-03909 and SEQID 04129-44483.

In another aspect, the invention provides nutritive formulations including an isolated enzyme-class polypeptide having substantially reduced or no substantial primary catalytic activity of a known enzyme, wherein the enzyme-class polypeptide includes an amino acid sequence at least about 90% identical over at least about 50 amino acids to a polypeptide sequence provided herein; wherein the enzyme-class polypeptide is present in the formulation at least about 0.5 g at a concentration of at least about 10 g per kilogram of formulation; wherein the formulation is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml or as a solid or semi-solid in a total mass not greater than about 200 g; and wherein the formulation is substantially free of non-comestible products. In one embodiment, the enzyme-class polypeptide is formulated in a pharmaceutically acceptable carrier. In one embodiment, the enzyme-class polypeptide is formulated in or as a food or a food ingredient. In one embodiment, the enzyme-class polypeptide is formulated in or as a beverage or a beverage ingredient. In one embodiment, the enzyme-class polypeptide is formulated in or as a food or a food ingredient. In one embodiment, the amino acid sequence is homologous to an enzyme having a primary activity, and wherein the enzyme-class polypeptide substantially lacks the primary activity. In one embodiment, the isolated enzyme-class polypeptide has an aqueous solubility at pH 7 of at least 12.5 g/L. In one embodiment, the isolated enzyme-class polypeptide has a simulated gastric digestion half-life of less than 30 minutes. In one embodiment, the formulations further include a component selected from a tastant, protein mixture, a polypeptide, a peptide, a free amino acid, a carbohydrate, a lipid, a mineral or mineral source, a vitamin, a supplement, an organism, a pharmaceutical, and an excipient. In one embodiment, the enzyme-class polypeptide is substantially thermostable at a pH of at least about 2. In one embodiment, the enzyme-class polypeptide is an enzyme-treated polypeptide, an acid-treated polypeptide, a base-treated polypeptide, a chemically-treated polypeptide, a heat-treated polypeptide, or a detergent-treated polypeptide, or a combination thereof.

In another aspect, the invention provides formulations including an enzyme-class polypeptide having reduced catalytic activity, wherein the formulation is nutritional and is substantially free of non-comestible products.

In another aspect, the invention provides formulations including an enzyme-class polypeptide, wherein the enzyme-class polypeptide is provided in a nutritional amount and the formulation is substantially free of non-comestible products. In one embodiment, the enzyme-class polypeptide is produced in a genetically modified organism. In one embodiment, the enzyme-class polypeptide is produced from a recombinant nucleic acid sequence. In one embodiment, the formulation provides a nutritional benefit equivalent to or greater than at least about 2% of a reference daily intake value of protein or is otherwise present in an amount sufficient to provide a feeling of satiety when consumed by a human subject. In one embodiment, the enzyme-class polypeptide includes an amino acid variation as compared to a corresponding wild-type enzyme-class polypeptide having a catalytic activity. In one embodiment, the enzyme-class polypeptide is present at a concentration of from about 1 milligram to 1000 milligrams per gram of the formulation. In one embodiment, the formulation is substantially free of a surfactant, a polyvinyl alcohol, a propylene glycol, a polyvinyl acetate, a polyvinylpyrrolidone, a non-comestible polyacid or polyol other than glycerol or propylene glycol a fatty alcohol, an alkylbenzyl sulfonate, an alkyl glucoside, or a methyl paraben. In one embodiment, the formulations further include a component selected from a protein mixture, a polypeptide, a peptide, a free amino acid, a carbohydrate, a lipid, a mineral or mineral source, a vitamin, a supplement, an organism, a pharmaceutical, and an excipient. In one embodiment, including a plurality of enzyme-class polypeptides. In one embodiment, the enzyme-class polypeptide includes an amylase, a cellobiohydrolase, a chloroperoxidase, an endoglucanase, a feruloyl-esterase, an alpha-galactosidase, a beta-galactosidase, a glucoamylase, a glucose oxidase, a laccase, a lignin-peroxidase, a lipase, a mannanase, a Mn-peroxidase, a phytase, a methyl esterase, a xylanase or a lysozyme. In one embodiment, the enzyme-class polypeptide is present at a concentration greater than 10 g/l.

In another aspect, the invention provides formulations including an enzyme-class polypeptide and a component selected from a protein mixture, a polypeptide, a peptide, a free amino acid, a carbohydrate, a lipid, a mineral or mineral source, a vitamin, a supplement, an organism, a pharmaceutical, and an excipient, wherein the formulation is nutritional and is substantially free of non-comestible products, wherein the formulation provides a nutritional benefit equivalent to or greater than at least about 2% of a reference daily intake value of protein or is otherwise present in an amount sufficient to provide a feeling of satiety when consumed by a human subject.

In another aspect, the invention provides recombinant microorganisms including a recombinant nucleic acid sequence encoding an enzyme-class polypeptide, wherein the enzyme-class polypeptide is at least 90% identical to a polypeptide provided herein.

In another aspect, the invention provides recombinant microorganisms including a recombinant nucleic acid sequence encoding an enzyme-class polypeptide having a percentage content of one or more branched chain amino acids (BCAAs) greater than the percentage content of one or more BCAAs present in a full-length reference nutritional polypeptide or a reference polypeptide-containing mixture, wherein the enzyme-class polypeptide is secreted from the microorganism. In one embodiment, the reference nutritional polypeptide is bovine beta lactoglobulin or bovine type I collagen or wherein the reference polypeptide-containing mixture includes bovine whey.

In another aspect, the invention provides recombinant microorganisms including a recombinant nucleic acid sequence encoding an enzyme-class polypeptide having a percentage content of one or more essential amino acids (EAAs) greater than about the percentage content of one or more EAAs present in a full-length reference nutritional polypeptide or a reference polypeptide-containing mixture wherein the reference nutritional polypeptide is bovine beta lactoglobulin or bovine type I collagen or wherein the reference polypeptide-containing mixture includes bovine whey, wherein the enzyme-class polypeptide is secreted from the microorganism. Also provided are the enzyme-class polypeptides produced by the organisms described herein.

In another aspect, the invention provides compositions including a dry mix for sports nutrition composition including a sports nutrition composition and a nutritive polypeptide. In one embodiment, the nutritive polypeptide provides a nutritional benefit equivalent to or greater than at least about 2% of a reference daily intake value of protein or is otherwise present in an amount sufficient to increase muscle anabolism in the subject after the administration thereof.

In another aspect, the invention provides formulations including a nutritive polypeptide component consisting essentially of an isolated single chain nutritive polypeptide having an amino acid sequence at least about 90% identical to an edible species polypeptide sequence or fragment thereof at least 50 amino acids in length, wherein the nutritive polypeptide is present in an amount sufficient to provide a nutritional benefit equivalent to or greater than at least about 2% of a reference daily intake value of protein, and wherein the formulation is substantially free of non-comestible products, and wherein the nutritive polypeptide has less than about 50% identity over at least 25 amino acids to a known allergen.

In another aspect, the invention provides a unit dose including an isolated nutritive polypeptides having an amino acid sequence at least about 90% identical to an edible species polypeptide or fragment thereof at least 50 amino acids in length, wherein the nutritive polypeptide is present in the unit dose at a concentration of above about 0.5% by weight, wherein the formulation is substantially free of non-comestible products, and wherein the nutritive polypeptide has less than about 50% identity over at least 25 amino acids to a known allergen. In one embodiment, the nutritive polypeptide is present at a concentration between about 0.5% and about 90% by weight, wherein the formulation is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml. In one embodiment, the nutritive polypeptide is present at a concentration between about 0.5% and about 99% by weight, wherein the formulation is present as a solid or semi-solid in a total mass not greater than about 200 g. In one embodiment, the edible species polypeptide includes an agriculturally-derived food protein. In one embodiment, the nutritive polypeptide is at least about 90% identical to a polypeptide provided herein. In one embodiment, the edible species polypeptide includes a nutritional domain of an agriculturally-derived food protein. In one embodiment, the edible species polypeptide is a present in a nutritionally substantial quantity in the diet of a human population. In one embodiment, the edible species polypeptide is a bovine, ovine, caprine, porcine, piscine, or galline protein. In one embodiment, the edible species polypeptide is a leaf vegetable protein. In one embodiment, the edible species polypeptide is a seed or fruit protein. In one embodiment, the edible species polypeptide is an edible flower protein. In one embodiment, the edible species polypeptide is a legume protein. In one embodiment, the edible species polypeptide is a bulb or stem protein. In one embodiment, the edible species polypeptide is a root or tuber protein. In one embodiment, the edible species polypeptide is a sea vegetable protein. In one embodiment, the edible species polypeptide is a fungal protein. In one embodiment, the edible species polypeptide is of microbial origin. In one embodiment, the nutritive polypeptide has a percentage content of one or more branched chain amino acids greater than about 23.67%. In one embodiment, the nutritive polypeptide has a percentage content of one or more essential amino acids greater than about 49.04%. In one embodiment, the nutritive polypeptide is present in an amount sufficient to provide a healthy adult male human subject with at least about 25% of a reference daily intake value of protein. In one embodiment, the nutritive polypeptide is present in an amount sufficient to provide a healthy adult female human subject with at least about 25% of a reference daily intake value of protein. In one embodiment, the nutritive polypeptide is present in an amount sufficient to provide a pregnant human subject with at least about 10% of a reference daily intake value of protein. In one embodiment, the nutritive polypeptide is produced by a microorganism. In one embodiment, the nutritive polypeptide is produced by a microorganism, and wherein the nutritive polypeptide is substantially separated from the microorganism. In one embodiment, the nutritive polypeptide is secreted by a microorganism. In one embodiment, the nutritive polypeptide is produced by a heterotrophic microorganism. In one embodiment, the nutritive polypeptide is produced by a autotrophic microorganism. In one embodiment, the nutritive polypeptide is produced by a mixotrophic microorganism. In one embodiment, the nutritive polypeptide is produced by a microorganism including an exogenous nucleic acid sequence. In one embodiment, the nutritive polypeptide includes an amino acid sequence at least about 90% identical to an edible species polypeptide present in an amount of at least about 1% total protein weight in an edible species, the food species representing at least about 1% of the diet of a human population. In one embodiment, the nutritive polypeptide includes an amino acid sequence at least about 90% identical to an edible species polypeptide present in an amount of at least about 0.1% total weight in a dehydrated food species, the edible species representing at least about 0.1% of the diet of a human population. In one embodiment, the unit dose includes at least about 5 grams of the nutritive polypeptide. In one embodiment, the unit dose includes at least one agriculturally-derived carbohydrate and/or lipid. In one embodiment, the unit dose includes at least one agriculturally-derived protein. In one embodiment, the unit dose further includes a tastant or flavorant. In one embodiment, the unit dose further includes a vitamin or mineral. In one embodiment, the unit dose is in a liquid, solid, semi-solid, or gel state at room temperature. In one embodiment, the unit dose is formulated for administration as an infant formula, an elderly nutritional formula, a prenatal nutrition formula, an athletic performance formula, a ready-to-use therapeutic food formula, or an athletic recovery formula. In one embodiment, the unit dose is formulated for administration as a medical food. In one embodiment, the unit dose is formulated for administration to a human subject suffering from or under treatment for a disease, disorder or condition. In one embodiment, the unit dose is formulated for administration to a human subject recovering from an injury, illness, or medical treatment. In one embodiment, the formulation is consumable without consumption of potable water. In one embodiment, the formulation does not require heating or cooking. In one embodiment, an allergenic protein or a polypeptide at least 25% homologous to a known allergen is not substantially present in the formulation. In one embodiment, a polypeptide having an allergenic protein or a protein at least 90% identical to an allergenic protein is not substantially present in the formulation. In one embodiment, an anti-nutritive polypeptide or a polypeptide including an anti-nutritive domain is not substantially present in the formulation. In one embodiment, a wheat protein isolated from a wheat grain is not substantially present in the formulation. In one embodiment, a soy protein is not substantially present in the formulation. In one embodiment, a dairy protein is not substantially present in the formulation.

In another aspect, the invention provides methods of preventing or reducing loss of muscle mass and/or muscle function in a human subject, including the steps of: i) identifying a human subject at risk of protein malnourishment, and ii) administering to the human subject a nutritional formulation in an amount sufficient to prevent or reduce a loss of muscle mass and/or muscle function, wherein the nutritional formulation includes an isolated nutritive polypeptide including an amino acid sequence at least about 90% identical over at least about 50 amino acids to a polypeptide sequence provided herein; wherein the formulation includes at least 1.0 g of the nutritive polypeptide; wherein the formulation is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml or as a solid or semi-solid in a total mass not greater than about 200 g; and wherein the formulation is substantially free of non-comestible products. In one embodiment, the human subject is identified as in need of a pharmaceutical composition, wherein administration of the pharmaceutical composition increases a risk of loss of muscle mass and/or muscle function. In one embodiment, the human subject is identified as suffering from a disease, disorder or condition and is in need of a pharmaceutical composition, wherein i) the disease, disorder or condition or ii) the administration of the pharmaceutical composition, or both i) and ii) increases a risk of loss of muscle mass and/or muscle function.

In another aspect, the invention provides methods for selecting an amino acid sequence of a nutritive polypeptide, including i) providing a library of amino acid sequences including a plurality of amino acid sequences present in at least one edible species, ii) identifying in the library one or more amino acid sequences including at least one amino acid of interest, and iii) selecting the one or more identified amino acid sequences, thereby selecting an amino acid sequence of a nutritive polypeptide.

In another aspect, the invention provides methods for selecting an amino acid sequence of a nutritive polypeptide, including i) providing a library of amino acid sequences including a plurality of amino acid sequences present in at least one edible species, ii) identifying in the library one or more amino acid sequences including a ratio of at least one amino acid residues of interest to total amino acid residues greater than or equal to a selected ratio, and iii) selecting the one or more identified amino acid sequences, thereby selecting an amino acid sequence of a nutritive polypeptide.

In another aspect, the invention provides methods for selecting an amino acid sequence of a nutritive polypeptide, including i) providing a library of amino acid sequences including a plurality of amino acid sequences present in at least one edible species, ii) identifying in the library one or more amino acid sequences including a ratio of at least one amino acid residues of interest to total amino acid residues less than or equal to a selected ratio, and iii) selecting the one or more identified amino acid sequences, thereby selecting an amino acid sequence of a nutritive polypeptide.

In another aspect, the invention provides methods of producing a nutritive polypeptide, including the steps of (i) providing a microorganism including an exogenous nucleic acid encoding a nutritive polypeptide, wherein the nutritive polypeptide includes an amino acid sequence at least about 90% identical to an edible species polypeptide or fragment or domain thereof, under conditions such that the nutritive polypeptide is produced in the microorganism; and (ii) isolating the produced nutritive polypeptide from the microorganism.

In another aspect, the invention provides methods of formulating a nutritional formulation, including the steps of (i) providing a nutritive polypeptide produced from a microorganism, wherein the nutritive polypeptide includes an amino acid sequence at least about 90% identical to an edible species polypeptide or fragment or domain thereof; (ii) combining the nutritive polypeptide with an agriculturally-derived food product, in amounts of the nutritive polypeptide and agriculturally-derived food product sufficient to formulate a nutritional formulation.

In another aspect, the invention provides methods of treating a disease, disorder or condition characterized or exacerbated by malnourishment or protein malnourishment in a human subject in need thereof, including the step of administering to the human subject a nutritional formulation in an amount sufficient to support the treatment of such disease, disorder or condition, wherein the nutritional formulation includes a nutritive polypeptide and an agriculturally-derived food product, wherein the nutritive polypeptide includes an amino acid sequence at least about 90% identical to an edible species polypeptide or fragment or domain thereof. In one embodiment, the human subject is an elderly subject. In one embodiment, the human subject is a child under 18 years old. In one embodiment, the human subject is an embryonic subject or a fetal subject. In one embodiment, the nutritive polypeptide is at least about 90% identical to a polypeptide provided herein.

In another aspect, the invention provides methods of reducing the risk of a human subject developing a disease, disorder or condition characterized or exacerbated by protein malnourishment, including the steps of (i) identifying the human subject as being at risk of developing the disease, disorder or condition; and (ii) administering in one or more doses a nutritional formulation including a nutritive polypeptide and an agriculturally-derived food product, wherein the nutritive polypeptide includes an amino acid sequence at least about 90% identical to a polypeptide provided herein. In one embodiment, the human subject is at risk of developing kwashiorkor. In one embodiment, the human subject is a pregnant subject or lactating female subject.

In another aspect, the invention provides methods of reducing the severity of physical or athletic performance-associated tissue damage, including administering to a subject in one or more doses a nutritional formulation including a nutritive polypeptide and an agriculturally-derived food product, wherein the nutritive polypeptide includes an amino acid sequence at least about 90% identical to a naturally occurring human food protein or functional fragment or domain thereof, wherein the nutritive polypeptide is present in the nutritional formulation in an amount sufficient to reduce the severity of a physical or athletic performance-associated tissue damage. In one embodiment, the nutritional formulation is administered prior to initiation of the physical or athletic performance. In one embodiment, the nutritional formulation is administered during the physical or athletic performance. In one embodiment, the nutritional formulation is administered following cessation of the physical or athletic performance.

BRIEF DESCRIPTION OF THE FIGURES

These and other features, aspects, and advantages of the present invention will become better understood with regard to the following description, and accompanying drawings, where:

FIG. 1 is an image demonstrating SDS-PAGE analysis of the purification of SEQID-00105 by IMAC.

FIG. 2 is a chart demonstrating net charge per amino acid as a function of pH for nutritive polypeptides predicted to bind to either anion or cation exchange resin. (1) SEQID-00105, (2) SEQID-00008, (3) SEQID-00009, (4) SEQID-00475, (5) SEQID-00472, (6) SEQID-00640, (7) SEQID-00019.

FIG. 3 is a chart demonstrating total charge per amino acid over a range of pHs for exemplary nutritive polypeptides. (1) SEQID-00475, (2) SEQID-00009, (3) SEQID-00478, (4) SEQID-00433, (5) SEQID-00472.

FIG. 4 is a chart demonstrating purity of SEQID-00009 is as a function of ammonium sulfate concentration.

FIG. 5 is an image demonstrating SDS-PAGE analysis demonstrating secretion of SEQID-00409 (left) and SEQID-00420 (right) with new signal peptide compared to native signal peptide.

FIG. 6 is a chart demonstrating supernatant concentration of GLP-1 (7-36) detected in the supernatant following stimulation, error bars are the standard deviation of the technical replicates.

FIG. 7 is a chart demonstrating average blood glucose values over time during OGTT of vehicle, SEQID-00105, Arginine, and SEQID-00338. The error bars shown are the standard errors of the mean.

FIG. 8A and FIG. 8B are charts demonstrating the area under curve for blood glucose integrated from 0-120 minutes (FIG. 8A) and from 0-60 minutes (FIG. 8B) after acute dosing of SEQID-00105, Arginine, and SEQID-00338.

FIG. 9 is a chart demonstrating average plasma insulin concentration for n=6 rats per treatment group over time. The error bars show the standard error of the mean.

FIG. 10 is a chart demonstrating plasma insulin area under curve integrated between 0-240 and 0-60 minutes for all treatment groups. The error bars show the standard error of the mean.

FIG. 11 is a chart demonstrating average plasma GLP-1 concentration for n=6 rats per treatment group over time. The error bars shown here correspond to the standard error of the mean.

FIG. 12 is a chart demonstrating average blood glucose values over time. The error bars shown are the standard errors of the mean.

FIG. 13 is a chart demonstrating integrated AUC for each treatment group between the time of glucose challenge (0 min.) and 60 minutes, and between time 0 and 120 minutes. The error bars shown are the standard errors of the mean.

FIG. 14 is a chart demonstrating average plasma insulin concentration for n=6 rats per treatment group in vehicle & SEQID-00105 and n=5 rats per treatment group in the case of SEQID-00338 over the course of the experiment. The error bars shown are the standard errors of the mean.

FIG. 15 is a chart demonstrating integrated area under the curve for vehicle, SEQID-00105 and SEQID-00338 between 0 and 90 minutes and between 0 and 60 minutes. Error bars shown here correspond to the standard error of the mean.

FIG. 16 is a chart demonstrating average plasma GLP-1 concentration for n=6 rats per treatment group for vehicle and SEQID-00105 and n=5 rats for SEQID-00338 over the course of the experiment. Error bars shown here correspond to the standard error of the mean.

FIG. 17 is a chart demonstrating area under curve for GLP-1 (7-36) for each treatment group integrated to 0-90 and 0-60 minutes. Error bars shown here correspond to the standard error of the mean.

FIG. 18 is a chart demonstrating average blood glucose values during OGTT of vehicle, SEQID-00105, Alogliptin, and the combination for n=6 rats per treatment group. Error bars shown here correspond to the standard error of the mean.

FIG. 19 is a chart demonstrating AlphaLISA plasma insulin over time for vehicle and SEQID-00105 administered at three different doses. Error bars shown here are the standard error of the mean.

FIG. 20 is a chart demonstrating AlphaLISA plasma insulin over time for vehicle and SEQID-00426, SEQID-00338, SEQID-00341. Error bars shown here are the standard error of the mean.

FIG. 21 is a chart demonstrating integrated area under curves for plasma insulin concentrations for SEQID-00105 at three doses between 0 and 240 minutes and between 0 and 60 minutes. Error bars shown here are the standard error of the mean.

FIG. 22 is a chart demonstrating integrated area under curves for plasma insulin concentrations for vehicle, SEQID-00426, SEQID-00338, and SEQID-00341 between 0 and 240 minutes and between 0 and 60 minutes. Error bars shown here are the standard error of the mean.

FIG. 23 is a chart demonstrating AlphaLISA plasma insulin over time for SEQID-00423, SEQID-00587, SEQID-00105. Error bars shown here are the standard error of the mean.

FIG. 24 is a chart demonstrating AlphaLISA plasma insulin over time for vehicle SEQID-00424, SEQID-00425, and SEQID-00429. Error bars shown here are the standard error of the mean.

FIG. 25 is a chart demonstrating integrated area under curves for plasma insulin concentrations for vehicle, SEQID-00423, SEQID-00587, and SEQID-00105 between 0 and 240 minutes and between 0 and 60 minutes. Error bars shown here are the standard error of the mean.

FIG. 26 is a chart demonstrating integrated area under curves for plasma insulin concentrations for vehicle, SEQID-00424, SEQID-00425, and SEQID-00429 between 0 and 240 minutes and between 0 and 60 minutes. Error bars shown here are the standard error of the mean.

FIG. 27 is a chart demonstrating ELISA plasma insulin over time for vehicle and SEQID-00105, SEQID-00240, and SEQID-00559. Error bars shown here are the standard error of the mean.

FIG. 28 is a chart demonstrating integrated area under curves for plasma insulin concentrations for vehicle, SEQID-00105, SEQID-00240, and SEQID-00559 between 0 and 240 minutes and 0 and 60 minutes. Error bars shown here are the standard error of the mean.

FIG. 29 is a chart demonstrating GLP-2 concentration over a 4 hour time course for vehicle and SEQID-00240, n=4 and n=5 rats, respectively. Error bars shown are the standard error of the mean.

FIG. 30 is a chart demonstrating integrated GLP-2 area under the curve over the first hour and the full 4 hours. Error bars shown are the 95% confidence interval.

FIG. 31 is a chart demonstrating average plasma insulin response to SEQID-00105 of all subjects over time.

FIG. 32 is a chart demonstrating average plasma insulin fold response to SEQID-00105 over baseline.

FIG. 33 is a chart demonstrating average plasma insulin response to SEQID-00426 of all subjects over time.

FIG. 34 is a chart demonstrating average plasma insulin fold response to SEQID-00426 over baseline.

FIG. 35 is a chart demonstrating average total Gastric Inhibitory Polypeptide (GIP) response of all patients to SEQID-00426.

FIG. 36 is a chart demonstrating aGastric Inhibitory Polypeptide (GIP) fold response of all patients to SEQID-00426.

FIG. 37 is a chart demonstrating alphascreen signal (y-axis) measured at different Leucine concentrations. Error bars shown are the standard deviation of replicates.

FIG. 38 is a chart demonstrating Leucine Dose Response in Minimal Amino Acid Media in Primary RSKMC. Error bars shown are the standard deviation.

FIG. 39 is a chart demonstrating In vitro Leucine Dose Response of rps6 Phosphorylation in Isolate Soleus Muscle. Error bars shown are the standard deviation.

FIG. 40 is a chart demonstrating In vitro Leucine Dose Response of rps6 Phosphorylation in Isolated Gastrocnemius Muscle. Error bars shown are the standard deviation.

FIG. 41 is a chart demonstrating In vitro Leucine Dose Response of rps6 Phosphorylation in Isolate Extensor Digitorum Longus Muscle. Error bars shown are the standard deviation.

FIG. 42 is a chart demonstrating Combined Activity of Leu/Tyr/Arg on RPS6 Phosphorylation. Error bars shown are the standard deviation.

FIG. 43 is a chart demonstrating Arginine Stimulation of RPS6 in Leu/Tyr Background. Error bars shown are the standard deviation.

FIG. 44 is a chart demonstrating Leucine Stimulation of RPS6 in Arg/Tyr Background. Error bars shown are the standard deviation.

FIG. 45 is a chart demonstrating Tyrosine Stimulation of RPS6 in Arg/Leu Background. Error bars shown are the standard deviation.

FIG. 46 is a chart demonstrating a time-course of free Leu release during Pancreatin digest of SEQID-00105.

FIG. 47 is a chart demonstrating viscosity measured in centipoise for SEQID-00105 at 4 C (closed circles) and 25 C (open circles) and whey at 4 C (closed squares) and 25 C (open squares) over a range of protein concentrations.

FIG. 48 is a chart demonstrating (Left) Initial and final (after heating to 90° C. and then cooling to 20° C.) protein circular dichroism spectrum for SEQID-00105 and (Right) change in ellipticity at a given wavelength over the temperature range for that SEQID-00105.

FIG. 49 is an image demonstrating Western blot analysis for mannose-containing glycans. A) Coomassie-stained gel. B) GNA blotted membrane. In both panels, lanes are as follows: 1) Pre-stained protein ladder, 2) SEQID-00363 (5 μg) from A. niger, 3) whole cell extract (5 μg) from E. coli transformed with an expression vector encoding SEQID-00363, 4), GNA positive control carboxypeptidase (5 μg), 5) soluble lysate (5 μg) from E. coli transformed with an expression vector encoding SEQID-00363.

FIG. 50 is an image demonstrating Western blot analysis for Neu5Gc. A) Coomassie-stained gel. B) anti-Neu5Gc probed membrane. In both panels, lanes are as follows: 1&10) Pre-stained protein ladder (New England Biolab), 2&11) beef extract (30 μg), 3) pork extract (30 μg), 4) deer extract (30 μg), 5) lamb extract (30 μg), 6) turkey extract (30 μg), 7) chicken extract (30 μg), 8) cod extract (30 μg), 9) Protein Mixture 1 (10 μg), 12-15) 168 nutritive polypeptide library (30 μg) expressed in 12) E. coli (IMAC-purified lysate), 13) B. subtilis (supernatant), 14) B. subtilis (lysate), 15) B. subtilis (IMAC-purified lysate), 16-20) cDNA Library (30 μg) expressed in 16) B. subtilis (PH951 Grac lysate), 17) E. coli (Rosetta soluble lysate), 18) E. coli (Rosetta whole cell), 19) E. coli (GamiB lysate), and 20) E. coli (Gami2 lysate).

FIG. 51 is an image demonstrating Western blot analysis for Xylose and Fucose. A) Coomassie-stained gel. B) anti-Neu5Gc probed membrane. In western blot analysis of samples of protein extracted from plants and fungi or recombinantly expressed by E. coli and A. niger. xylose- and fucose-containing glycans in A) Coomassie-stained gel. B) anti-Neu5Gc-blotted membrane. In both panels, lanes are as follows: 1&11) Pre-stained protein ladder (New England Biolab), 2) yeast extract (30 μg), 3) flaxseed extract (30 μg), 4) chicken extract (30 μg), 5) corn extract (30 μg), 6) potato extract (30 μg), 7) mushroom extract (30 μg), 8) Protein Mixture 2 (30 μg), 9) HRP (2 μg), 10) fetuin (2 μg), 12) soy extract (30 μg), 13) rice extract (30 μg), 14) broccoli extract (30 μg), 15) tomato extract (30 μg), 16) blueberry extract (30 μg), 17) grape extract (30 μg), 18) Protein Mixture 2 (30 μg), 19) HRP (2 μg), 20) fetuin (2 μg).

FIG. 52A, FIG. 52B, FIG. 52C, and FIG. 52D are a series of charts demonstrating change in average area under the curve (AUC) (±SD) of plasma amino acid concentrations (μM·h) measured in blood samples collected from rats (n=2-4) over 4 h following oral administration of the indicated nutritive polypeptides at the doses listed in Table E33A. BCAA: branched chain amino acids, EAA: essential amino acids.

FIG. 53A, FIG. 53B, FIG. 53C, and FIG. 53D are a series of charts demonstrating average plasma amino acid concentration (±SD)-time curve for rats (n=4) orally administered of SEQID-00105 at 2.85 g/kg. BCAA: branched chain amino acids, EAA: essential amino acids.

FIG. 54 is a series of charts demonstrating dose-response effect of SEQID-00105. (Left) Average plasma Leu concentration (±SD)-time curve (Right) Average area under the curve (AUC) (±SD) of plasma amino acid concentrations (μMh) measured in blood samples collected from rats (n=4) over 4 h following oral administration of SEQID-00105 at the doses listed in Table E33A.

FIG. 55 is a series of charts demonstrating plasma amino acid concentrations during rat pharmacokinetic studies of native and modified forms of SEQID-00363. Plasma amino acid profile of essential amino acids (EAAs) (A), Leucine (B), Serine (C), and Threonine (D) following oral administration of saline (circle (●), solid line) (n=4), native SEQID-00363 (square (▪), solid line) (n=4), deglycosylated SEQID-00363 (open circe (∘), dashed line) (n=2), and hydrolyzed SEQID-00363 (open square (□), dashed line) (n=4). Data represent the mean±the standard deviation of the mean for n=2-4 rats, as indicated above.

FIG. 56 is a series of charts demonstrating change in average FSR for WPI, SEQID-00105, and SEQID-363

FIG. 57A, FIG. 57B, and FIG. 57C are a series of charts demonstrating human plasma time course of measured amino acid s for WPI and SEQID-00105.

FIG. 58A, FIG. 58B, and FIG. 58C are a series of charts demonstrating human plasma time course of measured amino acid s for WPI and SEQID-00105.

FIG. 59A, FIG. 59B, and FIG. 59C are a series of charts demonstrating human plasma time course of measured amino acid s for WPI and SEQID-00105.

FIG. 60A, FIG. 60B, and FIG. 60C are a series of charts demonstrating human plasma time course of measured amino acid and the aggregate groups, essential amino acids (EAA), branched chain amino acids (BCAA), and total amino acids (TAA) for WPI and SEQID-00105.

FIG. 61 is a chart demonstrating integrated area under the curve (AUC) of measured amino acids, for WPI and SEQID-00105.

FIG. 62 is a chart demonstrating integrated area under the curve (AUC) of measured amino acids, for WPI and SEQID-00105.

FIG. 63 is a chart demonstrating integrated area under the curve (AUC) of aggregate groups, essential amino acids (EAA), branched chain amino acids (BCAA), and total amino acids (TAA), for WPI and SEQID-00105.

FIG. 64A, FIG. 64B, and FIG. 64C are is a series of charts demonstrating human plasma time course of measured amino acid s for WPI and SEQID-00105.

FIG. 65A, FIG. 65B, and FIG. 65C are a series of charts demonstrating human plasma time course of measured amino acid s for WPI and SEQID-00105.

FIG. 66A, FIG. 66B, and FIG. 66C are a series of charts demonstrating human plasma time course of measured amino acid s for WPI and SEQID-00105.

FIG. 67A, FIG. 67B, and FIG. 67C are a series of charts demonstrating human plasma time course of measured amino acid and the aggregate groups, essential amino acids (EAA), branched chain amino acids (BCAA), and total amino acids (TAA) for WPI and SEQID-00105.

FIG. 68 is a chart demonstrating integrated area under the curve (AUC) of measured amino acids, for WPI and SEQID-00105.

FIG. 69 is a chart demonstrating integrated area under the curve (AUC) of measured amino acids, for WPI and SEQID-00105.

FIG. 70 is a chart demonstrating integrated area under the curve (AUC) of aggregate groups, essential amino acids (EAA), branched chain amino acids (BCAA), and total amino acids (TAA), for WPI and SEQID-00105

FIG. 71A, FIG. 71B, and FIG. 71C are a series of charts demonstrating human plasma time course of measured amino acid s for WPI and SEQID-00363.

FIG. 72A, FIG. 72B, and FIG. 72C are a series of charts demonstrating human plasma time course of measured amino acid s for WPI and SEQID-00363.

FIG. 73A, FIG. 73B, and FIG. 73C are a series of charts demonstrating human plasma time course of measured amino acid s for WPI and SEQID-00363.

FIG. 74A, FIG. 74B, and FIG. 74C are a series of charts demonstrating human plasma time course of measured amino acid and the aggregate groups, essential amino acids (EAA), branched chain amino acids (BCAA), and total amino acids (TAA) for WPI and SEQID-363.

FIG. 75A, FIG. 75B, and FIG. 75C are a series of charts demonstrating human plasma time course of measured amino acid s for WPI and SEQID-00426.

FIG. 76A, FIG. 76B, and FIG. 76C are a series of charts demonstrating human plasma time course of measured amino acid s for WPI and SEQID-00426.

FIG. 77A, FIG. 77B, and FIG. 77C are a series of charts demonstrating human plasma time course of measured amino acid s for WPI and SEQID-00426.

FIG. 78A, FIG. 78B, and FIG. 78C are a series of charts demonstrating human plasma time course of measured amino acid and the aggregate groups, essential amino acids (EAA), branched chain amino acids (BCAA), and total amino acids (TAA) for WPI and SEQID-00426.

DETAILED DESCRIPTION

Terms used in the claims and specification are defined as set forth below unless otherwise specified.

It must be noted that, as used in the specification and the appended claims, the singular forms “a,” “an” and “the” include plural referents unless the context clearly dictates otherwise.

Definitions

An “agriculturally-derived food product” is a food product resulting from the cultivation of soil or rearing of animals.

The term “ameliorating” refers to any therapeutically beneficial result in the treatment of a disease state, e.g., including prophylaxis, lessening in the severity or progression, remission, or cure thereof.

As used herein, the term “autotrophic” refers to an organism that produces complex organic compounds (such as carbohydrates, fats, and proteins) from simple inorganic molecules using energy from light (by photosynthesis) or inorganic chemical reactions (chemosynthesis).

As used herein, a “body mass index” or “BMI” or “Quetelet index” is a subject's weight in kilograms divided by the square of the subject's height in meters (kg/m2). For adults, a frequent use of the BMI is to assess how much an individual's body weight departs from what is normal or desirable for a person of his or her height. The weight excess or deficiency may, in part, be accounted for by body fat, although other factors such as muscularity also affect BMI significantly. The World Health Organization regards a BMI of less than 18.5 as underweight and may indicate malnutrition, an eating disorder, or other health problems, while a BMI greater than 25 is considered overweight and above 30 is considered obese. (World Health Organization. BMI classification).

As used herein, a “branched chain amino acid” is an amino acid selected from Leucine, Isoleucine, and Valine.

As used herein, “cachexia” refers to a multifaceted clinical syndrome that results in muscle wasting and weight loss. It is a complex condition where protein catabolism exceeds protein anabolism, which makes muscle wasting a primary feature of the condition. In addition to the metabolic derangements in protein metabolism, it is also characterized by anorexia and inflammation. These derangements plus impaired protein metabolism are responsive to nutrition therapy to varying degrees.

As used herein, “calorie control” and “calorie restriction” refer to the process of reducing a subject's calorie intake from food products, either relative to the subject's prior calorie intake or relative to an appropriate calorie intake standard.

Generally, the terms “cancer” and “cancerous” refer to or describe the physiological condition in mammals that is typically characterized by unregulated cell growth. More specifically, cancers that are treated using any one or more tyrosine kinase inhibitors, other drugs blocking the receptors or their ligands, or variants thereof, and in connection with the methods provided herein include, but are not limited to, carcinoma, lymphoma, blastoma, sarcoma, leukemia, mesothelioma, squamous cell cancer, lung cancer including small-cell lung cancer and non-small cell lung cancer (which includes large-cell carcinoma, adenocarcinoma of the lung, and squamous carcinoma of the lung), cancer of the peritoneum, hepatocellular cancer, gastric or stomach cancer (including gastrointestinal cancer and gastrointestinal stromal cancer), pancreatic cancer, glioblastoma, cervical cancer, ovarian cancer, liver cancer, bladder cancer, breast cancer, colon cancer, colorectal cancer, endometrial or uterine carcinoma, salivary gland carcinoma, kidney or renal cancer, prostate cancer, cervical cancer, vulval cancer, thyroid cancer, head and neck cancer, melanoma, superficial spreading melanoma, lentigo maligna melanoma, acral lentiginous melanomas, nodular melanomas, T-cell lymphomas, B-cell lymphomas (including low grade/follicular non-Hodgkin's lymphoma (NHL); small lymphocytic (SL) NHL; intermediate grade/follicular NHL; intermediate grade diffuse NHL; high grade immunoblastic NHL; high grade lymphoblastic NHL; high grade small non-cleaved cell NHL; bulky disease NHL; mantle cell lymphoma; AIDS-related lymphoma; and Waldenstrom's Macroglobulinemia); chronic lymphocytic leukemia (CLL); acute myeloid leukemia (AML); chronic myeloid leukemia (CML); acute lymphoblastic leukemia (ALL); Hairy cell leukemia; chronic myeloblastic leukemia; or post-transplant lymphoproliferative disorder (PTLD), as well as abnormal vascular proliferation associated with phakomatoses, edema (such as that associated with brain tumors), and Meigs' syndrome.

A “comestible product” includes an edible product, while a “non-comestible product” is generally an inedible product or contains an inedible product. To be “substantially free of non-comestible products” means a composition does not have an amount or level of non-comestible product sufficient to render the composition inedible, dangerous or otherwise unfit for consumption by its intended consumer. Alternatively, a polypeptide can be substantially free of non-comestible products, meaning the polypeptide does not contain or have associated therewith an amount or level of non-comestible product sufficient to render a composition containing the polypeptide inedible by, or unsafe or deleterious to, its intended consumer. In preferred embodiments a composition substantially free of non-comestible products can be consumed in a nutritional amount by an intended consumer who does not suffer or is not at increased risk of suffering a deleterious event from such consumption. For example, levels of lead and other metals are well-documented as having significant risk including toxicity to humans when present in food, particularly foods containing an agriculturally-derived product grown in soil contaminated with lead and/or other metals. Thus, products such as foods, beverages, and compounds containing industrially-produced polypeptides having metal content above a certain parts per million (ppm), are considered non-comestible products, such metal content depending upon the metal as recognized in the art. For example, inclusion of lead or cadmium in an industrially-produced polypeptide at levels such that the lead will have a deleterious biological effect when consumed by a mammal will generally render a composition containing the industrially-produced polypeptide non-comestible. Notwithstanding the above, some polypeptides have certain amounts of metals complexed to or incorporated therein (such as iron, zinc, calcium and magnesium) and such metals shall not necessarily render the polypeptides non-comestible.

The term “control sequences” is intended to encompass, at a minimum, any component whose presence is essential for expression, and can also encompass an additional component whose presence is advantageous, for example, leader sequences and fusion partner sequences.

As used herein, a patient is “critically-medically ill” if the patient, because of medical illness, experiences changes in at least one of body mass index and muscle mass (e.g., sarcopenia). In some embodiments the patient is confined to bed for at least 25%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% of their waking time. In some embodiments the patient is unconscious. In some embodiments the patient has been confined to bed as described in this paragraph for at least 1 day, 2 days, 3 days, 4 days, 5 days, 10 days, 2 weeks, 3 weeks, 4 weeks, 5 weeks, 10 weeks or longer.

As used herein, the phrase “degenerate variant” of a reference nucleic acid sequence encompasses nucleic acid sequences that can be translated, according to the standard genetic code, to provide an amino acid sequence identical to that translated from the reference nucleic acid sequence. The term “degenerate oligonucleotide” or “degenerate primer” is used to signify an oligonucleotide capable of hybridizing with target nucleic acid sequences that are not necessarily identical in sequence but that are homologous to one another within one or more particular segments.

As used herein a “desirable body mass index” is a body mass index of from about 18.5 to about 25. Thus, if a subject has a BMI below about 18.5, then an increase in the subject's BMI is an increase in the desirability of the subject's BMI. If instead a subject has a BMI above about 25, then a decrease in the subject's BMI is an increase in the desirability of the subject's BMI.

As used herein, the term “diabetes” includes any metabolic disease in which a subject is unable to produce any or a sufficient amount of insulin or is otherwise unable to regulate blood glucose level. The term “pre-diabetes” is also termed “impaired fasting glucose” includes a condition in which fasting glucose is above an accepted normal limit

As used herein, an “elderly” mammal is one who experiences age related changes in at least one of body mass index and muscle mass (e.g., age related sarcopenia). In some embodiments an “elderly” human is at least 50 years old, at least 60 years old, at least 65 years old, at least 70 years old, at least 75 years old, at least 80 years old, at least 85 years old, at least 90 years old, at least 95 years old, or at least 100 years old. In some embodiments and an elderly animal, mammal, or human is a human who has experienced a loss of muscle mass from peak lifetime muscle mass of at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, or at least 60%. Because age related changes to at least one of body mass index and muscle mass are known to correlate with increasing age, in some embodiments an elderly mammal is identified or defined simply on the basis of age. Thus, in some embodiments an “elderly” human is identified or defined simply by the fact that their age is at least 60 years old, at least 65 years old, at least 70 years old, at least 75 years old, at least 80 years old, at least 85 years old, at least 90 years old, at least 95 years old, or at least 100 years old, and without recourse to a measurement of at least one of body mass index and muscle mass.

As used herein, an “essential amino acid” is an amino acid selected from Histidine, Isoleucine, Leucine, Lysine, Methionine, Phenylalanine, Threonine, Tryptophan, and Valine. However, it should be understood that “essential amino acids” can vary through a typical lifespan, e.g., cysteine, tyrosine, and arginine are considered essential amino acids in infant humans. Imura K, Okada A (1998). “Amino acid metabolism in pediatric patients”. Nutrition 14 (1): 143-8. In addition, the amino acids arginine, cysteine, glycine, glutamine, histidine, proline, serine and tyrosine are considered “conditionally essential” in adults, meaning they are not normally required in the diet, but must be supplied exogenously to specific populations that do not synthesize them in adequate amounts. Furst P, Stehle P (1 Jun. 2004). “What are the essential elements needed for the determination of amino acid requirements in humans?”. Journal of Nutrition 134 (6 Suppl): 1558S-1565S; and Reeds P J (1 Jul. 2000). “Dispensable and indispensable amino acids for humans”. J. Nutr. 130 (7): 1835S-40S.

As used herein, “exercise” is, most broadly, any bodily activity that enhances or maintains physical fitness and overall health and wellness. Exercise is performed for various reasons including strengthening muscles and the cardiovascular system, honing athletic skills, weight loss or maintenance, as well as for the purpose of enjoyment.

As used herein, an “exercise regimen” includes any course of exercise for the promotion of health, or for the treatment or prevention of disease.

As used herein, an “expression control sequence” refers to polynucleotide sequences which are necessary to affect the expression of coding sequences to which they are operatively linked. Expression control sequences are sequences which control the transcription, post-transcriptional events and translation of nucleic acid sequences. Expression control sequences include appropriate transcription initiation, termination, promoter and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation signals; sequences that stabilize cytoplasmic mRNA; sequences that enhance translation efficiency (e.g., ribosome binding sites); sequences that enhance protein stability; and when desired, sequences that enhance protein secretion. The nature of such control sequences differs depending upon the host organism; in prokaryotes, such control sequences generally include promoter, ribosomal binding site, and transcription termination sequence.

As used herein, “function” and “functional performance” refers to a functional test that simulates daily activities. “Muscle function” or “functional performance” is measured by any suitable accepted test, including timed-step test (step up and down from a 4 inch bench as fast as possible 5 times), timed floor transfer test (go from a standing position to a supine position on the floor and thereafter up to a standing position again as fast as possible for one repetition), and physical performance battery test (static balance test, chair test, and a walking test) (Borsheim et al., “Effect of amino acid supplementation on muscle mass, strength and physical function in elderly,” Clin Nutr 2008; 27:189-195). As used herein, a “performance-associated” injury or damage, such as a tissue injury or tissue damage, results from a functional activity, such as a physical or athletic performance.

The term “fusion protein” refers to a polypeptide comprising a polypeptide or fragment coupled to heterologous amino acid sequences. Fusion proteins are useful because they can be constructed to contain two or more desired functional elements that can be from two or more different proteins. A fusion protein comprises at least 10 contiguous amino acids from a polypeptide of interest, or at least 20 or 30 amino acids, or at least 40, 50 or 60 amino acids, or at least 75, 100 or 125 amino acids. The heterologous polypeptide included within the fusion protein is usually at least 6 amino acids in length, or at least 8 amino acids in length, or at least 15, 20, or 25 amino acids in length. Fusions that include larger polypeptides, such as an IgG Fc region, and even entire proteins, such as the green fluorescent protein (“GFP”) chromophore-containing proteins, have particular utility. Fusion proteins can be produced recombinantly by constructing a nucleic acid sequence which encodes the polypeptide or a fragment thereof in frame with a nucleic acid sequence encoding a different protein or peptide and then expressing the fusion protein. Alternatively, a fusion protein can be produced chemically by crosslinking the polypeptide or a fragment thereof to another protein.

Sequence homology for polypeptides, which is also referred to as percent sequence identity, is typically measured using sequence analysis software. See, e.g., the Sequence Analysis Software Package of the Genetics Computer Group (GCG), University of Wisconsin Biotechnology Center, 910 University Avenue, Madison, Wis. 53705. Protein analysis software matches similar sequences using a measure of homology assigned to various substitutions, deletions and other modifications, including conservative amino acid substitutions. For instance, GCG contains programs such as “Gap” and “Bestfit” which can be used with default parameters to determine sequence homology or sequence identity between closely related polypeptides, such as homologous polypeptides from different species of organisms or between a wild-type polypeptide and a mutein thereof. See, e.g., GCG Version 6. An exemplary algorithm when comparing a particular polypeptide sequence to a database containing a large number of sequences from different organisms is the computer program BLAST (Altschul et al., J. Mol. Biol. 215:403-410 (1990); Gish and States, Nature Genet. 3:266-272 (1993); Madden et al., Meth. Enzymol. 266:131-141 (1996); Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997); Zhang and Madden, Genome Res. 7:649-656 (1997)), especially blastp or tblastn (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)).

As used herein, a “gastrointestinal disorder” or a “gastrointestinal disease” includes any disorder or disease involving the gastrointestinal tract or region thereof, namely the esophagus, stomach, small intestine, large intestine or rectum, as well as organs and tissues associated with digestion, e.g., the pancreas, the gallbladder, and the liver.

As used herein, the term “heterotrophic” refers to an organism that cannot fix carbon and uses organic carbon for growth.

As used herein, a polypeptide has “homology” or is “homologous” to a second polypeptide if the nucleic acid sequence that encodes the polypeptide has a similar sequence to the nucleic acid sequence that encodes the second polypeptide. Alternatively, a polypeptide has homology to a second polypeptide if the two polypeptides have similar amino acid sequences. (Thus, the term “homologous polypeptides” is defined to mean that the two polypeptides have similar amino acid sequences.) When “homologous” is used in reference to polypeptides or peptides, it is recognized that residue positions that are not identical often differ by conservative amino acid substitutions. A “conservative amino acid substitution” is one in which an amino acid residue is substituted by another amino acid residue having a side chain (R group) with similar chemical properties (e.g., charge or hydrophobicity). In general, a conservative amino acid substitution will not substantially change the functional properties of a polypeptide. In cases where two or more amino acid sequences differ from each other by conservative substitutions, the percent sequence identity or degree of homology can be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are well known to those of skill in the art. See, e.g., Pearson, 1994, Methods Mol. Biol. 24:307-31 and 25:365-89. The following six groups each contain amino acids that are conservative substitutions for one another: 1) Serine, Threonine; 2) Aspartic Acid, Glutamic Acid; 3) Asparagine, Glutamine; 4) Arginine, Lysine; 5) Isoleucine, Leucine, Methionine, Alanine, Valine, and 6) Phenylalanine, Tyrosine, Tryptophan. In some embodiments, polymeric molecules (e.g., a polypeptide sequence or nucleic acid sequence) are considered to be homologous to one another if their sequences are at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, %, at least 97%, %, at least 98%, or at least 99% identical. In some embodiments, polymeric molecules are considered to be “homologous” to one another if their sequences are at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, %, at least 97%, %, at least 98%, or at least 99% similar. The term “homologous” necessarily refers to a comparison between at least two sequences (nucleotides sequences or amino acid sequences). In some embodiments, two nucleotide sequences are considered to be homologous if the polypeptides they encode are at least about 50% identical, at least about 60% identical, at least about 70% identical, at least about 80% identical, or at least about 90% identical for at least one stretch of at least about 10, 15, 20, 25, 30, 35, 40, 45, 50 or over 50 amino acids. In some embodiments, homologous nucleotide sequences are characterized by the ability to encode a stretch of at least 4-5 uniquely specified amino acids. Both the identity and the approximate spacing of these amino acids relative to one another must be considered for nucleotide sequences to be considered homologous. In some embodiments of nucleotide sequences less than 60 nucleotides in length, homology is determined by the ability to encode a stretch of at least 4-5 uniquely specified amino acids. In some embodiments, two polypeptide sequences are considered to be homologous if the polypeptides are at least about 50% identical, at least about 60% identical, at least about 70% identical, at least about 80% identical, or at least about 90% identical for at least one stretch of at least about 20 amino acids. In other embodiments, two polypeptide sequences are considered to be homologous if the polypeptides are similar, such as at least about 50% similar, at least about 60% similar, at least about 70% similar, at least about 80% similar, or at least about 90% similar, or at least about 95% similar for at least one stretch of at least about 20 amino acids. In some embodiments similarity is demonstrated by fewer nucleotide changes that result in an amino acid change (e.g., a nucleic acid sequence having a single nucleotide change is more similar to a reference nucleic acid sequence than a nucleic acid sequence having two nucleotide changes, even if both changes result in an identical amino acid substitution.

The term “in situ” refers to processes that occur in a living cell growing separate from a living organism, e.g., growing in tissue culture.

As used herein, the term “in vitro” refers to events that occur in an artificial environment, e.g., in a test tube or reaction vessel, in cell culture, in a Petri dish, etc., rather than within an organism (e.g., animal, plant, or microbe). As used herein, the term “ex vivo” refers to experimentation done in or on tissue in an environment outside the organism.

The term “in vivo” refers to processes that occur in a living organism.

As used herein, a “modified derivative” refers to polypeptides or fragments thereof that are substantially homologous in primary structural sequence to a reference polypeptide sequence but which include, e.g., in vivo or in vitro chemical and biochemical modifications or which incorporate amino acids that are not found in the reference polypeptide. Such modifications include, for example, acetylation, carboxylation, phosphorylation, glycosylation, ubiquitination, labeling, e.g., with radionuclides, and various enzymatic modifications, as will be readily appreciated by those skilled in the art. A variety of methods for labeling polypeptides and of substituents or labels useful for such purposes are well known in the art, and include radioactive isotopes such as 125I, 32P, 35S, and 3H, ligands that bind to labeled antiligands (e.g., antibodies), fluorophores, chemiluminescent agents, enzymes, and antiligands that can serve as specific binding pair members for a labeled ligand. The choice of label depends on the sensitivity required, ease of conjugation with the primer, stability requirements, and available instrumentation. Methods for labeling polypeptides are well known in the art. See, e.g., Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing Associates (1992, and Supplements to 2002).

As used herein, “muscle strength” refers to the amount of force a muscle can produce with a single maximal effort. There are two types of muscle strength, static strength and dynamic strength. Static strength refers to isometric contraction of a muscle, where a muscle generates force while the muscle length remains constant and/or when there is no movement in a joint. Examples include holding or carrying an object, or pushing against a wall. Dynamic strength refers to a muscle generating force that results in movement. Dynamic strength can be isotonic contraction, where the muscle shortens under a constant load or isokinetic contraction, where the muscle contracts and shortens at a constant speed. Dynamic strength can also include isoinertial strength. In addition, the term “muscle strength” refers to maximum dynamic muscle strength, as described by the term “one repetition maximum” (1RM). This is a measurement of the greatest load (in kilograms) that can be fully moved (lifted, pushed or pulled) once without failure or injury. This value can be measured directly, but doing so requires that the weight is increased until the subject fails to carry out the activity to completion. Alternatively, 1RM is estimated by counting the maximum number of exercise repetitions a subject can make using a load that is less than the maximum amount the subject can move. Leg extension and leg flexion are often measured in clinical trials (Borsheim et al., “Effect of amino acid supplementation on muscle mass, strength and physical function in elderly,” Clin Nutr 2008; 27:189-195; Paddon-Jones, et al., “Essential amino acid and carbohydrate supplementation ameliorates muscle protein loss in humans during 28 days bed rest,” J Clin Endocrinol Metab 2004; 89:4351-4358).

As used herein, “muscle mass” refers to the weight of muscle in a subject's body. Similarly, “muscle anabolism” includes the synthesis of muscle proteins, and is a component of the process by which muscle mass is gained. Muscle mass includes the skeletal muscles, smooth muscles (such as cardiac and digestive muscles) and the water contained in these muscles. Muscle mass of specific muscles can be determined using dual energy x-ray absorptiometry (DEXA) (Padden-Jones et al., 2004). Total lean body mass (minus the fat), total body mass, and bone mineral content can be measured by DEXA as well. In some embodiments a change in the muscle mass of a specific muscle of a subject is determined, for example by DEXA, and the change is used as a proxy for the total change in muscle mass of the subject. Thus, for example, if a subject consumes a nutritive protein as disclosed herein and experiences an increase over a period of time in muscle mass in a particular muscle or muscle group, it can be concluded that the subject has experienced an increase in muscle mass. Changes in muscle mass can be measured in a variety of ways including protein synthesis, fractional synthetic rate, and certain key activities such mTor/mTorc. In general, “lean muscle mass” refers to the mass of muscle tissue in the absence of other tissues such as fat.

The term “nucleic acid fragment” as used herein refers to a nucleic acid sequence that has a deletion, e.g., a 5′-terminal or 3′-terminal deletion compared to a full-length reference nucleotide sequence. In an embodiment, the nucleic acid fragment is a contiguous sequence in which the nucleotide sequence of the fragment is identical to the corresponding positions in the naturally-occurring sequence. In some embodiments, fragments are at least 10, 15, 20, or 25 nucleotides long, or at least 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, or 150 nucleotides long. In some embodiments a fragment of a nucleic acid sequence is a fragment of an open reading frame sequence. In some embodiments such a fragment encodes a polypeptide fragment (as defined herein) of the protein encoded by the open reading frame nucleotide sequence.

A composition, formulation or product is “nutritional” or “nutritive” if it provides an appreciable amount of nourishment to its intended consumer, meaning the consumer assimilates all or a portion of the composition or formulation into a cell, organ, and/or tissue. Generally such assimilation into a cell, organ and/or tissue provides a benefit or utility to the consumer, e.g., by maintaining or improving the health and/or natural function(s) of said cell, organ, and/or tissue. A nutritional composition or formulation that is assimilated as described herein is termed “nutrition.” By way of non-limiting example, a polypeptide is nutritional if it provides an appreciable amount of polypeptide nourishment to its intended consumer, meaning the consumer assimilates all or a portion of the protein, typically in the form of single amino acids or small peptides, into a cell, organ, and/or tissue. “Nutrition” also means the process of providing to a subject, such as a human or other mammal, a nutritional composition, formulation, product or other material. A nutritional product need not be “nutritionally complete,” meaning if consumed in sufficient quantity, the product provides all carbohydrates, lipids, essential fatty acids, essential amino acids, conditionally essential amino acids, vitamins, and minerals required for health of the consumer. Additionally, a “nutritionally complete protein” contains all protein nutrition required (meaning the amount required for physiological normalcy by the organism) but does not necessarily contain micronutrients such as vitamins and minerals, carbohydrates or lipids.

In preferred embodiments, a composition or formulation is nutritional in its provision of polypeptide capable of decomposition (i.e., the breaking of a peptide bond, often termed protein digestion) to single amino acids and/or small peptides (e.g., two amino acids, three amino acids, or four amino acids, possibly up to ten amino acids) in an amount sufficient to provide a “nutritional benefit.” In addition, in certain embodiments provided are nutritional polypeptides that transit across the gastrointestinal wall and are absorbed into the bloodstream as small peptides (e.g., larger than single amino acids but smaller than about ten amino acids) or larger peptides, oligopeptides or polypeptides (e.g., >11 amino acids). A nutritional benefit in a polypeptide-containing composition can be demonstrated and, optionally, quantified, by a number of metrics. For example, a nutritional benefit is the benefit to a consuming organism equivalent to or greater than at least about 0.5% of a reference daily intake value of protein, such as about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 100% or greater than about 100% of a reference daily intake value. Alternatively, a nutritional benefit is demonstrated by the feeling and/or recognition of satiety by the consumer. In other embodiments, a nutritional benefit is demonstrated by incorporation of a substantial amount of the polypeptide component of the composition or formulation into the cells, organs and/or tissues of the consumer, such incorporation generally meaning that single amino acids or short peptides are used to produce polypeptides de novo intracellularly. A “consumer” or a “consuming organism” means any animal capable of ingesting the product having the nutritional benefit. Typically, the consumer will be a mammal such as a healthy human, e.g., a healthy infant, child, adult, or older adult. Alternatively, the consumer will be a mammal such as a human (e.g., an infant, child, adult or older adult) at risk of developing or suffering from a disease, disorder or condition characterized by (i) the lack of adequate nutrition and/or (ii) the alleviation thereof by the nutritional products of the present invention. An “infant” is generally a human under about age 1 or 2, a “child” is generally a human under about age 18, and an “older adult” or “elderly” human is a human aged about 65 or older.

In other preferred embodiments, a composition or formulation is nutritional in its provision of carbohydrate capable of hydrolysis by the intended consumer (termed a “nutritional carbohydrate”). A nutritional benefit in a carbohydrate-containing composition can be demonstrated and, optionally, quantified, by a number of metrics. For example, a nutritional benefit is the benefit to a consuming organism equivalent to or greater than at least about 2% of a reference daily intake value of carbohydrate.

A polypeptide “nutritional domain” as used herein means any domain of a polypeptide that is capable of providing nutrition. Preferably, a polypeptide nutritional domain provides one or more advantages over the full-length polypeptide containing the nutritional domain, such as the nutritional domain provides more nutrition than the full-length polypeptide. For example, a polypeptide nutritional domain has a higher concentration of desirable amino acids, has a lower concentration of undesirable amino acids, contains a site for cleavage by a digestive protease, is easier to digest and/or is easier to produce from the digestion of a larger polypeptide, has improved storage characteristics, or a combination of these and/or other factors, in comparison to (i) a reference polypeptide or a reference polypeptide-containing mixture or composition, (ii) the protein(s) or polypeptide(s) present in an agriculturally-derived food product, and/or (iii) the protein or polypeptide products present in the diet of a mammalian subject. Other advantages of a polypeptide nutritional domain includes easier and/or more efficient production, different or more advantageous physiochemical properties, and/or has different s or more advantageous safety properties (e.g., elimination of one or more allergy domains) relative to full-length polypeptide. A reference polypeptide can be a naturally occurring polypeptide or a recombinantly produced polypeptide, which in turn may have an amino acid sequence identical to or different from a naturally occurring polypeptide. A reference polypeptide may also be a consensus amino acid sequence not present in a naturally-occurring polypeptide. Additionally, a reference polypeptide-containing mixture or composition can be a naturally-occurring mixture, such as a mixture of polypeptides present in a dairy product such as milk or whey, or can be a synthetic mixture of polypeptides (which, in turn, can be naturally-occurring or synthetic). In certain embodiments the nutritional domain contains an amino acid sequence having an N-terminal amino acid and/or a C-terminal amino acid different from the N-terminal amino acid and/or a C-terminal amino acid of a reference secreted polypeptide, such as a full-length secreted polypeptide. For example, a nutritional domain has an N-terminal amino acid sequence that corresponds to an amino acid sequence internal to a larger secreted polypeptide that contains the nutritional domain. A nutritional domain may include or exclude a signal sequence of a larger secreted polypeptide. As used herein, a polypeptide that “contains” a polypeptide nutritional domain contains the entirety of the polypeptide nutritional domain as well as at least one additional amino acid, either N-terminal or C-terminal to the polypeptide nutritional domain. Generally polypeptide nutritional domains are secreted from the cell or organism containing a nucleic acid encoding the nutritional domain, and are termed “secreted polypeptide nutritional domains,” and, in circumstances wherein the nutritional domain is secreted from a unicellular (or single celled) organism, it is termed a “unicellular secreted polypeptide nutritional domain.”

In other preferred embodiments, a composition or formulation is nutritional in its provision of lipid capable of digestion, incorporation, conversion, or other cellular uses by the intended consumer (termed a “nutritional lipid”). A nutritional benefit in a lipid-containing composition can be demonstrated and, optionally, quantified, by a number of metrics. For example, a nutritional benefit is the benefit to a consuming organism equivalent to or greater than at least about 2% of a reference daily intake value of lipid (i.e., fat).

As used herein, an “obese” subject has a level of excess body fat that, increasing the likelihood of the subject suffering from diseases including heart disease, type II diabetes, osteoporosis and osteoarthritis, and cancer, while an “overweight” subject is above a weight recognized as normal, acceptable, or desirable, but not obese. In Western countries, a subject having a BMI value exceeding 30 is considered obese, while a subject having a BMI value between 25-30 is considered overweight.

As used herein, “operatively linked” or “operably linked” expression control sequences refers to a linkage in which the expression control sequence is contiguous with the gene of interest to control the gene of interest, as well as expression control sequences that act in trans or at a distance to control the gene of interest.

The term “percent sequence identity” or “identical” in the context of nucleic acid sequences refers to the residues in the two sequences that are the same when aligned for maximum correspondence. There are a number of different algorithms known in the art that can be used to measure nucleotide sequence identity. For instance, polynucleotide sequences can be compared using FASTA, Gap or Bestfit, which are programs in Wisconsin Package Version 10.0, Genetics Computer Group (GCG), Madison, Wis. FASTA provides alignments and percent sequence identity of the regions of the best overlap between the query and search sequences. Pearson, Methods Enzymol. 183:63-98 (1990).

The term “polynucleotide,” “nucleic acid molecule,” “nucleic acid,” or “nucleic acid sequence” refers to a polymeric form of nucleotides of at least 10 bases in length. The term includes DNA molecules (e.g., cDNA or genomic or synthetic DNA) and RNA molecules (e.g., mRNA or synthetic RNA), as well as analogs of DNA or RNA containing non-natural nucleotide analogs, non-native internucleoside bonds, or both. The nucleic acid can be in any topological conformation. For instance, the nucleic acid can be single-stranded, double-stranded, triple-stranded, quadruplexed, partially double-stranded, branched, hairpinned, circular, or in a padlocked conformation. A “synthetic” RNA, DNA or a mixed polymer is one created outside of a cell, for example one synthesized chemically. The term “nucleic acid fragment” as used herein refers to a nucleic acid sequence that has a deletion, e.g., a 5′-terminal or 3′-terminal deletion of one or more nucleotides compared to a full-length reference nucleotide sequence. In an embodiment, the nucleic acid fragment is a contiguous sequence in which the nucleotide sequence of the fragment is identical to the corresponding positions in the naturally-occurring sequence. In some embodiments, fragments are at least 10, 15, 20, or 25 nucleotides long, or at least 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800 or greater than 1800 nucleotides long. In some embodiments a fragment of a nucleic acid sequence is a fragment of an open reading frame sequence. In some embodiments such a fragment encodes a polypeptide fragment (as defined herein) of the polypeptide encoded by the open reading frame nucleotide sequence.

The terms “polypeptide” and “protein” can be interchanged, and these terms encompass both naturally-occurring and non-naturally occurring polypeptides, and, as provided herein or as generally known in the art, fragments, mutants, derivatives and analogs thereof. A polypeptide can be monomeric, meaning it has a single chain, or polymeric, meaning it is composed of two or more chains, which can be covalently or non-covalently associated. Further, a polypeptide may comprise a number of different domains each of which has one or more distinct activities. For the avoidance of doubt, a polypeptide can be any length greater than or equal to two amino acids. The term “isolated polypeptide” is a polypeptide that by virtue of its origin or source of derivation (1) is not associated with naturally associated components that accompany it in any of its native states, (2) exists in a purity not found in nature, where purity can be adjudged with respect to the presence of other cellular material (e.g., is free of other polypeptides from the same species or from the host species in which the polypeptide was produced) (3) is expressed by a cell from a different species, (4) is recombinantly expressed by a cell (e.g., a polypeptide is an “isolated polypeptide” if it is produced from a recombinant nucleic acid present in a host cell and separated from the producing host cell, (5) does not occur in nature (e.g., it is a domain or other fragment of a polypeptide found in nature or it includes amino acid analogs or derivatives not found in nature or linkages other than standard peptide bonds), or (6) is otherwise produced, prepared, and/or manufactured by the hand of man. Thus, an “isolated polypeptide” includes a polypeptide that is produced in a host cell from a recombinant nucleic acid (such as a vector), regardless of whether the host cell naturally produces a polypeptide having an identical amino acid sequence. A “polypeptide” includes a polypeptide that is produced by a host cell via overexpression, e.g., homologous overexpression of the polypeptide from the host cell such as by altering the promoter of the polypeptide to increase its expression to a level above its normal expression level in the host cell in the absence of the altered promoter. A polypeptide that is chemically synthesized or synthesized in a cellular system different from a cell from which it naturally originates will be “isolated” from its naturally associated components. A polypeptide may also be rendered substantially free of naturally associated components by isolation, using protein purification techniques well known in the art. As thus defined, “isolated” does not necessarily require that the protein, polypeptide, peptide or oligopeptide so described has been physically removed from a cell in which it was synthesized.

The term “polypeptide fragment” or “protein fragment” as used herein refers to a polypeptide or domain thereof that has less amino acids compared to a reference polypeptide, e.g., a full-length polypeptide or a polypeptide domain of a naturally occurring protein. A “naturally occurring protein” or “naturally occurring polypeptide” includes a polypeptide having an amino acid sequence produced by a non-recombinant cell or organism. In an embodiment, the polypeptide fragment is a contiguous sequence in which the amino acid sequence of the fragment is identical to the corresponding positions in the naturally-occurring sequence. Fragments typically are at least 5, 6, 7, 8, 9 or 10 amino acids long, or at least 12, 14, 16 or 18 amino acids long, or at least 20 amino acids long, or at least 25, 30, 35, 40 or 45, amino acids, or at least 50, 60, 70, 80, 90 or 100 amino acids long, or at least 110, 120, 130, 140, 150, 160, 170, 180, 190 or 200 amino acids long, or 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 525, 550, 575, 600 or greater than 600 amino acids long. A fragment can be a portion of a larger polypeptide sequence that is digested inside or outside the cell. Thus, a polypeptide that is 50 amino acids in length can be produced intracellularly, but proteolyzed inside or outside the cell to produce a polypeptide less than 50 amino acids in length. This is of particular significance for polypeptides shorter than about 25 amino acids, which can be more difficult than larger polypeptides to produce recombinantly or to purify once produced recombinantly. The term “peptide” as used herein refers to a short polypeptide or oligopeptide, e.g., one that typically contains less than about 50 amino acids and more typically less than about 30 amino acids, or more typically less than about 15 amino acids, such as less than about 10, 9, 8, 7, 6, 5, 4, or 3 amino acids. The term as used herein encompasses analogs and mimetics that mimic structural and thus biological function.

As used herein, “polypeptide mutant” or “mutein” refers to a polypeptide whose sequence contains an insertion, duplication, deletion, rearrangement or substitution of one or more amino acids compared to the amino acid sequence of a reference protein or polypeptide, such as a native or wild-type protein. A mutein may have one or more amino acid point substitutions, in which a single amino acid at a position has been changed to another amino acid, one or more insertions and/or deletions, in which one or more amino acids are inserted or deleted, respectively, in the sequence of the reference protein, and/or truncations of the amino acid sequence at either or both the amino or carboxy termini. A mutein may have the same or a different biological activity compared to the reference protein. In some embodiments, a mutein has, for example, at least 85% overall sequence homology to its counterpart reference protein. In some embodiments, a mutein has at least 90% overall sequence homology to the wild-type protein. In other embodiments, a mutein exhibits at least 95% sequence identity, or 98%, or 99%, or 99.5% or 99.9% overall sequence identity.

As used herein, a “polypeptide tag for affinity purification” is any polypeptide that has a binding partner that can be used to isolate or purify a second protein or polypeptide sequence of interest fused to the first “tag” polypeptide. Several examples are well known in the art and include a His-6 tag (SEQ ID NO: 44484), a FLAG epitope, a c-myc epitope, a Strep-TAGII, a biotin tag, a glutathione 5-transferase (GST), a chitin binding protein (CBP), a maltose binding protein (MBP), or a metal affinity tag.

As used herein, “protein-energy malnutrition” refers to a form of malnutrition where there is inadequate protein intake. Types include Kwashiorkor (protein malnutrition predominant), Marasmus (deficiency in both calorie and protein nutrition), and Marasmic Kwashiorkor (marked protein deficiency and marked calorie insufficiency signs present, sometimes referred to as the most severe form of malnutrition). “Malnourishment” and “malnutrition” are used equivalently herein.

The terms “purify,” “purifying” and “purified” refer to a substance (or entity, composition, product or material) that has been separated from at least some of the components with which it was associated either when initially produced (whether in nature or in an experimental setting), or during any time after its initial production. A substance such as a nutritional polypeptide will be considered purified if it is isolated at production, or at any level or stage up to and including a final product, but a final product may contain other materials up to about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, or above about 90% and still be considered “isolated.” Purified substances or entities can be separated from at least about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, or more of the other components with which they were initially associated. In some embodiments, purified substances are more than about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more than about 99% pure. In the instance of polypeptides and other polypeptides provided herein, such a polypeptide can be purified from one or more other polypeptides capable of being secreted from the unicellular organism that secretes the polypeptide. As used herein, a polypeptide substance is “pure” if it is substantially free of other components or other polypeptide components.

As used herein, “recombinant” refers to a biomolecule, e.g., a gene or polypeptide, that (1) has been removed from its naturally occurring environment, (2) is not associated with all or a portion of a polynucleotide in which the gene is found in nature, (3) is operatively linked to a polynucleotide which it is not linked to in nature, or (4) does not occur in nature. Also, “recombinant” refers to a cell or an organism, such as a unicellular organism, herein termed a “recombinant unicellular organism,” a “recombinant host” or a “recombinant cell” that contains, produces and/or secretes a biomolecule, which can be a recombinant biomolecule or a non-recombinant biomolecule. For example, a recombinant unicellular organism may contain a recombinant nucleic acid providing for enhanced production and/or secretion of a recombinant polypeptide or a non-recombinant polypeptide. A recombinant cell or organism, is also intended to refer to a cell into which a recombinant nucleic acid such as a recombinant vector has been introduced. A “recombinant unicellular organism” includes a recombinant microorganism host cell and refers not only to the particular subject cell but to the progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the terms herein. The term “recombinant” can be used in reference to cloned DNA isolates, chemically-synthesized polynucleotide analogs, or polynucleotide analogs that are biologically synthesized by heterologous systems, as well as polypeptides and/or mRNAs encoded by such nucleic acids. Thus, for example, a polypeptide synthesized by a microorganism is recombinant, for example, if it is produced from an mRNA transcribed from a recombinant gene or other nucleic acid sequence present in the cell.

As used herein, an endogenous nucleic acid sequence in the genome of an organism (or the encoded polypeptide product of that sequence) is deemed “recombinant” herein if a heterologous sequence is placed adjacent to the endogenous nucleic acid sequence, such that the expression of this endogenous nucleic acid sequence is altered. In this context, a heterologous sequence is a sequence that is not naturally adjacent to the endogenous nucleic acid sequence, whether or not the heterologous sequence is itself endogenous (originating from the same host cell or progeny thereof) or exogenous (originating from a different host cell or progeny thereof). By way of example, a promoter sequence can be substituted (e.g., by homologous recombination) for the native promoter of a gene in the genome of a host cell, such that this gene has an altered expression pattern. This gene would now become “recombinant” because it is separated from at least some of the sequences that naturally flank it. A nucleic acid is also considered “recombinant” if it contains any modifications that do not naturally occur to the corresponding nucleic acid in a genome. For instance, an endogenous coding sequence is considered “recombinant” if it contains an insertion, deletion or a point mutation introduced artificially, e.g., by human intervention. A “recombinant nucleic acid” also includes a nucleic acid integrated into a host cell chromosome at a heterologous site and a nucleic acid construct present as an episome.

The term “recombinant host cell” (or simply “recombinant cell” or “host cell”), as used herein, is intended to refer to a cell into which a recombinant nucleic acid such as a recombinant vector has been introduced. In some instances the word “cell” is replaced by a name specifying a type of cell. For example, a “recombinant microorganism” is a recombinant host cell that is a microorganism host cell and a “recombinant cyanobacteria” is a recombinant host cell that is a cyanobacteria host cell. It should be understood that such terms are intended to refer not only to the particular subject cell but to the progeny of such a cell. Because certain modifications may occur in succeeding generations due to either mutation or environmental influences, such progeny may not, in fact, be identical to the parent cell, but are still included within the scope of the term “recombinant host cell,” “recombinant cell,” and “host cell”, as used herein. A recombinant host cell can be an isolated cell or cell line grown in culture or can be a cell which resides in a living tissue or organism.

As used herein, “sarcopenia” refers to the degenerative loss of skeletal muscle mass (typically 0.5-1% loss per year after the age of 25), quality, and strength associated with aging. Sarcopenia is a component of the frailty syndrome. The European Working Group on Sarcopenia in Older People (EWGSOP) has developed a practical clinical definition and consensus diagnostic criteria for age-related sarcopenia. For the diagnosis of sarcopenia, the working group has proposed using the presence of both low muscle mass and low muscle function (strength or performance). Sarcopenia is characterized first by a muscle atrophy (a decrease in the size of the muscle), along with a reduction in muscle tissue “quality,” caused by such factors as replacement of muscle fibres with fat, an increase in fibrosis, changes in muscle metabolism, oxidative stress, and degeneration of the neuromuscular junction. Combined, these changes lead to progressive loss of muscle function and eventually to frailty. Frailty is a common geriatric syndrome that embodies an elevated risk of catastrophic declines in health and function among older adults. Contributors to frailty can include sarcopenia, osteoporosis, and muscle weakness. Muscle weakness, also known as muscle fatigue, (or “lack of strength”) refers to the inability to exert force with one's skeletal muscles. Weakness often follows muscle atrophy and a decrease in activity, such as after a long bout of bedrest as a result of an illness. There is also a gradual onset of muscle weakness as a result of sarcopenia. Thus, sarcopenia is an exemplary condition associated with muscle wasting.

As used herein, “satiation” is the act of becoming full while eating or a reduced desire to eat. This halts or diminishes eating.

As used herein, “satiety” is the act of remaining full after a meal which manifests as the period of no eating follow the meal.

As used herein, “secrete,” “secretion” and “secreted” all refer to the act or process by which a polypeptide is relocated from the cytoplasm of a cell of a multicellular organism or unicellular organism into the extracellular milieu thereof. As provided herein, such secretion may occur actively or passively. Further, the terms “excrete,” “excretion” and “excreted” generally connote passive clearing of a material from a cell or unicellular organism; however, as appropriate such terms can be associated with the production and transfer of materials outwards from the cell or unicellular organism.

In general, “stringent hybridization” is performed at about 25° C. below the thermal melting point (Tm) for the specific DNA hybrid under a particular set of conditions. “Stringent washing” is performed at temperatures about 5° C. lower than the Tm for the specific DNA hybrid under a particular set of conditions. The Tm is the temperature at which 50% of the target sequence hybridizes to a perfectly matched probe. See Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989), page 9.51, hereby incorporated by reference. For purposes herein, “stringent conditions” are defined for solution phase hybridization as aqueous hybridization (i.e., free of formamide) in 6×SSC (where 20×SSC contains 3.0 M NaCl and 0.3 M sodium citrate), 1% SDS at 65° C. for 8-12 hours, followed by two washes in 0.2×SSC, 0.1% SDS at 65° C. for 20 minutes. It will be appreciated by the skilled worker that hybridization at 65° C. will occur at different rates depending on a number of factors including the length and percent identity of the sequences which are hybridizing.

The term “substantial homology” or “substantial similarity,” when referring to a nucleic acid or fragment thereof, indicates that, when optimally aligned with appropriate nucleotide insertions or deletions with another nucleic acid (or its complementary strand), there is nucleotide sequence identity in at least about 76%, 80%, 85%, or at least about 90%, or at least about 95%, 96%, 97%, 98% or 99% of the nucleotide bases, as measured by any well-known algorithm of sequence identity, such as FASTA, BLAST or Gap, as discussed above.

The term “sufficient amount” means an amount sufficient to produce a desired effect, e.g., an amount sufficient to modulate protein aggregation in a cell.

A “synthetic” RNA, DNA or a mixed polymer is one created outside of a cell, for example one synthesized chemically.

The term “therapeutically effective amount” is an amount that is effective to ameliorate a symptom of a disease. A therapeutically effective amount can be a “prophylactically effective amount” as prophylaxis can be considered therapy.

As used herein, “thermogenesis” is the process of heat production in a mammal. Thermogenesis is accompanied by an increase in energy expenditure. Thermogenesis is specifically the energy burned following the metabolism of a food component (such as protein). This may also be referred to as the thermic effect of food. Total energy expenditure by an individual equals the sum of resting energy expenditure (energy consumed at rest in a fasting state to support basal metabolism), the thermic effect of food, and energy expenditure related to physical activity. Resting energy expenditure accounts for about 65-75% of total energy expenditure in humans. The amount and activity of muscle mass is one influencer of resting energy expenditure. Adequate protein consumption to support muscle also influences resting energy expenditure. The ingestion of protein tends to increase energy expenditure following a meal; this is the thermic effect of food. The thermic effect of food accounts for about 10% of total energy expenditure in humans. While this is a small proportion of total energy expenditure, small increases in this value can impact body weight. Protein has a higher thermic effect than fat or carbohydrate; this effect along with other metabolic influences of protein makes it a useful substrate for weight control, diabetes management and other conditions.

As used herein, a “vector” is intended to refer to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. One type of vector is a “plasmid,” which generally refers to a circular double stranded DNA loop into which additional DNA segments can be ligated, but also includes linear double-stranded molecules such as those resulting from amplification by the polymerase chain reaction (PCR) or from treatment of a circular plasmid with a restriction enzyme. Other vectors include cosmids, bacterial artificial chromosomes (BAC) and yeast artificial chromosomes (YAC). Another type of vector is a viral vector, wherein additional DNA segments can be ligated into the viral genome (discussed in more detail below). Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., vectors having an origin of replication which functions in the host cell). Other vectors can be integrated into the genome of a host cell upon introduction into the host cell, and are thereby replicated along with the host genome. Moreover, certain vectors are capable of directing the expression of genes to which they are operatively linked. Such vectors are referred to herein as “recombinant expression vectors” (or simply “expression vectors”).

Nutritive Polypeptides and Amino Acid Sequences

Proteins present in dietary food sources can vary greatly in their nutritive value. Provided are nutritive polypeptides that have enhanced nutritive value and physiological and pharmacological effects due to their amino acid content and digestibility. Provided are nutritive polypeptides that have enhanced levels of essential amino acids, the inadequate availability of such essential amino acids in a person negatively impacts general health and physiology through the perturbation of a network of cellular functions, and is associated with a wide array of health issues and diseases. Also provided are nutritive polypeptides that have reduced levels of certain amino acids, the presence or overabundance of such amino acids in the diet of an affected subject results in increased morbidity and mortality.

Traditionally, nutritionists and health researchers have utilized specific source ingredients (e.g., whey protein, egg whites, soya) or fractionates and isolates (e.g., soy protein isolates) to modulate the relative concentration of total protein in the diet, without the ability to modulate the specific amino acid constituents.

Herein provided are nutritive polypeptides capable of transforming health and treating, preventing and reducing the severity of a multitude of diseases, disorders and conditions associated with amino acid pathophysiology, as they are selected for specific physiologic benefits to improve health and address many nutrition-related conditions, including gastrointestinal malabsorption, muscle wasting, diabetes or pre-diabetes, obesity, oncology, metabolic diseases, and other cellular and systemic diseases. Also provided are the compositions and formulations that contain the nutritive polypeptides, as food, beverages, medical foods, supplements, and pharmaceuticals.

Herein are provided important elucidations in the genomics, proteomics, protein characterization and production of nutritive polypeptides. The present invention utilizes the synergistic advancements, described herein, of (a) the genomics of edible species those human food source organisms, and human genomics, (b) substantial advances in protein identification and quantification in food protein and food nucleic acid libraries, (c) new correlations between protein physical chemistry, solubility, structure-digestibility relationships and amino acid absorption and metabolism in animals and humans, (d) physiology and pathophysiology information of how amino acids, the components of nutritive polypeptides, affect protein malnutrition, chronic disease, responses to acute injury, and aging, (e) recombinant nutritive polypeptide production utilizing a phylogenetically broad spectrum of host organisms, (0 qualification of allergenicity and toxicogenicity and in vitro and in vivo tests to assess human safety of orally consumed nutritive polypeptides.

Identification and Selection of Amino Acid Sequences Encoding Nutritive Polypeptides.

In its broadest sense, a nutritive polypeptide encompasses a polypeptide capable of delivering amino acid and peptide nutrition to its intended consumer, who derives a benefit from such consumption. Each nutritive polypeptide contains one or more amino acid sequences, and the present invention provides methods by which an amino acid sequence is identified and utilized in production, formulation and administration of the nutritive polypeptide having such an amino acid sequence.

In some embodiments, the source of a nutritive polypeptide amino acid sequence encompasses any protein-containing material, e.g., a food, beverage, composition or other product, known to be eaten, or otherwise considered suitable for consumption, without deleterious effect by, e.g., a human or other organism, in particular a mammal.

Nutritive Polypeptide Amino Acid Sequences Derived from Edible Species.

In some embodiments a nutritive polypeptide comprises or consists of a protein or fragment of a protein that naturally occurs in an edible product, such as a food, or in the organism that generates biological material used in or as the food. In some embodiments an “edible species” is a species known to produce a protein that can be eaten by humans without deleterious effect. A protein or polypeptide present in an edible species, or encoded by a nucleic acid present in the edible species, is termed an “edible species protein” or “edible species polypeptide” or, if the edible species is a species consumed by a human, the term “naturally occurring human food protein” is used interchangeably herein. Some edible products are an infrequent but known component of the diet of only a small group of a type of mammal in a limited geographic location while others are a dietary staple throughout much of the world. In other embodiments an edible product is one not known to be previously eaten by any mammal, but that is demonstrated to be edible upon testing or analysis of the product or one or more proteins contained in the product.

Food organisms include but are not limited to those organisms of edible species disclosed in PCT/US2013/032232, filed Mar. 15, 2013, PCT/US2013/032180, filed Mar. 15, 2013, PCT/US2013/032225, filed Mar. 15, 2013, PCT/US2013/032218, filed Mar. 15, 2013, PCT/US2013/032212, filed Mar. 15, 2013, PCT/US2013/032206, filed Mar. 15, 2013, and PCT/US2013/038682, filed Apr. 29, 2013 and any phylogenetically related organisms.

In some embodiments a nutritive polypeptide amino acid sequence is identified in a protein that is present in a food source, such as an abundant protein in food, or is a derivative or mutein thereof, or is a fragment of an amino acid sequence of a protein in food or a derivative or mutein thereof. An abundant protein is a protein that is present in a higher concentration in a food relative to other proteins present in the food. Alternatively, a nutritive polypeptide amino acid sequence is identified from an edible species that produces a protein containing the amino acid sequence in relatively lower abundance, but the protein is detectable in a food product derived from the edible species, or from biological material produced by the edible species. In some embodiments a nucleic acid that encodes the protein is detectable in a food product derived from the edible species, or the nucleic acid is detectable from a biological material produced by the edible species. An edible species can produce a food that is a known component of the diet of only a small group of a type of mammal in a limited geographic location, or a dietary staple throughout much of the world.

Exemplary edible species include animals such as goats, cows, chickens, pigs and fish. In some embodiments the abundant protein in food is selected from chicken egg proteins such as ovalbumin, ovotransferrin, and ovomucuoid; meat proteins such as myosin, actin, tropomyosin, collagen, and troponin; cereal proteins such as casein, alpha1 casein, alpha2 casein, beta casein, kappa casein, beta-lactoglobulin, alpha-lactalbumin, glycinin, beta-conglycinin, glutelin, prolamine, gliadin, glutenin, albumin, globulin; chicken muscle proteins such as albumin, enolase, creatine kinase, phosphoglycerate mutase, triosephosphate isomerase, apolipoprotein, ovotransferrin, phosphoglucomutase, phosphoglycerate kinase, glycerol-3-phosphate dehydrogenase, glyceraldehyde 3-phosphate dehydrogenase, hemoglobin, cofilin, glycogen phosphorylase, fructose-1,6-bisphosphatase, actin, myosin, tropomyosin a-chain, casein kinase, glycogen phosphorylase, fructose-1,6-bisphosphatase, aldolase, tubulin, vimentin, endoplasmin, lactate dehydrogenase, destrin, transthyretin, fructose bisphosphate aldolase, carbonic anhydrase, aldehyde dehydrogenase, annexin, adenosyl homocysteinase; pork muscle proteins such as actin, myosin, enolase, titin, cofilin, phosphoglycerate kinase, enolase, pyruvate dehydrogenase, glycogen phosphorylase, triosephosphate isomerase, myokinase; and fish proteins such as parvalbumin, pyruvate dehydrogenase, desmin, and triosephosphate isomerase.

Nutritive polypeptides may contain amino acid sequences present in edible species polypeptides. In one embodiment, a biological material from an edible species is analyzed to determine the protein content in the biological material. An exemplary method of analysis is to use mass spectrometry analysis of the biological material, as provided in the Examples below. Another exemplary method of analysis is to generate a cDNA library of the biological material to create a library of edible species cDNAs, and then express the cDNA library in an appropriate recombinant expression host, as provided in the Examples below. Another exemplary method of analysis is query a nucleic acid and/or protein sequence database as provided in the Examples below.

Determination of Amino Acid Ratios and Amino Acid Density in a Nutritive Polypeptide.

In some instances herein the portion of amino acid(s) of a particular type within a polypeptide, protein or a composition is quantified based on the weight ratio of the type of amino acid(s) to the total weight of amino acids present in the polypeptide, protein or composition in question. This value is calculated by dividing the weight of the particular amino acid(s) in the polypeptide, protein or a composition by the weight of all amino acids present in the polypeptide, protein or a composition.

In other instances the ratio of a particular type of amino acid(s) residues present in a polypeptide or protein to the total number of amino acids present in the polypeptide or protein in question is used. This value is calculated by dividing the number of the amino acid(s) in question that is present in each molecule of the polypeptide or protein by the total number of amino acid residues present in each molecule of the polypeptide or protein. A skilled artisan appreciates that these two methods are interchangeable and that the weight proportion of a type of amino acid(s) present in a polypeptide or protein can be converted to a ratio of the particular type of amino acid residue(s), and vice versa.

In some aspects the nutritive polypeptide is selected to have a desired density of one or more essential amino acids (EAA). Essential amino acid deficiency can be treated or prevented with the effective administration of the one or more essential amino acids otherwise absent or present in insufficient amounts in a subject's diet. For example, EAA density is about equal to or greater than the density of essential amino acids present in a full-length reference nutritional polypeptide, such as bovine lactoglobulin, bovine beta-casein or bovine type I collagen, e.g., EAA density in a nutritive polypeptide is at least about 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 100%, 200%, 300%, 400%, 500% or above 500% greater than a reference nutritional polypeptide or the polypeptide present in an agriculturally-derived food product.

In some aspects the nutritive polypeptide is selected to have a desired density of aromatic amino acids (“AAA”, including phenylalanine, tryptophan, tyrosine, histidine, and thyroxine). AAAs are useful, e.g., in neurological development and prevention of exercise-induced fatigue. For example, AAA density is about equal to or greater than the density of essential amino acids present in a full-length reference nutritional polypeptide, such as bovine lactoglobulin, bovine beta-casein or bovine type I collagen, e.g., AAA density in a nutritive polypeptide is at least about 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 100%, 200%, 300%, 400%, 500% or above 500% greater than a reference nutritional polypeptide or the polypeptide present in an agriculturally-derived food product.

In some aspects the nutritive polypeptide is selected to have a desired density of branched chain amino acids (BCAA). For example, BCAA density, either individual BCAAs or total BCAA content is about equal to or greater than the density of branched chain amino acids present in a full-length reference nutritional polypeptide, such as bovine lactoglobulin, bovine beta-casein or bovine type I collagen, e.g., BCAA density in a nutritive polypeptide is at least about 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 100%, 200%, 300%, 400%, 500% or above 500% greater than a reference nutritional polypeptide or the polypeptide present in an agriculturally-derived food product. BCAA density in a nutritive polypeptide can also be selected for in combination with one or more attributes such as EAA density.

In some aspects the nutritive polypeptide is selected to have a desired density of amino acids arginine, glutamine and/or leucine (RQL amino acids). For example, RQL amino acid density is about equal to or greater than the density of essential amino acids present in a full-length reference nutritional polypeptide, such as bovine lactoglobulin, bovine beta-casein or bovine type I collagen, e.g., RQL amino acid density in a nutritive polypeptide is at least about 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 100%, 200%, 300%, 400%, 500% or above 500% greater than a reference nutritional polypeptide or the polypeptide present in an agriculturally-derived food product.

In some aspects the nutritive polypeptide is selected to have a desired density or distribution of post-translational modifications (PTMs). For example, PTMs include addition, removal or redistribution of biotinylation, pegylation, acylation, alkylation, butyrylation, glycosylation, hydroxylation, iodination, oxidation, propionylation, malonylation, myristoylation, palmitoylation, isoprenylation, succinylation, selenoylation, SUMOylation, ubiquitination, and glypiation removal or redistribution of disulfide bridges.

In certain embodiments herein the weight proportion of branched chain amino acids, leucine, and/or essential amino acids in whey, egg, or soy is used as a benchmark to measure the amino acid composition of a polypeptide, a protein, or a composition comprising at least one of a polypeptide and a protein. In those embodiments it is understood that the two measures are not completely equivalent, but it is also understood that the measures result in measurements that are similar enough to use for this purpose. For example, when a protein of interest is characterized as comprising a ratio of branched chain amino acid residues to total amino acid residues that is equal to or greater than 24% (the weight proportion of branched chain amino acid residues present in whey), that is a precise description of the branched chain amino acid content of the protein. At the same time, the weight proportion of branched chain amino acid residues present in that protein is not necessarily exactly equal to 24%. Even so, the skilled artisan understands that this is a useful comparison. If provided with the total number of amino acid residues present in the protein of interest the skilled artisan can also determine the weight proportion of branched chain amino acid residues in the protein of interest.

In some embodiments a protein according to this disclosure comprises a first polypeptide sequence comprising a fragment of an edible species polypeptide. In some embodiments of the nutritrive protein, the protein consists of the first polypeptide sequence. In some embodiments of the nutritrive protein, the protein consists of the fragment of an edible species polypeptide.

In some embodiments a protein according to this disclosure comprises a first polypeptide sequence that comprises ratio of branched chain amino acid residues to total amino acid residues that is equal to or greater than the ratio of branched chain amino acid residues to total amino acid residues present in at least one of whey protein, egg protein, and soy protein. Thus, in such embodiments the protein comprises a first polypeptide sequence that comprises a ratio of branched chain amino acid residues to total amino acid residues that is equal to or greater than a ratio selected from 24%, 20%, and 18%. In other embodiments, the protein comprises a first polypeptide sequence that comprises a ratio of branched chain amino acid residues to total amino acid residues that is equal to or greater than a percentage ratio selected from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 75, 80, 85, 90, 95, or 100%.

In some embodiments a protein according to this disclosure comprises a first polypeptide sequence that comprises a ratio of L (leucine) residues to total amino acid residues that is equal to or greater than the ratio of L residues to total amino acid residues present in at least one of whey protein, egg protein, and soy protein. In other embodiments, the protein comprises a first polypeptide sequence that comprises a ratio of leucine residues to total amino acid residues that is equal to or greater than a percentage ratio selected from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or greater than 30%.

In some embodiments a protein according to this disclosure comprises a first polypeptide sequence that comprises a ratio of essential amino acid residues to total amino acid residues that is equal to or greater than the ratio of essential amino acid residues to total amino acid residues present in at least one of whey protein, egg protein, and soy protein. In other embodiments, the protein comprises a first polypeptide sequence that comprises a ratio of essential chain amino acid residues to total amino acid residues that is equal to or greater than a percentage ratio selected from 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 75, 80, 85, 90, 95, or 100%.

In some embodiments the protein comprises a first polypeptide sequence that comprises a ratio of branched chain amino acid residues to total amino acid residues that is equal to or greater than the ratio of branched chain amino acid residues to total amino acid residues present in at least one of whey protein, egg protein, and soy protein; and/or comprises a first polypeptide sequence that comprises a ratio of L (leucine) residues to total amino acid residues that is equal to or greater than the ratio of L residues to total amino acid residues present in at least one of whey protein, egg protein, and soy protein, and/or comprises a first polypeptide sequence that comprises a ratio of essential amino acid residues to total amino acid residues that is equal to or greater than the ratio of essential amino acid residues to total amino acid residues present in at least one of whey protein, egg protein, and soy protein.

In some embodiments the protein comprises a first polypeptide sequence that comprises a ratio of branched chain amino acid residues to total amino acid residues that is equal to or greater than the ratio of branched chain amino acid residues to total amino acid residues present in at least one of whey protein, egg protein, and soy protein; and comprises a first polypeptide sequence that comprises a ratio of essential amino acid residues to total amino acid residues that is equal to or greater than the ratio of essential amino acid residues to total amino acid residues present in at least one of whey protein, egg protein, and soy protein. In some embodiments the protein comprises a first polypeptide sequence that comprises a ratio of branched chain amino acid residues to total amino acid residues equal to or greater than 24% and a ratio of essential amino acid residues to total amino acid residues that is equal to or greater than 49%. In some embodiments the protein comprises a first polypeptide sequence that comprises a ratio of branched chain amino acid residues to total amino acid residues equal to or greater than 20% and a ratio of essential amino acid residues to total amino acid residues that is equal to or greater than 51%. In some embodiments the protein comprises a first polypeptide sequence that comprises a ratio of branched chain amino acid residues to total amino acid residues equal to or greater than 18% and a ratio of essential amino acid residues to total amino acid residues that is equal to or greater than 40%.

In some embodiments the protein comprises a first polypeptide sequence that comprises a ratio of L (leucine) residues to total amino acid residues that is equal to or greater than the ratio of L residues to total amino acid residues present in at least one of whey protein, egg protein, and soy protein; and comprises a first polypeptide sequence that comprises a ratio of essential amino acid residues to total amino acid residues that is equal to or greater than the ratio of essential amino acid residues to total amino acid residues present in at least one of whey protein, egg protein, and soy protein. In some embodiments the protein comprises a first polypeptide sequence that comprises a ratio of L (leucine) residues to total amino acid residues equal to or greater than 11% and a ratio of essential amino acid residues to total amino acid residues that is equal to or greater than 49%. In some embodiments the protein comprises a first polypeptide sequence that comprises a ratio of L (leucine) amino acid residues to total amino acid residues equal to or greater than 9% and a ratio of essential amino acid residues to total amino acid residues that is equal to or greater than 51%. In some embodiments the protein comprises a first polypeptide sequence that comprises a ratio of L (leucine) amino acid residues to total amino acid residues equal to or greater than 8% and a ratio of essential amino acid residues to total amino acid residues that is equal to or greater than 40%. In some embodiments of the protein, the first polypeptide sequence comprises a first polypeptide sequence comprising a ratio of branched chain amino acid residues to total amino acid residues equal to or greater than 24%, a ratio of L (leucine) residues to total amino acid residues that is equal to or greater than 11%, and comprises at least one of every essential amino acid. In some embodiments of the protein, the first polypeptide sequence comprises a first polypeptide sequence comprising a ratio of branched chain amino acid residues to total amino acid residues equal to or greater than 24% and a ratio of essential amino acid residues to total amino acid residues equal to or greater than 49%.

Provided are nutritive polypeptides that are nutritionally complete. In some embodiments of the protein, the first polypeptide sequence comprises a first polypeptide sequence that contains at least one of every essential amino acid.

Nutritive Glycoproteins and Nutritive Polypeptides with Modulated Glycosylation.

The term “glycan” or “glycoyl” refers to a polysaccharide or oligosaccharide which may be linked to a polypeptide, lipid, or proteoglycan. In some embodiments, a glycan is linked covalently or non-covalently to the polypeptide. In some embodiments the linkage occurs via a glycosidic bond. In some embodiments, the linkage is directly between the glycan (or glycoyl) and polypeptide or via an intermediary molecule. In some embodiments, the glycosidic bond is N-linked or O-linked. The term “polysaccharide” or “oligosaccharide” refers to one or more monosaccharide units joined together by glycosidic bonds. In some embodiments, the polysaccharide or oligosaccharide has a linear or branched structure. In some embodiments, the monosaccharide units comprise N-acetyl galactosamine, N-acetylglucosamine, galactose, neuraminic acid, fructose, mannose, fucose, glucose, xylose, N-acetylneuraminic acid, N-glycolylneuraminic acid, O-lactyl-N-acetylneuraminic acid, O-acetyl-N-acetylneuraminic acid, or O-methyl-N-acetylneuraminic acid. In some embodiments, the monosaccharide is modified by a phosphate, sulfate, or acetate group. The term “glycosylation acceptor site” refers to an amino acid along a polypeptide which carries a glycan or glycoyl in the native composition. In some embodiments the acceptor site consists of a nucleophilic acceptor of a glycosidic bond. In some embodiments, the nucleophilic acceptor site consists of an amino group. In some embodiments the amino acid consists of an asparagine, arginine, serine, threonine, hydroxyproline, hydroxylysine, tryptophan, phosphothreonine, serine, or phosphoserine. The term “exogenous glycosylation acceptor site” refers to a glycosylation acceptor site not present in the native composition of the polypeptide. In some embodiments the amino acid for the exogenous glycosylation acceptor site did not carry a glycan or glycoyl in the native composition. In some embodiments, the amino acid does not occur in the primary sequence of the polypeptide in the native composition. The term “exogenous glycan” or “exogenous glycoyl” refers to a glycan or glycoyl that occupies a glycosylation acceptor site, which was not present in the native composition on the same glycosylation acceptor site. In some embodiments, the glycosylation acceptor site is an exogenous glycosylation site or a native glycosylation site. The term “glycoprotein” refers to a polypeptide that is bound to at least one glycan or glycoyl.

Disclosed herein are formulations containing isolated nutritive polypeptides at least one exogenous glycosylation acceptor site present on an amino acid of the nutritive polypeptide. In some aspects, the at least one exogenous glycosylation acceptor site is occupied by an exogenous glycoyl or glycan, or alternatively, is unoccupied or is occupied by a non-natively occupying glycol or glycan. In some embodiments, the nutritive polypeptide is a polypeptide having an amino acid sequence at least 90% identical to SEQID 00001-03909 and SEQID 04129-44483, or is an edible species polypeptide sequence or fragment thereof at least 50 amino acids in length, or is a polypeptide having substantial immunogenicity when the glycosylation acceptor site is not present or is unoccupied. The nutritive polypeptide is more thermostable, is more digestible, and/or has a lower aggregation score than a reference polypeptide that has an amino acid sequence identical to the nutritive polypeptide but the glycosylation acceptor site is not present or is unoccupied in the reference polypeptide. The amino acids, e.g., asparagine, arginine, serine, threonine, hydroxyproline, and hydroxylysine, containing an exogenous glycosylation acceptor site are resistant to proteolysis. Exemplary glycans are N-acetyl galactosamine, N-acetylglucosamine, galactose, neuraminic acid, fructose, mannose, fucose, glucose, xylose, N-acetylneuraminic acid, N-glycolylneuraminic acid, O-lactyl-N-acetylneuraminic acid, O-acetyl-N-acetylneuraminic acid, and O-methyl-N-acetylneuraminic acid.

In some embodiments provided are formulations containing a nutritive polypeptide that is identical to the amino acid sequence of a polypeptide in a reference edible species glycoprotein, but the carbohydrate component of the nutritive polypeptide differs from a carbohydrate component of the reference edible species glycoprotein. The nutritive polypeptide is produced, for example, by expressing the polypeptide of the reference glycoprotein in a non-native host such as Aspergillus, Bacillus, Saccharomyces or a mammalian cell. Also provided are variant nutritive polypeptides, where the amino acid sequence differs from the amino acid sequence of a polypeptide in a reference glycoprotein by <1%, <5%, <10%, or more than 10%, and the mass of the carbohydrate component of the nutritive polypeptide is different from the mass of the carbohydrate component of the reference glycoprotein. The nutritive polypeptide variant is created by the insertion, deletion, substitution, or replacement of amino acid residues in the amino acid sequence of the polypeptide of the reference glycoprotein. Preferably, the nutritive polypeptide has distinguishable chemical, biochemical, biophysical, biological, or immunological properties from the reference glycoprotein. For example, the nutritive polypeptide is more hygroscopic, hydrophilic, or soluble in aqueous solutions than the reference glycoprotein. Alternatively, the nutritive polypeptide is less hygroscopic, hydrophilic, or soluble in aqueous solutions than the reference glycoprotein.

In another example, the nutritive polypeptide is more antigenic, immunogenic, or allergenic than the reference glycoprotein, or alternatively, the nutritive polypeptide is less antigenic, immunogenic, or allergenic than the reference glycoprotein. The nutritive polypeptide is more stable or resistant to enzymatic degradation than the reference glycoprotein or the nutritive polypeptide is more unstable or susceptible to enzymatic degradation than the reference glycoprotein. The carbohydrate component of the nutritive polypeptide is substantially free of N-glycolylneuraminic acid or has reduced N-glycolylneuraminic acid in comparison to the reference glycoprotein. Alternatively, the carbohydrate component of the nutritive polypeptide has elevated N-glycolylneuraminic acid in comparison to the reference glycoprotein.

Also provided is a nutritive polypeptide that has at least one exogenous glycosylation acceptor site present on an amino acid of the nutritive polypeptide, and the at least one exogenous glycosylation acceptor site is occupied by an exogenous glycoyl or glycan, and the nutritive polypeptide includes a polypeptide having an amino acid sequence at least 90% identical to SEQID 00001-03909 and SEQID 04129-44483, where the nutritive polypeptide is present in at least 0.5 g at a concentration of at least 10% on a mass basis, and where the formulation is substantially free of non-comestible products

Reference Nutritional Polypeptides and Reference Nutritional Polypeptide Mixtures.

Three natural sources of protein generally regarded as good sources of high quality amino acids are whey protein, egg protein, and soy protein. Each source comprises multiple proteins. Table RNP1 presents the weight proportional representation of each amino acid in the protein source (g AA/g protein) expressed as a percentage.

TABLE RNP1 Amino Acid Whey Egg Soy Isoleucine 6.5% 5.5% 5.0% Leucine 11.0% 8.6% 8.0% Lysine 9.1% 7.2% 6.3% Methionine 2.1% 3.1% 1.3% Phenylalanine 3.4% 5.3% 1.2% Threonine 7.0% 4.8% 3.7% Tryptophan 1.7% 1.2% 1.3% Valine 6.2% 6.1% 4.9% Histidine 2.0% 2.4% 2.7% Other 51.7% 49.5% 60.4%

Table RNP2 presents the weight proportion of each protein source that is essential amino acids, branched chain amino acids (L, I, and V), and leucine (L) (alone).

TABLE RNP2 Essential Amino Branched Chain Protein Source Acids Amino Acids Leucine Whey 49.0% 23.7% 11.0% Egg 50.5% 20.1% 8.6% Soy 39.6% 17.9% 8.0%

The sources relied on to determine the amino acid content of Whey are: Belitz H D., Grosch W., and Schieberle P. Food Chemistry (4th Ed). Springer-Verlag, Berlin Heidelberg 2009; gnc.com/product/index.jsp?productId=2986027; nutrabio.com/Products/wheyprotein concentrate.htm; and nutrabio.com/Products/wheyprotein isolate.htm. The amino acid content values from those sources were averaged to give the numbers presented in Tables RNP1 and RNP2. The source for soy protein is Egg, National Nutrient Database for Standard Reference, Release 24 (ndb.nal.usda.gov/ndb/foods/list). The source for soy protein is Self Nutrition Data (nutritiondata.self.com/facts/legumes-and-legume-products/4389/2).

According to the USDA nutritional database whey can include various non-protein components: water, lipids (such as fatty acids and cholesterol), carbohydrates and sugars, minerals (such as Ca, Fe, Mg, P, K, Na, and Zn), and vitamins (such as vitamin C, thiamin, riboflavin, niacin, vitamin B-6, folate, vitamin B-12, and vitamin A). According to the USDA nutritional database egg white can include various non-protein components: water, lipids, carbohydrates, minerals (such as Ca, Fe, Mg, P, K, Na, and Zn), and vitamins (such as thiamin, riboflavin, niacin, vitamin B-6, folate, and vitamin B-12). According to the USDA nutritional database soy can include various non-protein components: water, lipids (such as fatty acids), carbohydrates, minerals (such as Ca, Fe, Mg, P, K, Na, and Zn), and vitamins (such as thiamin, riboflavin, niacin, vitamin B-6, folate).

Engineered Nutritive Polypeptides.

In some embodiments a protein comprises or consists of a derivative or mutein of a protein or fragment of an edible species protein or a protein that naturally occurs in a food product. Such a protein can be referred to as an “engineered protein.” In such embodiments the natural protein or fragment thereof is a “reference” protein or polypeptide and the engineered protein or a first polypeptide sequence thereof comprises at least one sequence modification relative to the amino acid sequence of the reference protein or polypeptide. For example, in some embodiments the engineered protein or first polypeptide sequence thereof is at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 99.5% identical to at least one reference protein amino acid sequence. Typically the ratio of at least one of branched chain amino acid residues to total amino acid residues, essential amino acid residues to total amino acid residues, and leucine residues to total amino acid residues, present in the engineered protein or a first polypeptide sequence thereof is greater than the corresponding ratio of at least one of branched chain amino acid residues to total amino acid residues, essential amino acid residues to total amino acid residues, and leucine residues to total amino acid residues present in the reference protein or polypeptide sequence.

Nutritive Polypeptides—Orthologs and Homologs.

In another aspect, provided are nutritive polypeptides that contain amino acid sequences homologous to edible species polypeptides, which are optionally secreted from unicellular organisms and purified therefrom. Such homologous polypeptides can be 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater than 99% similar, or can be 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater than 99% identical to an edible species polypeptide. Such nutritive polypeptides can be endogenous to the host cell or exogenous, can be naturally secreted in the host cell, or both, and can be engineered for secretion.

Also provided are orthologs of nutritive polypeptides. The disclosure of a nutritive polypeptide sequence encompasses the disclosure of all orthologs of such a nutritive polypeptide sequence, from phylogenetically related organisms or, alternatively, from a phylogenetically diverse organism that is homologous to the nutrititve polypeptide, such as 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater than 99% similar, or can be 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or greater than 99% identical.

Nutritive Polypeptide Fragments, Nutritive Polypeptide Length.

In some embodiments herein a nutritive polypeptide contains a fragment of an edible species polypeptide. In some embodiments the fragment comprises at least 25 amino acids. In some embodiments the fragment comprises at least 50 amino acids. In some embodiments the fragment consists of at least 25 amino acids. In some embodiments the fragment consists of at least 50 amino acids. In some embodiments an isolated recombinant protein is provided. In some embodiments the protein comprises a first polypeptide sequence, and the first polypeptide sequence comprises a fragment of at least 25 or at least 50 amino acids of an edible species protein. In some embodiments the proteins is isolated. In some embodiments the proteins are recombinant. In some embodiments the proteins comprise a first polypeptide sequence comprising a fragment of at least 50 amino acids of an edible species protein. In some embodiments the proteins are isolated recombinant proteins. In some embodiments the isolated recombinant proteins disclosed herein are provided in a non-isolated and/or non-recombinant form.

In some embodiments the protein comprises from 10 to 5,000 amino acids, from 20-2,000 amino acids, from 20-1,000 amino acids, from 20-500 amino acids, from 20-250 amino acids, from 20-200 amino acids, from 20-150 amino acids, from 20-100 amino acids, from 20-40 amino acids, from 30-50 amino acids, from 40-60 amino acids, from 50-70 amino acids, from 60-80 amino acids, from 70-90 amino acids, from 80-100 amino acids, at least 10 amino acids, at least 11 amino acids, at least 12 amino acids, at least 13 amino acids, at least 14 amino acids, at least 15 amino acids, at least 16 amino acids, at least 17 amino acids, at least 18 amino acids, at least 19 amino acids, at least 20 amino acids, at least 21 amino acids, at least 22 amino acids, at least 23 amino acids, at least 24 amino acids, at least 25 amino acids, at least 30 amino acids, at least 35 amino acids, at least 40 amino acids, at least 45 amino acids, at least 50 amino acids, at least 55 amino acids, at least 60 amino acids, at least 65 amino acids, at least 70 amino acids, at least 75 amino acids, at least 80 amino acids, at least 85 amino acids, at least 90 amino acids, at least 95 amino acids, at least 100 amino acids, at least 105 amino acids, at least 110 amino acids, at least 115 amino acids, at least 120 amino acids, at least 125 amino acids, at least 130 amino acids, at least 135 amino acids, at least 140 amino acids, at least 145 amino acids, at least 150 amino acids, at least 155 amino acids, at least 160 amino acids, at least 165 amino acids, at least 170 amino acids, at least 175 amino acids, at least 180 amino acids, at least 185 amino acids, at least 190 amino acids, at least 195 amino acids, at least 200 amino acids, at least 205 amino acids, at least 210 amino acids, at least 215 amino acids, at least 220 amino acids, at least 225 amino acids, at least 230 amino acids, at least 235 amino acids, at least 240 amino acids, at least 245 amino acids, or at least 250 amino acids. In some embodiments the protein consists of from 20 to 5,000 amino acids, from 20-2,000 amino acids, from 20-1,000 amino acids, from 20-500 amino acids, from 20-250 amino acids, from 20-200 amino acids, from 20-150 amino acids, from 20-100 amino acids, from 20-40 amino acids, from 30-50 amino acids, from 40-60 amino acids, from 50-70 amino acids, from 60-80 amino acids, from 70-90 amino acids, from 80-100 amino acids, at least 25 amino acids, at least 30 amino acids, at least 35 amino acids, at least 40 amino acids, at least 2455 amino acids, at least 50 amino acids, at least 55 amino acids, at least 60 amino acids, at least 65 amino acids, at least 70 amino acids, at least 75 amino acids, at least 80 amino acids, at least 85 amino acids, at least 90 amino acids, at least 95 amino acids, at least 100 amino acids, at least 105 amino acids, at least 110 amino acids, at least 115 amino acids, at least 120 amino acids, at least 125 amino acids, at least 130 amino acids, at least 135 amino acids, at least 140 amino acids, at least 145 amino acids, at least 150 amino acids, at least 155 amino acids, at least 160 amino acids, at least 165 amino acids, at least 170 amino acids, at least 175 amino acids, at least 180 amino acids, at least 185 amino acids, at least 190 amino acids, at least 195 amino acids, at least 200 amino acids, at least 205 amino acids, at least 210 amino acids, at least 215 amino acids, at least 220 amino acids, at least 225 amino acids, at least 230 amino acids, at least 235 amino acids, at least 240 amino acids, at least 245 amino acids, or at least 250 amino acids. In some aspects, a protein or fragment thereof includes at least two domains: a first domain and a second domain. One of the two domains can include a tag domain, which can be removed if desired. Each domain can be 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, or greater than 25 amino acids in length. For example, the first domain can be a polypeptide of interest that is 18 amino acids in length and the second domain can be a tag domain that is 7 amino acids in length. As another example, the first domain can be a polypeptide of interest that is 17 amino acids in length and the second domain can be a tag domain that is 8 amino acids in length.

In some embodiments herein a fragment of an edible species polypeptide is selected and optionally isolated. In some embodiments the fragment comprises at least 25 amino acids. In some embodiments the fragment comprises at least 50 amino acids. In some embodiments the fragment consists of at least 25 amino acids. In some embodiments the fragment consists of at least 50 amino acids. In some embodiments an isolated recombinant protein is provided. In some embodiments the protein comprises a first polypeptide sequence, and the first polypeptide sequence comprises a fragment of at least 25 or at least 50 amino acids of an edible species protein. In some embodiments the proteins is isolated. In some embodiments the proteins are recombinant. In some embodiments the proteins comprise a first polypeptide sequence comprising a fragment of at least 50 amino acids of an edible species protein. In some embodiments the proteins are isolated recombinant proteins. In some embodiments the isolated nutritive polypeptides disclosed herein are provided in a non-isolated and/or non-recombinant form.

Nutritive Polypeptide Physicochemical Properties.

Digestibility. In some aspects the nutritive polypeptide is substantially digestible upon consumption by a mammalian subject. Preferably, the nutritive polypeptide is easier to digest than at least a reference polypeptide or a reference mixture of polypeptides, or a portion of other polypeptides in the consuming subject's diet. As used herein, “substantially digestible” can be demonstrated by measuring half-life of the nutritive polypeptide upon consumption. For example, a nutritive polypeptide is easier to digest if it has a half-life in the gastrointestinal tract of a human subject of less than 60 minutes, or less than 50, 40, 30, 20, 15, 10, 5, 4, 3, 2 minutes or 1 minute. In certain embodiments the nutritive polypeptide is provided in a formulation that provides enhanced digestion; for example, the nutritive polypeptide is provided free from other polypeptides or other materials. In some embodiments, the nutritive polypeptide contains one or more recognition sites for one or more endopeptidases. In a specific embodiment, the nutritive polypeptide contains a secretion leader (or secretory leader) sequence, which is then cleaved from the nutritive polypeptide. As provided herein, a nutritive polypeptide encompasses polypeptides with or without signal peptides and/or secretory leader sequences. In some embodiments, the nutritive polypeptide is susceptible to cleavage by one or more exopeptidases.

Digestion Assays

Digestibility is a parameter relevant to the benefits and utility of proteins. Information relating to the relative completeness of digestion can serve as a predictor of peptide bioavailability (Daniel, H., 2003. Molecular and Integrative Physiology of Intestinal Peptide Transport. Annual Review of Physiology, Volume 66, pp. 361-384). In some embodiments proteins disclosed herein are screened to assess their digestibility. Digestibility of proteins can be assessed by any suitable method known in the art. In some embodiments digestibility is assessed by a physiologically relevant in vitro digestion reaction that includes one or both phases of protein digestion, simulated gastric digestion and simulated intestinal digestion (see, e.g., Moreno, et al., 2005. Stability of the major allergen Brazil nut 2S albumin (Ber e 1) to physiologically relevant in vitro gastrointestinal digestion. FEBS Journal, pp. 341-352; Martos, G., Contreras, P., Molina, E. & Lopez-Fandino, R., 2010. Egg White Ovalbumin Digestion Mimicking Physiological Conditions. Journal of Agricultural and food chemistry, pp. 5640-5648; Moreno, F. J., Mackie, A. R. & Clare Mills, E. N., 2005). Phospholipid interactions protect the milk allergen a-Lactalbumin from proteolysis during in vitro digestion. Journal of agricultural and food chemistry, pp. 9810-9816). Briefly, test proteins are sequentially exposed to a simulated gastric fluid (SGF) for 120 minutes (the length of time it takes 90% of a liquid meal to pass from the stomach to the small intestine; see Kong, F. & Singh, R. P., 2008. Disintegration of Solid Foods in Human Stomach. Journal of Food Science, pp. 67-80) and then transferred to a simulated duodenal fluid (SDF) to digest for an additional 120 minutes. Samples at different stages of the digestion (e.g., 2, 5, 15, 30, 60 and 120 min) are analyzed by electrophoresis (e.g., chip electrophoresis or SDSPAGE) to monitor the size and amount of intact protein as well as any large digestion fragments (e.g., larger than 4 kDa). The disappearance of protein over time indicates the rate at which the protein is digested in the assay. By monitoring the amount of intact protein observed over time, the half-life (τ½) of digestion is calculated for SGF and, if intact protein is detected after treatment with SGF, the τ½ of digestion is calculated for SIF. This assay can be used to assess comparative digestibility (i.e., against a benchmark protein such as whey) or to assess absolute digestibility. In some embodiments the digestibility of the protein is higher (i.e., the SGF τ½ and/or SIF τ½ is shorter) than whey protein. In some embodiments the protein has a SGF τ½ of 30 minutes or less, 20 minutes or less, 15 minutes or less, 10 minutes or less, 5 minutes or less, 4 minutes or less, 3 minutes or less, 2 minutes or less or 1 minute or less. In some embodiments the protein has a SIF τ½ of 30 minutes or less, 20 minutes or less, 15 minutes or less, 10 minutes or less, 5 minutes or less, 4 minutes or less, 3 minutes or less, 2 minutes or less or 1 minute or less. In some embodiments the protein is not detectable in one or both of the SGF and SIF assays by 2 minutes, 5 minutes, 15 minutes, 30 minutes, 60 minutes, or 120 minutes. In some embodiments the protein is digested at a constant rate and/or at a controlled rate in one or both of SGF and SIF. In such embodiments the rate of digestion of the protein may not be optimized for the highest possible rate of digestion. In such embodiments the rate of absorption of the protein following ingestion by a mammal can be slower and the total time period over which absorption occurs following ingestion can be longer than for proteins of similar amino acid composition that are digested at a faster initial rate in one or both of SGF and SIF. In some embodiments the protein is completely or substantially completely digested in SGF. In some embodiments the protein is substantially not digested or not digested by SGF; in most such embodiments the protein is digested in SIF.

Assessing protein digestibility can also provide insight into a protein's potential allergenicity, as proteins or large fragments of proteins that are resistant to digestive proteases can have a higher risk of causing an allergenic reaction (Goodman, R. E. et al., 2008. Allergenicity assessment of genetically modified crops—what makes sense? Nature Biotechnology, pp. 73-81). To detect and identify peptides too small for chip electrophoresis analysis, liquid chromatography and mass spectrometry can be used. In SGF samples, peptides can be directly detected and identified by LC/MS. SIF protein digestions may require purification to remove bile acids before detection and identification by LC/MS.

In some embodiments digestibility of a protein is assessed by identification and quantification of digestive protease recognition sites in the protein amino acid sequence. In some embodiments the protein comprises at least one protease recognition site selected from a pepsin recognition site, a trypsin recognition site, and a chymotrypsin recognition site.

As used herein, a “pepsin recognition site” is any site in a polypeptide sequence that is experimentally shown to be cleaved by pepsin. In some embodiments it is a peptide bond after (i.e., downstream of) an amino acid residue selected from Phe, Trp, Tyr, Leu, Ala, Glu, and Gln, provided that the following residue is not an amino acid residue selected from Ala, Gly, and Val.

As used herein, a “trypsin recognition site” is any site in a polypeptide sequence that is experimentally shown to be cleaved by trypsin. In some embodiments it is a peptide bond after an amino acid residue selected from Lys or Arg, provided that the following residue is not a proline.

As used herein, a “chymotrypsin recognition site” is any site in a polypeptide sequence that is experimentally shown to be cleaved by chymotrypsin. In some embodiments it is a peptide bond after an amino acid residue selected from Phe, Trp, Tyr, and Leu.

Disulfide bonded cysteine residues in a protein tend to reduce the rate of digestion of the protein compared to what it would be in the absence of the disulfide bond. For example, it has been shown that the rate of digestion of the protein b-lactoglobulin is increased when its disulfide bridges are cleaved (I. M. Reddy, N. K. D. Kella, and J. E. Kinsella. “Structural and Conformational Basis of the Resistance of B-Lactoglobulin to Peptic and Chymotryptic Digestion”. J. Agric. Food Chem. 1988, 36, 737-741). Accordingly, digestibility of a protein with fewer disulfide bonds tends to be higher than for a comparable protein with a greater number of disulfide bonds. In some embodiments the proteins disclosed herein are screened to identify the number of cysteine residues present in each and in particular to allow selection of a protein comprising a relatively low number of cysteine residues. For example, edible species proteins or fragments can be identified that comprise a no Cys residues or that comprise a relatively low number of Cys residues, such as 10 or fewer Cys residues, 9 or fewer Cys residues, 8 or fewer Cys residues, 7 or fewer Cys residues, 6 or fewer Cys residues, 5 or fewer Cys residues, 4 or fewer Cys residues, 3 or fewer Cys residues, 2 or fewer Cys residues, 1 Cys residue, or no Cys residues. In some embodiments one or more Cys residues in an edible species protein or fragment thereof is removed by deletion and/or by substitution with another amino acid. In some embodiments 1 Cys residue is deleted or replaced, 1 or more Cys residues are deleted or replaced, 2 or more Cys residues are deleted or replaced, 3 or more Cys residues are deleted or replaced, 4 or more Cys residues are deleted or replaced, 5 or more Cys residues are deleted or replaced, 6 or more Cys residues are deleted or replaced, 7 or more Cys residues are deleted or replaced, 8 or more Cys residues are deleted or replaced, 9 or more Cys residues are deleted or replaced, or 10 or more Cys residues are deleted or replaced. In some embodiments the protein of this disclosure comprises a ratio of Cys residues to total amino acid residues equal to or lower than 5%, 4%, 3%, 2%, or 1%. In some embodiments the protein comprises 10 or fewer Cys residues, 9 or fewer Cys residues, 8 or fewer Cys residues, 7 or fewer Cys residues, 6 or fewer Cys residues, 5 or fewer Cys residues, 4 or fewer Cys residues, 3 or fewer Cys residues, 2 or fewer Cys residues, 1 Cys residue, or no Cys residues. In some embodiments, the protein comprises 1 or fewer Cys residues. In some embodiments, the protein comprises no Cys residues.

Alternatively or in addition, disulfide bonds that are or can be present in a protein can be removed. Disulfides can be removed using chemical methods by reducing the disulfide to two thiol groups with reducing agents such as beta-mercaptoethanol, dithiothreitol (DTT), or tris(2-carboxyethyl)phosphine (TCEP). The thiols can then be covalently modified or “capped” with reagents such as iodoacetamide, N-ethylmaleimide, or sodium sulfite (see, e.g., Crankshaw, M. W. and Grant, G. A. 2001. Modification of Cysteine. Current Protocols in Protein Science. 15.1.1-15.1.18).

Nutritive Polypeptides and Nutritive Polypeptide Formulations with Modulated Viscosity.

Disclosed herein are compositions, formulations, and food products that contain viscosity-modulating nutritive polypeptides. In one aspect, provided are formulations substantially free of non-comestible products that contain nutritive polypeptides present in a nutritional amount, and the nutritive polypeptide decreases the viscosity of a food product. In some embodiments, the nutritive polypeptide is present at about 10 g/1 and the viscosity of the formulation is from about 1,000 mPas to about 10,000 mPas at 25 degrees C., such as from about 2,500 mPas to about 5,000 mPas at 25 degrees C.

The formulations are incorporated into food products having advantages over similar food products lacking the nutritive polypeptides, or the formulations are incorporated into other products such as beverage products or animal feed products. For example, the food products have a reduced fat content, a reduced sugar content, and/or a reduced calorie content compared to a food product not having the nutritive polypeptide. Preferably, the nutritive polypeptide is present in the food product such that consumption of a nutritional amount of the food product is satiating. In an embodiment of the invention, gelatin, an animal-derived material, is replaced by a non-animal derived product, containing one or more nutritive polypeptides. Typically the nutritive polypeptide is present in an amount effective to replace gelatin in the product. The gelatin replacement is incorporated into a food product, a beverage product, or an animal feed product, and the formulation is substantially free of non-comestible products.

Also provided are formulations containing a nutritive polypeptide present in a functional and/or nutritional amount, which increases the viscosity of a food or beverage product, such as formulations containing viscosity-increasing nutritive polypeptides incorporated into food products having advantages over similar food products lacking the nutritive polypeptides. For example, the food products have a reduced fat content, a reduced sugar content, and/or a reduced calorie content compared to a food product not having the nutritive polypeptide. Viscous nutritive polypeptides can be used as a nutritionally favorable low calorie substitute for fat. Additionally, it may be desired to add to the compositions and products one or more polysaccharides or emulsifiers, resulting in a further improvement in the creamy mouthfeel.

In some embodiments, the viscosity of nutritive polypeptide-containing materials is enhanced by crosslinking the nutritive polypeptides or crosslinking nutritive polypeptides to other proteins present in the material. An example of an effective crosslinker is transglutaminase, which crosslinks proteins between an ε-amino group of a lysine residue and a γ-carboxamide group of glutamine residue, forming a stable covalent bond. The resulting gel strength and emulsion strength of nutritive polypeptides identified and produced as described herein are examined by preparing a transglutaminase-coupled nutritive protein composition, followed by gel strength and emulsion strength assays. A suitable transglutaminase derived from microorganisms in accordance with the teachings of U.S. Pat. No. 5,156,956 is commercially available. These commercially available transglutaminases typically have an enzyme activity of about 100 units. The amount of transglutaminase (having an activity of about 100 units) added to isolated nutritive polypeptide is expressed as a transglutaminase concentration which is the units of transglutaminase per 100 grams of isolated nutritive polypeptide. The isolated nutritive polypeptide contains from 5 to 95%, preferably 20 to 80%, preferably 58% to 72% protein and also preferably from 62% to 68% protein. The transglutaminase concentration is at least 0.15, preferably 0.25 and most preferably 0.30 units transglutaminase per gram protein up to 0.80 and preferably 0.65 units transglutaminase per gram protein. Higher and lower amounts may be used. This enzyme treatment can also be followed by thermal processing to make a viscous solution containing a nutritive polypeptide. To generate nutritive polypeptide samples containing crosslinks, a sample is mixed with a transglutaminase solution at pH 7.0 to give an enzyme to protein weight ratio of 1:25. The enzyme-catalyzed cross-linking reaction is conducted at 40° C. in most of the experiments.

Oscillatory shear measurements can be used to investigate the rheological properties of nutritive polypeptides. Also, to determine the viscosity of nutritive polypeptide solutions and gels viscoelasticity is investigated by dynamic oscillatory rheometry. A 2 mL sample of nutritive polypeptide solution or nutritive polypeptide solution containing transglutaminase is poured into the Couette-type cylindrical cell (2.5 cm i.d., 2.75 cm o.d.) of the rheometer and covered with a thin layer of low-viscosity silicone oil to prevent evaporation. For samples with enzyme present, gelation is induced in situ by incubation at 40° C. For nutritive polypeptide samples without enzyme, gelation is induced by subjecting the sample to the following thermal treatment process: temperature increased at constant rate of 2 K min−1 from 40 to 90° C., kept at 90° C. for 30 min, cooled at 1 K min−1 from 90 to 30° C., and kept at 30° C. for 15 min. Some samples can be subjected to this thermal treatment after the enzyme treatment. Small deformation shear rheological properties are mostly determined in the linear viscoelastic regime (maximum strain amplitude 0.5%) with storage and loss moduli (G′ and G″) measured at a constant frequency of 1 Hz. In addition, some small deformation measurements are made as a function of frequency e.g., 2×10-3 to 2 Hz, and some large deformation measurements are carried out at strains up to nearly 100%.

Amino Acid Pharmacology.

Amino acids are organic molecules containing both amino and acid groups. All amino acids have asymmetric carbon except for glycine and all protein amino acids, except proline, have an alpha-carbon bound to a carboxyl group and a primary amino group.

Amino acids exhibit a diverse range of biochemical properties and biological function due to their varying side chains. They are stable in solution at physiological pH, save for glutamine and cysteine. In the context of some proteins, conditional upon the host and translational machinery, amino acids can undergo post-translational modification. This can have significant effects on their bioavailability, metabolic function, and bioactivity in vivo. Sugar moieties appended to proteins post-translationally may reduce the usefulness of the nutritive proteins by affecting the gastrointestinal release of amino acids and embedded peptides. A comparison of digestion of glycosylated and non-glycosylated forms of the same proteins shows that the non-glycosylated forms are digested more quickly than the glycosylated forms (our data).

Although over 300 amino acids exist in nature, 20 serve as building blocks in protein. Non-protein alpha-AAs and non-alpha AAs are direct products of these 20 protein amino acids and play significant roles in cell metabolism. Due to the metabolic reactions of amino acid catabolism that drive the interconversion between amino acids, a subset of 11 of the 20 standard protein amino acids are considered non-essential for humans because they can be synthesized from other metabolites (amino acids, ketones, etc.) in the body: Alanine; Arginine; Asparagine; Aspartic acid; Cysteine; Glutamic acid; Glutamine; Glycine; Proline; Serine; and Tyrosine.

Arginine, cysteine, glycine, glutamine, histidine, proline, serine and tyrosine are considered conditionally essential, as they are not normally used in the diet, and are not synthesized in adequate amounts in specific populations to meet optimal needs where rates of utilization are higher than rates of synthesis. Functional needs such as reproduction, disease prevention, or metabolic abnormalities, however, can be taken into account when considering whether an amino acid is truly non-essential or can be conditionally essential in a population. The other 9 protein amino acids, termed essential amino acids, are taken as food because their carbon skeletons are not synthesized de novo by the body to meet optimal metabolic requirements: Histidine; Isoleucine; Leucine; Lysine; Methionine; Phenylalanine; Threonine; Tryptophan; and Valine.

All 20 protein amino acids (and non-protein metabolites) are used for normal cell functionality, and shifts in metabolism driven by changing availability of a single amino acid can affect whole body homeostasis and growth. Additionally, amino acids function as signaling molecules and regulators of key metabolic pathways used for maintenance, growth, reproduction, immunity.

In the body skeletal muscle represents the largest store of both free and protein-bound amino acids due to its large composition of body mass (around 40-45%). The small intestine is another important site for amino acid catabolism, governing the first pass metabolism and entry of dietary amino acids into the portal vein and into the peripheral plasma. 30-50% of EAA in the diet may be catabolized by the small intestine in first-pass metabolism. The high activity of BCAA transaminases in the intestinal mucosa leads to BCAA conversion to branched-chain alpha-ketoacids to provide energy for enterocytes similar as is done in skeletal muscle. Differences in physiological state of muscle and small intestine metabolism have large implications on amino acid biology systemically across tissues in humans.

Amino acids can exist in both L- and D-isoforms, except for glycine (non-chiral). Almost all amino acids in proteins exist in the L-isoform, except for cysteine (D-cys) due to its sulfur atom at the second position of the side-chain, unless otherwise enzymatically postranslationally modified or chemically treated for storing or cooking purposes. Most D-amino acids, except for D-arg, D-cys, D-his, D-lys, and D-thr, can be converted into the L chirality by D-AA oxidases and transaminases. In order to be catabolized, these D enantiomers are transported across the plasma and other biological membranes and undergo D-oxidation or deaminate the amino acid to convert to its alpha-ketoacid or racemization to convert the D-AA to its L-isoform. The transport of D-isomers is limited by a lower affinity of L-AA transporters to D-AAs. For this reason the efficiency of D-AA utilization, on a molar basis of the L-isomer, can range from 20-100% depending on the amino acid and the species.

Alanine:

Alanine is a glucogenic non-essential amino acid due to its ability to be synthesized in muscle cells from BCAAs and pyruvate as part of the glucose-alanine cycle. This involves a tightly regulated process by which skeletal muscle frees energy from protein stores for the generation of glucose distally in the liver for use by extrahepatic cells (including immunocytes) and tissues. The resulting stimulation of gluconeogenesis provides a source of energy in the form of glucose during periods of food deprivation. Alanine becomes a very sensitive intermediary to balance the utilization of BCAAs in the muscle for protein production and generation of available energy through gluconeogenesis in the liver. Furthermore, the alanine induction of gluconeogenesis is integral to support the function of many tissues, not limited to muscle, liver, and immunocytes. Beyond acting as simply an intermediate, however, it also directly regulates activity of a key enzyme in this energy balance, pyruvate kinase. Alanine has the ability to inhibit pyruvate kinase by facilitating its phosphorylation, slowing glycolysis and driving the reverse reaction of pyruvate to phosphoenolpyruvate (PEP) for initiation of gluconeogenesis.

High Alanine

A lack of ATP-producing substrates, as occurs in a fasted state, can lead to autophagy and the turnover of intracellular protein in the lysosome to provide an energy source. Low levels of the glucogenic amino acids, including alanine can stimulate hepatic autophagy, leading to degradation of liver function.

Beta-cells show increased autophagy when under high fat diet feeding as a response to increased demand for insulin production and protein turnover as the body reacts to rising plasma glucose concentrations. This progression towards increased insulin production in obesity is an early marker for pre-diabetes, an indicator of insulin resistance, and a risk factor for the deterioration of islet beta cell functionality which eventually leads to the onset of diabetes in overweight individuals. The ability to regulate alanine levels via nutrition may provide a powerful lever for shifting hepatic and beta cell autophagy to perturb impaired insulin metabolism in overweight individuals.

Alanine directly produces beta-alanine, important to the biosynthesis of panthothenic acid (vitamin b5), coenzyme A, and carnosine (or which it is the rate-limiting precursor). Carnosine, as well as other beta-alanine derived di-peptides (which don't incorporate into proteins) carcinine, anserine, and balenine act as antioxidant buffers in the muscle tissue, constituting up to 20% of the buffer capacity in type I and II muscle fibres. This buffering is important for maintaining tissue pH in muscle during the breakdown of glycogen to lactic acid. In weight loss/gain trials in college athletes, supplementation with beta-alanine was shown to prevent loss of lean mass in weight loss and larger increases in lean mass during weight gain compared to placebo. Beta-alanine is also implicated in decreasing fatigue and increasing muscular work done.

Carnosine is an antioxidant and transition metal ion-sequestering agent. It acts as an anti-glycating agent by inhibiting the formation of advanced glycation end products (AGEs). AGEs are prevalent in diabetic vasculature and contribute to the development of atherosclerosis. The presence of AGEs in various cells types affect both the extracellular and intracellular structure and function. (Golden, A. et. al. Advanced Glycosylation End Products, Circulation 2006). Also, the accumulation of AGEs in the brain is a characteristic of aging and degeneration, particularly in Alzheimer's disease. AGE accumulation explains many neuropathological and biochemical features of Alzheimer's disease such as protein crosslinking, oxidative stress, and neuronal cell death. Because of its combination of antioxidant and antiglycating properties, carnosine is able to diminish cellular oxidative stress and inhibit the intracellular formation of reactive oxygen species and reactive nitrogen species.

Low Alanine

In states of obesity and diabetes, animals have been shown to exhibit reduced hepatic autophagy, leading to increased insulin resistance. Autophagy is important for maintenance of the ER and cellular homeostasis, which when stressed can lead to impaired insulin sensitivity. High fat diet feeding in animal models stresses the ER, while leading to depressed hepatic autophagy through over-stimulation of mTORC1, which reinforces the progression towards insulin sensitivity impaired beta cell function in diabetes. Reducing the level of systemic Alanine provides an opportunity to lower mTORC1 activity and restore healthy levels of autophagy.

Arginine:

Arginine is a glucogenic non-essential amino acid, which can be synthesized via glutamate, aspartate, glutamine, and proline. It is produced by the mammalian small intestine via oxidation of glutamate, glutamine, and aspartate, which generates ornithine, citrulline, arginine, and alanine. It can also be produced (along with ornithine and citrulline) via the proline oxidase pathway from active degradation of proline in enterocytes. Arginine is converted from citrulline released into circulation by the enterocytes in the kidneys and some endothelial cells (leukocytes and smooth muscle). Newborns utilize most of the free citrulline locally in the small intestine for arginine synthesis rather than systemic release. Arginine and proline oxidation is constrained to the mucosa due to reduced activity of pyrroline-5-carboxylate dehydrogenase across the other tissues.

High Arginine

Citrulline is produced from arginine as a by-product of a reaction catalyzed by the NOS family. Dietary supplement of either arginine or citrulline is known to reduce plasma levels of glucose, homocysteine, and asymmetric dimethylarginine, which are risk factors for metabolic syndrome. L-citrulline accelerates the removal of lactic acid from muscles, likely due to the affects on vascular tone and endothelial function. Recent studies have also shown that L-citrulline from watermelon juice provides greater recovery from exercise, and less soreness the next day. It also appears that delivery of L-citrulline as a free form results in less uptake into cells in vitro than in the context of watermelon juice (which contains high levels of L-citrulline). This suggests an opportunity to deliver peptide doses, which can traffic arginine into muscle tissue for conversion into citrulline by eNOS at the endothelial membrane for improved efficacy.

Arginine is a highly functional amino acid implicated in many signaling pathways and as a direct precursor of nitric oxide (NO), which facilitates systemic signaling between tissues and regulation of nutrient metabolism and immune function. NO is important for normal endothelial function and cardiovascular health (including vascular tone, hemodynamics, and angiogenesis). Arginine stimulates insulin secretion by directly depolarizing the plasma membrane of the fl cell, leading to the influx of Ca2+ and subsequent insulin exocytosis.

Arginine supplementation was shown to improve endothelium-dependent relaxation, an indicator of cardiovascular function in type I and type II models of diabetes mellitus. Notably, arginine supplementation reduced white adipose tissue but increased brown fat mass in Zucker diabetic rats and diet-induced obese rats. Arginine and/or its metabolites may enhance the proliferation, differentiation, and function of brown adipocytes. In addition, both skeletal muscle mass and whole body insulin sensitivity were enhanced in response to arginine supplementation via mechanisms involving increases in muscle mTOR and NO signaling. Surprisingly, long-term oral administration of arginine decreased fat mass in adult obese humans with type II diabetes (Lucotti et al 2006). Moreover, supplementation with arginine to a conventional corn- and soybean-based diet reduced fat accretion and promoted protein deposition in the whole body of growing-finishing pigs. In a small pilot trial in humans data indicated that defective insulin-mediated vasodilatation in obesity and non-insulin dependent diabetics (NIDDM) can be normalized by intravenous L-arginine; L-arginine also improved insulin sensitivity in healthy subjects, obese patients and NIDDM patients, indicating a possible mechanism that is different from the restoration of insulin-mediated vasodilatation. In addition, a chronic administration of L-arginine improved glucose levels, insulin induced-hepatic glucose production, and insulin sensitivity in type II diabetic patients (Piatti et al 2001). Arginine rich peptides have not been isolated and tested.

Amino acid administration at high doses (10-20× that available in diet, or 0.1-0.3 g/kg body weight dosed over 20 minutes, via intravenous or oral routes, can stimulate hormone secretion from the gut via endocrine cells. Arginine is a well-studied secretagogue that can stimulate the systemic release of insulin, growth hormone, prolactin, glucagon, progesterone, and placental lactogen. This biology has direct implications on both digestive biology and the absorption of nutrients present in the intestine, as well as affecting energy balance by triggering satiety signals mediated by endocrine hormones. The ability to modulate these hormones provides a therapeutic opportunity for decreasing caloric intake in metabolic disorders such as obesity or alternatively triggering appetite in muscle wasting, sarcopenia, and cachexia, as well as by shifting insulin sensitivity in the onset of diabetes.

Arginine is an important signaling molecule for stimulating mTOR1 phosphorylation in a cell-specific manner. This regulates cellular protein turnover (autophagy) and integrates insulin-like growth signals to protein synthesis initiation across tissues. This biology has been directly linked to biogenesis of lean tissue mass in skeletal muscle, metabolic shifts in disease states of obesity and insulin resistance, and aging. It is also a central signaling pathway which can be hijacked for the proliferation of fast-growing cancer cells.

There is evidence for Arginine increasing levels of protein synthesis in the small intestine under catabolic states such as viral infection and malnutrition, where amino acid levels are dramatically shifted from their normal post-absorptive states. Additionally, demonstrated mTOR activation in the intestinal epithelial cells by Arginine provides a mechanism to repair intestinal epithelium by stimulating protein synthesis and cell proliferation. Similar anabolic signaling has been observed in myocytes in response to rising plasma levels of Arginine, leading to increased whole body and skeletal muscle protein synthesis. Arginine is an amino acid maintained at sufficient levels to support the anabolic effects of EAAs. Lysine, Methionine, Threonine, Tryptophan, Leucine, Isoleucine, and Valine have been shown unable to support increased protein synthesis and whole-body growth when added to a 12.7% crude protein diet, indicating a deficiency in the anabolic mediating non-essential amino acids, including Arginine.

Arginine also up-regulates proteins and enzymes related to mitochondrial biogenesis and substrate oxidation, stimulating metabolism of fatty acid stores and reducing fat tissue mass. Supplementation of dietary Arginine provides a therapeutic benefit in obese and pre-diabetic populations who suffer from insulin resistance due to their increased caloric intake. Likewise, the ability to stimulate mitochondrial biogenesis has direct implications in aging and the ability to regenerate functional proteins and healthy cells subject to oxidative stress.

It is established that dietary deficiency of protein reduces the availability of most amino acids, including Arginine despite it not being considered essential. Arginine deficiency is known to cause decreases in sperm counts by 90% after 9 days, increasing the proportion of non-motile sperm by a factor of 10. Arginine supplementation has been demonstrated in animals to increase levels of Arginine, Proline, Ornithine, and other Arginine metabolites such as Polyamines in seminal fluid, corresponding with increased sperm counts and sperm motility. Changes in NO synthesis and polyamines (via), likewise are seen during gestation when placental growth rate peaks, indicating a role for Arginine in fetal development during pregnancy. In uterine fluids during early gestation, Arginine levels also decrease in response to expression of specific amino acid transporters at the embryo. Arginine supplementation to the diet of animals during early gestation has shown embryonic survival and increase in litter size, indicating a significant potential for delivering high levels of arginine during pregnancy.

Arginine has an extensively studied effect on enhancing immune function, based on direct effects on NO production (which can potentiate a phagocyte's killing ability), hormonal secretagogue activity, and stimulation of mTOR. Proline catabolism by proline oxidase is known to have high levels of activity in the placentae and small intestine of mammals. This activity points to a crucial role for Arginine in gut and placentae immunity, both through generation of H2O2, which is cytotoxic to pathogenic bacteria, and synthesis of arginine. In critically injured patients leukocyte count normalizes more quickly after 6 days of Arginine enriched diet, with recovery to normal TNF response after 10 days (100% improvement). A clinical study in 296 surgery, trauma, or sepsis patients examining Arginine enriched (12.5 g/L Arginine) formulation vs enteral formula indicates highly reduced hospital stay (8-10 days) and major reduction in frequency of acquired infections. A separate clinical study of 181 septic patients fed Arginine enriched (12.5 g/L Arginine) vs. enteral formula show significantly reduced bacteremia (8% vs 22%), nosocomial infection (6% vs 20%).

Arginine is also a key substrate for the synthesis of collagen. Oral supplementation of arginine enhances wound healing and lymphocyte immune response in healthy subjects. A 2.4× increase in collagen deposition was observe at wound sites (24 nmol/cm vs. 10.1 nmol/cm), along with increased lymphocyte proliferation vs. control.

Arginine is an allosteric activator of N-acetylglutamate synthase, an enzyme which converts glutamate and acetyl-CoA into N-acetylglutamate in the mitochondria. This pushes the hepatic urea cycle towards the active state, useful for ammonia detoxification. This means that dietary delivery of nutrients with low doses of arginine may be useful in the context of kidney disease, where patients struggle to clear urea from their circulation. Elimination of arginine to limit uremia from the available nitrogen sources, while being able to maintain a limited protein intake to prevent tissue catabolism, is a novel strategy against a disruptive nutritional consequence of kidney disease.

Arginine up-regulates the activity of GTP cyclohydrolase-I, freeing tetrahydrobiopterin (THB) for NO synthesis and the hydroxylation of aromatic amino acids (ArAAs) by aromatic amino acid hydroxylase (AAAH). For this reason, delivery of high levels of Arginine to raise cellular levels of THB directly stimulates the biosynthesis of many neurotransmitters in the CNS capillary endothelial cells. ArAAs serve as precursors for biosynthesis of monoamine neurotransmitters, including melatonin, dopamine, norepinephrine (noradrenaline), and epinephrine (adrenaline).

Low Arginine

Excessive arginine intake, stimulating production of high levels of NO in the blood can lead to oxidative injury and apoptosis of cells.

Arginine excess or depletion affects global gene expression in mammalian hepatocytes. Depletion leads to 1419 genes with significantly (p<0.05) altered expression using in-vitro models, of which 56 showed at least 2-fold variation using a 9-way bioinformatics analysis. The majority rise in expression, including multiple growth, survival, and stress-related genes such as GADD45, TA1/LAT1, and caspases 11 and 12. Many are relevant in luminal ER stress response. LDLr, a regulator of cholesterol and steroid biosynthesis, was also modulated in response to arginine depletion. Consistent with Arginine affecting gene expression, dietary arginine supplementation up-regulates anti-oxidative genes and lowers expression of proinflammatory genes in the adipose and small intestinal tissues.

Lower arginine levels inhibit neurotransmitter biosynthesis, which has shown clinical efficacy in indications such as mania, parkinsons, and dyskenisia.

Asparagine:

Asparagine is a glucogenic nonessential amino acid, whose precursor is oxaloacetate (OAA) and which is synthesized via glutamine and aspartate by a transaminase enzyme. It is used for the function of some neoplastic cells such as lymphoblasts.

Asparagine is typically located at the ends of alpha helices of proteins and provides important sites for N-linked glycosylation to add carbohydrate chains, which affects immune response to amino acid ingestion.

Acyrlamide is formed by heat-induced reactions between Asparagine and carbonyl groups of glucose and fructose in many plant-derived foods. Acrylamide is an oxidant that can be cytotoxic, cause gene mutations, and generally affect food quality. Compositions with low levels of asparagine are useful in making safer food products that may be subject to cooking or non-refrigerated storage conditions.

Aspartate:

Aspartate is a glucogenic nonessential amino acid synthesized via the oxaloacetate (OAA) precursor by a transaminase enzyme. As part of the urea cycle, it can also be produced from ornithine and citrulline (or arginine) as the released fumarate is converted to malate and subsequently recycled to OAA. Aspartate provides a nitrogen atom in the synthesis of inosine, which is the precursor in purine biosynthesis. It is also involved in the synthesis of beta-alanine. Aspartate oxidizes in enterocytes of the small intestine, leading to nitrogeneous products ornithine, citrulline, arginine, and alanine.

Aspartate is an agonist of NMDA receptors (Glutamate receptors), releasing Ca2+ as a second messenger in many cellular signaling pathways. There are dopaminergic and glutaminergic abnormalities implicated in schitzophrenia, with NMDA antagonists mimicking some positive and negative symptoms of schitzophrenia, while carrying less risk of brain harm than do dopamine agonists. Ketamine and PCP, for example, produce similar phenotypes observed in schitzophrenia, with PCP showing less representative symptomology yet similar brain structure changes. Glutamate receptors have increased function, contributing to the onset of schizophrenia. An increased proportion of post-synaptic glutamate receptors to pre-synaptic glutamate receptors result in increased glutamate signaling. Both agonizing and antagonizing NMDA receptors has shown some benefit in treating Alzheimer's dementia, depending on the MOA and receptor specificity. Thus delivering proteins with either high or low levels of Aspartate, which also as NMDA agonist activity, could be therapeutic for this patient population. Proteins with low levels of Aspartate would likely provide a synergistic benefit along side NMDA antagonists, such as Memantine. Likewise, clinical trials on LY2140023 have demonstrated glutamate-based treatments as having potential for treating schizophrenia without the side effects seen with xenochemical anti-psychotics. Similar studies combining co-agonist glycine with anti-psychotics, showed improved symptomology, suggesting that delivering high doses of Aspartate, also a NMDA agonist, will yield similar therapeutic benefits in this patient population.

Aspartate is an acidic amino acid, with a low Pka of 3.9. Aspartate in the di-peptide form with phenylalanine via a methyl ester yields aspartame, which is used as a commercial artificial sweetener.

Cysteine:

Cysteine is a nonessential amino acid, and is synthesized from homocysteine, which is itself synthesized from the metabolism of methionine. Serine is involved in cysteine's synthesis by condensing with homocysteine to form cystathionine. Cystathionine is then deaminated and hydrolyzed to form cysteine and alpha ketobutyrate. Cysteine's sulfur comes from homocysteine, but the rest of the molecule comes from the initial serine residue. The biosynthesis of cysteine occurs via a different mechanism in plants and prokaryotes. Cysteine is a vital amino acid because it plays an important role in protein folding. The disulfide linkages formed between cysteine residues helps to stabilize the tertiary and quaternary structure of proteins, and these disuflfide linkages are most common among the secreted proteins, where proteins are exposed to more oxidizing conditions that are found in the cellular interior. Despite the benefits of homocysteine, having high systemic levels is a risk factor for developing cardiovascular disease. Elevated homocysteine may be caused by a genetic deficiency of cystathionine beta-synthase and excess methionine intake may be another explanation. Control of methionine intake and supplementation with folic acid and vitamin B12 in the diet has been used to lower homocysteine levels. Furthermore, because the availability of cysteine is a key component that limits the synthesis of glutathionine, dietary supplementation with N-acetyl-cysteine, a precursor for cysteine, is highly effective in enhancing immunity under a wide range of disease states.

Cysteine undergoes rapid oxidation to Cystine. It facilitates the biosynthesis of glutathionine, a powerful antioxidant which can donate a reducing equivalent to unstable molecules such as reactive oxygen species (ROS) free radicals. After reducing an oxidative species, it can form a glutathionine sulfide with another reactive glutathionine, providing a mechanism of depleting oxidative stress inducing molecules from cells (The liver can maintain concentrations of up to 5 mM). Glutathionine is a powerful neutralizer of toxins in the liver, and helps to protect the liver from the damaging effects of toxins. Additionally, this detoxifying ability helps to diminish muscle weakness, prevents brittle hair, and protects against radiation associated with these toxins. As a result, it is beneficial for those suffering from chemical allergies or exposed to high levels of air pollution. Glutathionine also is a cofactor for iNOS, allow maximal synthesis of NO in the arg-NO pathway. NO is important for normal endothelial function and cardiovascular health (including vascular tone, hemodynamics, angiogenesis).

In addition to being a precursor to glutatithionine, cysteine is a precursor for the H2S, which can induce endothelial-dependent relaxation, and can be further converted to cysteine sulfinate. Cysteine sulfunate can be converted to taurine, which has the ability to decrease methionine uptake. An excess of methionine increases the risks of the development of atherosclerosis by inducing hyperhomocysteinemia because homocysteine is an intermediate between methionine and cysteine. However, it is not known whether cysteine decreases homocysteine directly or through the reduction of methionine (Sebastiaan Wesseling, et al., Hypertension. 2009; 53: 909-911).

Furthermore, Cysteine is a precursor for Taurine, which modulates the arginine-NO pathway. Taurine has several potentially protective effects. First, taurine has the ability to reduce oxidative stress by binding to hypochlorite. It has been hypothesized that taurine conjugates to mitochondrial transfer RNA, and in so doing, prevents the formation of mitochondrial superoxide. Additionally, taurine inhibits homocysteine-induced stress of the endoplasmic reticulum of vascular smooth muscle cells and thus restores the expression and secretion of extracellular superoxide dismutase.

Glutamate:

Glutamate oxidizes in enterocytes of the small intestine, leading to nitrogeneous products ornithine, citrulline, arginine, and alanine. Glutamate also modulates the arginine-NO pathway. NO is important for normal endothelial function and cardiovascular health (including vascular tone, hemodynamics, angiogenesis).

High Glutamate

A lack of ATP-producing substrates, as occurs in a fasted state, can lead to autophagy and the turnover of intracellular protein in the lysosome to provide an energy source. Low levels of the glucogenic amino acids, including glutamate can stimulate hepatic autophagy, leading to degradation of liver function.

Citrulline is produced from Glutamate as a by-product of a reaction catalyzed by the NOS family. Dietary supplement of citrulline is known to reduce plasma levels of glucose, homocysteine, and asymmetric dimethylarginine, which are risk factors for metabolic syndrome. L-citrulline accelerates the removal of lactic acid from muscles, likely due to the effects on vascular tone and endothelial function. Recent studies have also shown that L-citrulline from watermelon juice provides greater recovery from exercise, and less soreness the next day. It also appears that delivery of L-citrulline as a free form results in less uptake into cells in vitro than in the context of watermelon juice (which contains high levels of L-citrulline). This suggests an opportunity to deliver peptide doses, which can traffic arginine into muscle tissue for conversion into citrulline by eNOS at the endothelial membrane for improved efficacy.

Glutamate facilitates the biosynthesis of glutathione, which can donate a reducing equivalent to unstable molecules such as reactive oxygen species (ROS) and free radicals. After reducing an oxidative species, it can form a glutathione disulfide with another reactive glutathione, providing a mechanism of depleting oxidative stress inducing molecules from cells (maintains high concentrations of up to 5 mM in the liver). Glutathione also is a cofactor for iNOS, allow maximal synthesis of NO in the arg-NO pathway.

Gluatamate with co-agonists glycine or serine is an agonist of NMDA receptors, releasing Ca2+ as a second messenger in many cellular signaling pathways. There are dopaminergic and glutaminergic abnormalities implicated in schitzophrenia, with NMDA antagonists mimicking some positive and negative symptoms of schitzophrenia, while carrying less risk of brain harm than do dopamine agonists. Ketamine and PCP, for example, produce similar phenotypes observed in schitzophrenia, with PCP showing less representative symptomology yet similar brain structure changes. Glutamate receptors have increased function, contributing to the onset of schizophrenia. An increased proportion of post-synaptic glutamate receptors to pre-synaptic glutamate receptors result in increased glutamate signaling. Both agonizing and antagonizing NMDA receptors has shown some benefit in treating Alzheimer's dementia, depending on the MOA and receptor specificity. Thus delivering proteins with either high or low levels of Glutamate, which also as NMDA agonist activity, could be therapeutic for this patient population. Proteins with low levels of Glutamate would likely provide a synergistic benefit alongside NMDA antagonists, such as Memantine. Likewise, clinical trials on LY2140023 have demonstrated Glutamate-based treatments as having potential for treating schizophrenia without the side effects seen with xenochemical anti-psychotics. Similar studies combining co-agonist Glycine with anti-psychotics, showed improved symptomology, suggesting that delivering high doses of Aspartate, also a NMDA agonist, will yield similar therapeutic benefits in this patient population.

Low Glutamate

Glutamate and acetyl-CoA are converted into N-acetylglutamate in the mitochondria. This pushes the hepatic urea cycle towards the active state, useful for ammonia detoxification. This means that dietary delivery of nutrients with low doses of Glutamate may be useful in the context of kidney disease, where patients struggle to clear urea from their circulation. Elimination of Glutamate to limit uremia from the available nitrogen sources, while being able to maintain a limited protein intake to prevent tissue catabolism, is a novel strategy against a disruptive nutritional consequence of kidney disease.

Glutamine:

Glutamine oxidizes in enterocytes of the small intestine, leading to nitrogeneous products ornithine, citrulline, arginine, and alanine.

Citrulline is produced from Glutamine as a by-product of a reaction catalyzed by the NOS family. Dietary supplement of citrulline is known to reduce plasma levels of glucose, homocysteine, and asymmetric dimethylarginine, which are risk factors for metabolic syndrome. L-citrulline accelerates the removal of lactic acid from muscles, likely due to the affects on vascular tone and endothelial function. Recent studies have also shown that L-citrulline from watermelon juice provides greater recovery from exercise, and less soreness the next day. It also appears that delivery of L-citrulline as a free form results in less uptake into cells in vitro than in the context of watermelon juice (which contains high levels of L-citrulline). This suggests an opportunity to deliver peptide doses, which can traffic arginine into muscle tissue for conversion into citrulline by eNOS at the endothelial membrane for improved efficacy.

High Glutamine

Glutamine is a well studied secretagogue that can stimulate the systemic release of insulin from beta-cells, growth hormone, prolactin, glucagon, progesterone, and placental lactogen. It has also been shown to reduce circulating glucocorticoids and stress hormones. This biology has direct implications on both digestive biology and the absorption of nutrients present in the intestine, as well as affecting energy balance by triggering satiety signals mediated by endocrine hormones. The ability to modulate these hormones provides a therapeutic opportunity for decreasing caloric intake in metabolic disorders such as obesity or alternatively triggering appetite in muscle wasting, sarcopenia, and cachexia, as well as by shifting insulin sensitivity in the onset of diabetes.

Dietary Glutamine supplementation up-regulates anti-oxidative genes and lowers expression of proinflammatory genes in the adipose and small intestinal tissues.

Glutamine is an important signaling molecule for stimulating mTOR1 phosphorylation in a cell-specific manner. This regulates cellular protein turnover (autophagy) and integrates insulin-like growth signals to protein synthesis initiation across tissues. This biology has been directly linked to biogenesis of lean tissue mass in skeletal muscle, metabolic shifts in disease states of obesity and insulin resistance, and aging.

Glutamine is an amino acid that is maintained at sufficient levels to support the anabolic effects of EAAs. Lysine, Methionine, Threonine, Tryptophan, Leucine, Isoleucine, and Valine have been shown unable to support increased protein synthesis and whole-body growth when added to a 12.7% crude protein diet, indicating a deficiency in the anabolic mediating non-essential amino acids, including Glutamine.

Glutamine is slowly cyclized to pyroglutamate. Glutamine is the preferred source of fuel for rapidly dividing cells, including enterocytes, lymphocytes, macrophages, and tumors. Supplementation with glutamine in the diet has significant demonstrated benefits in gut integrity and immune function in surgery, critical illness, burn and infection. A 12-day burn injury study of Glutamine supplementation (0.35 g/kg) showed decreased intestinal permeability, lower endotoxin levels, and shorter length of hospital stay. It provided 8.8× decrease vs 5.5× decrease in Lactulose/mannitol ratio after 3 days and a 6-day reduction in hospital stay. 2 week Glutamine total parenteral nutrition (TPN) (0.23 g/kg) vs Glutamine-Free TPN study of malnourished patients waiting for surgery showed increased gut permeability in Glutamine-Free group. It provided a 3.6× vs 0.81× increase in Lactulose/Mannitol ratio after 2 weeks. These improvements point to an opportunity to deliver high levels of Glutamine in the clinic to improve intestinal immunity and reduced bacteraemia.

This also improves lymphocyte counts systemically and reduces infectious complications during a hospital stay. A study of glutamine supplementation (26 g/day until discharge) in patients with serious burn injury shows 3× more frequent positive blood culture in standard total enteral nutrition (TEN) vs Glutamine-enriched, significantly reducing mortality rate. Additionally, Glutamine supplementation shows increased lymphocyte count and function, increased HGH, reduced infectious complications, reduced hospital stay, reduced morbidity, reduced mortality, and reduced gut permeability.

Intramuscular levels of Glutamine decrease under catabolic states such as stress, burn, injury, and sepsis. This decrease causes an net negative protein in lean tissue. Administration of Glutamine to the skeletal muscle has been shown to increase protein synthesis while inhibiting breakdown in-vitro. Furthermore, dose dependence from physiological concentrations (1 mM Glutamine) up to 15-fold higher concentrations has been observed in skeletal muscle. The effect was further demonstrated in mucosal cells taken from the small intestine.

Branched chain amino acids are all metabolic substrates for glutamine synthesis, providing a source of Glutamine in the fetus, enhancing placental and fetal growth, suggesting a role for Glutamine in mediating their effects on anabolism in mammals. Moreover, it has been shown that Glutamine levels and timing of availability from the plasma affect the cellular uptake of Leucine, and the subsequent profile of mTOR activation. A buildup of intracellular Glutamine is used for uptake of Leucine via the Glutamine/Leucine antiporter, SLC7A5. Administration of glutamine at equal proportions to Leucine in-vitro causes a more sustained stimulation of protein synthesis via mTOR, wheras priming the cells with Glutamine prior to Leucine administration leads to a more rapid, yet transient mTOR activation (Nicklin, P. et. al. Cell 2009).

A lack of ATP-producing substrates, as occurs in a fasted state, can lead to autophagy and the turnover of intracellular protein in the lysosome to provide an energy source. Low levels of the glucogenic amino acids, including glutamine can stimulate hepatic autophagy, leading to degradation of liver function.

Low Glutamine

mTOR is a central signaling pathway which can be hijacked for the proliferation of fast-growing cancer cells, as is evidence by oncogenic cells' preferential uptake of Glutamine.

Glycine:

A lack of ATP-producing substrates, as occurs in a fasted state, can lead to autophagy and the turnover of intracellular protein in the lysosome to provide an energy source. Low levels of the glucogenic amino acids, including glycine can stimulate hepatic autophagy, leading to degradation of liver function.

Glycine facilitates the biosynthesis of glutathione, which can donate a reducing equivalent to unstable molecules such as reactive oxygen species (ROS) and free radicals. After reducing an oxidative species, it can form a glutathione disulfide with another reactive glutathione, providing a mechanism of depleting oxidative stress inducing molecules from cells (maintains high concentrations of up to 5 mM in the liver). Glutathione also is a cofactor for iNOS, allow maximal synthesis of NO in the arg-NO pathway.

Histidine:

Histidine is an essential amino acid, and is a precursor for carnosine. Carnosine is an antioxidant and transition metal ion-sequestering agent. It acts as an anti-glycating agent by inhibiting the formation of advanced glycation end products (AGEs). AGEs are prevalent in diabetic vasculature and contribute to the development of atherosclerosis. The presence of AGEs in various cells types affect both the extracellular and intracellular structure and function. (Golden, A. et. al. Advanced Glycosylation End Products, Circulation 2006). Also, the accumulation of AGEs in the brain is a characteristic of aging and degeneration, particularly in Alzheimer's disease. AGE accumulation explains many neuropathological and biochemical features of Alzheimer's disease such as protein crosslinking, oxidative stress, and neuronal cell death. Because of its combination of antioxidant and antiglycating properties, carnosine is able to diminish cellular oxidative stress and inhibit the intracellular formation of reactive oxygen species and reactive nitrogen species.

Histidine has antioxidant, anti-inflammatory, and anti-secretory properties. Histidine's imidazole rings have the ability to scavenge reactive oxygen species (ROS), which are made by cells during acute inflammatory response. Histidine administration inhibits cytokine and growth factors involved in cell and tissue damage. Histidine administration is instrumental in rheumatoid arthritis treatment, and administering 4.5 g daily has been used to effectively treat patients with severe rheumatoid arthritis. Rheumatoid arthritis patients have been found to have low serum histidine levels due to its very rapid removal from the blood. Low plasma Histidine levels have also been found in patients with chronic renal failure, obese women (where it also had negative impact on oxidative stress and inflammation), pediatric patients with pneumonia, and asthma patients. Histidine supplementation has been shown to diminish insulin resistance, reduce BMI and fat mass. Histidine suppresses inflammation and oxidative stress in obese subjects with a metabolic syndrome. Lastly, as a precursor to histamine, histidine increases levels of histamine in the blood and in the brain. Low blood histamine is found in some manic, schizophrenic, high copper and hyperactive groups of psychiatric patients.

Posttranslational modification of proteins involved in transcriptional regulation is a mechanisms used to regulate genes. This modification can alter protein functions in specific ways. One form of modification is protein methylation, which is one of the most abundant protein modifications. Protein methylation carries important biological functions, including gene regulation and signal transduction. Histidine plays a role in protein modification, and ultimately gene regulation, in that it accepts methyl group transferred from S-adenosylmethionine by protein methyltransferases (Young-Ho Lee and Michael R. Stallcup, Mol Endocrinol. 2009 April; 23(4): 425-433).

Histidine supplementation can be instrumental in the treatment of multiple diseases including: Alzheimer's disease, diabetes, atherosclerosis, metabolic syndrome in women, rheumatoid arthritis, and various psychiatric conditions (manic, schizophrenic, high copper, and hyperactive groups). Additionally, due to its role in protein modifications, Histidine provides an avenue to combat diseases resulting from gene deregulation, including cancer.

Low Histidine

There exists a mechanistic understanding of how uncharged tRNA allosterically activates GCN2, leading to downstream phosphorylation of transcription factors related to lipogenesis and protein synthesis, along with many biosynthetic pathways in eukaryotes (SREBP-1c, eIF2a, and GCN4p discussed below). Diets devoid of an essential amino acid remarkably trigger this signaling within minutes after diet introduction (Hao et. Al., science 2005). Signaling through SREBP-1c has been shown in vivo to have dramatic effects on mobilizing lipid stores by repressing genes related to lipogenesis. SREBP-1c has been shown to specifically act on hepatic lipid synthesis, and an ability to cause a hepatic steatosis phenotype as well as increase in visceral fat mass (Knebel, B. et. Al. Liver-Specific Expression of Transcriptionally Active SREBP-1c Is Associated with Fatty Liver and Increased Visceral Fat Mass. PLoS, 2012). An unbalanced diet lacking Histidine has been shown to signal GCN2 for rats on a basal casein diet with 1-5.4% of an amino acid mixture supplemented lacking Histidine. Histidine deprivation, through its action on GCN2, has an effect on SREBP-1c and decreased physiologic measures of liver weight (and fatty liver phenotype), adipose tissue weight, cholesterol/triglyceride content, and food intake. Driving decreased fat mass, while maintaining lean mass, provides a therapeutic opportunity in areas such as obesity, diabetes, and cardiovascular health.

Isoleucine:

Isoleucine is an EAA, and is also a BCAA. Isoleucine is used in combination with other BCAAs to improve the nutritional status of patients suffering from hepatic disease. BCAAs, including isoleucine, serve as fuel sources for skeletal muscle during periods of metabolic stress; promote protein synthesis, suppress protein catabolism, and serve as substrates for gluconeogenesis. BCAAs, and specifically isoleucine, are catabolized in the skeletal muscle, and stimulate the production of L-alanine and L-glutamine.

BCAAs have been shown to have anabolic effects on protein metabolism by increasing the rate of protein synthesis and decreasing the rate of protein degradation in resting human muscle. Additionally, BCAAs are shown to have anabolic effects in human muscle during post endurance exercise recovery. These effects are mediated through the phosphorylation of mTOR and sequential activation of 70-kD S6 protein kinase (p70-kD S6), and eukaryotic initiation factor 4E-binding protein 1. P70-kD S6 is known for its role in modulating cell-cycle progression, cell size, and cell survival. P70-kD S6 activation in response to mitogen stimulation up-regulates ribosomal biosynthesis and enhances the translational capacity of the cell (W-L An, et al., Am J Pathol. 2003 August; 163(2): 591-607; E. Blomstrand, et al., J. Nutr. January 2006 136: 269S-273S). Eukaryotic initiation factor 4E-binding protein 1 is a limiting component of the multi-subunit complex that recruits 40S ribosomal subunits to the 5′ end of mRNAs. Activation of p70 S6 kinase, and subsequent phosphorylation of the ribosomal protein S6, is associated with enhanced translation of specific mRNAs.

BCAAs given to subjects during and after one session of quadriceps muscle resistance exercise show an increase in mTOR, p70 S6 kinase, and S6 phosphorylation was found in the recovery period after the exercise. However, there was no such effect of BCAAs on Akt or glycogen synthase kinase 3 (GSK-3). Exercise without BCAA intake leads to a partial phosphorylation of p70 S6 kinase without activating the enzyme, a decrease in Akt phosphorylation, and no change in GSK-3. BCAA infusion also increases p70 S6 kinase phosphorylation in an Akt-independent manner in resting subjects. This mTOR activity regulates cellular protein turnover (autophagy) and integrates insulin-like growth signals to protein synthesis initiation across tissues. This biology has been directly linked to biogenesis of lean tissue mass in skeletal muscle, metabolic shifts in disease states of obesity and insulin resistance, and aging.

Isoleucine supplementation can be used to improve athletic performance and muscle formation, prevent muscle loss that accompanies aging, aid those suffering from hepatic disease, support the growing bodies of children, and improve the nutritive quality of foods given to the starving populations. Additionally, as a precursor for L-alanine and L-glutamine, isoleucine mediates their significant metabolic signaling activities.

Low ISOLEUCINE

In states of obesity and diabetes, animals have been shown to exhibit reduced hepatic autophagy, leading to increased insulin resistance. Autophagy is important for maintenance of the ER and cellular homeostasis, which when stressed can lead to impaired insulin sensitivity. High fat diet feeding in animal models stresses the ER, while leading to depressed hepatic autophagy through over-stimulation of mTORC1, which reinforces the progression towards insulin sensitivity impaired beta-cell function in diabetes. Reducing the level of systemic Isoleucine provides an opportunity to lower mTORC1 activity and restore healthy levels of autophagy.

There exists a mechanistic understanding of how uncharged tRNA allosterically activates GCN2, leading to downstream phosphorylation of transcription factors related to lipogenesis and protein synthesis, along with many biosynthetic pathways in eukaryotes (SREBP-1c, eIF2a, and GCN4p discussed below). Diets devoid of any EAAs remarkably trigger this signaling within minutes after diet introduction (Hao et. Al., science 2005). Signaling through SREBP-1c has been shown in vivo to have dramatic effects on mobilizing lipid stores by repressing genes related to lipogenesis. SREBP-1c has been shown to specifically act on hepatic lipid synthesis, and an ability to cause a hepatic steatosis phenotype as well as increase in visceral fat mass (Knebel, B. et. Al. Liver-Specific Expression of Transcriptionally Active SREBP-1c Is Associated with Fatty Liver and Increased Visceral Fat Mass. PLoS, 2012). Isoleucine deprivation, through its action on GCN2, has an effect on SREBP-1c and decreased physiologic measures of liver weight (and fatty liver phenotype), adipose tissue weight, cholesterol/triglyceride content, and food intake. Driving decreased fat mass, while maintaining lean mass, provides a therapeutic opportunity in areas such as obesity, diabetes, and cardiovascular health.

Leucine:

Leucine is an essential amino acid and a branched chain amino acid. The branched chain amino acids, including Leucine, serve as fuel sources for skeletal muscle during periods of metabolic stress; promote protein synthesis, suppress protein catabolism, and serve as substrates for gluconeogenesis. BCAAs, and including Leucine, are catabolized in the skeletal muscle, and stimulate the production of L-alanine and L-glutamine. Leucine plays a direct role in the regulation of protein turnover through cellular mTOR signaling and gene expression as well as serving to activate glumatate dehydrogenase.

BCAAs have been shown to have anabolic effects on protein metabolism by increasing the rate of protein synthesis and decreasing the rate of protein degradation in resting human muscle. Additionally, BCAAs are shown to have anabolic affects in human muscle during post endurance exercise recovery. These affects are mediated through the phosphorylation of mTOR and sequential activation of 70-kD S6 protein kinase (p70-kD S6), and eukaryotic initiation factor 4E-binding protein 1. P70-kD S6 is known for its role in modulating cell-cycle progression, cell size, and cell survival. P70-kD S6 activation in response to mitogen stimulation up-regulates ribosomal biosynthesis and enhances the translational capacity of the cell (W-L An, et al., Am J Pathol. 2003 August; 163(2): 591-607; E. Blomstrand, et al., J. Nutr. January 2006 136: 269S-273S). Eukaryotic initiation factor 4E-binding protein 1 is a limiting component of the multi-subunit complex that recruits 40S ribosomal subunits to the 5′ end of mRNAs. Activation of p70 S6 kinase, and subsequent phosphorylation of the ribosomal protein S6, is associated with enhanced translation of specific mRNAs.

BCAAs given to subjects during and after one session of quadriceps muscle resistance exercise show an increase in mTOR, p70 S6 kinase, and S6 phosphorylation was found in the recovery period after the exercise. However, there was no such effect of BCAAs on Akt or glycogen synthase kinase 3 (GSK-3). Exercise without BCAA intake leads to a partial phosphorylation of p70 S6 kinase without activating the enzyme, a decrease in Akt phosphorylation, and no change in GSK-3. BCAA infusion also increases p70 S6 kinase phosphorylation in an Akt-independent manner in resting subjects. Leucine is furthermore known to be the primary signaling molecule for stimulating mTOR1 phosphorylation in a cell-specific manner. This regulates cellular protein turnover (autophagy) and integrates insulin-like growth signals to protein synthesis initiation across tissues. This biology has been directly linked to biogenesis of lean tissue mass in skeletal muscle, metabolic shifts in disease states of obesity and insulin resistance, and aging.

Leucine is a well-studied secretagogue that can stimulate the systemic release of insulin from beta-cells, growth hormone, prolactin, glucagon, progesterone, and placental lactogen. This biology has direct implications on both digestive biology and the absorption of nutrients present in the intestine, as well as affecting energy balance by triggering satiety signals mediated by endocrine hormones. The ability to modulate these hormones provides a therapeutic opportunity for decreasing caloric intake in metabolic disorders such as obesity or alternatively triggering appetite in muscle wasting, sarcopenia, and cachexia, as well as by shifting insulin sensitivity in the onset of diabetes.

Leucine activates glutamate dehydrogenase, which is an enzyme that catalyzes the reversible interconversion between glutamate, α-ketoglutarate, and ammonia. In mammals, glutamate dehydrogenase has high levels of activity in the liver, kidney, brain, and pancreas. In the liver, glutamate dehydrogenase provides the appropriate ratio of ammonia and amino acids for urea synthesis in periportal hepatocytes, and the glutamate dehydrogenase reactions seem to be in a close-to-equilibrium state. Additionally, glutamate dehydrogenase has been shown to produce glutamate for glutamine synthesis in a small rim of pericentral hepatocytes, enabling it to serve as either a source for ammonia or an ammonia scavenger. In the kidney, glutamate dehydrogenase functions to produce ammonia from glutamate to control acidosis (C. Spanaki and A. Plaitakis, Neurotox Res. 2012 January; 21(1):117-27).

Leucine supplementation can be used to improve athletic performance and muscle formation, prevent muscle loss that accompanies aging, aid those suffering from hepatic disease, support the growing bodies of children, and improve the nutritive quality of foods given to the starving populations. Additionally, leucine plays an important role in urea synthesis in hepatocytes, and may be given to treat those who suffer from conditions that cause them to be hyperammonemic. Lastly, leucine may be used to treat acidosis.

Low Leucine

In states of obesity and diabetes, animals have been shown to exhibit reduced hepatic autophagy, leading to increased insulin resistance. Autophagy is important for maintenance of the ER and cellular homeostasis, which when stressed can lead to impaired insulin sensitivity. High fat diet feeding in animal models stresses the ER, while leading to depressed hepatic autophagy through over-stimulation of mTORC1, which reinforces the progression towards insulin sensitivity impaired beta cell function in diabetes. Reducing the level of systemic Leucine provides an opportunity to lower mTORC1 activity and restore healthy levels of autophagy.

mTOR is a central signaling pathway which can be hijacked for the proliferation of fast-growing cancer cells. Depletion of Leucine may reduce a fast-growing cell's ability to sustain constitutive mTOR activation.

There exists a mechanistic understanding of how uncharged tRNA allosterically activates GCN2, leading to downstream phosphorylation of transcription factors related to lipogenesis and protein synthesis, along with many biosynthetic pathways in eukaryotes (SREBP-1c, eIF2a, and GCN4p discussed below). Diets devoid of an EAA remarkably trigger this signaling within minutes after diet introduction (Hao et. Al., science 2005). Signaling through SREBP-1c has been shown in vivo to have dramatic effects on mobilizing lipid stores by repressing genes related to lipogenesis. SREBP-1c has been shown to specifically act on hepatic lipid synthesis, and an ability to cause a hepatic steatosis phenotype as well as increase in visceral fat mass (Knebel, B. et. Al. Liver-Specific Expression of Transcriptionally Active SREBP-1c Is Associated with Fatty Liver and Increased Visceral Fat Mass. PLoS, 2012). Leucine deprivation, through its action on GCN2, has an affect on SREBP-1c and decreased physiologic measures of liver weight (and fatty liver phenotype), adipose tissue weight, cholesterol/triglyceride content, and food intake. Driving decreased fat mass, while maintaining lean mass, provides a therapeutic opportunity in areas such as obesity, diabetes, and cardiovascular health.

Leucine deprivation, furthermore, has directly shown up-regulation of UCP1 in brown adipose tissue (BAT), a direct measure of thermogenesis, an increase in energy expenditure (presumably due to an increase in thermogenesis in BAT), and a corresponding decrease in fat mass by stimulation of lipolysis in the white adipose tissue (WAT). UCP1 up-regulation results in decreased food intake, body weight, abdominal fat mass, fat mass, and maintenance of lean mass (Guo, F. The GCN2 eIF2alpha kinase regulates fatty-acid homeostasis in the liver during deprivation of an essential amino acid. Cell Metab., 2007).

Lysine:

Lysine is an EAA that is important for proper growth, and plays a vital role in the production of carnitine. Carnitine is a quaternary amine that plays an important role in the production of energy in the myocardium. Carnitine transports free fatty acids into the mitochondria, and in so doing, increases the preferred substrate for oxidative metabolism in the heart. Additionally, carnitine prevents the fatty acid accumulation that occurs during ischemic events, which may lead to ventricular arrhythmias. As the myocardial carnitine levels are quickly diminished during an ischemic event, exogenous supplementation with carnitine replenishes the depleted myocardial carnitine levels and improve cardiac metabolic and left ventricular function. Additionally, an analysis of 4 studies demonstrated that supplementation with L-carnitine after an acute myocardial infarction (AMI), in comparison to a placebo, significantly reduces left ventricular dilation in the first year after the AMI. This is significant because the prevention of left ventricular dilation and the preservation of cardiac function after an AMI is a powerful predictor of the progression to heart failure and death. Additionally, carnitine aids in lowering cholesterol, which further supports heart health, and aids in the prevention of acute myocardial infarctions (James J. DiNicolantonio, et al., Mayo Clinic Proceedings, 2013; 88, 544-551).

Lysine supplementation is useful to support heart health and during ischemic events to prevent ventricular arrhythmia. In addition, Lysine supplementation may help heart attack patients recover effectively, and aid in the prevention of heart attacks in those with the left ventricular dilation. Also, Lysine can be used for to decrease cholesterol levels in patients with high cholesterol.

Lysine is instrumental in helping the body to absorb calcium and decreases the amount of calcium that is lost in urine. Due to calcium's role in bone health, Lysine supplementation is helpful in preventing the bone loss that is associated with osteoporosis. Furthermore, a combination of L-arginine and Lysine makes the bone building cells more active and enhances production of collagen, which is substance that is important for bones and connective tissues including: skin, tendon, and cartilage.

Lysine supplementation is useful for patients suffering from osteoporosis, and those at risk for developing osteoporosis; the elderly, menopausal women, growing children, in cosmetics due to its role in collagen production, and athletes for improved ligament integrity.

A lysine deficiency causes fatigue, nausea, dizziness, loss of appetite, agitation, bloodshot eyes, slow growth, anemia, and reproductive disorders.

Lysine helps to prevent and suppress outbreaks of cold sores and genital herpes when taken on a regular basis. When 45 patients with frequently recurring herpes infection were given 312-1200 mg of lysine daily in single or multiple doses, recovery from the infection and suppression of recurrence was evidenced (Griffith R. S., et al., Dermatologica 1978; 156:257-267). This is because lysine has antiviral effects, which act by blocking the activity of arginine, which promotes herpes simplex virus (HSV) replication. In tissue culture studies, herpes viral replication is enhanced when the arginine/lysine ratio favors arginine. However, when the arginine/lysine ratio favors lysine, viral replication is suppressed, ad cyto-pathogenicity of HSV is inhibited. (Griffith R. S., et al., Dermatologica 1978; 156:257-267). It has been shown that oral lysine is more effective for preventing an outbreak than it is at reducing the severity and duration of the outbreak.

Supplementing the diet with Lysine for those infected with the HSV suppresses outbreak of cold sores and genital warts, and when actively taken on a regular basis is very beneficial in the prevention of outbreaks.

Lysine modulates the arginine-NO pathway. NO is important for normal endothelial function and cardiovascular health (including vascular tone, hemodynamics, angiogenesis). Lysine is a natural inhibitor of L-arginine transport, and competes with L-arginine for uptake through the system y+, which is the major transport system of cationic amino acids in mammalian cells. Excess nitric oxide contributes to refractory hypotension associated with sepsis, and can be combatted with administration of L-lysine because it inhibits Arginine, which is an important component of NO synthesis (K. G. Allman, et al., British Journal of Anaesthesia (1998) 81: 188-192). Moreover, an excess of NO may lead to diseases, due to its release from cerebral vasculature, brain tissue, and nerve endings, which are prime regions for neurodegeneration. Excess NO may lead to migraines, brain cell damage that can lead to neurodegenerative diseases like Parkinson disease, Alzheimer's disease, Huntington disease, and amyotrophic lateral sclerosis. Furthermore, NO that is produced by the pancreas may damage the beta-cells as occurs in type 1 diabetes.

Lysine supplementation is useful for the prevention of hypotension associated with sepsis by preventing vasodilation. Additionally, lysine may be used to prevent/treat migraines, and prevent/slow down the progression of neurodegenerative diseases like AD, Parkinson's disease, Huntington, and amyotrophic lateral sceloris.

Low Lysine

There exists a mechanistic understanding of how uncharged tRNA allosterically activates GCN2, leading to downstream phosphorylation of transcription factors related to lipogenesis and protein synthesis, along with many biosynthetic pathways in eukaryotes (SREBP-1c, eIF2a, and GCN4p discussed below). Diets devoid of an EAA remarkably trigger this signaling within minutes after diet introduction (Hao et. Al., science 2005). Signaling through SREBP-1c has been shown in vivo to have dramatic effects on mobilizing lipid stores by repressing genes related to lipogenesis. SREBP-1c has been shown to specifically act on hepatic lipid synthesis, with an ability to cause a hepatic steatosis phenotype as well as increase in visceral fat mass (Knebel, B. et. Al. Liver-Specific Expression of Transcriptionally Active SREBP-1c Is Associated with Fatty Liver and Increased Visceral Fat Mass. PLoS, 2012). Lysine deprivation, through its action on GCN2, has an affect on SREBP-1c and decreased physiologic measures of liver weight (and fatty liver phenotype), adipose tissue weight, cholesterol/triglyceride content, and food intake. Driving decreased fat mass, while maintaining lean mass, provides a therapeutic opportunity in areas such as obesity, diabetes, and cardiovascular health.

Methionine:

Methionine is an essential amino acid, and is the initiating amino acid in the synthesis of virtually all eukaryotic proteins. Methionine is one of the most hydrophobic AAs. Most of the methionine residues in globular proteins can be found in the interior of the hydrophobic core. Methionine is often found to interact with the lipid bilayer in membrane-spanning protein domains. Due to its location and powerful antioxidative properties, methionine has been regarded as endogenous antioxidants in proteins (John T. Brosnan and Margaret E. Brosnan, J. Nutr. June 2006 vol. 136 no. 6 1636S-1640S). Methionine residues have a high susceptibility to oxidation by oxidases, ozone, hydrogen peroxide, superoxide, γ-irradiation, metal-catalyzed oxidation, “leakage” from the electron transport chain, and auto-oxidation of flavins or xenobiotics. Once oxidized, the Methionine residue is converted to methionine sulfoxide, which can be converted back to Methionine though methionine sulfoxide reductases (Rodney L. Levine, et al., Proc Natl Acad Sci USA, 1996 Dec. 24; 93(26): 15036-15040). As an antioxidant, methionine supplementation can aid in the prevention of cancer, degenerative diseases, heart disease, liver and kidney pathologies. It can also be used in cosmetics to fight the damage of UV rays to the skin.

Methionine is a lipotropic AA, and helps the liver process lipids, and thereby helps prevent the build-up of fat in the liver and arteries that may ultimately lead to an obstruction of blood flow to the brain, heart, and kidneys. Additionally, the build-up of fat in the liver drives a pathology known as hepatic steatosis, which may ultimately lead to cirrhosis of the liver. Methionine supplementation for individuals undergoing drug detoxification may improve the process, as well as for those taking medications which have toxic side effects.

In addition, to being a lipotropic AA, Methionine promotes heart health by increasing of the liver's production of lectithin, which is known to help reduce cholesterol levels. Methionine supplementation can prevent cirrhosis of the liver from fat deposition therein. Additionally, it can promote cardiovascular health by preventing the deposition of fat into the arteries, thereby preventing possible myocardial infarctions and strokes. Further, Methionine may help those with high cholesterol levels lower their cholesterol, improving the risk of cardiovascular disease

Methionine aids in the proper functioning of the immune system in that elevated levels of methionine increases the levels of taurine, and homocysteine and glutathione which help improve immune function. The underlying mechanism for the immune functions may involve mTOR activation, NO and glutathionine synthesis, H2S signaling, and cellular redox state. Methionine is a precursor for Taurine, which modulates the arginine-NO pathway. NO is important for normal endothelial function and cardiovascular health (including vascular tone, hemodynamics, angiogenesis).

Methionine is also converted into cysteine, which is a precursor for Glutathionine. Glutathionine is a powerful neutralizer of toxins in the liver, and helps to protect the liver from the damaging effects of toxins. Additionally, this detoxifying ability helps to diminish muscle weakness, prevents brittle hair, and protects against radiation associated with these toxins. As a result, it is beneficial for those suffering from chemical allergies or exposure to high levels of air pollution. Methionine can be helpful to patients with compromised immune systems, such as AIDS patients and cancer patients. Likewise, it can be a useful supplement during flu seasons, particularly to groups who are most susceptible, including: the elderly, children, and pregnant women. Furthermore, it can be used for those travelling to countries where they will likely be susceptible to regional infections. Methionine levels are observed to be lower in patients with AIDS. This decreased level of methionine has been linked to deterioration in the nervous system that leads to symptoms like dementia, and diminished memory recall. Supplementing with 6 grams of methionine per day can lead to improvements in the memory recall in these patients. Likewise, Methionine can be beneficial to those who have diseases that involve nervous system degeneration including Alzheimer's Disease, ALS, MS, and Huntington's.

Methionine participates in one-carbon metabolism, and thereby also participates in the methylation of proteins and DNA, which in turn helps regulate gene expression and the biological activity of proteins. Methionine supplementation for those at risk for related genetic disorders can be used to promote proper gene regulation in all individuals.

Low Methionine

Methionine is a precursor for the toxic homocysteine, which mediates ADMA by down-regulating DDAH in body to metabolize ADMA, interfering with the arginine-NO pathway. NO is important for normal endothelial function and cardiovascular health (including vascular tone, hemodynamics, angiogenesis).

There exists a mechanistic understanding of how uncharged tRNA allosterically activates GCN2, leading to downstream phosphorylation of transcription factors related to lipogenesis, protein synthesis, along with many biosynthetic pathways in eukaryotes (SREBP-1c, eIF2a, and GCN4p discussed below). Diets devoid of any EEAs remarkably trigger this signaling within minutes after diet introduction (Hao et. Al., science 2005). Signaling through SREBP-1c has been shown in vivo to have dramatic effects on mobilizing lipid stores by repressing genes related to lipogenesis. SREBP-1c has been shown to specifically act on hepatic lipid synthesis, and an ability to cause a hepatic steatosis phenotype as well as increase in visceral fat mass (Knebel, B. et. Al. Liver-Specific Expression of Transcriptionally Active SREBP-1c Is Associated with Fatty Liver and Increased Visceral Fat Mass. PLoS, 2012). Methionine deprivation, through its action on GCN2, has an affect on SREBP-1c and decreased physiologic measures of liver weight (and fatty liver phenotype), adipose tissue weight, cholesterol/triglyceride content, and food intake. Driving decreased fat mass, while maintaining lean mass, provides a therapeutic opportunity in areas such as obesity, diabetes, and cardiovascular health.

Phenylalanine:

Phenylalanine is an EEA, AuAA, and precursor for synthesis of norepinephrine in the brain, as well as a metabolic precursor for tyrosine, which is another aromatic amino acid and precursor for the synthesis of dopamine.

Norepinephrine (NE) is synthesized in the adrenal medulla and postganglionic neurons in the sympathetic nervous system by the β-oxidation of dopamine by β-hydroxylase along with the cofactor ascorbate. It works by being secreted into the synaptic cleft where it stimulates adrenergic receptors and is then either degraded or up-taken by surrounding cells. As a cathecolamine, it does not cross the blood-brain barrier.

NE can be used to combat attention-deficit/hyperactivity disorders (ADHD), depression, and hypotension. In terms of attention disorders, like ADHD, medications prescribed tend to help increase levels of NE and dopamine. Furthermore, depression is typically treated with medications that inhibit the reuptake of serotonin and NE thereby increasing the amount of serotonin and NE that is available in the postsynaptic cells in the brain. Recent evidence has suggested that serotonin-norepinephrine reuptake inhibitors (SNRIs) may also increase dopamine transmission because if the norepinephrine transporter ordinarily recycled dopamine as well, then SNRIs will also enhance the dopaminergic transmission. As a result, the effects antidepressants may also be associated with the increased NE levels may partly be due to the simultaneous increase in dopamine (in particular in the prefrontal cortex of the brain).

NE is used to treat patients with critical hypotension. NE is a vasopressor and acts on both α1 and α2 adrenergic receptors to cause vasoconstriction, thereby increasing the blood pressure.

As a precursor for NE, Phenylalanine can be used to treat attention disorders like ADHD and ADD. Additionally, it can be used to treat those suffering from depression or post-traumatic stress syndrome. Phenylalaline can also be used to treat depression or alter the function of neurotransmitter modulating drugs such as SSRIs. Additionally, due to its ability to increase blood pressure through the increase of vascular tone, it may be used to treat those with a hypotensive tendency. Furthermore, phenylalanine may be used as an upstream regulator of tyrosine levels, and thereby Tyrosine function.

Tyrosine supplementation can help in the treatment of Parkinson's disease due to its role as a precursor to L-DOPA and dopamine. Additionally, it can be used in the treatment of those with emotional/psychiatric disorder like depression and in the treatment of addiction. Furthermore, it can promote learning by increasing the reward/pleasure response during learning difficult or complex concepts or movements.

Dopamine, which is a monoamine catecholamine neurotransmitter, plays a regulatory role in the immune system. Neurotransmitters and neuropeptides that interact with specific receptors present in particular immune effector cells are released by the immune system to influence the functions of these cells in the host against disease and other environmental stress. The immunoregulatory actions of dopamine have been shown to be regulated via five different G protein-coupled receptors that are present in target cells. There are two broad classes of these receptors: G1 and G2, which encompass the varying subtypes. The D1 class of receptors includes D2 and D5 subtypes, and increase intracellular cAMP upon activation. The D2 class of receptors consists of the D2, D3, and D4 subtypes, and has been reported to inhibit intracellular cAMP upon stimulation. Dopamine receptors have been found on normal human leukocytes. Likewise, the lymphoid tissues have dopaminergic innervations through sympathetic nerves, which suggests that dopamine may be able to regulate the immune system effector cells (Basu, Sujit & Sarkar, Chandrani, Dopamine and immune system. SciTopics 2010).

Dopamine affects T cells by activating the resting T cells and inhibiting the activation of stimulated T cells. In normal resting peripheral human T lymphocytes, dopamine activates the D2 and D3 subclass of receptors, which in turn activates integrins (α4β1 and α5β1). These integrins are hetrodimeric transmembrane glycoproteins that attach cells to the extracellular matrix component, fibronectin. Fibronectin is used for the trafficking and extravasation of T cells across the tissue barriers and blood vessels. Furthermore, dopamine acts through the D3 receptors to selectively induce the migration and homing of CD8+ T cells. Moreover, dopamine affects T cells by influencing the secretions of cytokines by the T cells. When dopamine stimulates the D3 and D1/D5 receptors, the secretion of TNF-α (a pleiotropic inflammatory cytokine) is increased. When the D2 receptors are stimulated, IL-10 (an anti-inflammatory cytokine) is induced to secrete. Dopamine, however, can inhibit the activated T cell receptor induced cell proliferation and secretion of a number of cytokines like 11-2, IFN-γ and IL-4 through the down-regulation of the expression of non-receptor tyrosine kinases lck and fyn, which are important tyrosine kinases in the initiation of TCR activation (Basu, Sujit & Sarkar, Chandrani Dopamine and immune system. SciTopics 2010).

The B cells have a very high expression of dopamine D2, D3, and D5 receptors. Dopamine has the ability to inhibit the proliferation of the resting and the malignant B lymphocytes. Dopamine acts by promoting apoptosis in cycling B cells through oxidative stress. However, this dopaminergic action has not been observed in resting lymphocytes, therefore suggesting a role in the prevention of cancer (Basu, Sujit & Sarkar, Chandrani, Dopamine and immune system. SciTopics 2010).

Tyrosine, as a precursor for Dopamine, can be used to improve immune responses and improve the overall immune system functionality. It can provide a benefit to the elderly, women who are pregnant, children, and those with compromised immune functions like AIDS patients, and cancer patients. It also can be given to teachers, those travelling, and anyone frequently exposed to germs.

Epinephrine, which is popularly known as adrenaline, is a hormone that is secreted by the medulla of the adrenal glands. Epinephrine is released in response to strong emotions such as fear or anger, which causes an increase in heart rate, muscle strength, blood pressure, and sugar metabolism. It is responsible for the flight or fight response that prepares the body for difficult or strenuous activity. Epinephrine is used as a stimulant during cardiac arrest, as a vasoconstrictor during shock to increase blood pressure, and as a bronchodilator and antispasmodic in bronchial asthma. Epinephrine is not found in large quantities in the body, but is nevertheless very important in the maintenance of cardiovascular homeostasis because it has the ability to divert blood to tissues under stress. Epinephrine has this effect by influencing muscle contraction. Contraction of the muscles occurs through the binding calmodulin to calcium ions when the concentration is 10× larger than normal in the cell. The calcium-calmodulin complex then goes on to activate the myosin light chain kinase, which then phosphorylates the LC2 causing the contraction. Epinephrine binds to the epinephrine receptors, which activates adenylyl cyclase, and produces cyclic AMP from ATP. cAMP activates a protein kinase which thus phosphorylates the myosin light chain kinase. This phosphorylated myosin light chain kinase has a lower affinity for the calcium-calmodulin complex, and is thus inactive. As such, the smooth muscle tissue is relaxed. It is this action of epinephrine that makes it very useful in treating asthma, cardiac arrest, and anaphylactic shock. Tyrosine, as a precursor for Epinephrine, can be used for patients who are at risk for cardiac arrest, those suffering from asthma, and those who are at risk for anaphylactic shock.

Epinephrine is one of two main hormones that breakdown glycogen by binding to a receptor on exterior of a liver cell. This binding causes a conformational change to take place thereby allowing G protein to bind and become active. The activation of the G-protein coupled receptor causes a conformational change on the molecule to occur which causes adenylate cyclase to bind. Onceadenylate cyclase binds the complex, adenylate cyclase breaks down ATP into cAMP, which then becomes the second messenger protein in this process and activates protein kinase. The activated protein kinase activates phosphorylase, which is an enzyme that catalyzes breaks down the glycogen to glucose. Tyrosine, as a precursor for Epinephrine, can be used to improve athletic performance by making glucose readily available to fuel exercise.

Melanin is a metabolite of Tyrosine, and is a powerful antioxidant. Additionally, it is influential in the inhibition of the production of inflammatory cytokines and superoxide. When pro-inflammatory cytokines are overproduced, it mediates the damaging effects of inflammation in pathologic conditions like rheumatoid arthritis, graft vs. host reactions, cachexia, and sepsis syndrome. It has been found that melanin inhibits ongoing cytokine synthesis, which strongly suggests that melanin may be useful as a superimposed therapy for conditions that involve proinflammatory cytokines (Mohagheghpour N., et al., Cell Immunol. 2000 Jan. 10; 199(1):25-36).

Tyrosine can be used in the treatment of rheumatoid arthritis, cachexia, sepsis syndrome, those with inflammation related to autoimmune disorder, and other inflammatory sequela of pathologic conditions.

Phenylalanine up-regulates the activity of GTP cyclohydrolase-I, freeing tetrahydrobiopterin (THB) for NO synthesis and the hydroxylation of ArAAs by aromatic amino acid hydroxylase (AAAH). For this reason, delivery of high levels of Phenylalanine to raise cellular levels of THB directly stimulates the biosynthesis of many neurotransmitters in the CNS capillary endothelial cells. ArAAs serve as precursors for biosynthesis of monoamine neurotransmitters, including melatonin, dopamine, norepinephrine (noradrenaline), and epinephrine (adrenaline). In promoting NO synthesis, phenylalanine can be used to treat hypertension, to decrease blood pressure, and may be used in the context of diving, or those travelling to high altitudes to increase vasodilation.

Low Phenylalanine

There exists a mechanistic understanding of how uncharged tRNA allosterically activates GCN2, leading to downstream phosphorylation of transcription factors related to lipogenesis and protein synthesis, along with many biosynthetic pathways in eukaryotes (SREBP-1c, eIF2a, and GCN4p discussed below). Diets devoid of any EAAs remarkably trigger this signaling within minutes after diet introduction (Hao et. Al., science 2005). Signaling through SREBP-1c has been shown in vivo to have dramatic effects on mobilizing lipid stores by repressing genes related to lipogenesis. SREBP-1c has been shown to specifically act on hepatic lipid synthesis, and an ability to cause a hepatic steatosis phenotype as well as increase in visceral fat mass (Knebel, B. et. Al. Liver-Specific Expression of Transcriptionally Active SREBP-1c Is Associated with Fatty Liver and Increased Visceral Fat Mass. PLoS, 2012). Phenylalanine deprivation, through its action on GCN2, has an effect on SREBP-1c and decreased physiologic measures of liver weight (and fatty liver phenotype), adipose tissue weight, cholesterol/triglyceride content, and food intake. Driving decreased fat mass, while maintaining lean mass, provides a therapeutic opportunity in areas such as obesity, diabetes, and cardiovascular health.

Proline:

Citrulline is produced from Glutamine as a by-product of a reaction catalyzed by the NOS family. Dietary supplement of citrulline is known to reduce plasma levels of glucose, homocysteine, and asymmetric dimethylarginine, which are risk factors for metabolic syndrome. L-citrulline accelerates the removal of lactic acid from muscles, likely due to the effects on vascular tone and endothelial function. Recent studies have also shown that L-citrulline from watermelon juice provides greater recovery from exercise and less soreness the next day. It also appears that delivery of L-citrulline as a free form results in less uptake into cells in vitro than in the context of watermelon juice (which contains high levels of L-citrulline). This suggests an opportunity to deliver peptide doses, which can traffic arginine into muscle tissue for conversion into citrulline by eNOS at the endothelial membrane for improved efficacy.

Changes in NO synthesis and polyamines (via Proline), are seen during gestation when placental growth rate peaks, indicating a role for arginine in fetal development during pregnancy.

Serine:

Serine is a nonessential amino acid, and is biosynthesized from glycolysis via 3-phosphoglycerate. Serine plays a vital role in intermediary metabolism in that it contributes to phospholipid, sphingolipid, and cysteine biosynthesisas well as tryptophan synthesis in bacteria and is a primary source of glycine. The body has a need for glycine, which probably exceeds dietary intake by 10-50 fold. This demand is not only for the synthesis of protein, particularly collagen, but also for glycine being a precursor for 5 major metabolic biosynthetic pathways: creatine, porphyrins, purines, bile acids, and glutathione. Additionally, due to its role in glycine production, serine is also a major donor of folate-linked one-carbon units that are used in the biosynthesis of purines and 2′ deoxythymidine 5′-monophosphate and the remethylation of homocystein to methionine. It is important to note that for every glycine molecule that is derived from serine, there is one-carbon unit formed. (Cook, R. Defining the steps of the folate one-carbon shuffle and homocysteine metabolism 1′2; Am. J Clin Nutr; 2000)

In one-carbon metabolism, one-carbon units for biosynthesis are carried and chemically activated by a family of cofactors called tetrahydrofolate (THF) polyglutamates. THF-mediated one-carbon metabolism is a metabolic system of interdependent biosynthetic pathways compartmentalized in the cytoplasm, the mitochondria, and the nucleus. In the cytoplasm, one-carbon metabolism is used for the synthesis of purines and thymidylates and the remethylation of homocysteine to methionine (an overabundance of homocysteine may be harmful to the body). In the mitochondria, one-carbon metabolism is used for the synthesis of formylated methionyl-tRNA; the catabolism of choline, purines, and histidine; and the interconversion of serine and glycine. Additionally, the mitochondria is the primary source for one-carbon units for cytoplasmic metabolism. Disruption of the folate-mediated one-carbon metabolism has been linked with many pathologies and developmental anomalies. (J. T. Fox and P. J. Stover, Chapter 1, Folate-Mediated One-Carbon Metabolism, In: Gerald Litwack, Editor(s), Vitamins & Hormones, Academic Press, 2008, Volume 79, Pages 1-44).

Serine hydroxymethyltransferase (SHMT) catalyzes the freely reversible interconversion of serine and glycine in a reaction that is both folate- and pyridoxal 5-phosphate dependent. The conversion of serine to glycine involves the removal of the C-3 serine and the formation of 5,10-methylenetetrahydrofolate, which can be utilized in the folate-dependent one-carbon metabolism or oxidized to carbon dioxide via 10-foryltetrahydrofolate (Robert J Cook, Am J Clin Nutr December 2000 vol. 72 no. 6 1419-1420).

Serine is a precursor for cysteine. Cysteine is synthesized from homocysteine, which is itself synthesized from the metabolism of methionine. Serine is involved in cysteine's synthesis by condensing with homocysteine to form cystathionine. Cystathionine is then deaminated and hydrolyzed to form cysteine and alpha ketobutyrate. Cysteine's sulfur comes from homocysteine, but the rest of the molecule comes from the initial serine residue. The biosynthesis of cysteine occurs via a different mechanism in plants and prokaryotes. Cysteine is a vital amino acid because it plays an important role in protein folding. The disulfide linkages formed between cysteine residues helps to stabilize the tertiary and quaternary structure of proteins, and these disulfide linkages are most common among the secreted proteins, where proteins are exposed to more oxidizing conditions that are found in the cellular interior. Despite the benefits of homocysteine, high levels can be a risk factor for developing cardiovascular disease. Elevated homocysteine may be caused by a genetic deficiency of cystathionine beta-synthase and excess methionine intake may be another explanation. Control of methionine intake and supplementing with folic acid and vitamin B12 in the diet have been used to lower homocysteine levels. Likewise, increased Serine levels to support homocysteine to cysteine conversion can be beneficial.

N-methyl-D-aspartate (NMDA) is one of the most fundamental neurotransmitters in the brain. It is a glutamate receptor and is a vital molecular device for the control of synaptic plasticity and memory function. This receptor is an ionotropic receptor for glutamate and is characterized by high affinity for glutamate, a high unitary conductance, high calcium permeability, and a voltage-dependent block by magnesium ions. In order for the NMDA receptor to open, it is bound by glutamate and glycine or D-serine. D-serine is a neurotransmitter and a gliotransmitter that is biosynthesized in the brain by serine racemase from L-serine. It is a powerful or potent agonist to glycine for the NMDA receptor binding site. (Jean-Pierre Mothet, et al., Proc Natl Acad Sci USA, 2000, 97 (9) 4926-4931; Zito K and Scheuss V. (2009) NMDA Receptor Function and Physiological Modulation. In: Encyclopedia of Neuroscience (Squire L R, ed), volume 6, pp. 1157-1164. Oxford: Academic Press).

Serine plays an important role in learning and synaptic plasticity, as a result, serine supplementation can be useful to the elderly, growing children, school age children, and those experiencing learning difficulties. Additionally, it can be given to anyone trying to learn a new task, be it an instrument, or athletes/dancers trying to improve or learn new exercises and movements. Furthermore, due to its role as a precursor for cysteine, may be given as an upstream regulator for the effects of cysteine. As a precursor for the synthesis of glycine, serine may be used in cosmetic products, to combat aging, and promote proper growth because of its role in collagen synthesis. Furthermore, it can be used to improve athletic abilities because of its role in the creatine biosynthetic pathway. Moreover, it may be very useful in the detoxification and immune health because of its role in the glutathionine metabolic pathway.

Threonine:

Threonine is an EAA, and is one of the few AAs that is not converted into its L-isomer via transaminases and d-AA oxidases. Threonine is used for the synthesis of mucin protein, which is used for maintaining the integrity and function of the intestines. Mucus, which is composed of mucin and inorganic salts suspended in water, serve as a diffusion barrier against contact with noxious substances such as gastric acid and smoke. Mucus also acts as a lubricant to minimize shear stresses (G. K. Law, et al., Am J Physiol Gastrointest Liver Physiol 292:G1293-G1301, 2007).

90% of dietary threonine is used in the gut for mucus synthesis. Mucin is continuously synthesized and is very resistant to intestinal proteolysis, and is therefore not very easily recycled. As such, a substantial and consistent supply of threonine is used in order to effectively maintain gut function and structure. As a result, it is very important that the diet is rich with threonine in order to prevent mucus production from decreasing, which can lead to cancers in the gut, ulcers, etc. (G. K. Law, et al., Am J Physiol Gastrointest Liver Physiol 292:G1293-G1301, 2007; A. Hamard, et al., Journal of Nutritional Biochemistry, October 2010, Volume 21, Issue 10, Pages 914-921). Due to the importance of mucus to the integrity and structure of the gut, threonine supplementation can be useful in the prevention of gut disorder including cancers, ulcers, infections, and erosions.

Threonine plays a key role in humoral immunity because threonine is a major component of immunoglobulins, which are secreted by B lymphocytes in the blood. Once released, they reach the site of infection, recognize, bind, and inactivate their antigens. Because of the high threonine content of immunoglobulins, a threonine deficiency may have negatively affect immunoglobulin production, and thereby decrease immune response. Threonine supplementation is essential for its role in the immune response and can support leukemia patients, AIDS patients, and individuals who have immunodeficiency. Additionally, it can support those susceptible to infection during the flu season, such as the elderly and small children, as well as throughout the year to strengthen immune response.

Low Threonine

There exists a mechanistic understanding of how uncharged tRNA allosterically activates GCN2, leading to downstream phosphorylation of transcription factors related to lipogenesis, protein synthesis, along with many biosynthetic pathways in eukaryotes (SREBP-1c, eIF2a, and GCN4p discussed below). Diets devoid of any EAAs remarkably trigger this signaling within minutes after diet introduction (Hao et. Al., science 2005). Signaling through SREBP-1c has been shown in vivo to have dramatic effects on mobilizing lipid stores by repressing genes related to lipogenesis. SREBP-1c has been shown to specifically act on hepatic lipid synthesis, and an ability to cause a hepatic steatosis phenotype as well as increase in visceral fat mass (Knebel, B. et. Al. Liver-Specific Expression of Transcriptionally Active SREBP-1c Is Associated with Fatty Liver and Increased Visceral Fat Mass. PLoS, 2012). An unbalanced diet lacking Threonine has been shown to signal GCN2 for rats on a basal casein diet with 1-5.4% of an amino acid mixture supplemented lacking Threonine. Threonine deprivation, through its action on GCN2, has an effect on SREBP-1c and decreased physiologic measures of liver weight (and fatty liver phenotype), adipose tissue weight, cholesterol/triglyceride content, and food intake. Driving decreased fat mass, while maintaining lean mass, provides a therapeutic opportunity in areas such as obesity, diabetes, and cardiovascular health.

Tryptophan:

Tryptophan is both an EAA that plays an important role in immune functions. For example, concentrations of tryptophan progressively decline due to chronic lung inflammation. This suggests that catabolism of tryptophan via the indoleamine 2,3-dioxygenase (IDO) appears to be very important for function of macrophages and lymphocytes. Thus, antranilic acid (ANS) inhibits the production of proinflammatory T-helper 1 cytokines and prevents autoimmune neuroinflammation. Tryptophan can be used to treat the inflammatory effects of certain diseases include arthritis and asthma or other autoimmune diseases.

It is also a precursor for serotonin (5-HT) synthesis, a neurotransmitter that affects appetite, sleep and is widely implicated in onset of depression. Abnormality in 5-HT activity in recovered depression patients (on SSRIs or other neurotransmitter re-uptake inhibitors) leads to an acute sensitivity to low levels of Tryptophan in the bloodstream. 5-HT production can be increased 2-fold by oral intake of free Tryptophan, indicating a role for Tryptophan administration in depression. Furthermore, Tryptophan can potentiate the effects of SSRIs due to the apparent dependence on 5-HT availability for improvement in patient outcome.

Tryptophan can furthermore be used to help in weight loss/maintenance, benefit those suffering from sleep disorders, recovery from travel and jet lag; in addition to those suffering from mood disorders like depression or the effects of PMS.

Low Tryptophan

There exists a mechanistic understanding of how uncharged tRNA allosterically activates GCN2, leading to downstream phosphorylation of transcription factors related to lipogenesis, protein synthesis, along with many biosynthetic pathways in eukaryotes (SREBP-1c, eIF2a, and GCN4p discussed below). Diets devoid of any EAAs remarkably trigger this signaling within minutes after diet introduction (Hao et. Al., science 2005). Signaling through SREBP-1c has been shown in vivo to have dramatic effects on mobilizing lipid stores by repressing genes related to lipogenesis. SREBP-1c has been shown to specifically act on hepatic lipid synthesis, and an ability to cause a hepatic steatosis phenotype as well as increase in visceral fat mass (Knebel, B. et. Al. Liver-Specific Expression of Transcriptionally Active SREBP-1c Is Associated with Fatty Liver and Increased Visceral Fat Mass. PLoS, 2012). Tryptophan deprivation, through its action on GCN2, has an effect on SREBP-1c and decreased physiologic measures of liver weight (and fatty liver phenotype), adipose tissue weight, cholesterol/triglyceride content, and food intake. Driving decreased fat mass, while maintaining lean mass, provides a therapeutic opportunity in areas such as obesity, diabetes, and cardiovascular health.

Tyrosine:

Tyrosine is a nonessential amino acid that is synthesized from phenylalanine. It is used as a precursor for many important neurotransmitters including, epinephrine, norepinephrine, and dopamine. Tyrosine helps produce melanin, and helps the organs that make and regulate hormones, like the adrenal gland, thyroid gland, and pituitary gland. Additionally, tyrosine is involved in the structure of almost every protein in the body.

Tyrosine hydroxylase converts L-tyrosine into Levodopa using tetrahydropteridine as a cofactor or by tyrosinase. The conversion that is mediated by tyrosinase specifically oxidizes Levodopa to Dopaquinone, and levodopa is further decarboxylated to Dopamine by Dopa decarboxylase. Dopamine is a very important hormone and neurotransmitter, and plays a vital role in both mental and physical health. Dopamine helps to control the brain's reward and pleasure centers, helps to regulate movement and emotional responses, and enables one to see rewards and take action to move towards those rewards. The neurons that contain dopamine are clustered in the midbrain, in an area called the susbtantia nigra. In those afflicted with Parkinson's disease, the neurons that transmit dopamine in this area die resulting in an inability to control bodily movement. In order to relieve the symptoms of Parkinson's disease, L-Dopa, which can be converted to dopamine is given to the patients.

Tyrosine supplementation can help in the treatment of Parkinson's disease due to its role as a precursor to L-DOPA and dopamine. Additionally, it can be used in the treatment of those with emotional/psychiatric disorder like depression and in the treatment of addiction. Furthermore, it can promote learning by increasing the reward/pleasure response during learning difficult or complex concepts or movements.

Dopamine, which is a monoamine catecholamine neurotransmitter, plays a regulatory role in the immune system. Neurotransmitters and neuropeptides that interact with specific receptors present in particular immune effector cells are released by the immune system to influence the functions of these cells in the host against disease and other environmental stress. The immunoregulatory actions of dopamine have been shown to be regulated via five different G protein-coupled receptors that are present in target cells. There are two broad classes of these receptors: G1 and G2, which encompass the varying subtypes. The D1 class of receptors includes D2 and D5 subtypes, and increase intracellular cAMP upon activation. The D2 class of receptors consists of the D2, D3, and D4 subtypes, and has been reported to inhibit intracellular cAMP upon stimulation. Dopamine receptors have been found on normal human leukocytes. Likewise, the lymphoid tissues have dopaminergic innervations through sympathetic nerves, which suggests that dopamine may be able to regulate the immune system effector cells (Basu, Sujit & Sarkar, Chandrani, Dopamine and immune system. SciTopics 2010).

Dopamine affects T cells by activating the resting T cells and inhibiting the activation of stimulated T cells. In normal resting peripheral human T lymphocytes, dopamine activates the D2 and D3 subclass of receptors, which in turn activates integrins (α4β1 and α5β1). These integrins are hetrodimeric transmembrane glycoproteins that attach cells to the extracellular matrix component, fibronectin. Fibronectin is used for the trafficking and extravasation of T cells across the tissue barriers and blood vessels. Furthermore, dopamine acts through the D3 receptors to selectively induce the migration and homing of CD8+ T cells. Moreover, dopamine affects T cells by influencing the secretions of cytokines by the T cells. When dopamine stimulates the D3 and D1/D5 receptors, the secretion of TNF-α (a pleiotropic inflammatory cytokine) is increased. When the D2 receptors are stimulated, IL-10 (an anti-inflammatory cytokine) is induced to secrete. Dopamine, however, can inhibit the activated T cell receptor induced cell proliferation and secretion of a number of cytokines like 11-2, IFN-γ and IL-4 through the down-regulation of the expression of non-receptor tyrosine kinases lck and fyn, which are important tyrosine kinases in the initiation of TCR activation (Basu, Sujit & Sarkar, Chandrani Dopamine and immune system. SciTopics 2010).

The B cells have a very high expression of dopamine D2, D3, and D5 receptors. Dopamine has the ability to inhibit the proliferation of the resting and the malignant B lymphocytes. Dopamine acts by promoting apoptosis in cycling B cells through oxidative stress. However, this dopaminergic action has not been observed in resting lymphocytes, therefore suggesting a role in the prevention of cancer (Basu, Sujit & Sarkar, Chandrani, Dopamine and immune system. SciTopics 2010).

Tyrosine, as a precursor for Dopamine, can be used to improve immune responses and improve the overall immune system functionality. It can provide a benefit to the elderly, women who are pregnant, children, and those with compromised immune functions like AIDS patients and cancer patients. It also can be given to teachers, those travelling, and anyone frequently exposed to germs.

NE is synthesized in the adrenal medulla and postganglionic neurons in the sympathetic nervous system by the β-oxidation of dopamine by β-hydroxylase along with the cofactor ascorbate. It works by being secreted into the synaptic cleft where it stimulates adrenergic receptors, and is then either degraded or up-taken by surrounding cells. As a cathecolamine, it does not cross the blood-brain barrier.

NE can be used to combat ADHD, depression, and hypotension. In terms of attention disorders, like ADHD, medications prescribed tend to help increase levels of NE and dopamine. Furthermore, depression is typically treated with medications that inhibit the reuptake of serotonin and NE thereby increasing the amount of serotonin and NE that is available in the postsynaptic cells in the brain. Recent evidence has suggested that SNRIs may also increase dopamine transmission because if the norepinephrine transporter ordinarily recycled dopamine as well, then SNRIs will also enhance the dopaminergic transmission. As a result, the effects antidepressants may also be associated with the increased NE levels may partly be due to the simultaneous increase in dopamine (in particular in the prefrontal cortex of the brain).

NE is used to treat patients with critical hypotension. NE is a vasopressor and acts on both α1 and α2 adrenergic receptors to cause vasoconstriction, thereby increasing the blood pressure.

As a precursor for NE, Tyrosine can be used to treat attention disorders like ADHD and ADD. Additionally, it can be used to treat those suffering from depression, post-traumatic stress syndrome, and those with acute hypotension.

Epinephrine, which is popularly known as adrenaline, is a hormone that is secreted by the medulla of the adrenal glands. Epinephrine is released in response to strong emotions such as fear or anger, which causes an increase in heart rate, muscle strength, blood pressure, and sugar metabolism. It is responsible for the flight or fight response that prepares the body for difficult or strenuous activity. Epinephrine is used as a stimulant during cardiac arrest, as a vasoconstrictor during shock to increase blood pressure, and as a bronchodilator and antispasmodic in bronchial asthma. Epinephrine is not found in large quantities in the body, but is nevertheless very important in the maintenance of cardiovascular homeostasis because it has the ability to divert blood to tissues under stress. Epinephrine has this effect by influencing muscle contraction. Contraction of the muscles occurs through the binding calmodulin to calcium ions when the concentration is 10× larger than normal in the cell. The calcium-calmodulin complex then goes on to activate the myosin light chain kinase, which then phosphorylates the LC2 causing the contraction. Epinephrine binds to the epinephrine receptors, which activates adenylyl cyclase, and produces cyclic AMP from ATP. cAMP activates a protein kinase which thus phosphorylates the myosin light chain kinase. This phosphorylated myosin light chain kinase has a lower affinity for the calcium-calmodulin complex, and is thus inactive. As such, the smooth muscle tissue is relaxed. It is this action of epinephrine that makes it very useful in treating asthma, cardiac arrest, and anaphylactic shock. Tyrosine, as a precursor for Epinephrine, can be used for patients who are at risk for cardiac arrest, those suffering from asthma, and those who are at risk for anaphylactic shock.

Epinephrine is one of two main hormones that breakdown glycogen by binding to a receptor on exterior of a liver cell. This binding causes a conformational change to take place thereby allowing G protein to bind and become active. The activation of the G-protein coupled receptor causes a conformational change on the molecule to occur which causes adenylate cyclase to bind. Once adenylate cyclase binds the complex, adenylate cyclase breaks down ATP into cAMP, which then becomes the second messenger protein in this process and activates protein kinase. The activated protein kinase activates phosphorylase, which is an enzyme that catalyzes breaks down the glycogen to glucose. Tyrosine, as a precursor for Epinephrine, can be used to improve athletic performance by making glucose readily available to fuel exercise.

Melanin is a metabolite of Tyrosine, and is a powerful antioxidant. Additionally, it is influential in the inhibition of the production of inflammatory cytokines and superoxide. When pro-inflammatory cytokines are overproduced, it mediates the damaging effects of inflammation in pathologic conditions like rheumatoid arthritis, graft vs. host reactions, cachexia, and sepsis syndrome. It has been found that melanin inhibits ongoing cytokine synthesis, which strongly suggests that melanin may be useful as a superimposed therapy for conditions that involve proinflammatory cytokines (Mohagheghpour N., et al., Cell Immunol. 2000 Jan. 10; 199(1):25-36).

Tyrosine can be used in the treatment of rheumatoid arthritis, cachexia, sepsis syndrome, those with inflammation related to autoimmune disorder, and other inflammatory sequela of pathologic conditions.

Valine:

Valine is an EAA, and is also a BCAA. The BCAAs, including valine, serve as fuel sources for skeletal muscle during periods of metabolic stress by promoting protein synthesis, suppressing protein catabolism, and serving as substrates for gluconeogenesis. The BCAAs, including valine, are substrates for glutamine synthesis in animal tissues, and it has been shown that glutamine may play a role in mediating the anabolic effect of BCAAs in animals. Such an effect is likely to be important for the lactating mammary gland because it produces more glutamine than it takes up from arterial blood. Catabolism of BCAAs in the placenta results in glutamine synthesis and its release into the fetal circulation, which is a major source of the glutamine that circulates in the fetus. This suggests that supplementing a diet with Valine as well as the other BCAAs, or a combination thereof, may increase fetal growth in mammals. Additionally, Valine plays a direct role in the synthesis of alanine, and therefore has a regulatory function with regards to alanine.

BCAAs have been shown to have anabolic effects on protein metabolism by increasing the rate of protein synthesis and decreasing the rate of protein degradation in resting human muscle. Additionally, BCAAs are shown to have anabolic effects in human muscle during post endurance exercise recovery. These effects are mediated through the phosphorylation of mTOR and sequential activation of 70-kD S6 protein kinase (p70-kD S6), and eukaryotic initiation factor 4E-binding protein 1. P70-kD S6 is known for its role in modulating cell-cycle progression, cell size, and cell survival. P70-kD S6 activation in response to mitogen stimulation up-regulates ribosomal biosynthesis and enhances the translational capacity of the cell (W-L An, et al., Am J Pathol. 2003 August; 163(2): 591-607; E. Blomstrand, et al., J. Nutr. January 2006 136: 269S-273S). Eukaryotic initiation factor 4E-binding protein 1 is a limiting component of the multi-subunit complex that recruits 40S ribosomal subunits to the 5′ end of mRNAs. Activation of p70 S6 kinase, and subsequent phosphorylation of the ribosomal protein S6, is associated with enhanced translation of specific mRNAs.

BCAAs given to subjects during and after one session of quadriceps muscle resistance exercise show an increase in mTOR, p70 S6 kinase, and S6 phosphorylation was found in the recovery period after the exercise. However, there was no such effect of BCAAs on Akt or glycogen synthase kinase 3 (GSK-3). Exercise without BCAA intake leads to a partial phosphorylation of p70 S6 kinase without activating the enzyme, a decrease in Akt phosphorylation, and no change in GSK-3. BCAA infusion also increases p70 S6 kinase phosphorylation in an Akt-independent manner in resting subjects. This mTOR activity regulates cellular protein turnover (autophagy) and integrates insulin-like growth signals to protein synthesis initiation across tissues. This biology has been directly linked to biogenesis of lean tissue mass in skeletal muscle, metabolic shifts in disease states of obesity and insulin resistance, and aging.

Valine plays a key role in muscle metabolism, tissue repair, and the maintenance of proper nitrogen balance in the body. As one of the three BCAAs, it can be utilized as an energy source by muscle tissue. Valine is a glucogenic AA, and therefore provides glucose. Valine may be useful in the treatment of liver and gallbladder disease. Additionally, valine may be useful in correcting the type of severe AA deficiencies caused by drug addiction. Furthermore, Valine has been found to promote mental vigor, muscle coordination, and calm emotions. It may also be used to prevent muscle loss at high altitudes.

Valine supplementation can be used to improve athletic performance and muscle formation, aid in drug addiction rehabilitation, to enhance mental vigor in elderly and growing children, prevent muscle loss that accompanies aging, aid those suffering from hepatic disease, support the growing bodies of children, serve as a therapy for gallbladder and liver disease, to increase lactation in mammals, to increase fetal growth in mammals, and improve the nutritive quality of foods given to the starving populations.

Low Valine

In states of obesity and diabetes, animals have been shown to exhibit reduced hepatic autophagy, leading to increased insulin resistance. Autophagy is important for maintenance of the ER and cellular homeostasis, which when stressed can lead to impaired insulin sensitivity. High fat diet feeding in animal models stresses the ER, while leading to depressed hepatic autophagy through over-stimulation of mTORC1, which reinforces the progression towards insulin sensitivity impaired beta cell function in diabetes. Reducing the level of systemic Valine provides an opportunity to lower mTORC1 activity and restore healthy levels of autophagy.

There exists a mechanistic understanding of how uncharged tRNA allosterically activates GCN2, leading to downstream phosphorylation of transcription factors related to lipogenesis and protein synthesis, along with many biosynthetic pathways in eukaryotes (SREBP-1c, eIF2a, and GCN4p discussed below). Diets devoid of any EAAs remarkably trigger this signaling within minutes after diet introduction (Hao et. Al., science 2005). Signaling through SREBP-1c has been shown in vivo to have dramatic effects on mobilizing lipid stores by repressing genes related to lipogenesis. SREBP-1c has been shown to specifically act on hepatic lipid synthesis, and an ability to cause a hepatic steatosis phenotype as well as increase in visceral fat mass (Knebel, B. et. Al. Liver-Specific Expression of Transcriptionally Active SREBP-1c Is Associated with Fatty Liver and Increased Visceral Fat Mass. PLoS, 2012). Valine deprivation, through its action on GCN2, has an effect on SREBP-1c and decreased physiologic measures of liver weight (and fatty liver phenotype), adipose tissue weight, cholesterol/triglyceride content, and food intake. Driving decreased fat mass, while maintaining lean mass, provides a therapeutic opportunity in areas such as obesity, diabetes, and cardiovascular health.

In Vitro Analyses of Amino Acid Pharmacology.

As provided herein, amino acids behave both as necessary substrates for the synthesis of new proteins and also serve as signaling molecules. Analysis of the pharmacological properties of a given amino acid is dependent on the cell line and model system utilized. For example, the amino acid leucine has been shown to increase phosphorylation of the mammalian target of rapamycin complex I and downstream targets involved in anabolism in skeletal muscle cells (Gran P & D Cameron-Smith. 2011. The actions of exogenous leucine on mTOR signaling and amino acid transporters in human myotubes. BMC Physiol. 11:10). In vitro assays of amino acid pharmacology can also reveal auxotrophies in certain types of cancer. Auxotrophies to methionine have been reported in multiple immortalized cancer cell lines (Cavuoto P & MF Fenech. 2012. A review of methionine dependency and the role of methionine restriction in cancer growth control and life-span extension. Cancer Treat Rev. 38: 726-736).

An in vitro assay may be designed utilizing amino acids, protein digests, or di- and tri-peptides as the independent or manipulated variable after identifying a relevant cell line. An appropriate cell line is selected based on its relevance as a model of cellular processes. For example, C2C12 (ATCC, CRL-1772) is a murine myoblast cell line that differentiates into myofibers and is used as a model of skeletal muscle fiber differentiation and development. Cells are maintained in a complete medium supplemented with fetal bovine serum up to 10% which supplies necessary growth factors, and penicillin and streptomycin. Adherent cell lines are grown in T75 flasks with phenolic caps for filtered gas exchange and incubated at 37° C. at 5% CO2 in a humidified environment. Table AA lists cell lines that are used to assay amino acid pharmacology. For an in vitro assay, cells are seeded in T75 flasks, 6-, 12-, 24-, 48- or 96-well plates at an appropriate cell density, determined empirically. Following an incubation period the complete growth medium is replaced with medium deficient in the test article. Following a period of medium depletion the test article is added in the appropriate medium. Following the treatment period, the relevant dependent variable is measured.

TABLE AA List of exemplary cell lines utilized in vitro assays of amino acid pharmacology. Tissue or Cell Cell Line Species Type Systems Modeled C2C12 Mus musculus Skeletal muscle Skeletal muscle growth and differentiation RSkMC Rattus Skeletal muscle Skeletal muscle growth norvegicus and differentiation 3T3-L1 Mus musculus Embryo White adipose tissue development CHO-K1 Cricetulus Ovary Heterologous protein griseus expression FHs 74 Int Homo sapiens Small intestine Gastrointestinal and enteroendocrine systems 293T Homo sapiens Embryonic Heterologous protein kidney expression IEC-6 Rattus Small intestine/ Gastrointestinal and norvegicus epithelium enteroendocrine systems NCI-H716 Homo sapiens Cecum Gastrointestinal and enteroendocrine systems STC-1 Mus musculus Intestine Gastrointestinal and enteroendocrine systems MCF-7 Homo sapiens Lung Breast cancer adenocarcinoma LNCaP Homo sapiens Prostate Prostate cancer clone FGC carcinoma PC-3 Homo sapiens Prostate Prostate cancer adenocarcinoma

See, e.g., Wu, G. Amino acids: Metabolism, functions, and nutrition. Amino Acids 37(1):1-17 (2009); Wu, G. Functional amino acids in nutrition and health. Amino Acids 45(3):407-11 (2013); Schworer, C. Glucagon-induced autophagy and proteolysis in rat liver: Mediation by selective deprivation of intracellular amino acids. PNAS 76(7):3169-73 (1979); Codongo, P. Autophagy: A Potential Link between Obesity and Insulin Resistance. Cell Metabolism 11(6):449-51 (2010); Leong, H et. al. Short-term arginine deprivation results in large-scale modulation of hepatic gene expression in both normal and tumor cells: microarray bioinformatic analysis. Nutrition and metabolism 3:37 (2006); Harbrecht, B. G. Glutathione regulates nitric oxide synthase in cultured hepatocytes. Annals of Surgery 225(1): 76-87 (1997); Watermelon juice: a potential functional drunk for sore muscle relief in athletes. J. Agric. Food Chem. 61(31):7522-8 (2013).

Secreted Nutritive Polypeptides.

In another aspect, provided are nutritive polypeptides that contain the amino acid sequences of edible species polypeptides, which are engineered to be secreted from unicellular organisms and purified therefrom. Such nutritive polypeptides can be endogenous to the host cell or exogenous, and can be naturally secreted in either the polypeptide or the host cell, or both, and are engineered for secretion of the nutritive polypeptide.

Advantageous properties of a nutritive polypeptide include the ability to be expressed and secreted in a host cell, solubility in a wide variety of solvents, and when consumed by an intended subject, nutritional benefit, reduced allergenicity or non-allergenicity, lack of toxicity, and digestibility. Such properties can be weighted based, at least in part, on the intended consumer and the reason(s) for consumption of the nutritive polypeptide (e.g., for general health, muscle anabolism, immune health, or treatment or prevention of a disease, disorder or condition). One or multiple nutritional criteria are satisfied for example, by computing the mass fractions of all relevant amino acid(s) based on primary sequence.

By way of non-limiting examples, polypeptides of the present invention are provided in Table 1. The Predicted leader column shows the sequence indices of predicted leaders (if a leader exists). The Fragment Indices column shows the sequence indices of fragment sequences. The DBID column lists either the UniProt or GenBank Accession numbers for each sequence as available as of Sep. 24, 2014, each of which is herein incorporated by reference. DBIDs with only numerical characters are from a GenBank database, and those with mixed alphabetical/numerical characters are from a UniProt database.

Nucleic Acids

Also provided herein are nucleic acids encoding polypeptides or proteins. In some embodiments the nucleic acid is isolated. In some embodiments the nucleic acid is purified.

In some embodiments of the nucleic acid, the nucleic acid comprises a nucleic acid sequence that encodes a first polypeptide sequence disclosed herein. In some embodiments of the nucleic acid, the nucleic acid consists of a nucleic acid sequence that encodes a first polypeptide sequence disclosed herein. In some embodiments of the nucleic acid, the nucleic acid comprises a nucleic acid sequence that encodes a protein disclosed herein. In some embodiments of the nucleic acid, the nucleic acid consists of a nucleic acid sequence that encodes a protein disclosed herein. In some embodiments of the nucleic acid the nucleic acid sequence that encodes the first polypeptide sequence is operatively linked to at least one expression control sequence. For example, in some embodiments of the nucleic acid the nucleic acid sequence that encodes the first polypeptide sequence is operatively linked to a promoter such as a promoter described herein.

Accordingly, in some embodiments the nucleic acid molecule of this disclosure encodes a polypeptide or protein that itself is a polypeptide or protein. Such a nucleic acid molecule can be referred to as a “nucleic acid.” In some embodiments the nucleic acid encodes a polypeptide or protein that itself comprises at least one of: a) a ratio of branched chain amino acid residues to total amino acid residues of at least 24%; b) a ratio of Leu residues to total amino acid residues of at least 11%; and c) a ratio of essential amino acid residues to total amino acid residues of at least 49%. In some embodiments the nucleic acid comprises at least 10 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, at least 60 nucleotides, at least 70 nucleotides, at least 80 nucleotides, at least 90 nucleotides, at least 100 nucleotides, at least 200 nucleotides, at least 300 nucleotides, at least 400 nucleotides, at least 500 nucleotides, at least 600 nucleotides, at least 700 nucleotides, at least 800 nucleotides, at least 900 nucleotides, at least 1,000 nucleotides. In some embodiments the nutritrive nucleic acid comprises from 10 to 100 nucleotides, from 20 to 100 nucleotides, from 10 to 50 nucleotides, or from 20 to 40 nucleotides. In some embodiments the nucleic acid comprises all or part of an open reading frame that encodes an edible species polypeptide or protein. In some embodiments the nucleic acid consists of an open reading frame that encodes a fragment of an edible species protein, wherein the open reading frame does not encode the complete edible species protein.

In some embodiments the nucleic acid is a cDNA.

In some embodiments nucleic acid molecules are provided that comprise a sequence that is at least 50%, 60%, 70%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or 99.9% identical to an edible species nucleic acid. In some embodiments nucleic acids are provided that hybridize under stringent hybridization conditions with at least one reference nucleic acid.

The nucleic acids and fragments thereof provided in this disclosure display utility in a variety of systems and methods. For example, the fragments can be used as probes in various hybridization techniques. Depending on the method, the target nucleic acid sequences can be either DNA or RNA. The target nucleic acid sequences can be fractionated (e.g., by gel electrophoresis) prior to the hybridization, or the hybridization can be performed on samples in situ. One of skill in the art will appreciate that nucleic acid probes of known sequence find utility in determining chromosomal structure (e.g., by Southern blotting) and in measuring gene expression (e.g., by Northern blotting). In such experiments, the sequence fragments are preferably detectably labeled, so that their specific hydridization to target sequences can be detected and optionally quantified. One of skill in the art will appreciate that the nucleic acid fragments of this disclosure can be used in a wide variety of blotting techniques not specifically described herein.

It should also be appreciated that the nucleic acid sequence fragments disclosed herein also find utility as probes when immobilized on microarrays. Methods for creating microarrays by deposition and fixation of nucleic acids onto support substrates are well known in the art. Reviewed in DNA Microarrays: A Practical Approach (Practical Approach Series), Schena (ed.), Oxford University Press (1999) (ISBN: 0199637768); Nature Genet. 21(1)(suppl):1-60 (1999); Microarray Biochip: Tools and Technology, Schena (ed.), Eaton Publishing Company/BioTechniques Books Division (2000) (ISBN: 1881299376), the disclosures of which are incorporated herein by reference in their entireties. Analysis of, for example, gene expression using microarrays comprising nucleic acid sequence fragments, such as the nucleic acid sequence fragments disclosed herein, is a well-established utility for sequence fragments in the field of cell and molecular biology. Other uses for sequence fragments immobilized on microarrays are described in Gerhold et al., Trends Biochem. Sci. 24:168-173 (1999) and Zweiger, Trends Biotechnol. 17:429-436 (1999); DNA Microarrays: A Practical Approach (Practical Approach Series), Schena (ed.), Oxford University Press (1999) (ISBN: 0199637768); Nature Genet. 21(1)(suppl):1-60 (1999); Microarray Biochip: Tools and Technology, Schena (ed.), Eaton Publishing Company/BioTechniques Books Division (2000) (ISBN: 1881299376).

Expression

Vectors

Also provided are one or more vectors, including expression vectors, which comprise at least one of the nucleic acid molecules disclosed herein, as described further herein. In some embodiments, the vectors comprise at least one isolated nucleic acid molecule encoding a protein as disclosed herein. In alternative embodiments, the vectors comprise such a nucleic acid molecule operably linked to one or more expression control sequence. The vectors can thus be used to express at least one recombinant protein in a recombinant microbial host cell. In some aspects, a vector or set of vectors can include a nucleic acid sequence coding for a signal peptide, e.g., to cause secretion of a protein disclosed herein. See below for further discussion of signal peptides and secretion.

Suitable vectors for expression of nucleic acids in microorganisms are well known to those of skill in the art. Suitable vectors for use in cyanobacteria are described, for example, in Heidorn et al., “Synthetic Biology in Cyanobacteria: Engineering and Analyzing Novel Functions,” Methods in Enzymology, Vol. 497, Ch. 24 (2011). Exemplary replicative vectors that can be used for engineering cyanobacteria as disclosed herein include pPMQAK1, pSL1211, pFC1, pSB2A, pSCR119/202, pSUN119/202, pRL2697, pRL25C, pRL1050, pSG111M, and pPBH201.

Other vectors such as pJB161 which are capable of receiving nucleic acid sequences disclosed herein may also be used. Vectors such as pJB161 comprise sequences which are homologous with sequences present in plasmids endogenous to certain photosynthetic microorganisms (e.g., plasmids pAQ1, pAQ3, and pAQ4 of certain Synechococcus species). Examples of such vectors and how to use them is known in the art and provided, for example, in Xu et al., “Expression of Genes in Cyanobacteria: Adaptation of Endogenous Plasmids as Platforms for High-Level Gene Expression in Synechococcus sp. PCC 7002,” Chapter 21 in Robert Carpentier (ed.), “Photosynthesis Research Protocols,” Methods in Molecular Biology, Vol. 684, 2011, which is hereby incorporated herein by reference. Recombination between pJB161 and the endogenous plasmids in vivo yield engineered microbes expressing the genes of interest from their endogenous plasmids. Alternatively, vectors can be engineered to recombine with the host cell chromosome, or the vector can be engineered to replicate and express genes of interest independent of the host cell chromosome or any of the host cell's endogenous plasmids.

A further example of a vector suitable for recombinant protein production is the pET system (Novagen®). This system has been extensively characterized for use in E. coli and other microorganisms. In this system, target genes are cloned in pET plasmids under control of strong bacteriophage T7 transcription and (optionally) translation signals; expression is induced by providing a source of T7 RNA polymerase in the host cell. T7 RNA polymerase is so selective and active that, when fully induced, almost all of the microorganism's resources are converted to target gene expression; the desired product can comprise more than 50% of the total cell protein a few hours after induction. It is also possible to attenuate the expression level simply by lowering the concentration of inducer. Decreasing the expression level may enhance the soluble yield of some target proteins. In some embodiments this system also allows for maintenance of target genes in a transcriptionally silent un-induced state.

In some embodiments of using this system, target genes are cloned using hosts that do not contain the T7 RNA polymerase gene, thus alleviating potential problems related to plasmid instability due to the production of proteins potentially toxic to the host cell. Once established in a non-expression host, target protein expression can be initiated either by infecting the host with λCE6, a phage that carries the T7 RNA polymerase gene under the control of the λ pL and pI promoters, or by transferring the plasmid into an expression host containing a chromosomal copy of the T7 RNA polymerase gene under lacUV5 control. In the second case, expression is induced by the addition of IPTG or lactose to the bacterial culture or using an autoinduction medium. Other plasmids systems that are controlled by the lac operator, but do not require the T7 RNA polymerase gene and rely upon E. coli's native RNA polymerase include the pTrc plasmid suite (Invitrogen) or pQE plamid suite (QIAGEN).

In other embodiments it is possible to clone directly into expression hosts. Two types of T7 promoters and several hosts that differ in their stringency of suppressing basal expression levels are available, providing great flexibility and the ability to optimize the expression of a wide variety of target genes.

Suitable vectors for expression of nucleic acids in mammalian cells typically comprise control functions provided by viral regulatory elements. For example, commonly used promoters are derived from polyoma virus, Adenovirus 2, cytomegalovirus, or Simian Virus 40.

Promoters

Promoters useful for expressing the recombinant genes described herein include both constitutive and inducible/repressible promoters. Examples of inducible/repressible promoters include nickel-inducible promoters (e.g., PnrsA, PnrsB; see, e.g., Lopez-Mauy et al., Cell (2002) v. 43: 247-256) and urea repressible promoters such as PnirA (described in, e.g., Qi et al., Applied and Environmental Microbiology (2005) v. 71: 5678-5684). Additional examples of inducible/repressible promoters include PnirA (promoter that drives expression of the nirA gene, induced by nitrate and repressed by urea) and Psuf (promoter that drives expression of the sufB gene, induced by iron stress). Examples of constitutive promoters include Pcpc (promoter that drives expression of the cpc operon), Prbc (promoter that drives expression of rubisco), PpsbAII (promoter that drives expression of PpsbAII), Pcro (lambda phage promoter that drives expression of cro). In other embodiments, a Paphll and/or a laclq-Ptrc promoter can used to control expression. Where multiple recombinant genes are expressed in an engineered microorganim, the different genes can be controlled by different promoters or by identical promoters in separate operons, or the expression of two or more genes can be controlled by a single promoter as part of an operon.

Further non-limiting examples of inducible promoters may include, but are not limited to, those induced by expression of an exogenous protein (e.g., T7 RNA polymerase, SP6 RNA polymerase), by the presence of a small molecule (e.g., IPTG, galactose, tetracycline, steroid hormone, abscisic acid), by absence of small molecules (e.g., CO2, iron, nitrogen), by metals or metal ions (e.g., copper, zinc, cadmium, nickel), and by environmental factors (e.g., heat, cold, stress, light, darkness), and by growth phase. In some embodiments, the inducible promoter is tightly regulated such that in the absence of induction, substantially no transcription is initiated through the promoter. In some embodiments, induction of the promoter does not substantially alter transcription through other promoters. Also, generally speaking, the compound or condition that induces an inducible promoter is not naturally present in the organism or environment where expression is sought.

In some embodiments, the inducible promoter is induced by limitation of CO2 supply to a cyanobacteria culture. By way of non-limiting example, the inducible promoter can be the promoter sequence of Synechocystis PCC 6803 that are up-regulated under the CO2-limitation conditions, such as the cmp genes, ntp genes, ndh genes, sbt genes, chp genes, and rbc genes, or a variant or fragment thereof.

In some embodiments, the inducible promoter is induced by iron starvation or by entering the stationary growth phase. In some embodiments, the inducible promoter can be variant sequences of the promoter sequence of cyanobacterial genes that are up-regulated under Fe-starvation conditions such as isiA, or when the culture enters the stationary growth phase, such as isiA, phrA, sigC, sigB, and sigH genes, or a variant or fragment thereof.

In some embodiments, the inducible promoter is induced by a metal or metal ion. By way of non-limiting example, the inducible promoter can be induced by copper, zinc, cadmium, mercury, nickel, gold, silver, cobalt, and bismuth or ions thereof. In some embodiments, the inducible promoter is induced by nickel or a nickel ion. In some embodiments, the inducible promoter is induced by a nickel ion, such as Ni2+. In another exemplary embodiment, the inducible promoter is the nickel inducible promoter from Synechocystis PCC 6803. In another embodiment, the inducible promoter can be induced by copper or a copper ion. In yet another embodiment, the inducible promoter can be induced by zinc or a zinc ion. In still another embodiment, the inducible promoter can be induced by cadmium or a cadmium ion. In yet still another embodiment, the inducible promoter can be induced by mercury or a mercury ion. In an alternative embodiment, the inducible promoter can be induced by gold or a gold ion. In another alternative embodiment, the inducible promoter can be induced by silver or a silver ion. In yet another alternative embodiment, the inducible promoter can be induced by cobalt or a cobalt ion. In still another alternative embodiment, the inducible promoter can be induced by bismuth or a bismuth ion.

In some embodiments, the promoter is induced by exposing a cell comprising the inducible promoter to a metal or metal ion. The cell can be exposed to the metal or metal ion by adding the metal to the microbial growth media. In certain embodiments, the metal or metal ion added to the microbial growth media can be efficiently recovered from the media. In other embodiments, the metal or metal ion remaining in the media after recovery does not substantially impede downstream processing of the media or of the bacterial gene products.

Further non-limiting examples of constitutive promoters include constitutive promoters from Gram-negative bacteria or a bacteriophage propagating in a Gram-negative bacterium. For instance, promoters for genes encoding highly expressed Gram-negative gene products can be used, such as the promoter for Lpp, OmpA, rRNA, and ribosomal proteins. Alternatively, regulatable promoters can be used in a strain that lacks the regulatory protein for that promoter. For instance Plac, Ptac, and Ptrc, can be used as constitutive promoters in strains that lack Lad. Similarly, P22 PR and PL can be used in strains that lack the lambda C2 repressor protein, and lambda PR and PL can be used in strains that lack the lambda C1 repressor protein. In one embodiment, the constitutive promoter is from a bacteriophage. In another embodiment, the constitutive promoter is from a Salmonella bacteriophage. In yet another embodiment, the constitutive promoter is from a cyanophage. In some embodiments, the constitutive promoter is a Synechocystis promoter. For instance, the constitutive promoter can be the PpsbAll promoter or its variant sequences, the Prbc promoter or its variant sequences, the Pcpc promoter or its variant sequences, and the PrnpB promoter or its variant sequences.

Hosts

Also provided are host cells transformed with the nucleic acid molecules or vectors disclosed herein, and descendants thereof. In some embodiments the host cells are microbial cells. In some embodiments, the host cells carry the nucleic acid sequences on vectors, which may but need not be freely replicating vectors. In other embodiments, the nucleic acids have been integrated into the genome of the host cells and/or into an endogenous plasmid of the host cells. The transformed host cells find use, e.g., in the production of recombinant proteins disclosed herein.

A variety of host microorganisms can be transformed with a nucleic acid sequence disclosed herein and can in some embodiments be used to produce a recombinant protein disclosed herein. Suitable host microorganisms include both autotrophic and heterotrophic microbes. In some applications the autotrophic microorganisms allows for a reduction in the fossil fuel and/or electricity inputs required to make a protein encoded by a recombinant nucleic acid sequence introduced into the host microorganism. This, in turn, in some applications reduces the cost and/or the environmental impact of producing the protein and/or reduces the cost and/or the environmental impact in comparison to the cost and/or environmental impact of manufacturing alternative proteins, such as whey, egg, and soy. For example, the cost and/or environmental impact of making a protein disclosed herein using a host microorganism as disclosed herein is in some embodiments lower that the cost and/or environmental impact of making whey protein in a form suitable for human consumption by processing of cow's milk.

Non-limiting examples of heterotrophs include Escherichia coli, Salmonella typhimurium, Bacillus subtilis, Bacillus megaterium, Corynebacterium glutamicum, Streptomyces coelicolor, Streptomyces lividans, Streptomyces vanezuelae, Streptomyces roseosporus, Streptomyces fradiae, Streptomyces griseus, Streptomyces calvuligerus, Streptomyces hygroscopicus, Streptomyces platensis, Saccharopolyspora erythraea, Corynebacterium glutamicum, Aspergillus niger, Aspergillus nidulans, Aspergillus oryzae, Aspergillus terreus, Aspergillus sojae, Penicillium chrysogenum, Trichoderma reesei, Clostridium acetobutylicum, Clostridium beijerinckii, Clostridium thermocellum, Fusibacter paucivorans, Saccharomyces cerevisiae, Saccharomyces boulardii, Pichia pastoris, and Pichia stipitis.

Photoautotrophic microrganisms include eukaryotic algae, as well as prokaryotic cyanobacteria, green-sulfur bacteria, green non-sulfur bacteria, purple sulfur bacteria, and purple non-sulfur bacteria. Extremophiles are also contemplated as suitable organisms. Such organisms are provided, e.g., in Mixotrophic organisms are also suitable organisms. Algae and cyanobacteria are contemplated as suitable organisms. See the organisms disclosed in, e.g., PCT/US2013/032232, filed Mar. 15, 2013, PCT/US2013/032180, filed Mar. 15, 2013, PCT/US2013/032225, filed Mar. 15, 2013, PCT/US2013/032218, filed Mar. 15, 2013, PCT/US2013/032212, filed Mar. 15, 2013, PCT/US2013/032206, filed Mar. 15, 2013, and PCT/US2013/038682, filed Apr. 29, 2013

Yet other suitable organisms include synthetic cells or cells produced by synthetic genomes as described in Venter et al. US Pat. Pub. No. 2007/0264688, and cell-like systems or synthetic cells as described in Glass et al. US Pat. Pub. No. 2007/0269862.

Still other suitable organisms include Escherichia coli, Acetobacter aceti, Bacillus subtilis, yeast and fungi such as Clostridium ljungdahlii, Clostridium thermocellum, Penicillium chrysogenum, Pichia pastoris, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Pseudomonas fluorescens, or Zymomonas mobilis. In some embodiments those organisms are engineered to fix carbon dioxide while in other embodiments they are not.

In some embodiments eukaryotic cells, such as insect cells or mammalian cells, such as human cells are used as host cells. Vectors and expression control sequences including promoters and enhancers are well known for such cells. Examples of useful mammalian host cell lines for this purpose are monkey kidney CV1 line transformed by SV40 (COS-7, ATCC CRL 1651); human embryonic kidney line (293 or 293 cells subcloned for growth in suspension culture, Graham et al., J. Gen Virol. 36:59 (1977)); baby hamster kidney cells (BHK, ATCC CCL 10); Chinese hamster ovary cells/-DHFR (CHO, Urlaub et al., Proc. Natl. Acad. Sci. USA 77:4216 (1980)); mouse sertoli cells (TM4, Mather, Biol. Reprod. 23:243-251 (1980)); monkey kidney cells (CV1 ATCC CCL 70); African green monkey kidney cells (VERO-76, ATCC CRL-1587); human cervical carcinoma cells (HELA, ATCC CCL 2); canine kidney cells (MDCK, ATCC CCL 34); buffalo rat liver cells (BRL 3A, ATCC CRL 1442); human lung cells (W138, ATCC CCL 75); human liver cells (Hep G2, HB 8065); mouse mammary tumor (MMT 060562, ATCC CCL51); TRI cells (Mather et al., Annals N.Y. Acad. Sci. 383:44-68 (1982)); MRC 5 cells; FS4 cells; and a human hepatoma line (Hep G2).

Transfection

Proteins can be produced in a host cell using, for example, a combination of recombinant DNA techniques and gene transfection methods as is well known in the art (e.g., Morrison, S. (1985) Science 229:1202). For expression of the protein, the expression vector(s) encoding the protein is transfected into a host cell by standard techniques. The various forms of the term transfection are intended to encompass a wide variety of techniques commonly used for the introduction of exogenous DNA into a prokaryotic or eukaryotic host cell, e.g., electroporation, calcium-phosphate precipitation, DEAE-dextran transfection and the like.

Production

Skilled artisans are aware of many suitable methods available for culturing recombinant cells to produce (and optionally secrete) a protein as disclosed herein, as well as for purification and/or isolation of expressed proteins. The methods chosen for protein purification depend on many variables, including the properties of the protein of interest, its location and form within the cell, the vector, host strain background, and the intended application for the expressed protein. Culture conditions can also have an effect on solubility and localization of a given target protein. Many approaches can be used to purify target proteins expressed in recombinant microbial cells as disclosed herein, including without limitation ion exchange and gel filtration.

In some embodiments a peptide fusion tag is added to the recombinant protein making possible a variety of affinity purification methods that take advantage of the peptide fusion tag. In some embodiments, the use of an affinity method enables the purification of the target protein to near homogeneity in one step. Purification may include cleavage of part or all of the fusion tag with enterokinase, factor Xa, thrombin, or HRV 3C proteases, for example. In some embodiments, before purification or activity measurements of an expressed target protein, preliminary analysis of expression levels, cellular localization, and solubility of the target protein is performed. The target protein can be found in any or all of the following fractions: soluble or insoluble cytoplasmic fractions, periplasm, or medium. Depending on the intended application, preferential localization to inclusion bodies, medium, or the periplasmic space can be advantageous, in some embodiments, for rapid purification by relatively simple procedures.

While Escherichia coli is widely regarded as a robust host for heterologous protein expression, it is also widely known that over-expression of many proteins in this host is prone to aggregation in the form of insoluble inclusion bodies. One of the most commonly used methods for either rescuing inclusion body formation, or to improve the titer of the protein itself, is to include an amino-terminal maltose-binding protein (MBP) (Austin B P, Nallamsetty S, Waugh D S. Hexahistidine-tagged maltose-binding protein (‘Hexahistidine’ disclosed as SEQ ID NO: 44484) as a fusion partner for the production of soluble recombinant proteins in Escherichia coli. Methods Mol Biol. 2009; 498:157-72), or small ubiquitin-related modifier (SUMO) (Saitoh H, Uwada J, Azusa K. Strategies for the expression of SUMO-modified target proteins in Escherichia coli. Methods Mol Biol. 2009; 497:211-21; Malakhov M P, Mattern M R, Malakhova O A, Drinker M, Weeks S D, Butt T R. SUMO fusions and SUMO-specific protease for efficient expression and purification of proteins. J Struct Funct Genomics. 2004; 5(1-2):75-86; Panavas T, Sanders C, Butt T R. SUMO fusion technology for enhanced protein production in prokaryotic and eukaryotic expression systems. Methods Mol Biol. 2009; 497:303-17) fusion to the protein of interest. These two proteins are expressed extremely well, and in the soluble form, in Escherichia coli such that the protein of interest is also effectively produced in the soluble form. The protein of interest can be cleaved by designing a site specific protease recognition sequence (such as the tobacco etch virus (TEV) protease) in-between the protein of interest and the fusion protein. In some embodiments, a protein of interest can be present in an inclusion body; in some aspects the inclusion body can be formulated for delivery to a subject. Formulation is discussed in further detail below.

In some embodiments the protein is initially not folded correctly or is insoluble. A variety of methods are well known for refolding of insoluble proteins. Most protocols comprise the isolation of insoluble inclusion bodies by centrifugation followed by solubilization under denaturing conditions. The protein is then dialyzed or diluted into a non-denaturing buffer where refolding occurs. Because every protein possesses unique folding properties, the optimal refolding protocol for any given protein can be empirically determined by a skilled artisan. Optimal refolding conditions can, for example, be rapidly determined on a small scale by a matrix approach, in which variables such as protein concentration, reducing agent, redox treatment, divalent cations, etc., are tested. Once the optimal concentrations are found, they can be applied to a larger scale solubilization and refolding of the target protein.

In some embodiments the protein does not comprise a tertiary structure. In some embodiments less than half of the amino acids in the protein partipate in a tertiary structure. In some embodiments the protein does not comprise a secondary structure. In some embodiments less than half of the amino acids in the protein partipate in a secondary structure. Recombinant proteins can be isolated from a culture of cells expressing them in a state that comprises one or more of these structural features. In some embodiments the tertiary structure of a recombinant protein is reduced or eliminated after the protein is isolated from a culture producing it. In some embodiments the secondary structure of a recombinant protein is reduced or eliminated after the protein is isolated from a culture producing it.

In some embodiments a CAPS buffer at alkaline pH in combination with N-lauroylsarcosine is used to achieve solubility of the inclusion bodies, followed by dialysis in the presence of DTT to promote refolding. Depending on the target protein, expression conditions, and intended application, proteins solubilized from washed inclusion bodies can be >90% homogeneous and may not require further purification. Purification under fully denaturing conditions (before refolding) is possible using His•Tag® fusion proteins and His•Bind® immobilized metal affinity chromatography (Novogen®). In addition, S•Tag™, T7•Tag®, and Strep•Tag® II fusion proteins solubilized from inclusion bodies using 6 M urea can be purified under partially denaturing conditions by dilution to 2 M urea (S•Tag and T7•Tag) or 1 M urea (Strep•Tag II) prior to chromatography on the appropriate resin. Refolded fusion proteins can be affinity purified under native conditions using His•Tag, S•Tag, Strep•Tag II, and other appropriate affinity tags (e.g., GST•Tag™, and T7•Tag) (Novogen®).

In some embodiments the protein is an endogenous protein of the host cell used to express it. That is, the cellular genome of the host cell comprises an open reading frame that encodes the recombinant protein. In some embodiments regulatory sequences sufficient to increase expression of the protein are inserted into the host cell genome and operatively linked to the endogenous open reading frame such that the regulatory sequences drive overexpression of the recombinant protein from a recombinant nucleic acid. In some embodiments heterologous nucleic acid sequences are fused to the endogenous open reading frame of the protein and cause the protein to be synthesized comprising a heterologous amino acid sequence that changes the cellular trafficking of the recombinant protein, such as directing it to an organelle or to a secretion pathway. In some embodiments an open reading frame that encodes the endogeneous host cell protein is introduced into the host cell on a plasmid that further comprises regulatory sequences operatively linked to the open reading frame. In some embodiments the recombinant host cell expresses at least 2 times, at least 3 times, at least 4 times, at least 5 times, at least 10 times, or at least 20 times, at least 30 times, at least 40 times, at least 50 times, or at least 100 times more of the recombinant protein than the amount of the protein produced by a similar host cell grown under similar conditions.

Production of Recombinant Proteins in Plants

Nutritive polypeptides can be produced recombinantly from plants, including but not limited to those organisms and methods of production disclosed in PCT/US2013/032232, filed Mar. 15, 2013, PCT/US2013/032180, filed Mar. 15, 2013, PCT/US2013/032225, filed Mar. 15, 2013, PCT/US2013/032218, filed Mar. 15, 2013, PCT/US2013/032212, filed Mar. 15, 2013, PCT/US2013/032206, filed Mar. 15, 2013, and PCT/US2013/038682, filed Apr. 29, 2013 and any phylogenetically related organisms, and other methods of production known in the art.

Purification

Secreted

It is generally recognized that nearly all secreted bacterial proteins, and those proteins from other unicellular hosts, are synthesized as pre-proteins that contain N-terminal sequences known as signal peptides. These signal peptides influence the final destination of the protein and the mechanisms by which they are transported. Most signal peptides can be placed into one of four groups based on their translocation mechanism (e.g., Sec- or Tat-mediated) and the type of signal peptidase used to cleave the signal peptide from the preprotein. Also provided are N-terminal signal peptides containing a lipoprotein signal peptide. Although proteins carrying this type of signal are transported via the Sec translocase, their peptide signals tend to be shorter than normal Sec-signals and they contain a distinct sequence motif in the C-domain known as the lipo box (L(AS)(GA)C) at the −3 to +1 position. The cysteine at the +1 position is lipid modified following translocation whereupon the signal sequence is cleaved by a type II signal peptidase. Also provided are type IV or prepilin signal peptides, wherein type IV peptidase cleavage domains are localized between the N- and H-domain rather than in the C-domain common in other signal peptides.

As provided herein, the signal peptides can be attached to a heterologous polypeptide sequence (i.e., different than the protein the signal peptide is derived or obtained from) containing a nutritive polypeptide, in order to generate a recombinant nutritive polypeptide sequence. Alternatively, if a nutritive polypeptide is naturally secreted in the host organism, it can be sufficient to use the native signal sequence or a variety of signal sequences that directs secretion. In some embodiments of the nutritive polypeptides, the heterologous nutritive polypeptide sequence attached to the carboxyl terminus of the signal peptide is an edible species eukaryotic protein, a mutein or derivative thereof, or a polypeptide nutritional domain. In other embodiments of the polypeptide, the heterologous nutritive polypeptide sequence attached to the carboxyl terminus of the signal peptide is an edible species intracellular protein, a mutein or derivative thereof, or a polypeptide nutritional domain.

Purification of Nutritive Polypeptides.

Also provided are methods for recovering the secreted nutritive polypeptide from the culture medium. In some embodiments the secreted nutritive polypeptide is recovered from the culture medium during the exponential growth phase or after the exponential growth phase (e.g., in pre-stationary phase or stationary phase). In some embodiments the secreted nutritive polypeptide is recovered from the culture medium during the stationary phase. In some embodiments the secreted nutritive polypeptide is recovered from the culture medium at a first time point, the culture is continued under conditions sufficient for production and secretion of the recombinant nutritive polypeptide by the microorganism, and the recombinant nutritive polypeptide is recovered from the culture medium at a second time point. In some embodiments the secreted nutritive polypeptide is recovered from the culture medium by a continuous process. In some embodiments the secreted nutritive polypeptide is recovered from the culture medium by a batch process. In some embodiments the secreted nutritive polypeptide is recovered from the culture medium by a semi-continuous process. In some embodiments the secreted nutritive polypeptide is recovered from the culture medium by a fed-batch process. Those skilled in the art are aware of many suitable methods available for culturing recombinant cells to produce (and optionally secrete) a recombinant nutritive polypeptide as disclosed herein, as well as for purification and/or isolation of expressed recombinant polypeptides. The methods chosen for polypeptide purification depend on many variables, including the properties of the polypeptide of interest. Various methods of purification are known in the art including diafilitration, precipitation, and chromatography.

Non-Secreted

In some aspects, proteins can be isolated in the absence of secretion. For example, a cell having the protein (e.g., on the cell surface or intracellularly) can be lysed and the protein can be purified using standard methods such as chromatography or antibody-based isolation of the protein from the lysate. In some aspects, a cell surface expressed protein can be enzymatically cleaved from the surface.

Isolation of Nutritive Polypeptides from Biological Materials from Edible Species

In some embodiments a nutritive polypeptide having a desired amino acid or plurality of amino acids, which are optionally present in a desired amino acid sequence, is isolated or purified from a food source, or from a biological material from an edible species. For example, a biological material of a plant includes nuts, seeds, leaves, and roots; a biological material of a mammal includes milk, muscle, sera, and liver. Isolation methods include solubilization, chromatography, and precipitation.

Nutritive polypeptides are isolated from biological materials by specific solubilization of the targeted nutritive polypeptide. The biological material is suspended and homogenized in a solubilization solution. The solubilization solution is selected based on the nutritive polypeptides physiochemical properties. Composition of the solubilization solution is a mixture of water, detergent, salt, pH, chaotrope, cosmotrope, and/or organic solvent. As an example, proteins high in proline are known to be soluble in ethanol solutions (Dickey, L. C., et al. Industrial Crops and Products 10.2 (1999): 137-143). A nutritive polypeptide with high proline content is selected and isolated by suspending the biological material in ethanol at a ratio (w/w) of liquid to biological material of 1:1, 2:1, 3:1, 4:1 or other ratio recognized in the art. The suspension is blended and insoluble material is removed by centrifugation. The ethanol soluble nutritive polypeptide is purified solubly in the ethanol fraction.

Nutritive polypeptides are isolated from biological materials by precipitation of the targeted nutritive polypeptide or precipitation of other proteins. Precipitating agents include salt, pH, heat, flocculants, chaotropes, cosmotropes, and organic solvents. The mode of precipitation is selected for a given nutritive polypeptide based on the proteins physiochemical properties. As an example, a nutritive polypeptide is selected to be thermal stable at pH 7 by low solvation score and low aggregation score as described herein. To purify this protein the biological material is suspended in a neutral pH aqueous solution and homogenized. Insoluble material is removed from solution by centrifugation. To purify the nutritive polypeptide from other proteins, the supernatant is heated to 90 degrees C. for 10 minutes. Insoluble material is removed by centrifugation. Small molecules are removed from the supernatant by dialyzing using a 3 kDa membrane, resulting in pure nutritive polypeptide.

Nutritive polypeptides are isolated from biological materials by various chromatographic methods. The mode of chromatography selected for use depends on the physicochemical properties of the target nutritive polypeptide. Charged nutritive polypeptides bind to ion exchange chromatography resin through electrostatic interactions. Hydrophobic nutritive polypeptides bind to hydrophobic interaction chromatography resin through hydrophobic association. Mixed-mode chromatography can be used for a variety of nutritive polypeptides, and can act through a variety of interactions. Metal affinity chromatography can be used for nutritive polypeptides that bind to metal ions. As an example, a nutritive polypeptide is selected to have a high charge per amino acid at pH 4 so that it binds tightly to a cation-exchange resin. The biological material is added to a low ionic strength pH 4 aqueous solution and homogenized. Insoluble material is removed by centrifugation. The soluble material is added to a cation exchange resin, such as POROS® XS Strong Cation Exchange Resin from Life Technologies, and washed with a low ionic strength pH4 solution. The nutritive polypeptide is eluted from the resin by adding high ionic strength (eg. 500 mM NaCl) pH 4 solution, resulting in purified nutritive polypeptide.

Synthetic Nutritive Polypeptide Amino Acid Compositions

In some embodiments compositions of this disclosure contain a plurality of free amino acids that represents the molar ratio of the plurality of amino acids present in a selected nutritive polypeptide, herein termed a “nutritive polypeptide blend”. The compositions in certain embodiments include both free amino acids and nutritive polypeptides. As used herein in these embodiments, disclosure of a nutritive polypeptide and compositions and formulations containing the nutritive polypeptide includes disclosure of a nutritive polypeptide blend and compositions and formulations containing the nutritive polypeptide blend, as well as a composition in which a first amount of amino acids are present in the form of a nutritive polypeptide and a second amount of amino acids are present in free amino acid form.

Synthetic Methods of Production

In some embodiments proteins of this disclosure are synthesized chemically without the use of a recombinant production system. Protein synthesis can be carried out in a liquid-phase system or in a solid-phase system using techniques known in the art (see, e.g., Atherton, E., Sheppard, R. C. (1989). Solid Phase peptide synthesis: a practical approach. Oxford, England: IRL Press; Stewart, J. M., Young, J. D. (1984). Solid phase peptide synthesis (2nd ed.). Rockford: Pierce Chemical Company.

Peptide chemistry and synthetic methods are well known in the art and a protein of this disclosure can be made using any method known in the art. A non-limiting example of such a method is the synthesis of a resin-bound peptide (including methods for de-protection of amino acids, methods for cleaving the peptide from the resin, and for its purification).

For example, Fmoc-protected amino acid derivatives that can be used to synthesize the peptides are the standard recommended: Fmoc-Ala-OH, Fmoc-Arg(Pbf)-OH, Fmoc-Asn(Trt)-OH, Fmoc-Asp(OtBu)-OH, Fmoc-Cys(Trt)-OH, Fmoc-Gln(Trt)-OH, Fmoc-Glu(OtBu)-OH, Fmoc-Gly-OH, Fmoc-His(Trt)-OH, Fmoc-Ile-OH, Fmoc-Leu-OH, Fmoc-Lys(BOC)-OH, Fmoc-Met-OH, Fmoc-Phe-OH, Fmoc-Pro-OH, Fmoc-Ser(tBu)-OH, Fmoc-Thr(tBu)-OH, Fmoc-Trp(BOC)-OH, Fmoc-Tyr(tBu)-OH and Fmoc-Val-OH (supplied from, e.g., Anaspec, Bachem, Iris Biotech, or NovabioChem). Resin bound peptide synthesis is performed, for example, using Fmoc based chemistry on a Prelude Solid Phase Peptide Synthesizer from Protein Technologies (Tucson, Ariz. 85714 U.S.A.). A suitable resin for the preparation of C-terminal carboxylic acids is a pre-loaded, low-load Wang resin available from NovabioChem (e.g. low load fmoc-Thr(tBu)-Wang resin, LL, 0.27 mmol/g). A suitable resin for the synthesis of peptides with a C-terminal amide is PAL-ChemMatrix resin available from Matrix-Innovation. The N-terminal alpha amino group is protected with Boc.

Fmoc-deprotection can be achieved with 20% piperidine in NMP for 2×3 min. The coupling chemistry is DIC/HOAt/collidine in NMP. Amino acid/HOAt solutions (0.3 M/0.3 M in NMP at a molar excess of 3-10 fold) are added to the resin followed by the same molar equivalent of DIC (3 M in NMP) followed by collidine (3 M in NMP). For example, the following amounts of 0.3 M amino acid/HOAt solution are used per coupling for the following scale reactions: Scale/ml, 0.05 mmol/1.5 mL, 0.10 mmol/3.0 mL, 0.25 mmol/7.5 mL. Coupling time is either 2×30 min or 1×240 min. After synthesis the resin is washed with DCM, and the peptide is cleaved from the resin by a 2-3 hour treatment with TFA/TIS/water (95/2.5/2.5) followed by precipitation with diethylether. The precipitate is washed with diethylether. The crude peptide is dissolved in a suitable mixture of water and MeCN such as water/MeCN (4:1) and purified by reversed-phase preparative HPLC (Waters Deltaprep 4000 or Gilson) on a column containing C18-silica gel. Elution is performed with an increasing gradient of MeCN in water containing 0.1% TFA. Relevant fractions are checked by analytical HPLC or UPLC. Fractions containing the pure target peptide are mixed and concentrated under reduced pressure. The resulting solution is analyzed (HPLC, LCMS) and the product is quantified using a chemiluminescent nitrogen specific HPLC detector (Antek 8060 HPLC-CLND) or by measuring UV-absorption at 280 nm. The product is dispensed into glass vials. The vials are capped with Millipore glassfibre prefilters. Freeze-drying affords the peptide trifluoroacetate as a white solid. The resulting peptides can be detected and characterized using LCMS and/or UPLC, for example, using standard methods known in the art. LCMS can be performed on a setup consisting of Waters Acquity UPLC system and LCT Premier XE mass spectrometer from Micromass. The UPLC pump is connected to two eluent reservoirs containing: A) 0.1% Formic acid in water; and B) 0.1% Formic acid in acetonitrile. The analysis is performed at RT by injecting an appropriate volume of the sample (preferably 2-10 μl) onto the column which is eluted with a gradient of A and B. The UPLC conditions, detector settings and mass spectrometer settings are: Column: Waters Acquity UPLC BEH, C-18, 1.7 μm, 2.1 mm×50 mm. Gradient: Linear 5%-95% acetonitrile during 4.0 min (alternatively 8.0 min) at 0.4 ml/min. Detection: 214 nm (analogue output from TUV (Tunable UV detector)). MS ionisation mode: API-ES Scan: 100-2000 amu (alternatively 500-2000 amu), step 0.1 amu. UPLC methods are well known. Non-limiting examples of methods that can be used are described at pages 16-17 of US 2013/0053310 A1, published Feb. 28, 2013, for example.

Inactivating Enzyme Activity

In some aspects, a protein is an enzyme or has enzymatic activity. In some aspects, it can be desirable to inactivate or reduce the enzymatic activity of the enzyme. Various methods are known in the art for enzyme inactivation including application of heat, application of one or more detergents, application of one or more metal chelators, reduction, oxidation, application of one or more chaotropes, covalent modification, alterating post translational modifications, e.g., via enzymatic or chemical alteration, altering pH (acidic and basic), or altering the salt concentration. For example, heat inactiviation is typically performed at a certain temperature for a certain amount of time, e.g., most endonucleases are inactivated by incubation at 65° C. for 20 minutes. In some aspects, enzymes can be mutated to eliminate or reduce enzymatic activity, e.g., by causing the enzyme to misfold. In addition, high pressure carbon dioxide (HPCD) has been demonstrated to to an effective non-thermal processing technique for inactivating enzymes. See Hu et al., Enzyme Inactivation in Food Processing using High Pressure Carbon Dioxide Technology; Critical Review in Food Science and Nutrition; Volume 52, Issue 2, 2013. Various other forms of enzyme inactivation are known in the art, the parameters of which can be adjusted as needed to alter enzyme activity accordingly. Various methods for enzyme inactivation and excipients such as oxidation, e.g., bleach, H2O2, and ethylene oxide; to reduce disulphides, e.g., DTT, BME, and TCEP; high pH using Na2CO3, Tris Base, or Na2HPO4; low pH using Citric Acid, Boric Acid, Acetic Acid, or Tris HCl; Heat using temperatures 30° C.-100° C. over a period of time; protein unfolding with chaotropes such as Thiocyanate, Urea, Guanidine HCl, or CaCl2); protein unfold with surfactants (e.g., detergents) such as MPD, Triton (non-ionic), CHAPS (zwitterionic), or Tween (non-ionic), or to chelate metals with EDTA or Citrate.

Properties of Nutritive Polypeptides and Assay Methods

Cell Proliferation Assays

Cell proliferation assays can be used to measure the relative importance of a protein or portion thereof to the proliferative process. In some aspects, cell proliferation can be measured under starvation conditions in the presence or absence of a protein or interest. For example, cells can be starved over a period of time (e.g., 48 hours) with a medium having or lacking each, respective protein of interest in a tissue culture incubator. After the incubation, a detection agent such as AlamarBlue can be added and fluorescence measured as an output for proliferation. In some aspects, cell proliferation can be measured as part of a dose response to a protein of interest. For example, cells can be starved in medium having or lacking each, respective protein of interest in a tissue culture incubator. After starvation, the cells can then be treated with varying concentrations of the protein (e.g., 0, 20, 100, or 1000 μM) that was lacking in the initial culture in the same, source medium lacking the respective protein. The cells can then be incubated again in a for tissue culture incubator. After the incubation a detection agent such as AlamarBlue can be added and fluorescence read.

Allerzenicity Assays

For some embodiments it is preferred that the protein not exhibit inappropriately high allergenicity. Accordingly, in some embodiments the potential allergenicy of the protein is assessed. This can be done by any suitable method known in the art. In some embodiments an allergenicity score is calculated. The allergenicity score is a primary sequence based metric based on WHO recommendations (fao.org/ag/agn/food/pdf/allergygm.pdf) for assessing how similar a protein is to any known allergen, the primary prediction being that high percent identity between a target and a known allergen is likely indicative of cross reactivity. For a given protein, the likelihood of eliciting an allergic response can be assessed via one or both of a complimentary pair of sequence homology based tests. The first test determines the protein's percent identity across the entire sequence via a global-global sequence alignment to a database of known allergens using the FASTA algorithm with the BLOSUM50 substitution matrix, a gap open penalty of 10, and a gap extension penalty of 2. It has been suggested that proteins with less than 50% global homology are unlikely to be allergenic (Goodman R. E. et al. Allergenicity assessment of genetically modified crops—what makes sense? Nat. Biotech. 26, 73-81 (2008); Aalberse R. C. Structural biology of allergens. J. Allergy Clin. Immunol. 106, 228-238 (2000)).

In some embodiments of a protein, the protein has less than 50% global homology to any known allergen in the database used for the analysis. In some embodiments a cutoff of less than 40% homology is used. In some embodiments a cutoff of less than 30% homology is used. In some embodiments a cutoff of less than 20% homology is used. In some embodiments a cutoff of less than 10% homology is used. In some embodiments a cutoff of from 40% to 50% is used. In some embodiments a cutoff of from 30% to 50% is used. In some embodiments a cutoff of from 20% to 50% is used. In some embodiments a cutoff of from 10% to 50% is used. In some embodiments a cutoff of from 5% to 50% is used. In some embodiments a cutoff of from 0% to 50% is used. In some embodiments a cutoff of greater than 50% global homology to any known allergen in the database used for the analysis is used. In some embodiments a cutoff of from 50% to 60% is used. In some embodiments a cutoff of from 50% to 70% is used. In some embodiments a cutoff of from 50% to 80% is used. In some embodiments a cutoff of from 50% to 90% is used. In some embodiments a cutoff of from 55% to 60% is used. In some embodiments a cutoff of from 65% to 70% is used. In some embodiments a cutoff of from 70% to 75% is used. In some embodiments a cutoff of from 75% to 80% is used.

The second test assesses the local allergenicity along the protein sequence by determining the local allergenicity of all possible contiguous 80 amino acid fragments via a global-local sequence alignment of each fragment to a database of known allergens using the FASTA algorithm with the BLOSUM50 substitution matrix, a gap open penalty of 10, and a gap extension penalty of 2. The highest percent identity of any 80 amino acid window with any allergen is taken as the final score for the protein of interest. The WHO guidelines suggest using a 35% identity cutoff with this fragment test. In some embodiments of a protein, all possible fragments of the protein have less than 35% local homology to any known allergen in the database used for the analysis using this test. In some embodiments a cutoff of less than 30% homology is used. In some embodiments a cutoff of from 30% to 35% homology is used. In some embodiments a cutoff of from 25% to 30% homology is used. In some embodiments a cutoff of from 20% to 25% homology is used. In some embodiments a cutoff of from 15% to 20% homology is used. In some embodiments a cutoff of from 10% to 15% homology is used. In some embodiments a cutoff of from 5% to 10% homology is used. In some embodiments a cutoff of from 0% to 5% homology is used. In some embodiments a cutoff of greater than 35% homology is used. In some embodiments a cutoff of from 35% to 40% homology is used. In some embodiments a cutoff of from 40% to 45% homology is used. In some embodiments a cutoff of from 45% to 50% homology is used. In some embodiments a cutoff of from 50% to 55% homology is used. In some embodiments a cutoff of from 55% to 60% homology is used. In some embodiments a cutoff of from 65% to 70% homology is used. In some embodiments a cutoff of from 70% to 75% homology is used. In some embodiments a cutoff of from 75% to 80% homology is used.

Skilled artisans are able to identify and use a suitable database of known allergens for this purpose. In some embodiments the database is custom made by selecting proteins from more than one database source. In some embodiments the custom database comprises pooled allergen lists collected by the Food Allergy Research and Resource Program (allergenonline.org/), UNIPROT annotations (uniprot.org/docs/allergen), and the Structural Database of Allergenic Proteins (SDAP, fermi.utmb.edu/SDAP/sdap_lnk.html). This database includes all currently recognized allergens by the International Union of Immunological Socieities (IUIS, allergen.org/) as well as a large number of additional allergens not yet officially named. In some embodiments the database comprises a subset of known allergen proteins available in known databases; that is, the database is a custom selected subset of known allergen proteins. In some embodiments the database of known allergens comprises at least 10 proteins, at least 20 proteins, at least 30 proteins, at least 40 proteins, at least 50 proteins, at least 100, proteins, at least 200 proteins, at least 300 proteins, at least 400 proteins, at least 500 proteins, at least 600 proteins, at least 700 proteins, at least 800 proteins, at least 900 proteins, at least 1,000 proteins, at least 1,100 proteins, at least 1,200 proteins, at least 1,300 proteins, at least 1,400 proteins, at least 1,500 proteins, at least 1,600 proteins, at least 1,700 proteins, at least 1,800 proteins, at least 1,900 proteins, or at least 2,000 proteins. In some embodiments the database of known allergens comprises from 100 to 500 proteins, from 200 to 1,000 proteins, from 500 to 1,000 proteins, from 500 to 1,000 proteins, or from 1,000 to 2,000 proteins.

In some embodiments all (or a selected subset) of contiguous amino acid windows of different lengths (e.g., 70, 60, 50, 40, 30, 20, 10, 8 or 6 amino acid windows) of a protein are tested against the allergen database and peptide sequences that have 100% identity, 95% or higher identity, 90% or higher identity, 85% or higher identity, 80% or higher identity, 75% or higher identity, 70% or higher identity, 65% or higher identity, 60% or higher identity, 55% or higher identity, or 50% or higher identity matches are identified for further examination of potential allergenicity.

Another method of predicting the allergenicity of a protein is to assess the homology of the protein to a protein of human origin. The human immune system is exposed to a multitude of possible allergenic proteins on a regular basis and has the intrinsic ability to differentiate between the host body's proteins and exogenous proteins. The exact nature of this ability is not always clear, and there are many diseases that arise as a result of the failure of the body to differentiate self from non-self (e.g., arthritis). Nonetheless, the fundamental analysis is that proteins that share a degree of sequence homology to human proteins are less likely to elicit an immune response. In particular, it has been shown that for some protein families with known allergenic members (tropomyosins, parvalbumins, caseins), those proteins that bear more sequence homology to their human counterparts relative to known allergenic proteins, are not thought to be allergenic (Jenkins J. A. et al. Evolutionary distance from human homologs reflects allergenicity of animcal food proteins. J. Allergy Clin Immunol. 120 (2007): 1399-1405). For a given protein, a human homology score is measured by determining the maximum percent identity of the protein to a database of human proteins (e.g., the UNIPROT database) from a global-local alignment using the FASTA algorithm with the BLOSUM50 substitution matrix, a gap open penalty of 10, and a gap extension penalty of 2. According to Jenkins et al. (Jenkins J. A. et al. Evolutionary distance from human homologs reflects allergenicity of animal food proteins. J. Allergy Clin Immunol. 120 (2007): 1399-1405) proteins with a sequence identity to a human protein above about 62% are less likely to be allergenic. Skilled artisans are able to identify and use a suitable database of known human proteins for this purpose, for example, by searching the UNIPROT database (uniprot.org). In some embodiments the database is custom made by selecting proteins from more than one database source. Of course the database may but need not be comprehensive. In some embodiments the database comprises a subset of human proteins; that is, the database is a custom selected subset of human proteins. In some embodiments the database of human proteins comprises at least 10 proteins, at least 20 proteins, at least 30 proteins, at least 40 proteins, at least 50 proteins, at least 100, proteins, at least 200 proteins, at least 300 proteins, at least 400 proteins, at least 500 proteins, at least 600 proteins, at least 700 proteins, at least 800 proteins, at least 900 proteins, at least 1,000 proteins, at least 2,000 proteins, at least 3,000 proteins, at least 4,000 proteins, at least 5,000 proteins, at least 6,000 proteins, at least 7,000 proteins, at least 8,000 proteins, at least 9,000 proteins, or at least 10,000 proteins. In some embodiments the database comprises from 100 to 500 proteins, from 200 to 1,000 proteins, from 500 to 1,000 proteins, from 500 to 1,000 proteins, from 1,000 to 2,000 proteins, from 1,000 to 5,000 proteins, or from 5,000 to 10,000 proteins. In some embodiments the database comprises at least 90%, at least 95%, or at least 99% of all known human proteins.

In some embodiments of a protein, the protein is at least 20% homologous to a human protein. In some embodiments a cutoff of at least 30% homology is used. In some embodiments a cutoff of at least 40% homology is used. In some embodiments a cutoff of at least 50% homology is used. In some embodiments a cutoff of at least 60% homology is used. In some embodiments a cutoff of at least 70% homology is used. In some embodiments a cutoff of at least 80% homology is used. In some embodiments a cutoff of at least 62% homology is used. In some embodiments a cutoff of from at least 20% homology to at least 30% homology is used. In some embodiments a cutoff of from at least 30% homology to at least 40% homology is used. In some embodiments a cutoff of from at least 50% homology to at least 60% homology is used. In some embodiments a cutoff of from at least 60% homology to at least 70% homology is used. In some embodiments a cutoff of from at least 70% homology to at least 80% homology is used.

Theromostability Assays

As used herein, a “stable” protein is one that resists changes (e.g., unfolding, oxidation, aggregation, hydrolysis, etc.) that alter the biophysical (e.g., solubility), biological (e.g., digestibility), or compositional (e.g. proportion of Leucine amino acids) traits of the protein of interest.

Protein stability can be measured using various assays known in the art and proteins disclosed herein and having stability above a threshold can be selected. In some embodiments a protein is selected that displays thermal stability that is comparable to or better than that of whey protein. Thermal stability is a property that can help predict the shelf life of a protein. In some embodiments of the assay stability of protein samples is determined by monitoring aggregation formation using size exclusion chromatography (SEC) after exposure to extreme temperatures. Aqueous samples of the protein to be tested are placed in a heating block at 90° C. and samples are taken after 0, 1, 5, 10, 30 and 60 min for SEC analysis. Protein is detected by monitoring absorbance at 214 nm, and aggregates are characterized as peaks eluting faster than the protein of interest. No overall change in peak area indicates no precipitation of protein during the heat treatment. Whey protein has been shown to rapidly form 80% aggregates when exposed to 90° C. in such an assay.

In some embodiments the thermal stability of a protein is determined by heating a sample slowly from 25° C. to 95° C. in presence of a hydrophobic dye (e.g., ProteoStat® Thermal shift stability assay kit, Enzo Life Sciences) that binds to aggregated proteins that are formed as the protein denatures with increasing temperature (Niesen, F. H., Berglund, H. & Vadadi, M., 2007. The use of differential scanning fluorimetry to detect ligand interactions that promote protein stability. Nature Protocols, Volume 2, pp. 2212-2221). Upon binding, the dye's fluorescence increases significantly, which is recorded by an rtPCR instrument and represented as the protein's melting curve (Lavinder, J. J., Hari, S. B., Suillivan, B. J. & Magilery, T. J., 2009. High-Throughput Thermal Scanning: A General, Rapid Dye-Binding Thermal Shift Screen for Protein Engineering. Journal of the American Chemical Society, pp. 3794-3795). After the thermal shift is complete, samples are examined for insoluble precipitates and further analyzed by analytical size exclusion chromatography (SEC).

Solubility Assays

In some embodiments of the proteins disclosed herein the protein is soluble. Solubility can be measured by any method known in the art. In some embodiments solubility is examined by centrifuge concentration followed by protein concentration assays. Samples of proteins in 20 mM HEPES pH 7.5 are tested for protein concentration according to protocols using two methods, Coomassie Plus (Bradford) Protein Assay (Thermo Scientific) and Bicinchoninic Acid (BCA) Protein Assay (SigmadAldrich). Based on these measurements 10 mg of protein is added to an Amicon Ultra 3 kDa centrifugal filter (Millipore). Samples are concentrated by centrifugation at 10,000×g for 30 minutes. The final, now concentrated, samples are examined for precipitated protein and then tested for protein concentration as above using two methods, Bradford and BCA.

In some embodiments the proteins have a final solubility limit of at least 5 g/L, 10 g/L, 20 g/L, 30 g/L, 40 g/L, 50 g/L, or 100 g/L at physiological pH. In some embodiments the proteins are greater than 50%, greater than 60%, greater than 70%, greater than 80%, greater than 90%, greater than 95%, greater than 96%, greater than 97%, greater than 98%, greater than 99%, or greater than 99.5% soluble with no precipitated protein observed at a concentration of greater than 5 g/L, or 10 g/L, or 20 g/L, or 30 g/L, or 40 g/L, or 50 g/L, or 100 g/L at physiological pH. In some embodiments, the solubility of the protein is higher than those typically reported in studies examining the solubility limits of whey (12.5 g/L; Pelegrine et al., Lebensm.-Wiss. U.-Technol. 38 (2005) 77-80) and soy (10 g/L; Lee et al., JAOCS 80(1) (2003) 85-90).

Eukaryotic proteins are often glycosylated, and the carbohydrate chains that are attached to proteins serve various functions. N-linked and O-linked glycosylation are the two most common forms of glycosylation occurring in proteins. N-linked glycosylation is the attachment of a sugar molecule to a nitrogen atom in an amino acid residue in a protein. N-linked glycosylation occurs at Asparagine and Arginine residues. O-linked glycosylation is the attachment of a sugar molecule to an oxygen atom in an amino acid residue in a protein. O-linked glycosylation occurs at Threonine and Serine residues.

Glycosylated proteins are often more soluble than their un-glycosylated forms. In terms of protein drugs, proper glycosylation usually confers high activity, proper antigen binding, better stability in the blood, etc. However, glycosylation necessarily means that a protein “carries with it” sugar moieties. Such sugar moieties may reduce the usefulness of the proteins of this disclosure including recombinant proteins. For example, as demonstrated in the examples, a comparison of digestion of glycosylated and non-glycosylated forms of the same proteins shows that the non-glycosylated forms are digested more quickly than the glycosylated forms. For these reasons, in some embodiments the nutrive proteins according to the disclosure comprise low or no glycosylation. For example, in some embodiments the proteins comprise a ratio of non-glycosilated to total amino acid residues of at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In some embodiments the proteins to not comprise any glycosylation.

In some embodiments, the protein according to the disclosure is de-glycosylated after it is produced or after it is isolated. Proteins of low or no glycosylation can be made by any method known in the art. For example, enzymatic and/or chemical methods can be used (Biochem. J. (2003) 376, p339-350). Enzymes are produced commercially at research scales for the removal of N-linked and O-linked oligosaccharides. Chemical methods include use of trifluoromethanesulfonic acid to selectively break N-linked and O-linked peptide-saccharide bonds. This method often results in a more complete deglycosylation than does the use of enzymatic methods.

In other embodiments, the protein according to the disclosure is produced with low or no glycosylation by a host organism. Most bacteria and other prokaryotes have very limited capabilities to glycosylate proteins, especially heterologous proteins. Accordingly, in some embodiments of this disclosure a protein is made recombinantly in a microorganism such that the level of glycosylation of the recombinant protein is low or no glycosylation. In some embodiments the level of glycosylation of the recombinant protein is lower than the level of glycosylation of the protein as it occurs in the organism from which it is derived. Glycosylation of a protein can vary based on the host organism, in other words some hosts will produce more glycosylation relative to one or more other hosts; while other hosts will produce less g glycosylation relative to one or more other hosts. Differences in the amount of glycosylation can be measured based upon, e.g., the mass of glycosylation present and/or the total number of glycosylation sites present.

Toxicity and Anti-Nutricity Assays

For most embodiments it is preferred that the protein not exhibit inappropriately high toxicity. Accordingly, in some embodiments the potential toxicity of the protein is assessed. This can be done by any suitable method known in the art. In some embodiments a toxicity score is calculated by determining the protein's percent identity to databases of known toxic proteins (e.g., toxic proteins identified from the UNIPROT database). A global-global alignment of the protein of interest against the database of known toxins is performed using the FASTA algorithm with the BLOSUM50 substitution matrix, a gap open penalty of 10, and a gap extension penalty of 2. In some embodiments of a protein, the protein is less than 35% homologous to a known toxin. In some embodiments a cutoff of less than 35% homology is used. In some embodiments a cutoff of from 30% to 35% homology is used. In some embodiments a cutoff of from 25% to 35% homology is used. In some embodiments a cutoff of from 20% to 35% homology is used. In some embodiments a cutoff of from 15% to 35% homology is used. In some embodiments a cutoff of from 10% to 35% homology is used. In some embodiments a cutoff of from 5% to 35% homology is used. In some embodiments a cutoff of from 0% to 35% homology is used. In some embodiments a cutoff of greater than 35% homology is used. In some embodiments a cutoff of from 35% to 40% homology is used. In some embodiments a cutoff of from 35% to 45% homology is used. In some embodiments a cutoff of from 35% to 50% homology is used. In some embodiments a cutoff of from 35% to 55% homology is used. In some embodiments a cutoff of from 35% to 60% homology is used. In some embodiments a cutoff of from 35% to 70% homology is used. In some embodiments a cutoff of from 35% to 75% homology is used. In some embodiments a cutoff of from 35% to 80% homology is used. Skilled artisans are able to identify and use a suitable database of known toxins for this purpose, for example, by searching the UNIPROT database (uniprot.org). In some embodiments the database is custom made by selecting proteins identified as toxins from more than one database source. In some embodiments the database comprises a subset of known toxic proteins; that is, the database is a custom selected subset of known toxic proteins. In some embodiments the database of toxic proteins comprises at least 10 proteins, at least 20 proteins, at least 30 proteins, at least 40 proteins, at least 50 proteins, at least 100, proteins, at least 200 proteins, at least 300 proteins, at least 400 proteins, at least 500 proteins, at least 600 proteins, at least 700 proteins, at least 800 proteins, at least 900 proteins, at least 1,000 proteins, at least 2,000 proteins, at least 3,000 proteins, at least 4,000 proteins, at least 5,000 proteins, at least 6,000 proteins, at least 7,000 proteins, at least 8,000 proteins, at least 9,000 proteins, or at least 10,000 proteins. In some embodiments the database comprises from 100 to 500 proteins, from 200 to 1,000 proteins, from 500 to 1,000 proteins, from 500 to 1,000 proteins, from 1,000 to 2,000 proteins, from 1,000 to 5,000 proteins, or from 5,000 to 10,000 proteins.

Anti-Nutricity and Anti-Nutrients

For some embodiments it is preferred that the protein not exhibit anti-nutritional activity (“anti-nutricity”), i.e., proteins that have the potential to prevent the absorption of nutrients from food. Examples of anti-nutritive sequences causing such anti-nutricity include protease inhibitors, which inhibit the actions of trypsin, pepsin and other proteases in the gut, preventing the digestion and subsequent absorption of protein.

Disclosed herein are formulations containing isolated nutritive polypeptides that are substantially free of anti-nutritive sequences. In some embodiments the nutritive polypeptide has an anti-nutritive similarity score below about 1, below about 0.5, or below about 0.1. The nutritive polypeptide is present in the formulation in an amount greater than about 10 g, and the formulation is substantially free of anti-nutritive factors. The formulation is present as a liquid, semi-liquid or gel in a volume not greater than about 500 ml or as a solid or semi-solid in a mass not greater than about 200 g. The nutritive polypeptide may have low homology with a protease inhibitor, such as a member of the serpin family of polypeptides, e.g., it is less than 90% identical, or is less than 85%, 80%, 75%, 70%, 65%, 60%, 5540, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%. 10%, 5%, or less than 5% identical.

Accordingly, in some embodiments the potential anti-nutricity of the protein is assessed. This can be done by any suitable method known in the art. In some embodiments an anti-nutricity score is calculated by determining the protein's percent identity to databases of known protease inhibitors (e.g., protease inhibitors identified from the UNIPROT database). A global-global alignment of the protein of interest against the database of known protease inhibitors is performed using the FASTA algorithm with the BLOSUM50 substitution matrix, a gap open penalty of 10, and a gap extension penalty of 2, to identify whether the protein is homologous to a known anti-protein. In some embodiments of a protein, the protein has less than 35% global homology to any known anti-protein (e.g., any known protease inhibitor) in the database used for the analysis. In some embodiments a cutoff of less than 35% identify is used. In some embodiments a cutoff of from 30% to 35% is used. In some embodiments a cutoff of from 25% to 35% is used. In some embodiments a cutoff of from 20% to 35% is used. In some embodiments a cutoff of from 15% to 35% is used. In some embodiments a cutoff of from 10% to 35% is used. In some embodiments a cutoff of from 5% to 35% is used. In some embodiments a cutoff of from 0% to 35% is used. In some embodiments a cutoff of greater than 35% identify is used. In some embodiments a cutoff of from 35% to 40% is used. In some embodiments a cutoff of from 35% to 45% is used. In some embodiments a cutoff of from 35% to 50% is used. In some embodiments a cutoff of from 35% to 55% is used. In some embodiments a cutoff of from 35% to 60% is used. In some embodiments a cutoff of from 35% to 70% is used. In some embodiments a cutoff of from 35% to 75% is used. In some embodiments a cutoff of from 35% to 80% is used. Skilled artisans are able to identify and use a suitable database of known protease inhibitors for this purpose, for example, by searching the UNIPROT database (uniprot.org). In some embodiments the database is custom made by selecting proteins identified protease-inhibitors as from more than one database source. In some embodiments the database comprises a subset of known protease inhibitors available in databases; that is, the database is a custom selected subset of known protease inhibitor proteins. In some embodiments the database of known protease inhibitor proteins comprises at least 10 proteins, at least 20 proteins, at least 30 proteins, at least 40 proteins, at least 50 proteins, at least 100, proteins, at least 200 proteins, at least 300 proteins, at least 400 proteins, at least 500 proteins, at least 600 proteins, at least 700 proteins, at least 800 proteins, at least 900 proteins, at least 1,000 proteins, at least 1,100 proteins, at least 1,200 proteins, at least 1,300 proteins, at least 1,400 proteins, at least 1,500 proteins, at least 1,600 proteins, at least 1,700 proteins, at least 1,800 proteins, at least 1,900 proteins, or at least 2,000 proteins. In some embodiments the database of known protease inhibitor proteins comprises from 100 to 500 proteins, from 200 to 1,000 proteins, from 500 to 1,000 proteins, from 500 to 1,000 proteins, or from 1,000 to 2,000 proteins, or from 2,000 to 3,000 proteins.

In other embodiments a protein that does exhibit some degree of protease inhibitor activity is used. For example, in some embodiments such a protein can be useful because it delays protease digestion when the nutritive protein is consumed such that the protein traverse a greater distance within the GI tract before it is digested, thus delaying absorption. For example, in some embodiments the protein inhibits gastric digestion but not intestinal digestion. Delaney B. et al. (Evaluation of protein safety in the context of agricultural biotechnology. Food. Chem. Toxicol. 46 (2008: S71-S97)) suggests that one should avoid both known toxic and anti-proteins when assessing the safety of a possible food protein. In some embodiments of a protein, the protein has a favorably low level of global homology to a database of known toxic proteins and/or a favorably low level of global homology to a database of known anti-nutricity proteins (e.g., protease inhibitors), as defined herein.

Antinutrients.

Provided are nutritional compositions that lack anti-nutrients (or antinutrients). Antinutrients are compounds, usually other than proteins, which are typically found in plant foods and have been found to have both adverse effects and, in some situations, certain health benefits. For instance, phytic acid, lectins, phenolic compounds, saponins, and enzyme inhibitors have been shown to reduce the availability of nutrients and to cause the inhibition of growth, and phytoestrogens and lignans have been linked with infertility problems. On the other hand, phytic acid, lectins, phenolic compounds, amylase inhibitors, and saponins have been shown to reduce the blood glucose and insulin response to starch foods and/or the plasma cholesterol and triglycerides. Furthermore, phytic acid, phenolics, saponins, protease inhibitors, phytoestrogens, and lignans have been linked to reduced cancer risks.

Provided are methods for reducing the amount of anti-nutritional factors in a food product, by treating the food product with a thermal treatment comprising steam or hot air having a temperature greater than about 90 degrees C. for at least 1 minute, combining with the treated food product with a composition containing an isolated nutritive polypeptide. Optionally, the step of thermal treatment degrades at least one anti-nutritional factor such as a saponin, a lectin, and a prolamin, a protease inhibitor, or phytic acid.

Anti-nutritional factors are detected in a protein composition as follows. Phytic acid: The procedure of Wheeler and Ferrel (Wheeler, E. L., Ferrel, R. E., Cereal Chem. 1971, 48, 312) is used for the determination of phytic acid extracted in 3% trichloroacetic acid. Raffinose family oligosaccharides: Protein samples are extracted with 70% ethanol using Soxhlet apparatus for 6-8 h and thin-layer chromatography is used for the quantitative determination of raffinose and stachyose in the extract according to the procedure of Tanaka et al. (Tanaka, M., Thananunkul, D., Lee, T. C., Chichester, C. O., J. Food Sci. 1975, 40, 1087-1088). Trypsin inhibitor: The method of Kakade et al. (Kakade, M. L., Rackis, J. J., McGhee, J. E., Puski, G., Cereal Chem. 1974, 51, 376-82) is used for determining the trypsin inhibitor activity in raw and treated samples. One trypsin inhibitor unit (TIU) is defined as a decrease in absorbance at 410 nm by 0.01 in 10 min and data were expressed as TIU*mg-1. Amylase inhibitor: The inhibitor is extracted in 0.15 m NaCl according to the procedure of Baker et al. (Baker, J. E., Woo, S. M., Throne, J. E., Finny, P. L., Environm. Entomol. 1991, 20, 53±60) and assayed by the method of Huesing et al. (Huesing, J. E., Shade, R. E., Chrispeels, M. J., Murdok, L. L., Plant Physiol. 1991, 96, 993±996). One amylase inhibitor unit (AIU) is defined as the amount that gives 50% inhibition of a portion of the amylase that produced one mg maltose monohydrate per min. Lectins: The procedure of Paredes-Lopez et al. (Paredes-Lopez, O., Schevenin, M. L., Guevara-Lara, F., Food Chem. 1989, 31, 129-137) is applied to the extraction of lectins using phosphate-buffered saline (PBS). The hemagglutinin activity (HA) of lectins in the sample extract is determined according to Kortt (Kortt, A. A. (Ed.), Eur. J. Biochem. 1984, 138, 519). Trypsinized human red blood cell (A, B and O) suspensions are prepared according to Lis and Sharon (Lis, H., Sharon, N., Methods Enzymol. 1972, 28, 360±368). HA is expressed as the reciprocal of the highest dilution giving positive agglutination. Tannins: The tannin contents are determined as tannic acid by Folin-Denis reagent according to the procedure of the AOAC (Helrich, K. (Ed.), AOAC, Official Methods of Analysis, Association of Official Analytical Chemists, Arlington, Va. 1990)

Charge Assays and Solvation Scoring

One feature that can enhance the utility of a protein is its charge (or per amino acid charge). Proteins with higher charge can in some embodiments exhibit desirable characteristics such as increased solubility, increased stability, resistance to aggregation, and desirable taste profiles. For example, a charged protein that exhibits enhanced solubility can be formulated into a beverage or liquid formulation that includes a high concentration of protein in a relatively low volume of solution, thus delivering a large dose of protein nutrition per unit volume. A charged protein that exhibits enhanced solubility can be useful, for example, in sports drinks or recovery drinks wherein a user (e.g., an athlete) wants to ingest protein before, during or after physical activity. A charged protein that exhibits enhanced solubility can also be particularly useful in a clinical setting wherein a subject (e.g., a patient or an elderly person) is in need of protein nutrition but is unable to ingest solid foods or large volumes of liquids.

For example, the net charge (ChargeP) of a polypeptide at pH 7 can be calculated using the following formula:


ChargeP=−0.002−(C)(0.045)−(D)(0.999)−(E)(0.998)+(H)(0.091)+(K)(1.0)+(R)(1.0)−(Y)(−0.001)

where C is the number of cysteine residues, D is the number of aspartic acid residues, E is the number of glutamic acid residues, H is the number of histidine residues, K is the number of lysine residues, R is the number of arginine residues and Y is the number of tyrosine residues in the polypeptide. The per amino acid charge (ChargeA) of the polypeptide can be calculated by dividing the net charge (ChargeP) by the number of amino acid residues (N), i.e., ChargeA=ChargeP/N. (See Bassi S (2007), “A Primer on Python for Life Science Researchers.” PLoS Comput Biol 3(11): e199. doi:10.1371/journal.pcbi.0030199).

One metric for assessing the hydrophilicity and potential solubility of a given protein is the solvation score. Solvation score is defined as the total free energy of solvation (i.e. the free energy change associated with transfer from gas phase to a dilute solution) for all amino acid side chains if each residue were solvated independently, normalized by the total number of residues in the sequence. The side chain solvation free energies are found computationally by calculating the electrostatic energy difference between a vacuum dielectric of 1 and a water dielectric of 80 (by solving the Poisson-Boltzmann equation) as well as the non-polar, Van der Waals energy using a linear solvent accessible surface area model (D. Sitkoff, K. A. Sharp, B. Honig. “Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models”. J. Phys. Chem. 98, 1994). For amino acids with ionizable sidechains (Arg, Asp, Cys, Glu, His, Lys and Tyr), an average solvation free energy is used based on the relative probabilities for each ionization state at the specified pH. Solvation scores start at 0 and continue into negative values, and the more negative the solvation score, the more hydrophilic and potentially soluble the protein is predicted to be. In some embodiments of a protein, the protein has a solvation score of −10 or less at pH 7. In some embodiments of a protein, the protein has a solvation score of −15 or less at pH 7. In some embodiments of a protein, the protein has a solvation score of −20 or less at pH 7. In some embodiments of a protein, the protein has a solvation score of −25 or less at pH 7. In some embodiments of a protein, the protein has a solvation score of −30 or less at pH 7. In some embodiments of a protein, the protein has a solvation score of −35 or less at pH 7. In some embodiments of a protein, the protein has a solvation score of −40 or less at pH 7.

The solvation score is a function of pH by virtue of the pH dependence of the molar ratio of undissociated weak acid ([HA]) to conjugate base ([A-]) as defined by the Henderson-Hasselbalch equation:

pH = pKa + log ( [ A - ] [ HA ] )

All weak acids have different solvation free energies compared to their conjugate bases, and the solvation free energy used for a given residue when calculating the solvation score at a given pH is the weighted average of those two values.

Accordingly, in some embodiments of a protein, the protein has a solvation score of −10 or less at an acidic pH. In some embodiments of a protein, the protein has a solvation score of −15 or less at at an acidic pH. In some embodiments of a protein, the protein has a solvation score of −20 or less at an acidic pH. In some embodiments of a protein, the protein has a solvation score of −25 or less at an acidic pH. In some embodiments of a protein, the protein has a solvation score of −30 or less at an acidic pH. In some embodiments of a protein, the protein has a solvation score of −35 or less at an acidic pH. In some embodiments of a protein, the protein has a solvation score of −40 or less at acidic pH.

Accordingly, in some embodiments of a protein, the protein has a solvation score of −10 or less at a basic pH. In some embodiments of a protein, the protein has a solvation score of −15 or less at at a basic pH. In some embodiments of a protein, the protein has a solvation score of −20 or less at a basic pH. In some embodiments of a protein, the protein has a solvation score of −25 or less at a basic pH. In some embodiments of a protein, the protein has a solvation score of −30 or less at a basic pH. In some embodiments of a protein, the protein has a solvation score of −35 or less at a basic pH. In some embodiments of a protein, the protein has a solvation score of −40 or less at basic pH.

Accordingly, in some embodiments of a protein, the protein has a solvation score of −10 or less at a pH range selected from 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10, 10-11, and 11-12. In some embodiments of a protein, the protein has a solvation score of −15 or less at at a pH range selected from 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10, 10-11, and 11-12. In some embodiments of a protein, the protein has a solvation score of −20 or less at a pH range selected from 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10, 10-11, and 11-12. In some embodiments of a protein, the protein has a solvation score of −25 or less at a pH range selected from 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10, 10-11, and 11-12. In some embodiments of a protein, the protein has a solvation score of −30 or less at a pH range selected from 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10, 10-11, and 11-12. In some embodiments of a protein, the protein has a solvation score of −35 or less at a pH range selected from 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10, 10-11, and 11-12. In some embodiments of a protein, the protein has a solvation score of −40 or less at a pH range selected from 2-3, 3-4, 4-5, 5-6, 6-7, 7-8, 8-9, 9-10, 10-11, and 11-12.

Aggregation Assays and Aggregation Scoring

In some embodiments a protein of this disclosure shows resistance to aggregation, exhibiting, for example, less than 80% aggregation, 10% aggregation, or no detectable aggregation at elevated temperatures (e.g., 50° C., 60° C., 70° C., 80° C., 85° C., 90° C., or 95° C.).

One benefit of stable proteins as disclosed herein is that they can be able to be stored for an extended period of time before use, in some instances without the need for refrigeration or cooling. In some embodiments, proteins are processed into a dry form (e.g., by lyophilization). In some embodiments, proteins are stable upon lyophilization. In some embodiments, such lyophilized proteins maintain their stability upon reconstitution (e.g., liquid formulation).

The aggregation score is a primary sequence based metric for assessing the hydrophobicity and likelihood of aggregation of a given protein. Using the Kyte and Doolittle hydrophobity scale (Kyte J, Doolittle R F (May 1982) “A simple method for displaying the hydropathic character of a protein”. J. Mol. Biol. 157 (1): 105-32), which gives hydrophobic residues positive values and hydrophilic residues negative values, the average hydrophobicity of a protein sequence is calculated using a moving average of five residues. The aggregation score is drawn from the resulting plot by determining the area under the curve for values greater than zero and normalizing by the total length of the protein. The underlying view is that aggregation is the result of two or more hydrophobic patches coming together to exclude water and reduce surface exposure, and the likelihood that a protein will aggregate is a function of how densely packed its hydrophobic (i.e., aggregation prone) residues are. Aggregation scores start at 0 and continue into positive values, and the smaller the aggregation score, the less hydrophobic and potentially less prone to aggregation the protein is predicted to be. In some embodiments of a protein, the protein has an aggregation score of 2 or less. In some embodiments of a protein, the protein has an aggregation score of 1.5 or less. In some embodiments of a protein, the protein has an aggregation score of 1 or less. In some embodiments of a protein, the protein has an aggregation score of 0.9 or less. In some embodiments of a protein, the protein has an aggregation score of 0.8 or less. In some embodiments of a protein, the protein has an aggregation score of 0.7 or less. In some embodiments of a protein, the protein has an aggregation score of 0.6 or less. In some embodiments of a protein, the protein has an aggregation score of 0.5 or less. In some embodiments of a protein, the protein has an aggregation score of 0.4 or less. In some embodiments of a protein, the protein has an aggregation score of 0.3 or less. In some embodiments of a protein, the protein has an aggregation score of 0.2 or less. In some embodiments of a protein, the protein has an aggregation score of 0.1 or less.

In some cases, soluble expression is desirable because it can increase the amount and/or yield of the protein and facilitate one or more of the isolation and purification of the protein. In some embodiments, the proteins of this disclosure are solubly expressed in the host organism. Solvation score and aggregation score can be used to predict soluble expression of recombinant proteins in a host organism. As shown in Example 8, this disclosure provides evidence suggesting that proteins with solvation scores of ≤−20 and aggregation scores of ≤0.75 are more likely to be recombinantly expressed in a particular E. coli expression system. Moreover, the data also suggests that proteins with solvation scores of ≤−20 and aggregation scores of ≤0.5 are more likely to be solubly expressed in this system. Therefore, in some embodiments the protein of this disclosure has a solvation score of −20 or less. In some embodiments the nutitive protein has an aggregation score of 0.75 or less. In some embodiments the nutitive protein has an aggregation score of 0.5 or less. In some embodiments the protein has a solvation score of −20 or less and an aggregation score of 0.75 or less. In some embodiments the protein has a solvation score of −20 or less and an aggregation score of 0.5 or less.

Taste and Mouth Characteristics

Certain free amino acids and mixtures of free amino acids are known to have a bitter or otherwise unpleasant taste. In addition, hydrolysates of common proteins (e.g., whey and soy) often have a bitter or unpleasant taste. In some embodiments, proteins disclosed and described herein do not have a bitter or otherwise unpleasant taste. In some embodiments, proteins disclosed and described herein have a more acceptable taste as compared to at least one of free amino acids, mixtures of free amino acids, and/or protein hydrolysates. In some embodiments, proteins disclosed and described herein have a taste that is equal to or exceeds at least one of whey protein.

Proteins are known to have tastes covering the five established taste modalities: sweet, sour, bitter, salty, and umami. Fat can be considered a sixth taste. The taste of a particular protein (or its lack thereof) can be attributed to several factors, including the primary structure, the presence of charged side chains, and the electronic and conformational features of the protein. In some embodiments, proteins disclosed and described herein are designed to have a desired taste (e.g., sweet, salty, umami) and/or not to have an undesired taste (e.g., bitter, sour). In this context “design” includes, for example, selecting edible species proteins embodying features that achieve the desired taste property, as well as creating muteins of edible species polypeptides that have desired taste properties. For example, proteins can be designed to interact with specific taste receptors, such as sweet receptors (T1R2-T1R3 heterodimer) or umami receptors (T1R1-T1R3 heterodimer, mGluR4, and/or mGluR1). Further, proteins can be designed not to interact, or to have diminished interaction, with other taste receptors, such as bitter receptors (T2R receptors).

Proteins disclosed and described herein can also elicit different physical sensations in the mouth when ingested, sometimes referred to as “mouth feel.” The mouth feel of the proteins can be due to one or more factors including primary structure, the presence of charged side chains, and the electronic and conformational features of the protein. In some embodiments, proteins elicit a buttery or fat-like mouth feel when ingested.

Nutritive Compositions and Formulations

At least one protein disclosed herein can be combined with at least one second component to form a composition. In some embodiments the only source of amino acid in the composition is the at least one protein disclosed herein. In such embodiments the amino acid composition of the composition will be the same as the amino acid composition of the at least one protein disclosed herein. In some embodiments the composition comprises at least one protein disclosed herein and at least one second protein. In some embodiments the at least one second protein is a second protein disclosed herein, while in other embodiments the at least one second protein is not a protein disclosed herein. In some embodiments the composition comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more proteins disclosed herein. In some embodiments the composition comprises 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more proteins that are not proteins disclosed herein. In some embodiments the composition comprises 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more proteins and the composition comprises 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more proteins that are not proteins disclosed herein.

Also provided are formulations containing the nutritive polypeptides described herein. In one aspect, provided is a formulation containing a unicellular organism secreted polypeptide nutritional domain. For example, the polypeptide nutritional domain contains an amino acid sequence having an N-terminal amino acid that does not correspond to the N-terminal amino acid of an amino acid sequence comprising a unicellular organism secreted polypeptide that contains the polypeptide nutritional domain. In some embodiments the amino acid sequence comprising the unicellular organism secreted polypeptide is an edible species polypeptide sequence, and the N-terminal amino acid is a common edible species amino acid. In addition or in the alternative, the polypeptide nutritional domain contains an amino acid sequence having a C-terminal amino acid that does not correspond to the C-terminal amino acid of an amino acid sequence comprising a unicellular organism secreted polypeptide that contains the polypeptide nutritional domain. In some embodiments the amino acid sequence comprising the unicellular organism secreted polypeptide is an edible species polypeptide sequence, and the C-terminal amino acid is a common edible species amino acid. Thus, in some embodiments the secreted polypeptide nutritional domain is at least one amino acid shorter than a homologous edible species polypeptide. The nutritional domain can be about 99%, 98%, 97%, 96%, 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5% or less than 5% the length of a homologous edible species peptide. In other embodiments, the polypeptide nutritional domain consists of from about 1% to about 99% of the unicellular organism secreted polypeptide that contains the polypeptide nutritional domain. As described herein, the polypeptide nutritional domain is generally preferred to the larger polypeptide containing the polypeptide nutritional domain. A polypeptide nutritional domain may contain, on a mass basis, more nutrition than the larger including polypeptide. In some embodiments, a polypeptide nutritional domain may provide desirable features when compared to the larger including polypeptide, such as increased solubility and better shelf-life stability.

In some embodiments the composition as described in the preceding paragraph, further comprises at least one of at least one polypeptide, at least one peptide, and at least one free amino acid. In some embodiments the composition comprises at least one polypeptide and at least one peptide. In some embodiments the composition comprises at least one polypeptide and at least one free amino acid. In some embodiments the composition comprises at least one peptide and at least one free amino acid. In some embodiments the at least one polypeptide, at least one peptide, and/or at least one free amino acid comprises amino acids selected from 1) branched chain amino acids, 2) leucine, and 3) essential amino acids. In some embodiments the at least one polypeptide, at least one peptide, and/or at least one free amino acid consists of amino acids selected from 1) branched chain amino acids, 2) leucine, and 3) essential amino acids. In some embodiments, the composition comprises at least one modified amino acid or a non-standard amino acid. Modified amino acids include amino acids that have modifications to one or more of the carboxy terminus, amino terminus, and/or side chain. Non-standard amino acids can be selected from those that are formed by post-translational modification of proteins, for example, carboxylated glutamate, hydroxyproline, or hypusine. Other non-standard amino acids are not found in proteins. Examples include lanthionine, 2-aminoisobutyric acid, dehydroalanine, gamma-aminobutyric acid, ornithine and citrulline. In some embodiments, the composition comprises one or more D-amino acids. In some embodiments, the composition comprises one or more L-amino acids. In some embodiments, the composition comprises a mixture of one or more D-amino acids and one or more L-amino acids.

By adding at least one of a polypeptide, a peptide, and a free amino acid to a composition the proportion of at least one of branched chain amino acids, leucine, and essential amino acids, to total amino acid, present in the composition can be increased.

In some embodiments the composition comprises at least one carbohydrate. A “carbohydrate” refers to a sugar or polymer of sugars. The terms “saccharide,” “polysaccharide,” “carbohydrate,” and “oligosaccharide” can be used interchangeably. Most carbohydrates are aldehydes or ketones with many hydroxyl groups, usually one on each carbon atom of the molecule. Carbohydrates generally have the molecular formula CnH2nOn. A carbohydrate can be a monosaccharide, a disaccharide, trisaccharide, oligosaccharide, or polysaccharide. The most basic carbohydrate is a monosaccharide, such as glucose, sucrose, galactose, mannose, ribose, arabinose, xylose, and fructose. Disaccharides are two joined monosaccharides. Exemplary disaccharides include sucrose, maltose, cellobiose, and lactose. Typically, an oligosaccharide includes between three and six monosaccharide units (e.g., raffinose, stachyose), and polysaccharides include six or more monosaccharide units. Exemplary polysaccharides include starch, glycogen, and cellulose. Carbohydrates may contain modified saccharide units such as 2′-deoxyribose wherein a hydroxyl group is removed, 2′-fluororibose wherein a hydroxyl group is replace with a fluorine, or N-acetylglucosamine, a nitrogen-containing form of glucose (e.g., 2′-fluororibose, deoxyribose, and hexose). Carbohydrates may exist in many different forms, for example, conformers, cyclic forms, acyclic forms, stereoisomers, tautomers, anomers, and isomers.

In some embodiments the composition comprises at least one lipid. As used herein a “lipid” includes fats, oils, triglycerides, cholesterol, phospholipids, fatty acids in any form including free fatty acids. Fats, oils and fatty acids can be saturated, unsaturated (cis or trans) or partially unsaturated (cis or trans). In some embodiments the lipid comprises at least one fatty acid selected from lauric acid (12:0), myristic acid (14:0), palmitic acid (16:0), palmitoleic acid (16:1), margaric acid (17:0), heptadecenoic acid (17:1), stearic acid (18:0), oleic acid (18:1), linoleic acid (18:2), linolenic acid (18:3), octadecatetraenoic acid (18:4), arachidic acid (20:0), eicosenoic acid (20:1), eicosadienoic acid (20:2), eicosatetraenoic acid (20:4), eicosapentaenoic acid (20:5) (EPA), docosanoic acid (22:0), docosenoic acid (22:1), docosapentaenoic acid (22:5), docosahexaenoic acid (22:6) (DHA), and tetracosanoic acid (24:0). In some embodiments the composition comprises at least one modified lipid, for example a lipid that has been modified by cooking.

In some embodiments the composition comprises at least one supplemental mineral or mineral source. Examples of minerals include, without limitation: chloride, sodium, calcium, iron, chromium, copper, iodine, zinc, magnesium, manganese, molybdenum, phosphorus, potassium, and selenium. Suitable forms of any of the foregoing minerals include soluble mineral salts, slightly soluble mineral salts, insoluble mineral salts, chelated minerals, mineral complexes, non-reactive minerals such as carbonyl minerals, and reduced minerals, and combinations thereof.

In some embodiments the composition comprises at least one supplemental vitamin. The at least one vitamin can be fat-soluble or water soluble vitamins. Suitable vitamins include but are not limited to vitamin C, vitamin A, vitamin E, vitamin B12, vitamin K, riboflavin, niacin, vitamin D, vitamin B6, folic acid, pyridoxine, thiamine, pantothenic acid, and biotin. Suitable forms of any of the foregoing are salts of the vitamin, derivatives of the vitamin, compounds having the same or similar activity of the vitamin, and metabolites of the vitamin.

In some embodiments the composition comprises at least one organism. Suitable examples are well known in the art and include probiotics (e.g., species of Lactobacillus or Bifidobacterium), spirulina, chlorella, and porphyra.

In some embodiments the composition comprises at least one dietary supplement. Suitable examples are well known in the art and include herbs, botanicals, and certain hormones. Non limiting examples include ginko, gensing, and melatonin.

In some embodiments the composition comprises an excipient. Non-limiting examples of suitable excipients include a tastant, a flavorant, a buffering agent, a preservative, a stabilizer, a binder, a compaction agent, a lubricant, a dispersion enhancer, a disintegration agent, a flavoring agent, a sweetener, a coloring agent.

In some embodiments the excipient is a buffering agent. Non-limiting examples of suitable buffering agents include sodium citrate, magnesium carbonate, magnesium bicarbonate, calcium carbonate, and calcium bicarbonate.

In some embodiments the excipient comprises a preservative. Non-limiting examples of suitable preservatives include antioxidants, such as alpha-tocopherol and ascorbate, and antimicrobials, such as parabens, chlorobutanol, and phenol.

In some embodiments the composition comprises a binder as an excipient. Non-limiting examples of suitable binders include starches, pregelatinized starches, gelatin, polyvinylpyrolidone, cellulose, methylcellulose, sodium carboxymethylcellulose, ethylcellulose, polyacrylamides, polyvinyloxoazolidone, polyvinylalcohols, C12-C18 fatty acid alcohol, polyethylene glycol, polyols, saccharides, oligosaccharides, and combinations thereof.

In some embodiments the composition comprises a lubricant as an excipient. Non-limiting examples of suitable lubricants include magnesium stearate, calcium stearate, zinc stearate, hydrogenated vegetable oils, sterotex, polyoxyethylene monostearate, talc, polyethyleneglycol, sodium benzoate, sodium lauryl sulfate, magnesium lauryl sulfate, and light mineral oil.

In some embodiments the composition comprises a dispersion enhancer as an excipient. Non-limiting examples of suitable dispersants include starch, alginic acid, polyvinylpyrrolidones, guar gum, kaolin, bentonite, purified wood cellulose, sodium starch glycolate, isoamorphous silicate, and microcrystalline cellulose as high HLB emulsifier surfactants.

In some embodiments the composition comprises a disintegrant as an excipient. In some embodiments the disintegrant is a non-effervescent disintegrant. Non-limiting examples of suitable non-effervescent disintegrants include starches such as corn starch, potato starch, pregelatinized and modified starches thereof, sweeteners, clays, such as bentonite, micro-crystalline cellulose, alginates, sodium starch glycolate, gums such as agar, guar, locust bean, karaya, pecitin, and tragacanth. In some embodiments the disintegrant is an effervescent disintegrant. Non-limiting examples of suitable effervescent disintegrants include sodium bicarbonate in combination with citric acid, and sodium bicarbonate in combination with tartaric acid.

In some embodiments the excipient comprises a flavoring agent. Flavoring agents incorporated into the outer layer can be chosen from synthetic flavor oils and flavoring aromatics; natural oils; extracts from plants, leaves, flowers, and fruits; and combinations thereof. In some embodiments the flavoring agent is selected from cinnamon oils; oil of wintergreen; peppermint oils; clover oil; hay oil; anise oil; eucalyptus; vanilla; citrus oil such as lemon oil, orange oil, grape and grapefruit oil; and fruit essences including apple, peach, pear, strawberry, raspberry, cherry, plum, pineapple, and apricot.

In some embodiments the excipient comprises a sweetener. Non-limiting examples of suitable sweeteners include glucose (corn syrup), dextrose, invert sugar, fructose, and mixtures thereof (when not used as a carrier); saccharin and its various salts such as the sodium salt; dipeptide sweeteners such as aspartame; dihydrochalcone compounds, glycyrrhizin; Stevia Rebaudiana (Stevioside); chloro derivatives of sucrose such as sucralose; and sugar alcohols such as sorbitol, mannitol, sylitol, and the like. Also contemplated are hydrogenated starch hydrolysates and the synthetic sweetener 3,6-dihydro-6-methyl-1,2,3-oxathiazin-4-one-2,2-dioxide, particularly the potassium salt (acesulfame-K), and sodium and calcium salts thereof.

In some embodiments the composition comprises a coloring agent. Non-limiting examples of suitable color agents include food, drug and cosmetic colors (FD&C), drug and cosmetic colors (D&C), and external drug and cosmetic colors (Ext. D&C). The coloring agents can be used as dyes or their corresponding lakes.

The weight fraction of the excipient or combination of excipients in the formulation is usually about 50% or less, about 45% or less, about 40% or less, about 35% or less, about 30% or less, about 25% or less, about 20% or less, about 15% or less, about 10% or less, about 5% or less, about 2% or less, or about 1% or less of the total weight of the amino acids in the composition.

The proteins and compositions disclosed herein can be formulated into a variety of forms and administered by a number of different means. The compositions can be administered orally, rectally, or parenterally, in formulations containing conventionally acceptable carriers, adjuvants, and vehicles as desired. The term “parenteral” as used herein includes subcutaneous, intravenous, intramuscular, or intrasternal injection and infusion techniques. In an exemplary embodiment, the protein or composition is administered orally.

Solid dosage forms for oral administration include capsules, tablets, caplets, pills, troches, lozenges, powders, and granules. A capsule typically comprises a core material comprising a protein or composition and a shell wall that encapsulates the core material. In some embodiments the core material comprises at least one of a solid, a liquid, and an emulsion. In some embodiments the shell wall material comprises at least one of a soft gelatin, a hard gelatin, and a polymer. Suitable polymers include, but are not limited to: cellulosic polymers such as hydroxypropyl cellulose, hydroxyethyl cellulose, hydroxypropyl methyl cellulose (HPMC), methyl cellulose, ethyl cellulose, cellulose acetate, cellulose acetate phthalate, cellulose acetate trimellitate, hydroxypropylmethyl cellulose phthalate, hydroxypropylmethyl cellulose succinate and carboxymethylcellulose sodium; acrylic acid polymers and copolymers, such as those formed from acrylic acid, methacrylic acid, methyl acrylate, ammonio methylacrylate, ethyl acrylate, methyl methacrylate and/or ethyl methacrylate (e.g., those copolymers sold under the trade name “Eudragit”); vinyl polymers and copolymers such as polyvinyl pyrrolidone, polyvinyl acetate, polyvinylacetate phthalate, vinylacetate crotonic acid copolymer, and ethylene-vinyl acetate copolymers; and shellac (purified lac). In some embodiments at least one polymer functions as taste-masking agents.

Tablets, pills, and the like can be compressed, multiply compressed, multiply layered, and/or coated. The coating can be single or multiple. In one embodiment, the coating material comprises at least one of a saccharide, a polysaccharide, and glycoproteins extracted from at least one of a plant, a fungus, and a microbe. Non-limiting examples include corn starch, wheat starch, potato starch, tapioca starch, cellulose, hemicellulose, dextrans, maltodextrin, cyclodextrins, inulins, pectin, mannans, gum arabic, locust bean gum, mesquite gum, guar gum, gum karaya, gum ghatti, tragacanth gum, funori, carrageenans, agar, alginates, chitosans, or gellan gum. In some embodiments the coating material comprises a protein. In some embodiments the coating material comprises at least one of a fat and oil. In some embodiments the at least one of a fat and an oil is high temperature melting. In some embodiments the at least one of a fat and an oil is hydrogenated or partially hydrogenated. In some embodiments the at least one of a fat and an oil is derived from a plant. In some embodiments the at least one of a fat and an oil comprises at least one of glycerides, free fatty acids, and fatty acid esters. In some embodiments the coating material comprises at least one edible wax. The edible wax can be derived from animals, insects, or plants. Non-limiting examples include beeswax, lanolin, bayberry wax, carnauba wax, and rice bran wax. Tablets and pills can additionally be prepared with enteric coatings.

Alternatively, powders or granules embodying the proteins and compositions disclosed herein can be incorporated into a food product. In some embodiments the food product is be a drink for oral administration. Non-limiting examples of a suitable drink include fruit juice, a fruit drink, an artificially flavored drink, an artificially sweetened drink, a carbonated beverage, a sports drink, a liquid diary product, a shake, an alcoholic beverage, a caffeinated beverage, infant formula and so forth. Other suitable means for oral administration include aqueous and nonaqueous solutions, creams, pastes, emulsions, suspensions and slurries, each of which may optionally also containin at least one of suitable solvents, preservatives, emulsifying agents, suspending agents, diluents, sweeteners, coloring agents, a tastant, a flavorant, and flavoring agents.

In some embodiments the food product is a solid foodstuff. Suitable examples of a solid foodstuff include without limitation a food bar, a snack bar, a cookie, a brownie, a muffin, a cracker, a biscuit, a cream or paste, an ice cream bar, a frozen yogurt bar, and the like.

In some embodiments, the proteins and compositions disclosed herein are incorporated into a therapeutic food. In some embodiments, the therapeutic food is a ready-to-use food that optionally contains some or all essential macronutrients and micronutrients. In some embodiments, the proteins and compositions disclosed herein are incorporated into a supplementary food that is designed to be blended into an existing meal. In some embodiments, the supplemental food contains some or all essential macronutrients and micronutrients. In some embodiments, the proteins and compositions disclosed herein are blended with or added to an existing food to fortify the food's protein nutrition. Examples include food staples (grain, salt, sugar, cooking oil, margarine), beverages (coffee, tea, soda, beer, liquor, sports drinks), snacks, sweets and other foods.

The compositions disclosed herein can be utilized in methods to increase at least one of muscle mass, strength and physical function, thermogenesis, metabolic expenditure, satiety, mitochondrial biogenesis, weight or fat loss, and lean body composition for example.

A formulation can contain a nutritive polypeptide up to about 25 g per 100 kilocalories (25 g/100 kcal) in the formulation, meaning that all or essentially all of the energy present in the formulation is in the form of the nutritive polypeptide. More typically, about 99%, 98%, 97%, 96%, 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5% or less than 5% of the energy present in the formulation is in the form of the nutritive polypeptide. In other formulations, the nutritive polypeptide is present in an amount sufficient to provide a nutritional benefit equivalent to or greater than at least about 0.1% of a reference daily intake value of polypeptide. Suitable reference daily intake values for protein are well known in the art. See, e.g., Dietary Reference Intakes for Energy, Carbohydrate, Fiber, Fat, Fatty Acids, Cholesterol, Protein and Amino Acids, Institute of Medicine of the National Academies, 2005, National Academies Press, Washington D.C. A reference daily intake value for protein is a range wherein 10-35% of daily calories are provided by protein and isolated amino acids. Another reference daily intake value based on age is provided as grams of protein per day: children ages 1-3: 13 g, children ages 4-8: 19 g, children ages 9-13: 34 g, girls ages 14-18: 46, boys ages 14-18: 52, women ages 19-70+: 46, and men ages 19-70+: 56. In other formulations, the nutritive polypeptide is present in an amount sufficient to provide a nutritional benefit to a human subject suffering from protein malnutrition or a disease, disorder or condition characterized by protein malnutrition. Protein malnutrition is commonly a prenatal or childhood condition. Protein malnutrition with adequate energy intake is termed kwashiorkor or hypoalbuminemic malnutrition, while inadequate energy intake in all forms, including inadequate protein intake, is termed marasmus. Adequately nourished individuals can develop sarcopenia from consumption of too little protein or consumption of proteins deficient in nutritive amino acids. Prenatal protein malnutrition can be prevented, treated or reduced by administration of the nutritive polypeptides described herein to pregnant mothers, and neonatal protein malnutrition can be prevented, treated or reduced by administration of the nutritive polypeptides described herein to the lactation mother. In adults, protein malnutrition is commonly a secondary occurrence to cancer, chronic renal disease, and in the elderly. Additionally, protein malnutrition can be chronic or acute. Examples of acute protein malnutrition occur during an acute illness or disease such as sepsis, or during recovery from a traumatic injury, such as surgery, thermal injury such as a burn, or similar events resulting in substantial tissue remodeling. Other acute illnesses treatable by the methods and compositions described herein include sarcopenia, cachexia, diabetes, insulin resistance, and obesity.

A formulation can contain a nutritive polypeptide in an amount sufficient to provide a feeling of satiety when consumed by a human subject, meaning the subject feels a reduced sense or absence of hunger, or desire to eat. Such a formulation generally has a higher satiety index than carbohydrate-rich foods on an equivalent calorie basis.

A formulation can contain a nutritive polypeptide in an amount based on the concentration of the nutritive polypeptide (e.g., on a weight-to-weight basis), such that the nutritive polypeptide accounts for up to 100% of the weight of the formulation, meaning that all or essentially all of the matter present in the formulation is in the form of the nutritive polypeptide. More typically, about 99%, 98%, 97%, 96%, 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5% or less than 5% of the weight present in the formulation is in the form of the nutritive polypeptide. In some embodiments, the formulation contains 10 mg, 100 mg, 500 mg, 750 mg, 1 g, 2 g, 3 g, 4 g, 5 g, 6 g, 7 g, 8 g, 9, 10 g, 15 g, 20 g, 25 g, 30 g, 35 g, 40 g, 45 g, 50 g, 60 g, 70 g, 80 g, 90 g, 100 g or over 100 g of nutritive polypeptide.

Preferably, the formulations provided herein are substantially free of non-comestible products. Non-comestible products are often found in preparations of recombinant proteins of the prior art, produced from yeast, bacteria, algae, insect, mammalian or other expression systems. Exemplary non-comestible products include surfactant, a polyvinyl alcohol, a propylene glycol, a polyvinyl acetate, a polyvinylpyrrolidone, a non-comestible polyacid or polyol, a fatty alcohol, an alkylbenzyl sulfonate, an alkyl glucoside, or a methyl paraben.

In aspects, the provided formulations contain other materials, such as a tastant, a nutritional carbohydrate and/or a nutritional lipid. In addition, formulations may include bulking agents, texturizers, and fillers.

In preferred embodiments, the nutritive polypeptides provided herein are isolated and/or substantially purified. The nutritive polypeptides and the compositions and formulations provided herein, are substantially free of non-protein components. Such non-protein components are generally present in protein preparations such as whey, casein, egg and soy preparations, which contain substantial amounts of carbohydrates and lipids that complex with the polypeptides and result in delayed and incomplete protein digestion in the gastrointestinal tract. Such non-protein components can also include DNA. Thus, the nutritive polypeptides, compositions and formulations are characterized by improved digestability and decreased allergenicity as compared to food-derived polypeptides and polypeptide mixtures. Furthermore, these formulations and compositions are characterized by more reproducible digestability from a time and/or a digestion product at a given unit time basis. In certain embodiments, a nutritive polypeptide is at least 10% reduced in lipids and/or carbohydrates, and optionally one or more other materials that decreases digestibility and/or increases allergenicity, relative to a reference polypeptide or reference polypeptide mixture, e.g., is reduced by 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% or greater than 99%. In certain embodiments, the nutritive formulations contain a nutritional carbohydrate and/or nutritional lipid, which may be selected for digestibility and/or reduced allegenicity.

Methods of Use

In some embodiments the proteins and compositions disclosed herein are administered to a patient or a user (sometimes collectively referred to as a “subject”). As used herein “administer” and “administration” encompasses embodiments in which one person directs another to consume a protein or composition in a certain manner and/or for a certain purpose, and also situations in which a user uses a protein or composition in a certain manner and/or for a certain purpose independently of or in variance to any instructions received from a second person. Non-limiting examples of embodiments in which one person directs another to consume a protein or composition in a certain manner and/or for a certain purpose include when a physician prescribes a course of conduct and/or treatment to a patient, when a trainer advises a user (such as an athlete) to follow a particular course of conduct and/or treatment, and when a manufacturer, distributer, or marketer recommends conditions of use to an end user, for example through advertisements or labeling on packaging or on other materials provided in association with the sale or marketing of a product.

In some embodiments the proteins or compositions are provided in a dosage form. In some embodiments the dosage form is designed for administration of at least one protein disclosed herein, wherein the total amount of protein administered is selected from 0.1 g to 1 g, 1 g to 5 g, from 2 g to 10 g, from 5 g to 15 g, from 10 g to 20 g, from 15 g to 25 g, from 20 g to 40 g, from 25-50 g, and from 30-60 g. In some embodiments the dosage form is designed for administration of at least one protein disclosed herein, wherein the total amount of protein administered is selected from about 0.1 g, 0.1 g-1 g, 1 g, 2 g, 3 g, 4 g, 5 g, 6 g, 7 g, 8 g, 9 g, 10 g, 15 g, 20 g, 25 g, 30 g, 35 g, 40 g, 45 g, 50 g, 55 g, 60 g, 65 g, 70 g, 75 g, 80 g, 85 g, 90 g, 95 g, and 100 g.

In some embodiments the dosage form is designed for administration of at least one protein disclosed herein, wherein the total amount of essential amino acids administered is selected from 0.1 g to 1 g, from 1 g to 5 g, from 2 g to 10 g, from 5 g to 15 g, from 10 g to 20 g, and from 1-30 g. In some embodiments the dosage form is designed for administration of at least one protein disclosed herein, wherein the total amount of protein administered is selected from about 0.1 g, 0.1-1 g, 1 g, 2 g, 3 g, 4 g, 5 g, 6 g, 7 g, 8 g, 9 g, 10 g, 15 g, 20 g, 25 g, 30 g, 35 g, 40 g, 45 g, 50 g, 55 g, 60 g, 65 g, 70 g, 75 g, 80 g, 85 g, 90 g, 95 g, and 100 g.

In some embodiments the protein or composition is consumed at a rate of from 0.1 g to 1 g a day, 1 g to 5 g a day, from 2 g to 10 g a day, from 5 g to 15 g a day, from 10 g to 20 g a day, from 15 g to 30 g a day, from 20 g to 40 g a day, from 25 g to 50 g a day, from 40 g to 80 g a day, from 50 g to 100 g a day, or more.

In some embodiments, of the total protein intake by the subject, at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, or about 100% of the total protein intake by the subject over a dietary period is made up of at least one protein according to this disclosure. In some embodiments, of the total protein intake by the subject, from 5% to 100% of the total protein intake by the subject, from 5% to 90% of the total protein intake by the subject, from 5% to 80% of the total protein intake by the subject, from 5% to 70% of the total protein intake by the subject, from 5% to 60% of the total protein intake by the subject, from 5% to 50% of the total protein intake by the subject, from 5% to 40% of the total protein intake by the subject, from 5% to 30% of the total protein intake by the subject, from 5% to 20% of the total protein intake by the subject, from 5% to 10% of the total protein intake by the subject, from 10% to 100% of the total protein intake by the subject, from 10% to 100% of the total protein intake by the subject, from 20% to 100% of the total protein intake by the subject, from 30% to 100% of the total protein intake by the subject, from 40% to 100% of the total protein intake by the subject, from 50% to 100% of the total protein intake by the subject, from 60% to 100% of the total protein intake by the subject, from 70% to 100% of the total protein intake by the subject, from 80% to 100% of the total protein intake by the subject, or from 90% to 100% of the total protein intake by the subject, over a dietary period, is made up of at least one protein according to this disclosure. In some embodiments the at least one protein of this disclosure accounts for at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, or at least 50% of the subject's calorie intake over a dietary period.

In some embodiments the at least one protein according to this disclosure comprises at least 2 proteins of this disclosure, at least 3 proteins of this disclosure, at least 4 proteins of this disclosure, at least 5 proteins of this disclosure, at least 6 proteins of this disclosure, at least 7 proteins of this disclosure, at least 8 proteins of this disclosure, at least 9 proteins of this disclosure, at least 10 proteins of this disclosure, or more.

In some embodiments the dietary period is 1 meal, 2 meals, 3 meals, at least 1 day, at least 2 days, at least 3 days, at least 4 days, at least 5 days, at least 6 days, at least 1 week, at least 2 weeks, at least 3 weeks, at least 4 weeks, at least 1 month, at least 2 months, at least 3 months, at least 4 months, at least 5 months, at least 6 months, or at least 1 year. In some embodiments the dietary period is from 1 day to 1 week, from 1 week to 4 weeks, from 1 month, to 3 months, from 3 months to 6 months, or from 6 months to 1 year.

Clinical studies provide evidence that protein prevents muscle loss due to aging or disuse, such as from immobility or prolonged bed rest. In particular, studies have shown that protein supplementation increases muscle fractional synthetic rate (FSR) during prolonged bed rest, maintains leg mass and strength during prolonged bed rest, increases lean body mass, improves functional measures of gait and balance, and may serve as a viable intervention for individuals at risk of sarcopenia due to immobility or prolonged bed rest. See, e.g., Paddon-Jones D, et al. J Clin Endocrinol Metab 2004, 89:4351-4358; Ferrando, A et al. Clinical Nutrition 2009 1-6; Katsanos C et al. Am J Physiol Endocrinol Metab. 2006, 291: 381-387.

Studies on increasing muscle protein anabolism in athletes have shown that protein provided following exercise promotes muscle hypertrophy to a greater extent than that achieved by exercise alone. It has also been shown that protein provided following exercise supports protein synthesis without any increase in protein breakdown, resulting in a net positive protein balance and muscle mass accretion. While muscle protein synthesis appears to respond in a dose-response fashion to essential amino acid supplementation, not all proteins are equal in building muscle. For example, the amino acid leucine is an important factor in stimulating muscle protein synthesis. See, e.g., Borscheim E et al. Am J Physiol Endocrinol Metab 2002, 283: E648-E657; Borsheim E et al. Clin Nutr. 2008, 27: 189-95; Esmarck B et al J Physiol 2001, 535: 301-311; Moore D et al. Am J Clin Nutr 2009, 89: 161-8).

In another aspect this disclosure provides methods of maintaining or increasing at least one of muscle mass, muscle strength, and functional performance in a subject. In some embodiments the methods comprise providing to the subject a sufficient amount of a protein of this disclosure, a composition of this disclosure, or a composition made by a method of this disclosure. In some embodiments the subject is at least one of elderly, critically-medically ill, and suffering from protein-energy malnutrition. In some embodiments the sufficient amount of a protein of this disclosure, a composition of this disclosure, or a composition made by a method of this disclosure is consumed by the subject in coordination with performance of exercise. In some embodiments the protein of this disclosure, composition of this disclosure, or composition made by a method of this disclosure is consumed by the subject by an oral, enteral, or parenteral route. In some embodiments the protein of this disclosure, composition of this disclosure, or composition made by a method of this disclosure is consumed by the subject by an oral route. In some embodiments the protein of this disclosure, composition of this disclosure, or composition made by a method of this disclosure is consumed by the subject by an enteral route.

In another aspect this disclosure provides methods of maintaining or achieving a desirable body mass index in a subject. In some embodiments the methods comprise providing to the subject a sufficient amount of a protein of this disclosure, a composition of this disclosure, or a composition made by a method of this disclosure. In some embodiments the subject is at least one of elderly, critically-medically ill, and suffering from protein-energy malnutrition. In some embodiments the sufficient amount of a protein of this disclosure, a composition of this disclosure, or a composition made by a method of this disclosure is consumed by the subject in coordination with performance of exercise. In some embodiments the protein of this disclosure, composition of this disclosure, or composition made by a method of this disclosure is consumed by the subject by an oral, enteral, or parenteral route.

In another aspect this disclosure provides methods of providing protein to a subject with protein-energy malnutrition. In some embodiments the methods comprise providing to the subject a sufficient amount of a protein of this disclosure, a composition of this disclosure, or a composition made by a method of this disclosure. In some embodiments the protein of this disclosure, composition of this disclosure, or composition made by a method of this disclosure is consumed by the subject by an oral, enteral, or parenteral route.

The need for essential amino acid supplementation has been suggested in cancer patients and other patients suffering from muscle wasting and cachexia. Dietary studies in mice have shown survival and functional benefits to cachectic cancer-bearing mice through dietary intervention with essential amino acids. Beyond cancer, essential amino acid supplementation has also shown benefits, such as improved muscle function and muscle gain, in patients suffering from other diseases that have difficulty exercising and therefore suffer from muscular deterioration, such as chronic obstructive pulmonary disease, chronic heart failure, HIV, and other disease states.

Studies have shown that specific amino acids have advantages in managing cachexia. A relatively high content of BCAAs and Leu in diets are thought to have a positive effect in cachexia by promoting total protein synthesis by signaling an increase in translation, enhancing insulin release, and inhibiting protein degradation. Thus, consuming increased dietary BCAAs in general and/or Leu in particular will contribute positively to reduce or reverse the effects of cachexia. Because nitrogen balance is important in countering the underlying cause of cachexia it is thought that consuming increased dietary glutamine and/or arginine will contribute positively to reduce or reverse the effects of cachexia. See, e.g., Op den Kamp C, Langen R, Haegens A, Schols A. “Muscle atrophy in cachexia: can dietary protein tip the balance?” Current Opinion in Clinical Nutrition and Metabolic Care 2009, 12:611-616; Poon R T-P, Yu W-C, Fan S-T, et al. “Long-term oral branched chain amino acids in patients undergoing chemoembolization for hepatocellular carcinoma: a randomized trial.” Aliment Pharmacol Ther 2004; 19:779-788; Tayek J A, Bistrian B R, Hehir D J, Martin R, Moldawer L L, Blackburn G L. “Improved protein kinetics and albumin synthesis by branched chain amino acid-enriched total parenteral nutrition in cancer cachexia.” Cancer. 1986; 58:147-57; Xi P, Jiang Z, Zheng C, Lin Y, Wu G “Regulation of protein metabolism by glutamine: implications for nutrition and health.” Front Biosci. 2011 Jan. 1; 16:578-97.

Accordingly, also provided herein are methods of treating cachexia in a subject. In some embodiments a sufficient amount of a protein of this disclosure, a composition of this disclosure, or a composition made by a method of this disclosure for a subject with cachexia is an amount such that the amount of protein of this disclosure ingested by the person meets or exceeds the metabolic needs (which are often elevated). A protein intake of 1.5 g/kg of body weight per day or 15-20% of total caloric intake appears to be an appropriate target for persons with cachexia. In some embodiments all of the protein consumed by the subject is a protein according to this disclosure. In some embodiments protein according to this disclosure is combined with other sources of protein and/or free amino acids to provide the total protein intake of the subject. In some embodiments the subject is at least one of elderly, critically-medically ill, and suffering from protein-energy malnutrition. In some embodiments the subject suffers from a disease that makes exercise difficult and therefore causes muscular deterioration, such as chronic obstructive pulmonary disease, chronic heart failure, HIV, cancer, and other disease states. In some embodiments, the protein according to disclosure, the composition according to disclosure, or the composition made by a method according to disclosure is consumed by the subject in coordination with performance of exercise. In some embodiments, the protein according to this disclosure, the composition according to disclosure, or the composition made by a method according to disclosure is consumed by the subject by an oral, enteral, or parenteral route.

Sarcopenia is the degenerative loss of skeletal muscle mass (typically 0.5-1% loss per year after the age of 25), quality, and strength associated with aging. Sarcopenia is a component of the frailty syndrome. The European Working Group on Sarcopenia in Older People (EWGSOP) has developed a practical clinical definition and consensus diagnostic criteria for age-related sarcopenia. For the diagnosis of sarcopenia, the working group has proposed using the presence of both low muscle mass and low muscle function (strength or performance). Sarcopenia is characterized first by a muscle atrophy (a decrease in the size of the muscle), along with a reduction in muscle tissue “quality,” caused by such factors as replacement of muscle fibres with fat, an increase in fibrosis, changes in muscle metabolism, oxidative stress, and degeneration of the neuromuscular junction. Combined, these changes lead to progressive loss of muscle function and eventually to frailty. Frailty is a common geriatric syndrome that embodies an elevated risk of catastrophic declines in health and function among older adults. Contributors to frailty can include sarcopenia, osteoporosis, and muscle weakness. Muscle weakness, also known as muscle fatigue, (or “lack of strength”) refers to the inability to exert force with one's skeletal muscles. Weakness often follows muscle atrophy and a decrease in activity, such as after a long bout of bedrest as a result of an illness. There is also a gradual onset of muscle weakness as a result of sarcopenia.

The proteins of this disclosure are useful for treating sarcopenia or frailty once it develops in a subject or for preventing the onset of sarcopenia or frailty in a subject who is a member of an at risk groups. In some embodiments all of the protein consumed by the subject is a protein according to this disclosure. In some embodiments protein according to this disclosure is combined with other sources of protein and/or free amino acids to provide the total protein intake of the subject. In some embodiments the subject is at least one of elderly, critically-medically ill, and suffering from protein-energy malnutrition. In some embodiments, the protein according to disclosure, the composition according to disclosure, or the composition made by a method according to disclosure is consumed by the subject in coordination with performance of exercise. In some embodiments, the protein according to this disclosure, the composition according to disclosure, or the composition made by a method according to disclosure is consumed by the subject by an oral, enteral, or parenteral route.

Obesity is a multifactorial disorder associated with a host of comorbidities including hypertension, type 2 diabetes, dyslipidemia, coronary heart disease, stroke, cancer (eg, endometrial, breast, and colon), osteoarthritis, sleep apnea, and respiratory problems. The incidence of obesity, defined as a body mass index >30 kg/m2, has increased dramatically in the United States, from 15% (1976-1980) to 33% (2003-2004), and it continues to grow. Although the mechanisms contributing to obesity are complex and involve the interplay of behavioral components with hormonal, genetic, and metabolic processes, obesity is largely viewed as a lifestyle-dependent condition with 2 primary causes: excessive energy intake and insufficient physical activity. With respect to energy intake, there is evidence that modestly increasing the proportion of protein in the diet, while controlling total energy intake, may improve body composition, facilitate fat loss, and improve body weight maintenance after weight loss. Positive outcomes associated with increased dietary protein are thought to be due primarily to lower energy intake associated with increased satiety, reduced energy efficiency and/or increased thermogenesis, positive effects on body composition (specifically lean muscle mass), and enhanced glycemic control.

Dietary proteins are more effective in increasing post-prandial energy expenditure than isocaloric intakes of carbohydrates or fat (see, e.g., Dauncey M, Bingham S. “Dependence of 24 h energy expenditure in man on composition of the nutrient intake.” Br J Nutr 1983, 50: 1-13; Karst H et al. “Diet-induced thermogenesis in man: thermic effects of single proteins, carbohydrates and fats depending on their energy amount.” Ann Nutr Metab. 1984, 28: 245-52; Tappy L et al “Thermic effect of infused amino acids in healthy humans and in subjects with insulin resistance.” Am J Clin Nutr 1993, 57 (6): 912-6). This property along with other properties (satiety induction; preservation of lean body mass) make protein an attractive component of diets directed at weight management. The increase in energy expenditure caused by such diets may in part be due to the fact that the energy cost of digesting and metabolizing protein is higher than for other calorie sources. Protein turnover, including protein synthesis, is an energy consuming process. In addition, high protein diets may also up-regulate uncoupling protein in liver and brown adipose, which is positively correlated with increases in energy expenditure. It has been theorized that different proteins may have unique effects on energy expenditure.

Studies suggest that ingestion of protein, particularly proteins with high EAA and/or BCAA content, leads to distinct effects on thermogenesis and energy expenditure (see, e.g., Mikkelsen P. et al. “Effect of fat-reduced diets on 24 h energy expenditure: comparisons between animal protein, vegetable protein and carbohydrate.” Am J Clin Nutr 2000, 72:1135-41; Acheson K. et al. “Protein choices targeting thermogenesis and metabolism.” Am J Clin Nutr 2011, 93:525-34; Alfenas R. et al. “Effects of protein quality on appetite and energy metabolism in normal weight subjects” Arg Bras Endocrinol Metabol 2010, 54 (1): 45-51; Lorenzen J. et al. “The effect of milk proteins on appetite regulation and diet-induced thermogenesis.” J Clin Nutr 2012 66 (5): 622-7). Additionally, L-tyrosine has been identified as an amino acid that plays a role in thermogenesis (see, e.g., Belza A. et al. “The beta-adrenergic antagonist propranolol partly abolishes thermogenic response to bioactive food ingredients.” Metabolism 2009, 58 (8):1137-44). Further studies suggest that Leucine and Arginine supplementation appear to alter energy metabolism by directing substrate to lean body mass rather than adipose tissue (Dulloo A. “The search for compounds that stimulate thermogenesis in obesity management: from pharmaceuticals to functional food ingredients.” Obes Rev 2011 12: 866-83).

Collectively the literature suggests that different protein types leads to distinct effects on thermogenesis. Because proteins or peptides rich in EAAs, BCAA, and/or at least one of Tyr, Arg, and Leu are believed to have a stimulatory effect on thermogenesis, and because stimulation of thermogenesis is believed to lead to positive effects on weight management, this disclosure also provides products and methods useful to stimulation thermogenesis and/or to bring about positive effects on weight management in general.

More particularly, this disclosure provides methods of increasing thermogenesis in a subject. In some embodiments the methods comprise providing to the subject a sufficient amount of a protein of this disclosure, a composition of this disclosure, or a composition made by a method of this disclosure. In some embodiments the subject is obese. In some embodiments, the protein according to disclosure, the composition according to disclosure, or the composition made by a method according to disclosure is consumed by the subject in coordination with performance of exercise. In some embodiments, the protein according to disclosure, the composition according to disclosure, or the composition made by a method according to disclosure is consumed by the subject by an oral, enteral, or parenteral route.

At the basic level, the reason for the development of an overweight condition is due to an imbalance between energy intake and energy expenditure. Attempts to reduce food at any particular occasion (satiation) and across eating occasions (satiety) have been a major focus of recent research. Reduced caloric intake as a consequence of feeling satisfied during a meal and feeling full after a meal results from a complex interaction of internal and external signals. Various nutritional studies have demonstrated that variation in food properties such as energy density, content, texture and taste influence both satiation and satiety.

There are three macronutrients that deliver energy: fat, carbohydrates and proteins. A gram of protein or carbohydrate provides 4 calories while a gram of fat 9 calories. Protein generally increases satiety to a greater extent than carbohydrates or fat and therefore may facilitate a reduction in calorie intake. However, there is considerable evidence that indicates the type of protein matters in inducing satiety (see, e.g., W. L. Hall, et al. “Casein and whey exert different effects on plasma amino acid profiles, gastrointestinal hormone secretion and appetite.” Br J Nutr. 2003 February, 89(2):239-48; R. Abou-Samra, et al. “Effect of different protein sources on satiation and short-term satiety when consumed as a starter.” Nutr J. 2011 Dec. 23, 10:139; T. Akhavan, et al. “Effect of premeal consumption of whey protein and its hydrolysate on food intake and postmeal glycemia and insulin responses in young adults.” Am J Clin Nutr. 2010 April, 91(4):966-75, Epub 2010 Feb. 17; MA Veldhorst “Dose-dependent satiating effect of whey relative to casein or soy” Physiol Behay. 2009 Mar. 23, 96(4-5):675-82). Evidence indicates that protein rich in Leucine is particularly effective at inducing satiety (see, e.g., Fromentin G et al “Peripheral and central mechanisms involved in the control of food intake by dietary amino acids and proteins.” Nutr Res Rev 2012 25: 29-39).

In some embodiments a protein of this disclosure is consumed by a subject concurrently with at least one pharmaceutical or biologic drug product. In some embodiments the beneficial effects of the protein and the at least one pharmaceutical or biologic drug product have an additive effect while in some embodiments the beneficial effects of the protein and the at least one pharmaceutical or biologic drug product have a synergistic effect. Examples of pharmaceutical or biologic drug products that can be administered with the proteins of this disclosure are well known in the art. For example, when a protein of this disclosure is used to maintain or increase at least one of muscle mass, muscle strength, and functional performance in a subject, the protein can be consumed by a subject concurrently with a therapeutic dosage regime of at least one pharmaceutical or biologic drug product indicated to maintain or increase at least one of muscle mass, muscle strength, and functional performance in a subject, such as an anabolic steroid. When a protein of this disclosure is used to maintain or achieve a desirable body mass index in a subject, the protein can be consumed by a subject concurrently with a therapeutic dosage regime of at least one pharmaceutical or biologic drug product indicated to maintain or achieve a desirable body mass index in a subject, such as orlistat, lorcaserin, sibutramine, rimonabant, metformin, exenatide, or pramlintide. When a protein of this disclosure is used to induce at least one of a satiation response and a satiety response in a subject, the protein can be consumed by a subject concurrently with a therapeutic dosage regime of at least one pharmaceutical or biologic drug product indicated to induce at least one of a satiation response and a satiety response in a subject, such as rimonabant, exenatide, or pramlintide. When a protein of this disclosure is used to treat at least one of cachexia, sarcopenia and frailty in a subject, the protein can be consumed by a subject concurrently with a therapeutic dosage regime of at least one pharmaceutical or biologic drug product indicated to treat at least one of cachexia, sarcopenia and frailty, such as omega-3 fatty acids or anabolic steroids. Because of the role of dietary protein in inducing satiation and satiety, the proteins and compositions disclosed herein can be used to induce at least one of a satiation response and a satiety response in a subject. In some embodiments the methods comprise providing to the subject a sufficient amount of a protein of this disclosure, a composition of this disclosure, or a composition made by a method of this disclosure. In some embodiments the subject is obese. In some embodiments, the protein according to disclosure, the composition according to disclosure, or the composition made by a method according to disclosure is consumed by the subject in coordination with performance of exercise. In some embodiments, the protein according to disclosure, the composition according to disclosure, or the composition made by a method according to disclosure is consumed by the subject by an oral, enteral, or parenteral route.

In some embodiments incorporating a least one protein or composition of this disclosure into the diet of a subject has at least one effect selected from inducing postprandial satiety (including by suppressing hunger), inducing thermogenesis, reducing glycemic response, positively affecting energy expenditure positively affecting lean body mass, reducing the weight gain caused by overeating, and decreasing energy intake. In some embodiments incorporating a least one protein or composition of this disclosure into the diet of a subject has at least one effect selected from increasing loss of body fat, reducing lean tissue loss, improving lipid profile, and improving glucose tolerance and insulin sensitivity in the subject.

Examples of the techniques and protocols described herein can be found in Remington's Pharmaceutical Sciences, 16th edition, Osol, A. (ed), 1980.

Examples

Below are examples of specific embodiments for carrying out the present invention. The examples are offered for illustrative purposes only, and are not intended to limit the scope of the present invention in any way. Efforts have been made to ensure accuracy with respect to numbers used (e.g., amounts, temperatures, etc.), but some experimental error and deviation should, of course, be allowed for.

The practice of the present invention will employ, unless otherwise indicated, conventional methods of protein chemistry, biochemistry, recombinant DNA techniques and pharmacology, within the skill of the art. Such techniques are explained fully in the literature. See, e.g., T. E. Creighton, Proteins: Structures and Molecular Properties (W.H. Freeman and Company, 1993); A. L. Lehninger, Biochemistry (Worth Publishers, Inc., current addition); Sambrook, et al., Molecular Cloning: A Laboratory Manual (2nd Edition, 1989); Methods In Enzymology (S. Colowick and N. Kaplan eds., Academic Press, Inc.); Remington's Pharmaceutical Sciences, 18th Edition (Easton, Pa.: Mack Publishing Company, 1990); Carey and Sundberg Advanced Organic Chemistry 3rd Ed. (Plenum Press) Vols A and B(1992).

Example 1. Identification and Selection of Amino Acid Sequences of Nutritive Polypeptides of Edible Species Using Mass Spectrometric Analyses

Provided is a process for identifying one or a plurality of nutritive polypeptide amino acid sequences, such as from a polypeptide or nucleic acid library, or from a relevant database of protein sequences. Here, nutritive polypeptide amino acid sequences were identified by mass spectroscopy analysis of proteins extracted and purified from edible species.

Protein Isolation for Mass Spectroscopy.

Proteins were extracted from solid edible sources. Samples from the following species were included in the analysis: Actinidia deliciosa, Agaricus bisporus var. bisporus, Arthrospira platensis, Bos taurus, Brassica oleracea, Cannabis, Chenopodium quinoa, Chlorella regularis, Chlorella variabilis, Cicer arietinum, Cucurbita maxima, Fusarium graminearum, Gadus morhua, Gallus gallus, Glycine Max, Lactobacillus acidophilus, Laminariales, Linum usitatissimum, Meleagris gallopavo, Odocoileus virginianus, Oreochromis niloticus, Oryza sativa, Ovis aries, Palmaria palmata, Persea americana, Prunus mume, Saccharomyces cerevisiae, Salmo salar, Solanum lycopesicum, Solanum tuberosum, Sus scrofa, Thunnus thynnus, Vaccinium corymbosum, Vitis vinifera, and Zea mays. Each sample was first frozen at −80 C and then ground using a mortar and pestle before weighing 50 mg of material into a microcentrifuge tube. The 50 mg sample was then resuspended in 1 mL of extraction buffer (8.3 M urea, 2 M thiourea, 2% w/v CHAPS, 1% w/v DTT) and agitated for 30 minutes. Addition of 500 μL of 100-μm zirconium beads (Ops Diagnostics) was followed by continued agitation for an additional 30 minutes. Samples were run on a TissueLyser II (Qiagen) at 30 Hz for 3 minutes and then centrifuged for 10 minutes at 21,130 g in a benchtop microcentrifuge (Eppendorf). Supernatants were transferred to clean microcentrifuge tubes, aliquoted into 50 μL aliquots, and stored at −80° C. The amount of soluble protein extracted was measured by Coomassie Plus (Bradford) Protein Assay (Thermo Scientific). 20 ug of protein was run on 10% 10-lane BisTris SDS-PAGE gel (Invitrogen) and then excised for analysis by LC/MS/MS.

Proteins were also isolated from liquid cultures of the following edible organisms: Aspergillus niger, Bacillus subtilis, Bacillus licheniformis, and Bacillus amyloliquefaciens. Aspergillus and bacillus organisms were cultured as described herein. Clarified supernatants were isolated by centrifuging (10,000×g) cultures for 10 minutes, followed by filtering the supernatant using a 0.2 μM filter. The amount of soluble protein in the clarified supernatant was measured by Coomassie Plus (Bradford) Protein Assay (Thermo Scientific). Protein samples (20 μg) were run on a 10% Precast BisTris SDS-PAGE gel (Invitrogen) according to the manufacturer's protocol.

Mass Spectroscopy.

For LC/MS/MS analysis each gel was excised into five equally sized pieces. Trypsin digestion was performed using a robot (ProGest, DigiLab) with the following protocol: washed with 25 mM ammonium bicarbonate followed by acetonitrile, reduced with 10 mM dithiothreitol at 60° C. followed by alkylation with 50 mM iodoacetamide at RT, digested with trypsin (Promega) at 37° C. for 4 h, quenched with formic acid and the supernatant was analyzed directly without further processing. The gel digests for each sample were pooled and analyzed by nano LC/MS/MS with a Waters NanoAcquity HPLC system interfaced to a ThermoFisher Q Exactive. Peptides were loaded on a trapping column and eluted over a 75 μm analytical column at 350 nL/min; both columns were packed with Jupiter Proteo resin (Phenomenex). The mass spectrometer was operated in data-dependent mode, with MS and MS/MS performed in the Orbitrap at 70,000 FWHM and 17,500 FWHM resolution, respectively. The fifteen most abundant ions were selected for MS/MS. Resulting data were searched against a Uniprot and/or NCBI protein database from the corresponding organism using Mascot with the following parameters: Enzyme—Trypsin/P, Fixed modification—Carbamidomethyl (C) Variable modifications—Oxidation (M), Acetyl (Protein N-term), Pyro-Glu (N-term Q), Deamidation (NQ), Mass values Monoisotopic, Peptide Mass Tolerance—10 ppm, Fragment Mass Tolerance—0.015 Da, Max Missed Cleavages 2. Mascot DAT files were parsed into the Scaffold software for validation, filtering and to create a non-redundant list per sample. Data were filtered using a minimum protein value of 90%, a minimum peptide value of 50% (Prophet scores) and requiring at least two unique peptides per protein. Relative abundance of detected proteins was determined by spectral counts, which is the number of spectra acquired for each protein. Spectral counting is a label-free quantification method commonly used by the protein mass spec field (Liu, Hongbin et al. Analytical chemistry 76.14 (2004): 4193-4201). To calculate the relative abundance of each protein in the protein isolate the number of protein spectral counts is divided by the total protein spectral counts. SEQID 894-3415 were identified using this method.

Homolog Discovery.

For the nutritive polypeptide sequences identified, as described, similar sequences are identified from other species, SEQID-00093, which was identified by this method, was used to search for homologs using the computer program BLAST as described herein. Example nutritive polypeptide homologs from the edible database identified in this way are shown in table E1A. Example nutritive polypeptide homologous from the expressed sequence database identified in this way are shown in table E1B.

TABLE E1A Edible Sequences identified as homologs to SEQID-00093. SEQID EAA Percent ID to SEQID-00093 SEQID-00094 0.45 98.8 SEQID-00092 0.42 89.9 SEQID-00075 0.46 67.9 SEQID-00072 0.46 67.3 SEQID-03712 0.46 66.0 SEQID-03763 0.44 50.3 SEQID-03708 0.45 51.6 SEQID-03798 0.46 51.6 SEQID-03860 0.45 50.9 SEQID-03651 0.46 50.9

TABLE E1B Expressed Sequences identified as homologs to SEQID-00093 SEQID EAA Percent ID to SEQID-00093 SEQID-00074 0.42 91.2 SEQID-00092 0.42 89.9 SEQID-00075 0.46 66.9 SEQID-00078 0.46 65 SEQID-00106 0.46 50.3 SEQID-00104 0.45 50.3 SEQID-00864 0.45 50.3 SEQID-00870 0.45 43.8 SEQID-00867 0.47 44.5 SEQID-00105 0.44 41.1 SEQID-00103 0.45 40.5 SEQID-00866 0.40 39.8

Example 2. Identification and Selection of Amino Acid Sequences of Nutritive Polypeptides of Edible Species Using cDNA Libraries

Here, nutritive polypeptide amino acid sequences were identified by analysis of proteins produced from nucleic acid sequences extracted and purified from edible species.

Construction of cDNA Library.

A library of cDNA from twelve edible species was constructed. The twelve edible species were divided into five categories for RNA extraction. Animal tissues including ground beef, pork, lamb, chicken, turkey, and a portion of tilapia was combined with 50 mg from each edible species. Fruit tissues from grape and tomato including both the skin and the fruit were grounded and combined with 2.5 g from each species. Seeds of rice and soybean were combined with 1 g from each species and grounded into powder. 12 ml of Saccharomyces cerevisiae were grown overnight and spun down to obtain 110 mg of wet cell weight of yeast. 1 g of mushroom mycelium was grounded and processed using fungi RNA extraction protocols. All five categories of samples were snap frozen with liquid nitrogen, thawed and lysed using category-specific RNA extract protocols. The RNA from different food categories was extracted and combined as one pooled sample. The combined pool of RNA was reverse transcribed into cDNA using oligo-dT as primers resulting in cDNA of length between 500 bp to 4 kb. Adaptors were ligated to each end of the cDNA and used as PCR primers for amplification of the cDNA library and also included Sfi I restriction digestion sites for cloning the library into an expression vector. The cDNA library was denatured and re-annealed and the single-stranded DNA was selected using gel electrophoresis. This process removed extra cDNA from highly abundant RNA species to obtain a normalized cDNA library. The normalized cDNA library was precipitated using ethanol precipitation before PCR amplification and cloning into the expression vectors.

TABLE E2A Primer and adapter sequences flanking the cDNA. Adapter SEQID Sequence (underlined: Sfi I site) 5′ adapter 3910 CAGTGGTATCAACGCAGAGTGGCCATTACGGCC AAGTTACGGG 3′ adapter 3911 CAGTGGTATCAACGCAGAGTGGCCGAGGCGGCC TTTTTTTTTTTTTTT

Cloning of cDNA Library into E. coli for Protein Expression.

The cDNA library was cloned into the pET15b backbone vector, which was amplified with primers with overhangs that contain the corresponding SfiI restriction sites (forward primer overhang: TACGTGTATGGCCGCCTCGGCC (SEQ ID NO: 3912); reverse primer overhang: TACGTGTATGGCCGTAATGGCC (SEQ ID NO: 3913)). pET15b contains a pBR322 origin of replication, lac-controlled T7 promoter, and a bla gene conferring resistance to carbenicillin. Both the cDNA library and PCR amplified backbone were cut with SfiI, PCR purified, and ligated. The ligation reaction was transformed into 10-Beta High Efficiency Competent Cells (New England Biolabs), and transformed cells were plated onto four LB agar plates containing 100 mg/L carbenicillin. Plates were incubated at 37° C. overnight. After colonies had grown, 2 mL of liquid LB medium was added to each plate. Cells were scraped into the liquid and mixed together, and the suspension was prepared for plasmid extraction to form the multiplex cDNA plasmid library.

E coli cDNA Multiplex Expression Methods.

Four strains were used to express the cDNA libraries: T7 Express from New England Biolabs; and Rosetta 2(DE3), Rosetta-gami B(DE3), and Rosetta-gami 2(DE3) from EMD Millipore. T7 Express is an enhanced BL21 derivative which contains the T7 RNA polymerase in the lac operon, while lacking the Lon and OmpT proteases. The genotype of T7 Express is: fhuA2 lacZ::T7 gene1 [lon] ompT gal sulA11 R(mcr-73::miniTn10-TetS)2 [dcm] R(zgb-210::Tn10-TetS)endA1 Δ(mcrC-mrr)114::IS10. Rosetta 2(DE3) is a BL21 derivative that supplies tRNAs for 7 rare codons (AGA, AGG, AUA, CUA, GGA, CCC, CGG). The strain is a lysogen of λDE3, and carries the T7 RNA polymerase gene under the lacUV5 promoter. The genotype of Rosetta 2(DE3) is: F ompT hsdSB(rB mB) gal dcm (DE3) pRARE2 (CamR). Rosetta-gami B(DE3) has the same properties as Rosetta 2(DE3) but includes characteristics that enhance the formation of protein disulfide bonds in the cytoplasm. The genotype of Rosetta-gami B(DE3) is F ompT hsdSB (rB mB) gal dcm lacY1 ahpC (DE3) gor522::Tn10 trxBpRARE (CamR, KanR, TerR). Rosetta-gami 2(DE3), similarly to Rosetta-gami B(DE3), alleviates codon bias, enhances disulfide bond formation, and have the T7 RNA polymerase gene under the lacUV5 promoter in the chromosome. The genotype of Rosetta-gami 2(DE3) is Δ(ara-leu)7697 ΔlacX74 ΔphoA PvuII phoR araD139 ahpC galE galK rpsL(DE3) F′[lac+ lacIq pro] gor522::Tn10 trxB pRARE2 (CamR, StrR, TetR)

Roughly 200 ng of prepared cDNA libraries were transformed into the four background strains: T7 Express, Rosetta 2(DE3), Rosetta-gami B(DE3), and Rosetta-gami 2(DE3) competent cells. After transforming, 100 μL of each strain was plated onto four LB (10 g/l NaCl, 10 g/l tryptone, and 5 g/l yeast extract) 1.5% agar plates containing 100 mg/L carbenicillin and incubated at 37° C. for 16 hrs. After incubation, 2 mL of LB media with 100 mg/L carbenicillin was added to the surface of each plate containing several thousand transformants, and the cells were suspended in the surface medium by scraping with a cell spreader and mixing. Suspended cells from the four replicate plates from each background were combined to form the pre-inoculum cultures for the expression experiments.

The OD600 of the pre-inoculum cultures made from re-suspended cells were measured using a plate reader to be between 35 and 40 (T7, Rosetta 2(DE3) or 15 and 20 (Rosetta-gami B(DE3) and 2(DE3)). For the four background strains, 125 mL baffled shake flasks containing 10 mL of LB medium with 100 mg/L carbenicillin were inoculated to OD600 0.2 to form the inoculum cultures, and incubated at 37° C. shaking at 250 rpm for roughly 6 hours. OD600 was measured and the inoculum cultures were used to inoculate expression cultures in 2 L baffled shake flasks containing 250 mL of BioSilta Enbase medium with 100 mg/L carbenicillin, 600 mU/L of glucoamylase and 0.01% Antifoam 204 to an OD600 of 0.1. Cultures were shaken at 30° C. and 250 rpm for 18 hours, and were induced with 1 mM IPTG and supplemented with additional EnBase media components and another 600 mU/L of glucoamylase. Heterologous expression was carried out for 24 hours at 30° C. and 250 rpm, at which point the cultures were terminated. The terminal cell density was measured and the cells were harvested by centrifugation (5000×g, 10 min, RT). Cells were stored at −80° C. before being lysed with B-PER (Pierce) according to the manufacturer's protocol. After cell lysis, the whole cell lysate is sampled for analysis. In the Rosetta (DE3) strain, the whole cell lysate is centrifuged (3000×g, 10 min RT) and the supernatant is collect as the soluble fraction of the lysate. Cell lysates were run on SDS-PAGE gels, separated into ten fractions, and then analyzed using MS-MS.

Cloning of cDNA Library into Bacillus for Protein Secretion.

The cDNA library was cloned into the pHT43 vector for protein secretion assay in Bacillus subtilis. The unmodified pHT43 vector from MoBiTec contains the Pgrac promoter, the SamyQ signal peptide, Amp and Cm resistance genes, a lad region, a repA region, and the ColE1 origin of replication. The SamyQ signal peptide was removed. The pHT43 backbone vector with no signal peptide as well as a modified version with the aprE promoter substituted for the grac promoter and with the lad region removed were amplified with primers with overhangs that contain the corresponding SfiI restriction sites (forward primer overhang: TACGTGTATGGCCGCCTCGGCC (SEQ ID NO: 3912); reverse primer overhang: TACGTGTATGGCCGTAATGGCC (SEQ ID NO: 3913)). Both the cDNA library and the two PCR amplified backbones were cut with SfiI and PCR purified. The cDNA library inserts were ligated into each background. The ligation reactions were transformed into 10-Beta High Efficiency Competent Cells (New England Biolabs), and cells from each ligation were plated onto four LB agar plates containing 100 mg/L carbenicillin. Plates were incubated at 37° C. overnight. After colonies had grown, 2 mL of liquid LB medium was added to each plate. For each ligation, cells were scraped into the liquid and mixed together, and the suspensions were prepped for plasmid extraction to form the multiplex cDNA plasmid libraries (henceforth referred to as the multiplex Grac-cDNA and AprE-cDNA libraries).

The expression strains used in this expression experiment are based off of the WB800N strain (MoBiTec). The WB800N strain has the following genotype: nprE aprE epr bpr mpr::ble nprB::bsr vpr wprA::hyg cm::neo; NeoR. Strain cDNA-1 contains a mutation that synergizes with the paprE promoter and has these alterations in addition to the WB800N genotype: pXylA-comK::Erm, degU32(Hy), sigF::Str. Strain cDNA-2 has these alterations to WB800N: pXylA-comK::Erm.

Roughly 1 μg of the multiplex Grac-cDNA library was transformed into both Strain cDNA-1 and Strain cDNA-2, and 1 μg of the multiplex AprE-cDNA library was transformed into Strain cDNA-1. After transforming, 1004 of each strain was plated onto four LB (10 g/l NaCl, 10 g/l tryptone, and 5 g/l yeast extract) 1.5% agar plates containing 5 mg/L chloramphenicol and incubated at 37° C. for 16 hrs. After incubation, 2 mL of LB media with 5 mg/L chloramphenicol was added to the surface of each plate containing several thousand transformants, and the cells were suspended in the surface medium by scraping with a cell spreader and mixing. Suspended cells from the four replicate plates from each transformation were combined to form the preinoculum cultures for the expression experiments.

The OD600 of the preinoculum cultures made from resuspended cells were measured using a plate reader to be roughly 20-25. For the three strains (strain cDNA-1+multiplex Grac-cDNA, strain cDNA-1+multiplex AprE-cDNA, strain cDNA-2+Grac-cDNA), 500 mL baffled shake flasks containing 50 mL of 2×Mal medium (20 g/L NaCl, 20 g/L Tryptone, 10 g/L yeast extract, 75 g/L D-Maltose) with 5 mg/L chloramphenicol were inoculated to OD600≈0.2 to form the inoculum cultures, and incubated at 30° C. shaking at 250 rpm for roughly 6 hours. OD600 was measured and the inoculum cultures were used to inoculate expression cultures in 2 L baffled shake flasks containing 2×Mal medium with 5 mg/L chloramphenicol, 1× Teknova Trace Metals, and 0.01% Antifoam 204 to an OD600 of 0.1. The strain cDNA-1+multiplex AprE cDNA culture was shaken for 30° C. and 250 rpm for 18 hours, at which point the culture was harvested. The terminal cell density was measured and the cells were harvested by centrifugation (5000×g, 30 min, RT). The strain cDNA-1+multiplex Grac-cDNA and strain cDNA-2+multiplex Grac-cDNA cultures were shaken at 37° C. and 250 rpm for 4 hours, and were induced with 1 mM IPTG. Heterologous expression was carried out for 4 hours at 37° C. and 250 rpm, at which point the cultures were harvested. Again, the terminal cell density was measured and the cells were harvested by centrifugation (5000×g, 30 min, RT). The supernatant was collected and run on SDS-PAGE gels, separated into ten fractions, and then analyzed using LC-MS/MS to identify secreted proteins.

Mass Spectrometry Analysis.

Whole cell lysate and soluble lysate samples were analyzed for protein expression using LC-MS/MS. To analyze samples, 10 μg of sample was loaded onto a 10% SDS-PAGE gel (Invitrogen) and separated approximately 5 cm. The gel was excised into ten segments and the gel slices were processed by washing with 25 mM ammonium bicarbonate, followed by acetonitrile. Gel slices were then reduced with 10 mM dithiothreitol at 60° C., followed by alkylation with 50 mM iodoacetamide at room temperature. Finally, the samples were digested with trypsin (Promega) at 37° C. for 4 h and the digestions were quenched with the addition of formic acid. The supernatant samples were then analyzed by nano LC/MS/MS with a Waters NanoAcquity HPLC system interfaced to a ThermoFisher Q Exactive. Peptides were loaded on a trapping column and eluted over a 75 μm analytical column at 350 nL/min; both columns were packed with Jupiter Proteo resin (Phenomenex). A 1 h gradient was employed. The mass spectrometer was operated in data-dependent mode, with MS and MS/MS performed in the Orbitrap at 70,000 FWHM resolution and 17,500 FWHM resolution, respectively. The fifteen most abundant ions were selected for MS/MS. Data were searched against a database using Mascot to identify peptides. The database was constructed by combining the complete proteome sequences from all twelve species including Bos taurus, Gallus gallus, Vitis vinifera, Ovis aries, Sus scrofa, Oryza sativa, Glycine max, Oreochromis niloticus, Solanum lycopesicum, Agaricus bisporus var. bisporus, Saccharomyces cerevisiae, and Meleagris gallopavo. Mascot DAT files were parsed into the Scaffold software for validation, filtering and to create a nonredundant list per sample. Data were filtered at 1% protein and peptide false discovery rate (FDR) and requiring at least two unique peptides per protein.

Expressed Proteins Identified.

Mass spectrometry analysis identified a total of 125 proteins across expression strains. Spectrum counts, which are related to the protein abundance, are reported to confirm protein expression or secretion. Fifty three proteins were identified in whole cell lysate of the Rosetta (DE3) strain, 46 in the soluble fraction of the Rosetta (DE3) strain, 36 in Rosetta-Gami B (DE3), 10 in Rosetta-Gami 2 (DE3), and 15 in the secreted supernatant of Bacillus subtilis.

The nutritive polypeptides detected in the secreted supernatant of Bacillus subtilis are SEQID-00718, SEQID-00762, SEQID-00763, SEQID-00764, SEQID-00765, SEQID-00766, SEQID-00767, SEQID-00768, SEQID-00769, SEQID-00770, SEQID-00771, SEQID-00772, SEQID-00773, SEQID-00774, SEQID-00775.

The nutritive polypeptides detected in the whole cell lysate of the E. coli Rosetta (DE3) strain are SEQID-00716, SEQID-00718, SEQID-00720, SEQID-00723, SEQID-00724, SEQID-00725, SEQID-00729, SEQID-00732, SEQID-00737, SEQID-00751, SEQID-00776, SEQID-00790, SEQID-00797, SEQID-00798, SEQID-00799, SEQID-00800, SEQID-00801, SEQID-00802, SEQID-00803, SEQID-00804, SEQID-00805, SEQID-00806, SEQID-00807, SEQID-00808, SEQID-00809, SEQID-00810, SEQID-00811, SEQID-00812, SEQID-00813, SEQID-00814, SEQID-00815, SEQID-00816, SEQID-00817, SEQID-00818, SEQID-00819, SEQID-00820, SEQID-00821, SEQID-00822, SEQID-00823, SEQID-00824, SEQID-00825, SEQID-00826, SEQID-00827, SEQID-00828, SEQID-00829, SEQID-00830, SEQID-00831, SEQID-00832, SEQID-00833, SEQID-00834, SEQID-00835, SEQID-00836, SEQID-00837.

The nutritive polypeptides detected in the soluble lysate of the E. coli Rosetta (DE3) strain are SEQID-00716, SEQID-00717, SEQID-00718, SEQID-00719, SEQID-00720, SEQID-00721, SEQID-00722, SEQID-00724, SEQID-00725, SEQID-00726, SEQID-00727, SEQID-00728, SEQID-00729, SEQID-00730, SEQID-00731, SEQID-00732, SEQID-00733, SEQID-00734, SEQID-00735, SEQID-00736, SEQID-00737, SEQID-00738, SEQID-00739, SEQID-00740, SEQID-00741, SEQID-00742, SEQID-00743, SEQID-00744, SEQID-00745, SEQID-00746, SEQID-00747, SEQID-00748, SEQID-00749, SEQID-00750, SEQID-00751, SEQID-00752, SEQID-00753, SEQID-00754, SEQID-00755, SEQID-00756, SEQID-00757, SEQID-00758, SEQID-00759, SEQID-00760, SEQID-00761.

The nutritive polypeptides detected in the E. coli Rosetta-Gami B (DE3) strain are SEQID-00003, SEQID-00004, SEQID-00005, SEQID-00716, SEQID-00718, SEQID-00719, SEQID-00720, SEQID-00729, SEQID-00730, SEQID-00731, SEQID-00732, SEQID-00734, SEQID-00736, SEQID-00740, SEQID-00743, SEQID-00752, SEQID-00760, SEQID-00763, SEQID-00764, SEQID-00776, SEQID-00777, SEQID-00778, SEQID-00779, SEQID-00780, SEQID-00781, SEQID-00782, SEQID-00783, SEQID-00784, SEQID-00785, SEQID-00786, SEQID-00787, SEQID-00788, SEQID-00789, SEQID-00790, SEQID-00791, SEQID-00792.

The nutritive polypeptides detected in the E. coli Rosetta-Gami 2 (DE3) strain are SEQID-00716, SEQID-00737, SEQID-00747, SEQID-00763, SEQID-00789, SEQID-00790, SEQID-00793, SEQID-00794, SEQID-00795, SEQID-00796.

Example 3. Identification and Selection of Amino Acid Sequences of Nutritive Polypeptides of Edible Species Using Annotated Protein Sequence Databases

Construction of Protein Databases. The UniProtKB/Swiss-Prot (a collaboration between the European Bioinformatics Institute and the Swiss Institute of Bioinformatics) is a manually curated and reviewed protein database, and was used as the starting point for constructing a protein database. To construct a protein database of edible species, a search was performed on the UniProt database for proteins from edible species as disclosed in, e.g., PCT/US2013/032232, filed Mar. 15, 2013, PCT/US2013/032180, filed Mar. 15, 2013, PCT/US2013/032225, filed Mar. 15, 2013, PCT/US2013/032218, filed Mar. 15, 2013, PCT/US2013/032212, filed Mar. 15, 2013, and PCT/US2013/032206, filed Mar. 15, 2013. To identify proteins that are secreted from microorganisms, the UniProt database was searched for species from microorganisms as disclosed herein and proteins that are annotated with keywords or annotations that includes secreted, extracellular, cell wall, and outer membrane. To identify proteins that are abundant in the human diet, the reference proteomes of edible species were assembled from genome databases. As provided herein, mass spectrometry was performed on proteins extracted from each edible species. The peptides identified by mass spectrometry were mapped to the reference proteomes and the spectrum counts of the peptides associated with the reference protein sequences were converted to a measure for the abundance of the corresponding protein in food. All proteins that were detected above a cutoff spectrum count with high confidence were assembled into a database. These databases are used for identifying proteins that are derived from edible species, which are secreted, and/or are abundant in the human diet.

Processes for Selection of Amino Acid Sequences.

A process for picking a protein or group of proteins can include identifying a set of constraints that define the class of protein one is interested in finding, the database of proteins from which to search, and performing the actual search.

The protein class criteria can be defined by nutritional literature (i.e., what has been previously identified as efficacious), desired physiochemical traits (e.g., expressible, soluble, nonallergenic, nontoxic, digestible, etc), and other characteristics. A relevant database of proteins that can be used for searching purposes can be derived from the sequences disclosed herein.

One example of proteins that can be searched is a highly soluble class of proteins for muscle anabolism/immune health/diabetes treatment. These proteins are generally solubly expressible, highly soluble upon purification/isolation, non-allergenic, non-toxic, fast digesting, and meet some basic nutritional criteria (e.g., [EAA]>0.3, [BCAA]>0.15, [Leucine or Glutamine or Arginine]>0.08, eaa complete).

A search is conducted for expressible, soluble proteins using a binary classification model based on two parameters related to the hydrophilicity and hydrophobicity of the protein sequence: solvation score and aggregation score (see examples below for various descriptions of these two metrics and measures of efficacy of the model). Alternatively, a search can be conducted for highly charged proteins with high (or low) net charge per amino acid, which is indicative of a net excess of negative or positive charges per amino acid (see example below for additional description).

The nutritional criteria are satisfied by computing the mass fractions of all relevant amino acids based on primary sequence. For cases in which it is desired to match a known, clinically efficacious amino acid blend a weighted Euclidean distance method can be used (see example below).

As provided herein, allergenicity/toxicity/nonallergenicity/antinutrticity criteria are searched for using sequence based homology assessments in which each candidate sequence is compared to libraries of known allergens, toxins, nonallergens, or antinutritive (e.g., protease inhibitory) proteins (see examples herein). In general, cutoffs of <50% global or <35% local (over any given 80aa window) homology (percent ID) can be used for the allergenicity screens, and <35% global for the toxicity and antinutricity screens. In all cases, smaller implies less allergenic/toxic/antinutritive. The nonallergenicity screen is less typically used as a cutoff, but >62% as a cutoff can be used (greater implies is more nonallergenic). These screens reduce the list to a smaller subset of proteins enriched in the criteria of interest. This list is then ranked using a variety of aggregate objective functions and selections are made from this rank ordered list.

Example 4. Selection of Amino Acid Sequences to Demonstrate Amino Acid Pharmacology of Nutritive Polypeptides

Identification of Proteins Enriched in Leucine and Essential Amino Acids for the Treatment of Sarcopenia.

As described herein, sarcopenia is the degenerative loss of skeletal muscle mass (typically 0.5-1% loss per year after the age of 25), quality, and strength associated with aging. Sarcopenia is characterized first by a muscle atrophy (a decrease in the size of the muscle), along with a reduction in muscle tissue “quality,” caused by such factors as replacement of muscle fibres with fat, an increase in fibrosis, changes in muscle metabolism, oxidative stress, and degeneration of the neuromuscular junction. Combined, these changes lead to progressive loss of muscle function and eventually to frailty. It has been shown that essential amino acid supplementation in elderly, sarcopenia individuals can have an anabolic and/or sparing effect on muscle mass. Furthermore, this supplementation can also translate to improvements in patient strength and muscle quality. See, e.g., Paddon-Jones D, et al. J Clin Endocrinol Metab 2004, 89:4351-4358; Ferrando, A et al. Clinical Nutrition 2009 1-6; Katsanos C et al. Am J Physiol Endocrinol Metab. 2006, 291: 381-387. It has also been shown that the essential amino acid leucine is a particularly important factor in stimulating muscle protein synthesis. See, e.g., Borscheim E et al. Am J Physiol Endocrinol Metab 2002, 283: E648-E657; Borsheim E et al. Clin Nutr. 2008, 27: 189-95; Esmarck B et al J Physiol 2001, 535: 301-311; Moore D et al. Am J Clin Nutr 2009, 89: 161-8). One can identify beneficial nutritive polypeptides for individuals that suffer from sarcopenia by selecting proteins that are enriched by mass in leucine and the other essential amino acids.

Using a database of all protein sequences derived from edible species as described herein, candidate sequences that are enriched in leucine (≥15% by mass) and essential amino acids (≥40% by mass) were identified and rank ordered by their total leucine plus essential amino acid mass relative to total amino acid mass. In order to increase the probability that these proteins are solubly expressed, as well as highly soluble at pH 7 with reduced aggregation propensity, solvation score and aggregation score upper bounds of −20 kcal/mol/AA and 0.5 were applied. In order to reduce the likelihood that these proteins would elicit an allergenic response, upper bounds of 50% and 35% were set for the global allergen homology and allergenicity scores, respectively. In order to reduce the likelihood that these proteins would have toxic effects upon ingestion, an upper bound of 35% was set for the toxicity score. In order to reduce the likelihood that these proteins would act as inhibitors of digestive proteases, an upper bound of 35% was set for the anti-nutricity score.

An exemplary list of the top 10 nutritive polypeptide sequences that are enriched in leucine (≥15% by mass) and essential amino acids (≥40% by mass), and meet the afore mentioned cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in Table E4A.

TABLE E4A SEQID EAA L SEQID-03552 0.56 0.21 SEQID-03581 0.60 0.15 SEQID-03532 0.58 0.16 SEQID-03475 0.56 0.17 SEQID-03499 0.58 0.15 SEQID-03494 0.54 0.18 SEQID-03460 0.54 0.17 SEQID-03485 0.54 0.16 SEQID-03513 0.54 0.15 SEQID-03491 0.52 0.17

An exemplary list of the top 10 nutritive polypeptide sequences from the expressed protein database that are enriched in leucine (≥15% by mass) and essential amino acids (≥40% by mass) is shown in Table E4B.

TABLE E4B SEQID EAA L SEQID-00162 0.64 0.34 SEQID-00132 0.60 0.32 SEQID-00166 0.65 0.26 SEQID-00169 0.60 0.25 SEQID-00137 0.64 0.20 SEQID-00134 0.58 0.24 SEQID-00175 0.63 0.19 SEQID-00194 0.54 0.26 SEQID-00193 0.53 0.26 SEQID-00195 0.52 0.26

Example 5. Selection of Amino Acid Sequences of Nutritive Polypeptides Enriched in Essential Amino Acids and Enriched or Reduced in Various Individual Amino Acids of Interest

Using a database of all protein sequences derived from edible species as described herein, candidate sequences enriched in essential amino acids with elevated or reduced amounts of each amino acid were identified. In order to increase the probability that these proteins would be solubly expressed, as well as highly soluble at pH 7 with reduced aggregation propensity, solvation score and aggregation score upper bounds of −20 kcal/mol/AA and 0.5 were applied. In order to reduce the likelihood that these proteins would elicit an allergenic response, upper bounds of 50% and 35% were set for the global allergen homology and allergenicity scores, respectively. In order to reduce the likelihood that these proteins would have toxic effects upon ingestion, an upper bound of 35% was set for the toxicity score. In order to reduce the likelihood that these proteins would act as inhibitors of digestive proteases, an upper bound of 35% was set for the anti-nutricity score. When searching for proteins enriched or reduced in a given amino acid, the cutoffs described above were applied, and proteins were rank ordered by their calculated amino acid mass fraction of the desired amino acid and then by their essential amino acid content.

An exemplary list of the top 10 nutritive polypeptide sequences enriched in alanine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in Table E5A. The top 10 nutritive polypeptide sequences reduced in alanine are shown in Table E5B.

TABLE E5A SEQID EAA A SEQID-03678 0.46 0.18 SEQID-03682 0.44 0.18 SEQID-03646 0.46 0.18 SEQID-03653 0.39 0.16 SEQID-03717 0.38 0.16 SEQID-03686 0.41 0.15 SEQID-03807 0.46 0.15 SEQID-03864 0.44 0.15 SEQID-03663 0.35 0.15 SEQID-03777 0.46 0.14

TABLE E5B SEQID EAA A SEQID-03874 0.62 0.00 SEQID-03552 0.56 0.00 SEQID-03880 0.52 0.00 SEQID-03673 0.52 0.00 SEQID-03667 0.50 0.00 SEQID-03657 0.50 0.00 SEQID-03842 0.49 0.00 SEQID-03623 0.49 0.00 SEQID-03817 0.48 0.00 SEQID-03875 0.48 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in arginine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5C. The top 10 nutritive polypeptide sequences reduced in arginine are shown in Table E5D.

TABLE E5C SEQID EAA R SEQID-03473 0.06 0.72 SEQID-03855 0.09 0.65 SEQID-03727 0.10 0.63 SEQID-03767 0.17 0.62 SEQID-03704 0.17 0.60 SEQID-03459 0.10 0.60 SEQID-03731 0.16 0.48 SEQID-03698 0.47 0.40 SEQID-03687 0.37 0.38 SEQID-03732 0.19 0.38

TABLE E5D SEQID EAA R SEQID-03744 0.60 0.00 SEQID-03770 0.56 0.00 SEQID-03880 0.52 0.00 SEQID-03736 0.49 0.00 SEQID-03706 0.49 0.00 SEQID-03881 0.49 0.00 SEQID-03668 0.49 0.00 SEQID-03733 0.46 0.00 SEQID-03764 0.46 0.00 SEQID-03832 0.44 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in asparagine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5E. The top 10 nutritive polypeptide sequences reduced in asparagine are shown in table E5F.

TABLE E5E SEQID EAA N SEQID-03723 0.39 0.15 SEQID-03721 0.39 0.15 SEQID-03803 0.42 0.15 SEQID-03801 0.38 0.15 SEQID-03714 0.40 0.14 SEQID-03742 0.41 0.14 SEQID-03724 0.38 0.14 SEQID-03884 0.42 0.14 SEQID-03778 0.39 0.13 SEQID-03746 0.41 0.13

TABLE E5F SEQID EAA N SEQID-03874 0.62 0.00 SEQID-03793 0.59 0.00 SEQID-03789 0.57 0.00 SEQID-03869 0.57 0.00 SEQID-03809 0.57 0.00 SEQID-03662 0.56 0.00 SEQID-03850 0.55 0.00 SEQID-03783 0.55 0.00 SEQID-03753 0.54 0.00 SEQID-03677 0.53 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in aspartic acid that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5G. The top 10 nutritive polypeptide sequences reduced in aspartic acid are shown in table E5H.

TABLE E5G SEQID EAA D SEQID-03630 0.33 0.28 SEQID-03425 0.34 0.26 SEQID-03564 0.33 0.25 SEQID-03543 0.34 0.25 SEQID-03607 0.32 0.24 SEQID-03621 0.35 0.23 SEQID-03604 0.37 0.21 SEQID-03827 0.42 0.20 SEQID-03540 0.37 0.20 SEQID-03624 0.39 0.19

TABLE E5H SEQID EAA D SEQID-03795 0.62 0.00 SEQID-03468 0.62 0.00 SEQID-03672 0.62 0.00 SEQID-03656 0.61 0.00 SEQID-03517 0.60 0.00 SEQID-03493 0.60 0.00 SEQID-03816 0.60 0.00 SEQID-03796 0.59 0.00 SEQID-03868 0.59 0.00 SEQID-03740 0.59 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in cysteine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E51. The top 10 nutritive polypeptide sequences reduced in cysteine are shown in table E5J.

TABLE E5I SEQID EAA C SEQID-03495 0.24 0.30 SEQID-03514 0.24 0.30 SEQID-03571 0.24 0.28 SEQID-03430 0.36 0.27 SEQID-03419 0.37 0.16 SEQID-03478 0.29 0.16 SEQID-03523 0.33 0.16 SEQID-03504 0.35 0.16 SEQID-03477 0.28 0.16 SEQID-03459 0.10 0.16

TABLE E5J SEQID EAA C SEQID-03636 0.65 0.00 SEQID-03492 0.63 0.00 SEQID-03484 0.62 0.00 SEQID-03442 0.61 0.00 SEQID-03417 0.61 0.00 SEQID-03563 0.61 0.00 SEQID-03512 0.61 0.00 SEQID-03517 0.60 0.00 SEQID-03606 0.60 0.00 SEQID-03493 0.60 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in glutamine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5K. The top 10 nutritive polypeptide sequences reduced in glutamine are shown in table E5L.

TABLE E5K SEQID EAA Q SEQID-03676 0.29 0.19 SEQID-03720 0.29 0.19 SEQID-03683 0.33 0.19 SEQID-03782 0.46 0.18 SEQID-03681 0.46 0.18 SEQID-03852 0.46 0.18 SEQID-03671 0.43 0.17 SEQID-00515 0.25 0.17 SEQID-03866 0.40 0.17 SEQID-03824 0.36 0.16

TABLE E5L SEQID EAA Q SEQID-03636 0.65 0.00 SEQID-03795 0.62 0.00 SEQID-03468 0.62 0.00 SEQID-03484 0.62 0.00 SEQID-03570 0.59 0.00 SEQID-03422 0.58 0.00 SEQID-03432 0.58 0.00 SEQID-03590 0.58 0.00 SEQID-03515 0.58 0.00 SEQID-03499 0.58 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in histidine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5M. The top 10 nutritive polypeptide sequences reduced in histidine are shown in table E5N.

TABLE E5M SEQID EAA H SEQID-03744 0.60 0.25 SEQID-03551 0.48 0.24 SEQID-03745 0.56 0.19 SEQID-03793 0.59 0.19 SEQID-03468 0.62 0.15 SEQID-03743 0.38 0.13 SEQID-03711 0.56 0.12 SEQID-03847 0.58 0.12 SEQID-03637 0.43 0.12 SEQID-03739 0.52 0.12

TABLE E5N SEQID EAA H SEQID-03795 0.62 0.00 SEQID-03874 0.62 0.00 SEQID-03656 0.61 0.00 SEQID-03517 0.60 0.00 SEQID-03493 0.60 0.00 SEQID-03816 0.60 0.00 SEQID-03796 0.59 0.00 SEQID-03740 0.59 0.00 SEQID-03814 0.58 0.00 SEQID-03837 0.57 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in isoleucine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E50. The top 10 nutritive polypeptide sequences reduced in isoleucine are shown in table E5P.

TABLE E5O SEQID EAA I SEQID-03722 0.40 0.15 SEQID-03805 0.38 0.15 SEQID-03435 0.40 0.15 SEQID-03838 0.42 0.15 SEQID-03655 0.54 0.15 SEQID-03828 0.49 0.15 SEQID-03593 0.39 0.15 SEQID-03818 0.51 0.15 SEQID-03841 0.49 0.15 SEQID-03843 0.48 0.14

TABLE E5P SEQID EAA I SEQID-03581 0.60 0.00 SEQID-03685 0.57 0.00 SEQID-03705 0.56 0.00 SEQID-03660 0.55 0.00 SEQID-03779 0.53 0.00 SEQID-03781 0.52 0.00 SEQID-03647 0.51 0.00 SEQID-03785 0.50 0.00 SEQID-03865 0.50 0.00 SEQID-03802 0.49 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in leucine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5Q. The top 10 nutritive polypeptide sequences reduced in leucine are shown in table E5R.

TABLE E5Q SEQID EAA L SEQID-03552 0.56 0.21 SEQID-03428 0.49 0.18 SEQID-03623 0.49 0.18 SEQID-03702 0.47 0.18 SEQID-03701 0.49 0.18 SEQID-03703 0.49 0.18 SEQID-03599 0.51 0.18 SEQID-03494 0.54 0.18 SEQID-03632 0.45 0.18 SEQID-03423 0.44 0.18

TABLE E5R SEQID EAA L SEQID-03661 0.53 0.00 SEQID-03849 0.52 0.00 SEQID-03644 0.42 0.00 SEQID-03878 0.39 0.00 SEQID-03652 0.38 0.00 SEQID-03419 0.37 0.00 SEQID-03654 0.36 0.00 SEQID-03804 0.36 0.00 SEQID-03504 0.35 0.00 SEQID-03477 0.28 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in lysine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5S. The top 10 nutritive polypeptide sequences reduced in lysine are shown in table E5T.

TABLE E5S SEQID EAA K SEQID-03648 0.52 0.36 SEQID-03797 0.56 0.34 SEQID-03830 0.52 0.31 SEQID-03829 0.54 0.30 SEQID-03581 0.60 0.30 SEQID-03520 0.36 0.30 SEQID-03457 0.37 0.30 SEQID-03471 0.36 0.29 SEQID-03859 0.53 0.29 SEQID-03456 0.34 0.29

TABLE E5T SEQID EAA K SEQID-03583 0.42 0.00 SEQID-03684 0.40 0.00 SEQID-03813 0.36 0.00 SEQID-03771 0.28 0.00 SEQID-03873 0.26 0.00 SEQID-03585 0.25 0.00 SEQID-03704 0.17 0.00 SEQID-03767 0.17 0.00 SEQID-03731 0.16 0.00 SEQID-03459 0.10 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in methionine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5U. The top 10 nutritive polypeptide sequences reduced in arginine are shown in table E5V.

TABLE E5U SEQID EAA M SEQID-00552 0.45 0.16 SEQID-03870 0.52 0.15 SEQID-03680 0.49 0.15 SEQID-03888 0.39 0.13 SEQID-03738 0.37 0.13 SEQID-03698 0.47 0.13 SEQID-03584 0.53 0.12 SEQID-03487 0.53 0.11 SEQID-03858 0.49 0.11 SEQID-03787 0.46 0.11

TABLE E5V SEQID EAA M SEQID-03701 0.49 0.00 SEQID-03861 0.49 0.00 SEQID-03703 0.49 0.00 SEQID-03702 0.47 0.00 SEQID-03773 0.46 0.00 SEQID-03707 0.45 0.00 SEQID-03726 0.41 0.00 SEQID-03725 0.39 0.00 SEQID-03734 0.39 0.00 SEQID-03700 0.37 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in phenylalanine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5W. The top 10 nutritive polypeptide sequences reduced in phenylalanine are shown in table E5X.

TABLE E5W SEQID EAA F SEQID-03761 0.48 0.14 SEQID-03831 0.56 0.14 SEQID-03836 0.55 0.13 SEQID-03437 0.41 0.13 SEQID-03749 0.41 0.13 SEQID-03558 0.44 0.13 SEQID-03791 0.49 0.13 SEQID-03729 0.50 0.12 SEQID-03846 0.40 0.12 SEQID-03862 0.50 0.12

TABLE E5X SEQID EAA F SEQID-03581 0.60 0.00 SEQID-03441 0.57 0.00 SEQID-03685 0.57 0.00 SEQID-03573 0.55 0.00 SEQID-03661 0.53 0.00 SEQID-03859 0.53 0.00 SEQID-03688 0.53 0.00 SEQID-03675 0.53 0.00 SEQID-03609 0.53 0.00 SEQID-03584 0.53 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in proline that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5Y. The top 10 nutritive polypeptide sequences reduced in proline are shown in table E5Z.

TABLE E5Y SEQID EAA P SEQID-03800 0.33 0.16 SEQID-03756 0.32 0.14 SEQID-03839 0.35 0.14 SEQID-03810 0.33 0.13 SEQID-03888 0.39 0.13 SEQID-03845 0.41 0.13 SEQID-03834 0.39 0.13 SEQID-03658 0.35 0.13 SEQID-03856 0.44 0.12 SEQID-03799 0.37 0.12

TABLE E5Z SEQID EAA P SEQID-03636 0.65 0.00 SEQID-03468 0.62 0.00 SEQID-03790 0.57 0.00 SEQID-03486 0.57 0.00 SEQID-03665 0.56 0.00 SEQID-03833 0.56 0.00 SEQID-03588 0.56 0.00 SEQID-03808 0.56 0.00 SEQID-03719 0.56 0.00 SEQID-03815 0.55 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in serine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5AA. The top 10 nutritive polypeptide sequences reduced in serine are shown in table E5AB.

TABLE E5AA SEQID EAA S SEQID-03747 0.45 0.16 SEQID-03863 0.27 0.15 SEQID-03737 0.32 0.15 SEQID-03759 0.40 0.15 SEQID-03882 0.39 0.15 SEQID-03748 0.41 0.14 SEQID-03792 0.41 0.14 SEQID-03844 0.37 0.14 SEQID-03751 0.47 0.14 SEQID-03822 0.45 0.14

TABLE E5AB SEQID EAA S SEQID-03441 0.57 0.00 SEQID-03867 0.55 0.00 SEQID-03645 0.43 0.00 SEQID-03455 0.35 0.00 SEQID-03775 0.30 0.00 SEQID-03771 0.28 0.00 SEQID-03772 0.28 0.00 SEQID-03716 0.26 0.00 SEQID-03873 0.26 0.00 SEQID-03508 0.41 0.01

An exemplary list of the top 10 nutritive polypeptide sequences enriched in threonine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5AC. The top 10 nutritive polypeptide sequences reduced in threonine are shown in table E5AD.

TABLE E5AC SEQID EAA T SEQID-03718 0.42 0.16 SEQID-03777 0.46 0.14 SEQID-03713 0.42 0.12 SEQID-03871 0.44 0.12 SEQID-03867 0.55 0.12 SEQID-03819 0.48 0.12 SEQID-03820 0.41 0.11 SEQID-03653 0.39 0.11 SEQID-03717 0.38 0.11 SEQID-03877 0.45 0.11

TABLE E5AD SEQID EAA T SEQID-03744 0.60 0.00 SEQID-03745 0.56 0.00 SEQID-03661 0.53 0.00 SEQID-03830 0.52 0.00 SEQID-03849 0.52 0.00 SEQID-03887 0.51 0.00 SEQID-03886 0.50 0.00 SEQID-03670 0.48 0.00 SEQID-03551 0.48 0.00 SEQID-03780 0.47 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in tryptophan that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5AE. The top 10 nutritive polypeptide sequences reduced in tryptophan are shown in table E5AF.

TABLE E5AE SEQID EAA W SEQID-03583 0.42 0.15 SEQID-03635 0.40 0.13 SEQID-03555 0.50 0.11 SEQID-03679 0.48 0.09 SEQID-03440 0.44 0.09 SEQID-03439 0.45 0.09 SEQID-03468 0.62 0.08 SEQID-01546 0.51 0.08 SEQID-03576 0.42 0.08 SEQID-03821 0.44 0.08

TABLE E5AF SEQID EAA W SEQID-03672 0.62 0.00 SEQID-03512 0.61 0.00 SEQID-03606 0.60 0.00 SEQID-03744 0.60 0.00 SEQID-03581 0.60 0.00 SEQID-03868 0.59 0.00 SEQID-03762 0.59 0.00 SEQID-03857 0.59 0.00 SEQID-03793 0.59 0.00 SEQID-03769 0.59 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in tyrosine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5AG. The top 10 nutritive polypeptide sequences reduced in tyrosine are shown in table E5AH.

TABLE E5AG SEQID EAA Y SEQID-03848 0.32 0.16 SEQID-03831 0.56 0.15 SEQID-03876 0.26 0.15 SEQID-00325 0.42 0.14 SEQID-03794 0.43 0.14 SEQID-03826 0.38 0.14 SEQID-03659 0.46 0.14 SEQID-03786 0.35 0.14 SEQID-03784 0.38 0.14 SEQID-03823 0.39 0.14

TABLE E5AH SEQID EAA Y SEQID-03468 0.62 0.00 SEQID-03442 0.61 0.00 SEQID-03417 0.61 0.00 SEQID-03563 0.61 0.00 SEQID-03606 0.60 0.00 SEQID-03469 0.60 0.00 SEQID-03443 0.60 0.00 SEQID-03581 0.60 0.00 SEQID-03796 0.59 0.00 SEQID-03762 0.59 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in valine that met the above cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E5AI. The top 10 nutritive polypeptide sequences reduced in valine are shown in table E5AJ.

TABLE E5AI SEQID EAA V SEQID-03881 0.49 0.17 SEQID-03790 0.57 0.17 SEQID-03808 0.56 0.17 SEQID-03606 0.60 0.17 SEQID-03688 0.53 0.16 SEQID-03806 0.55 0.16 SEQID-03643 0.51 0.15 SEQID-03788 0.58 0.14 SEQID-03762 0.59 0.14 SEQID-03674 0.51 0.14

TABLE E5AJ SEQID EAA V SEQID-03879 0.56 0.00 SEQID-03552 0.56 0.00 SEQID-03835 0.54 0.00 SEQID-03851 0.53 0.00 SEQID-03757 0.53 0.00 SEQID-03648 0.52 0.00 SEQID-03766 0.50 0.00 SEQID-03710 0.46 0.00 SEQID-03764 0.46 0.00 SEQID-00552 0.45 0.00

Selection of Expressed Proteins Enriched in Essential Amino Acids and Enriched or Reduced in Various Individual Amino Acids.

Using the database of all expressed protein sequences described herein, candidate sequences enriched in essential amino acids with elevated or reduced amounts of each amino acid were identified. When searching for proteins enriched or reduced in a given amino acid, proteins were rank ordered by their calculated amino acid mass fraction of the desired amino acid and then by their essential amino acid content.

An exemplary list of the top 10 nutritive polypeptide sequences enriched in alanine is shown in table E5AK. The top 10 nutritive polypeptide sequences reduced in alanine are shown in table E5AL.

TABLE E5AK SEQID EAA A SEQID-00499 0.34 0.18 SEQID-00512 0.44 0.17 SEQID-00651 0.39 0.17 SEQID-00519 0.38 0.16 SEQID-02704 0.42 0.16 SEQID-02703 0.42 0.13 SEQID-00530 0.37 0.13 SEQID-00544 0.45 0.12 SEQID-00549 0.40 0.12 SEQID-02675 0.50 0.11

TABLE E5AL SEQID EAA A SEQID-00140 0.70 0.00 SEQID-00057 0.41 0.00 SEQID-00652 0.28 0.00 SEQID-00199 0.52 0.00 SEQID-00198 0.53 0.00 SEQID-00197 0.52 0.00 SEQID-00196 0.53 0.00 SEQID-00722 0.53 0.01 SEQID-00204 0.51 0.01 SEQID-00203 0.51 0.01

An exemplary list of the top 10 nutritive polypeptide sequences enriched in arginine is shown in table E5AM. The top 10 nutritive polypeptide sequences reduced in arginine are shown in table E5AN.

TABLE E5AM SEQID EAA R SEQID-00540 0.41 0.23 SEQID-00567 0.42 0.22 SEQID-00636 0.47 0.22 SEQID-00556 0.33 0.22 SEQID-00637 0.42 0.22 SEQID-00575 0.33 0.22 SEQID-00492 0.42 0.21 SEQID-00631 0.38 0.21 SEQID-00551 0.41 0.21 SEQID-00328 0.45 0.20

TABLE E5AN SEQID EAA R SEQID-00140 0.70 0.00 SEQID-00146 0.67 0.00 SEQID-00150 0.67 0.00 SEQID-00143 0.65 0.00 SEQID-00525 0.64 0.00 SEQID-00162 0.64 0.00 SEQID-00175 0.63 0.00 SEQID-00169 0.60 0.00 SEQID-00548 0.59 0.00 SEQID-00536 0.58 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in asparagine is shown in table E5AO. The top 10 nutritive polypeptide sequences reduced in asparagine are shown in table E5AP.

TABLE E5AO SEQID EAA N SEQID-00195 0.52 0.14 SEQID-00194 0.54 0.14 SEQID-00193 0.53 0.12 SEQID-03872 0.47 0.12 SEQID-01388 0.36 0.12 SEQID-00552 0.45 0.12 SEQID-00169 0.60 0.11 SEQID-00196 0.53 0.10 SEQID-00197 0.52 0.10 SEQID-03693 0.34 0.10

TABLE E5AP SEQID EAA N SEQID-00536 0.58 0.00 SEQID-00284 0.54 0.00 SEQID-00212 0.51 0.00 SEQID-00101 0.51 0.00 SEQID-00219 0.50 0.00 SEQID-00634 0.50 0.00 SEQID-00624 0.49 0.00 SEQID-00639 0.46 0.00 SEQID-00597 0.45 0.00 SEQID-00527 0.40 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in aspartic acid is shown in table E5AQ. The top 10 nutritive polypeptide sequences reduced in aspartic acid are shown in table E5AR.

TABLE E5AQ SEQID EAA D SEQID-00562 0.40 0.19 SEQID-03853 0.34 0.17 SEQID-00116 0.32 0.16 SEQID-00102 0.45 0.16 SEQID-00115 0.32 0.16 SEQID-00484 0.38 0.16 SEQID-00100 0.46 0.15 SEQID-00220 0.52 0.15 SEQID-00098 0.50 0.15 SEQID-00078 0.46 0.14

TABLE E5AR SEQID EAA D SEQID-00166 0.65 0.00 SEQID-00051 0.59 0.00 SEQID-00052 0.59 0.00 SEQID-00053 0.57 0.00 SEQID-00054 0.55 0.00 SEQID-00055 0.55 0.00 SEQID-00523 0.51 0.00 SEQID-00635 0.46 0.00 SEQID-00230 0.45 0.00 SEQID-00637 0.42 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in cysteine is shown in table E5AS. The top 10 nutritive polypeptide sequences reduced in cysteine are shown in table E5AT.

TABLE E5AS SEQID EAA C SEQID-00737 0.28 0.18 SEQID-00652 0.28 0.16 SEQID-00007 0.28 0.13 SEQID-00007 0.28 0.13 SEQID-00558 0.36 0.13 SEQID-00013 0.28 0.12 SEQID-00014 0.30 0.12 SEQID-00989 0.34 0.12 SEQID-00566 0.38 0.11 SEQID-00596 0.44 0.11

TABLE E5AT SEQID EAA C SEQID-00166 0.65 0.00 SEQID-00137 0.64 0.00 SEQID-00525 0.64 0.00 SEQID-00162 0.64 0.00 SEQID-03297 0.62 0.00 SEQID-00169 0.60 0.00 SEQID-00132 0.60 0.00 SEQID-00298 0.59 0.00 SEQID-00536 0.58 0.00 SEQID-00297 0.58 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in glutamine is shown in table E5AU. The top 10 nutritive polypeptide sequences reduced in glutamine are shown in table E5AV.

TABLE E5AU SEQID EAA Q SEQID-00743 0.29 0.22 SEQID-00513 0.33 0.17 SEQID-03695 0.38 0.17 SEQID-00522 0.40 0.17 SEQID-00515 0.25 0.17 SEQID-03692 0.44 0.14 SEQID-03666 0.35 0.13 SEQID-00613 0.36 0.13 SEQID-00585 0.44 0.13 SEQID-00223 0.50 0.13

TABLE E5AV SEQID EAA Q SEQID-00143 0.65 0.00 SEQID-00137 0.64 0.00 SEQID-00525 0.64 0.00 SEQID-00134 0.58 0.00 SEQID-00194 0.54 0.00 SEQID-00193 0.53 0.00 SEQID-00195 0.52 0.00 SEQID-00650 0.50 0.00 SEQID-00563 0.50 0.00 SEQID-00598 0.49 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in histidine is shown in table E5AW. The top 10 nutritive polypeptide sequences reduced in histidine are shown in table E5AX.

TABLE E5AW SEQID EAA H SEQID-00536 0.58 0.23 SEQID-00560 0.55 0.18 SEQID-01162 0.48 0.12 SEQID-00585 0.44 0.10 SEQID-00298 0.59 0.10 SEQID-00615 0.40 0.10 SEQID-00525 0.64 0.10 SEQID-00297 0.58 0.10 SEQID-00764 0.56 0.09 SEQID-00128 0.53 0.08

TABLE E5AX SEQID EAA H SEQID-00043 0.57 0.00 SEQID-00531 0.55 0.00 SEQID-00592 0.53 0.00 SEQID-00224 0.53 0.00 SEQID-00024 0.52 0.00 SEQID-00625 0.52 0.00 SEQID-00233 0.52 0.00 SEQID-00587 0.51 0.00 SEQID-00213 0.51 0.00 SEQID-00214 0.51 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in isoleucine is shown in table E5AY. The top 10 nutritive polypeptide sequences reduced in isoleucine are shown in table E5AZ.

TABLE E5AY SEQID EAA I SEQID-00561 0.68 0.18 SEQID-00134 0.58 0.14 SEQID-00175 0.63 0.14 SEQID-00162 0.64 0.14 SEQID-00234 0.51 0.13 SEQID-00233 0.52 0.13 SEQID-00169 0.60 0.13 SEQID-00025 0.48 0.13 SEQID-00043 0.57 0.12 SEQID-00584 0.50 0.12

TABLE E5AZ SEQID EAA I SEQID-00762 0.56 0.00 SEQID-00764 0.56 0.00 SEQID-00571 0.54 0.00 SEQID-00212 0.51 0.00 SEQID-00237 0.48 0.00 SEQID-00236 0.45 0.00 SEQID-00551 0.41 0.00 SEQID-00515 0.25 0.01 SEQID-00128 0.53 0.01 SEQID-00651 0.39 0.01

An exemplary list of the top 10 nutritive polypeptide sequences enriched in leucine is shown in table E5BA. The top 10 nutritive polypeptide sequences reduced in leucine are shown in table E5BB.

TABLE E5BA SEQID EAA L SEQID-00162 0.64 0.34 SEQID-00132 0.60 0.32 SEQID-00195 0.52 0.26 SEQID-00194 0.54 0.26 SEQID-00193 0.53 0.26 SEQID-00166 0.65 0.26 SEQID-00169 0.60 0.25 SEQID-00134 0.58 0.24 SEQID-00212 0.51 0.23 SEQID-00139 0.49 0.23

TABLE E5BB SEQID EAA L SEQID-00553 0.39 0.00 SEQID-00743 0.29 0.00 SEQID-00522 0.40 0.01 SEQID-00554 0.38 0.01 SEQID-00585 0.44 0.01 SEQID-00560 0.55 0.01 SEQID-00529 0.38 0.01 SEQID-00552 0.45 0.01 SEQID-00547 0.49 0.01 SEQID-00575 0.33 0.01

An exemplary list of the top 10 nutritive polypeptide sequences enriched in lysine is shown in table E5BC. The top 10 nutritive polypeptide sequences reduced in lysine are shown in table E5BD.

TABLE E5BC SEQID EAA K SEQID-00560 0.55 0.26 SEQID-00573 0.54 0.23 SEQID-00619 0.51 0.23 SEQID-00553 0.39 0.23 SEQID-00572 0.54 0.23 SEQID-00623 0.54 0.23 SEQID-03691 0.52 0.23 SEQID-00503 0.49 0.22 SEQID-00564 0.51 0.22 SEQID-00517 0.49 0.22

TABLE E5BD SEQID EAA K SEQID-00166 0.65 0.00 SEQID-00175 0.63 0.00 SEQID-00169 0.60 0.00 SEQID-00134 0.58 0.00 SEQID-00535 0.30 0.00 SEQID-00513 0.33 0.01 SEQID-02675 0.50 0.01 SEQID-00490 0.58 0.01 SEQID-00512 0.44 0.01 SEQID-00500 0.40 0.01

An exemplary list of the top 10 nutritive polypeptide sequences enriched in methionine is shown in table E5BE. The top 10 nutritive polypeptide sequences reduced in arginine are shown in table E5BF.

TABLE E5BE SEQID EAA M SEQID-00552 0.45 0.16 SEQID-00513 0.33 0.13 SEQID-00529 0.38 0.09 SEQID-00526 0.41 0.09 SEQID-00868 0.48 0.09 SEQID-00595 0.44 0.09 SEQID-00584 0.50 0.09 SEQID-00486 0.56 0.09 SEQID-00092 0.42 0.08 SEQID-00074 0.42 0.08

TABLE E5BF SEQID EAA M SEQID-00132 0.60 0.00 SEQID-00051 0.59 0.00 SEQID-00052 0.59 0.00 SEQID-00043 0.57 0.00 SEQID-00053 0.57 0.00 SEQID-00055 0.55 0.00 SEQID-00054 0.55 0.00 SEQID-00224 0.53 0.00 SEQID-00024 0.52 0.00 SEQID-00220 0.52 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in phenylalanine is shown in table E5BG. The top 10 nutritive polypeptide sequences reduced in phenylalanine are shown in table E5BH.

TABLE E5BG SEQID EAA F SEQID-00150 0.67 0.13 SEQID-00595 0.44 0.13 SEQID-00561 0.68 0.13 SEQID-00118 0.51 0.12 SEQID-00597 0.45 0.12 SEQID-00507 0.51 0.12 SEQID-00594 0.44 0.12 SEQID-00501 0.50 0.12 SEQID-00175 0.63 0.11 SEQID-00485 0.46 0.11

TABLE E5BH SEQID EAA F SEQID-00162 0.64 0.00 SEQID-00560 0.55 0.00 SEQID-00224 0.53 0.00 SEQID-00220 0.52 0.00 SEQID-00195 0.52 0.00 SEQID-00241 0.52 0.00 SEQID-00215 0.51 0.00 SEQID-00213 0.51 0.00 SEQID-00214 0.51 0.00 SEQID-00212 0.51 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in proline is shown in table E5BI. The top 10 nutritive polypeptide sequences reduced in proline are shown in table E5BJ.

TABLE E5BI SEQID EAA P SEQID-00743 0.29 0.28 SEQID-00553 0.39 0.24 SEQID-03641 0.24 0.20 SEQID-03444 0.23 0.16 SEQID-00169 0.60 0.14 SEQID-00005 0.48 0.14 SEQID-00805 0.50 0.13 SEQID-00737 0.28 0.13 SEQID-03451 0.40 0.11 SEQID-03447 0.30 0.10

TABLE E5BJ SEQID EAA P SEQID-00150 0.67 0.00 SEQID-00137 0.64 0.00 SEQID-00287 0.59 0.00 SEQID-00548 0.59 0.00 SEQID-00142 0.56 0.00 SEQID-00560 0.55 0.00 SEQID-00224 0.53 0.00 SEQID-00220 0.52 0.00 SEQID-00241 0.52 0.00 SEQID-00216 0.52 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in serine is shown in table E5BK. The top 10 nutritive polypeptide sequences reduced in serine are shown in table E5BL.

TABLE E5BK SEQID EAA S SEQID-03447 0.30 0.27 SEQID-00483 0.42 0.16 SEQID-00535 0.30 0.16 SEQID-00630 0.35 0.14 SEQID-00134 0.58 0.14 SEQID-00557 0.47 0.13 SEQID-03760 0.39 0.12 SEQID-03642 0.37 0.12 SEQID-00652 0.28 0.12 SEQID-00577 0.47 0.12

TABLE E5BL SEQID EAA S SEQID-00175 0.63 0.00 SEQID-00051 0.59 0.00 SEQID-00052 0.59 0.00 SEQID-00536 0.58 0.00 SEQID-00043 0.57 0.00 SEQID-00053 0.57 0.00 SEQID-00643 0.57 0.00 SEQID-00055 0.55 0.00 SEQID-00054 0.55 0.00 SEQID-00112 0.50 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in threonine is shown in table E5BM. The top 10 nutritive polypeptide sequences reduced in threonine are shown in table E5BN.

TABLE E5BM SEQID EAA T SEQID-00404 0.49 0.14 SEQID-00547 0.49 0.13 SEQID-00522 0.40 0.13 SEQID-00569 0.56 0.13 SEQID-00528 0.53 0.11 SEQID-00504 0.47 0.11 SEQID-03768 0.42 0.11 SEQID-00523 0.51 0.11 SEQID-03649 0.49 0.11 SEQID-00116 0.32 0.11

TABLE E5BN SEQID EAA T SEQID-00560 0.55 0.00 SEQID-00057 0.41 0.00 SEQID-00542 0.41 0.00 SEQID-00059 0.41 0.00 SEQID-00015 0.30 0.00 SEQID-00014 0.30 0.00 SEQID-00013 0.28 0.00 SEQID-00007 0.28 0.00 SEQID-00007 0.28 0.00 SEQID-00621 0.39 0.01

An exemplary list of the top 10 nutritive polypeptide sequences enriched in tryptophan is shown in table E5BM. The top 10 nutritive polypeptide sequences reduced in tryptophan are shown in table E5BN.

TABLE E5BM SEQID EAA W SEQID-01546 0.51 0.08 SEQID-00642 0.45 0.08 SEQID-03690 0.43 0.08 SEQID-03776 0.43 0.08 SEQID-03297 0.62 0.07 SEQID-03244 0.46 0.07 SEQID-00512 0.44 0.07 SEQID-00814 0.42 0.07 SEQID-00110 0.49 0.06 SEQID-03137 0.50 0.06

TABLE E5BN SEQID EAA W SEQID-00166 0.65 0.00 SEQID-00137 0.64 0.00 SEQID-00525 0.64 0.00 SEQID-00162 0.64 0.00 SEQID-00169 0.60 0.00 SEQID-00051 0.59 0.00 SEQID-00052 0.59 0.00 SEQID-00134 0.58 0.00 SEQID-00536 0.58 0.00 SEQID-00043 0.57 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in tyrosine is shown in table E5BO. The top 10 nutritive polypeptide sequences reduced in tyrosine are shown in table E5BP.

TABLE E5BO SEQID EAA Y SEQID-00013 0.28 0.16 SEQID-00007 0.28 0.15 SEQID-00007 0.28 0.15 SEQID-00015 0.30 0.14 SEQID-00325 0.42 0.14 SEQID-00014 0.30 0.13 SEQID-00513 0.33 0.12 SEQID-03689 0.41 0.11 SEQID-00521 0.41 0.11 SEQID-00640 0.47 0.11

TABLE E5BP SEQID EAA Y SEQID-00140 0.70 0.00 SEQID-00146 0.67 0.00 SEQID-00051 0.59 0.00 SEQID-00052 0.59 0.00 SEQID-00548 0.59 0.00 SEQID-00134 0.58 0.00 SEQID-00043 0.57 0.00 SEQID-00053 0.57 0.00 SEQID-00054 0.55 0.00 SEQID-00055 0.55 0.00

An exemplary list of the top 10 nutritive polypeptide sequences enriched in valine is shown in table E5BQ. The top 10 nutritive polypeptide sequences reduced in valine are shown in table E5BR.

TABLE E5BQ SEQID EAA V SEQID-00550 0.49 0.18 SEQID-00592 0.53 0.16 SEQID-00532 0.44 0.15 SEQID-00620 0.50 0.15 SEQID-00644 0.42 0.14 SEQID-00514 0.46 0.13 SEQID-00518 0.52 0.13 SEQID-00598 0.49 0.13 SEQID-00581 0.51 0.13 SEQID-00145 0.51 0.13

TABLE E5BR SEQID EAA V SEQID-00239 0.48 0.00 SEQID-00552 0.45 0.00 SEQID-00240 0.45 0.00 SEQID-00615 0.40 0.00 SEQID-00652 0.28 0.00 SEQID-00515 0.25 0.01 SEQID-00522 0.40 0.01 SEQID-00560 0.55 0.01 SEQID-00645 0.56 0.01 SEQID-00647 0.42 0.01

Example 6. Selection of Amino Acid Sequences of Nutritive Polypeptides Enriched in Essential Amino Acids to Provide Protein Nutrition and for the Treatment of Protein Malnutrition

It has been shown that humans cannot endogenously synthesize nine of the twenty naturally occurring amino acids: histidine, leucine, isoleucine, valine, phenylalanine, methionine, threonine, lysine, and tryptophan (Young, V. R. and Tharakan, J. F. Nutritional essentiality of amino acids and amino acid requirements in healthy adults. In Metabolic and Therapeutic Aspects of Amino Acids in Clinical Nutrition. Second Edition. Cynober, L. A. Ed.; CRC Press: New York, 2004; pp 439-470). As such, there is a need to ingest sufficient quantities of these nine essential amino acids to avoid protein malnutrition and the deleterious health effects that result from this state. Nutritive polypeptides are identified that are useful for the fulfillment of these essential amino acid requirements either in healthy or malnourished individuals by selecting those that are enriched in essential amino acids by mass and contain a non-zero amount of each essential amino acid (i.e., the nutritive polypeptide sequence is essential amino acid complete).

Using a database of all protein sequences derived from edible species as described herein, candidate sequences that are essential amino acid complete and enriched in essential amino acids were identified. In order to increase the probability of these proteins being solubly expressed and highly soluble at pH 7 with reduced aggregation propensity, solvation score and aggregation score upper bounds of −20 kcal/mol/AA and 0.5 were applied. In order to reduce the likelihood that these proteins would elicit an allergenic response, upper bounds of 50% and 35% were set for the global allergen homology and allergenicity scores, respectively. In order to reduce the likelihood that these proteins would have toxic effects upon ingestion, an upper bound of 35% was set for the toxicity score. In order to reduce the likelihood that these proteins would act as inhibitors of digestive proteases, an upper bound of 35% was set for the anti-nutricity score.

An exemplary list of the top 10 nutritive polypeptide sequences that are essential amino acid complete, enriched in essential amino acids, and meet the aforementioned cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E6A.

TABLE E6A SEQID EAAc EAA SEQID-03636 1 0.65 SEQID-03492 1 0.63 SEQID-03468 1 0.62 SEQID-03544 1 0.62 SEQID-03484 1 0.62 SEQID-03442 1 0.61 SEQID-03417 1 0.61 SEQID-03563 1 0.61 SEQID-03469 1 0.60 SEQID-03443 1 0.60

An exemplary list of the top 10 nutritive polypeptide sequences from the expressed protein database that are essential amino acid complete and enriched in essential amino acids is shown in table E6B.

TABLE E6B SEQID EAAc EAA SEQID-00140 1 0.70 SEQID-00561 1 0.68 SEQID-00146 1 0.67 SEQID-00150 1 0.67 SEQID-00143 1 0.65 SEQID-03297 1 0.62 SEQID-00487 1 0.61 SEQID-00287 1 0.59 SEQID-00298 1 0.59 SEQID-00548 1 0.59

Example 7. Selection of Amino Acid Sequences of Nutritive Polypeptides Enriched in Branched Chain Amino Acids for Muscle Health, and Selection of Amino Acid Sequences of Nutritive Polypeptides Reduced in Branched Chain Amino Acids for Treatment of Diabetes, Cardiovascular Disease, Chronic Kidney Disease and Stroke

Identification of Proteins Enriched in Branched Chain Amino Acids for the Treatment of Hepatic and/or Renal Disease.

Using a database of all protein sequences derived from edible species as described herein, candidate sequences that are enriched or reduced in branched chain amino acids were identified. In order to increase the probability that these proteins are solubly expressed, as well as highly soluble at pH 7 with reduced aggregation propensity, solvation score and aggregation score upper bounds of −20 kcal/mol/AA and 0.5 were applied. In order to reduce the likelihood that these proteins would elicit an allergenic response, upper bounds of 50% and 35% were set for the global allergen homology and allergenicity scores, respectively. In order to reduce the likelihood that these proteins would have toxic effects upon ingestion, an upper bound of 35% was set for the toxicity score. In order to reduce the likelihood that these proteins would act as inhibitors of digestive proteases, an upper bound of 35% was set for the anti-nutricity score.

An exemplary list of the top 10 nutritive polypeptide sequences that are enriched in branched chain amino acids, and meet the afore mentioned cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E7A.

TABLE E7A SEQID EAA BCAA SEQID-03532 0.58 0.31 SEQID-03616 0.53 0.31 SEQID-03629 0.56 0.31 SEQID-03619 0.52 0.29 SEQID-03542 0.49 0.29 SEQID-03519 0.49 0.29 SEQID-03603 0.53 0.29 SEQID-03536 0.52 0.29 SEQID-03597 0.48 0.29 SEQID-03623 0.49 0.29

An exemplary list of the top 10 nutritive polypeptide sequences from the expressed protein database that are enriched in branched chain amino acids is shown in table E7B.

TABLE E7B SEQID EAA BCAA SEQID-00162 0.64 0.53 SEQID-00166 0.65 0.46 SEQID-00134 0.58 0.46 SEQID-00169 0.60 0.43 SEQID-00043 0.57 0.41 SEQID-00132 0.60 0.41 SEQID-00137 0.64 0.39 SEQID-00175 0.63 0.38 SEQID-00550 0.49 0.38 SEQID-00234 0.51 0.37

An exemplary list of the top 10 nutritive polypeptide sequences that are reduced in branched chain amino acids, and meet the afore mentioned cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E7C.

TABLE E7C SEQID EAA BCAA SEQID-03471 0.36 0.01 SEQID-03473 0.06 0.01 SEQID-03571 0.24 0.01 SEQID-03495 0.24 0.01 SEQID-03514 0.24 0.01 SEQID-00552 0.45 0.03 SEQID-03611 0.37 0.03 SEQID-03457 0.37 0.03 SEQID-03456 0.34 0.03 SEQID-03520 0.36 0.03

An exemplary list of the top 10 nutritive polypeptide sequences from the expressed protein database that are reduced in branched chain amino acids is shown in table E7D.

TABLE E7D SEQID EAA BCAA SEQID-00552 0.45 0.03 SEQID-00522 0.40 0.03 SEQID-00515 0.25 0.05 SEQID-00553 0.39 0.05 SEQID-00585 0.44 0.06 SEQID-00637 0.42 0.07 SEQID-00652 0.28 0.08 SEQID-00615 0.40 0.08 SEQID-00743 0.29 0.08 SEQID-00547 0.49 0.09

Example 8. Selection of Amino Acid Sequences of Nutritive Polypeptides Having Low or No Phenylalanine and Enriched in Tyrosine and all Other Essential Amino Acids for Treatment or Prevention of Phenylketonuria

Individuals who suffer from phenylketonuria (PKU) are unable to process the amino acid phenylalanine and catalyze its conversion to tyrosine often due to a malfunctioning hepatic enzyme phenylalnine hydroxylase (MacLeod E. L. and Ney D. M. Nutritional Management of Phenylketonuria. Annales Nestle. (2010) 68:58-69). In these individuals, when protein containing the amino acid phenylalanine is ingested, phenylalanine accumulates in the blood. Untreated PKU has serious untoward health effects, including impaired school performance, impaired executive functioning, and long term intellectual disability (Matalon, R., Michals-Matalon, K., Bhatia, G., Grechanina, E., Novikov, P., McDonald, J. D., Grady, J., Tyring, S. K., Guttler, F. Large neutral amino acids in the treatment of phenylketonuria. J. Inherit. Metab. Dis. (2006) 29: 732-738). One way phenylalanine blood levels can be kept low to avoid neurological effects is to avoid the ingestion of phenylalanine containing proteins and/or only consume protein sources that are low in phenylalanine. As basic protein nutritional requirements of all other amino acids must also be met, sufficient intake of the other essential amino acids (histidine, leucine, isoleucine, valine, methionine, threonine, lysine, and tryptophan) and tyrosine, which becomes conditionally essential in these individuals, is required. One can identify beneficial nutritive polypeptides for individuals that suffer from phenylketonuria by selecting proteins that contain low or no phenylalanine and are enriched by mass in tyrosine and the other essential amino acids.

Using a database of all protein sequences derived from edible species as described herein, candidate sequences that contain low or no phenylalanine by mass, are essential amino acid and tyrosine complete (aside from phenylalanine), and enriched in tyrosine and essential amino acids were identified and rank ordered first by their phenylalanine mass fraction and then by their total tyrosine plus essential amino acid mass fraction. In order to increase the probability that these proteins are solubly expressed, as well as highly soluble at pH 7 with reduced aggregation propensity, solvation score and aggregation score upper bounds of −20 kcal/mol/AA and 0.5 were applied. In order to reduce the likelihood that these proteins would elicit an allergenic response, upper bounds of 50% and 35% were set for the global allergen homology and allergenicity scores, respectively. In order to reduce the likelihood that these proteins would have toxic effects upon ingestion, an upper bound of 35% was set for the toxicity score. In order to reduce the likelihood that these proteins would act as inhibitors of digestive proteases, an upper bound of 35% was set for the anti-nutricity score.

An exemplary list of the top 10 nutritive polypeptide sequences that contain low or no phenylalanine by mass, are essential amino acid and tyrosine complete (aside from phenylalanine), enriched in tyrosine and essential amino acids, and meet the afore mentioned cutoffs in solvation score, aggregation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E8A.

TABLE E8A SEQID EAA F Y SEQID-03584 0.53 0.00 0.09 SEQID-03479 0.52 0.00 0.04 SEQID-03573 0.55 0.00 0.01 SEQID-00634 0.50 0.00 0.05 SEQID-03466 0.49 0.00 0.05 SEQID-03609 0.53 0.00 0.01 SEQID-03498 0.45 0.00 0.06 SEQID-03465 0.40 0.00 0.08 SEQID-03587 0.46 0.00 0.01 SEQID-03463 0.41 0.00 0.05

An exemplary list of the top 10 nutritive polypeptide sequences from the expressed protein database that contain low or no phenylalanine by mass, are essential amino acid and tyrosine complete (aside from phenylalanine), and enriched in tyrosine and essential amino acids is shown in table E8B.

TABLE E8B SEQID EAA F Y SEQID-00634 0.50 0.00 0.05 SEQID-00514 0.46 0.01 0.01 SEQID-00329 0.47 0.01 0.03 SEQID-00628 0.51 0.01 0.02 SEQID-00636 0.47 0.01 0.04 SEQID-03634 0.45 0.01 0.04 SEQID-00335 0.38 0.01 0.07 SEQID-00639 0.46 0.01 0.06 SEQID-03448 0.40 0.01 0.03 SEQID-03889 0.42 0.02 0.01

Example 9. Selection of Amino Acid Sequences of Nutritive Polypeptides Containing Fragments or Regions of Naturally-Occurring Protein Sequences: Nutritive Polypeptide Fragments Enriched in Leucine and all Essential Amino Acids

In some cases, full length proteins identified from the databases described herein are not particularly advantageous in view of one or more selection requirements defined by one or more important parameters, or otherwise do not provide enough of one or more specific amino acid(s) by mass relative to the total mass of the nutritive polypeptide. In these cases, one or more fragments (also termed “regions” herein) of nutritive polypeptides identified in the database are able meet the desired search criteria. Databases containing possible fragments of nutritive polypeptides are generated and searched by taking each full length sequences in the database and examining all possible subsequences at least 25 amino acids in length contained therein. For example, it was desired to find a nutritive polypeptide sequence that had leucine mass fractions greater than about 0.2 and highly charged to increase the likelihood of soluble expression. The protein edible species database described herein was searched using a solvation score cutoff of less than −30, and in order to reduce the likelihood that these proteins would elicit an allergenic response, upper bounds of 50% were set for the global allergen homology and allergenicity scores. In order to reduce the likelihood that these proteins would have toxic effects upon ingestion, an upper bound of 35% was set for the toxicity score. In order to reduce the likelihood that these proteins would act as inhibitors of digestive proteases, an upper bound of 35% was set for the anti-nutricity score.

An exemplary list of the top 10 nutritive polypeptide fragments that are enriched in leucine (≥20% by mass) and meet the afore mentioned cutoffs in solvation score, global allergen homology, allergenicity score, toxicity score, and anti-nutricity score is shown in table E9A.

TABLE E9A DBID EAA L P58797 0.47 0.26 A7A1V1 0.49 0.24 P04467 0.56 0.23 Q9AWA5 0.45 0.22 Q2NL14 0.52 0.22 Q60CZ8 0.42 0.22 Q10MN8 0.32 0.22 P50275 0.47 0.22 Q2YDE5 0.45 0.22 Q0P5B4 0.51 0.22

An exemplary list of the top 10 nutritive polypeptide fragments from the expressed protein database that are enriched in leucine (≥20% by mass) is shown in Table E9B.

TABLE E9B SEQID EAA L SEQID-00132 0.60 0.32 SEQID-00195 0.52 0.26 SEQID-00194 0.54 0.26 SEQID-00193 0.53 0.26 SEQID-00166 0.65 0.26 SEQID-00134 0.58 0.24 SEQID-00212 0.51 0.23 SEQID-00139 0.49 0.23 SEQID-00213 0.51 0.21 SEQID-00148 0.47 0.21

Example 10. Purification of Nutritive Polypeptides

Various methods of purification have been used to isolate nutritive polypeptides from or away other materials such as raw foods, cells, salts, small molecules, host cell proteins, and lipids. These methods include diafiltration, precipitation, flocculation, aqueous two phase extraction, and chromatography.

Purification by Anti-FLAG Affinity Chromatography.

Anti-FLAG purification provides a method to purify nutritive polypeptides from low-titer expression systems or from similarly charged host cell proteins. Nutritive polypeptides were engineered to contain either a single FLAG tag (DYKDDDDK (SEQ ID NO: 3914)) or a triple tandem FLAG tag (DYKDHDGDYKDHDIDYKDDDDK (SEQ ID NO: 3915)) appended to the C-terminus of the protein. Anti-FLAG affinity purification offers a single-step purification process that offers non-denaturing process conditions and elution purities of >95% (Einhauer et al., 2001 Journal of Biochemical and Biophysical Methods).

Nutritive polypeptides were purified using Anti-FLAG M2 Affinity Agarose Gel (Sigma Aldrich, St. Louis, Mo.). The M2 affinity resin is designed specifically for use with C-terminal FLAG epitopes. For purification of N-terminally appended FLAG epitopes, the M1 Affinity Agarose Gel was used. The M2 Affinity Agarose Gel (resin) has an advertised static binding capacity (SBC) of approximately 0.5 mg nutritive polypeptide per mL of resin.

Purification of nutritive polypeptides from Aspergillus niger secretion media and Bacillus subtilis secretion media were performed using 20-40 mL of anti-FLAG resin. Prior to purification, secretion media was adjusted to 150 mM NaCl and pH 7.4. Resin was equilibrated by rinsing the media with an excess of 1× tris-buffered saline (TBS) pH 7.4±0.1 and collecting it through a 0.2 um polyethersulfone (PES) vacuum filter. Equilibrated resin was then mixed with secretion media in batch mode and allowed to mix at room temperature for one hour. Unbound material was removed from the resin by passing the entire mixture through a 0.2 um PES vacuum filter. The resin was physically collected on the surface of the filter and was subsequently washed with 20 resin volumes of TBS pH 7.4±0.1 to further remove unbound material through the 0.2 um PES vacuum filter. Washed resin was transferred to drip columns (10 mL each) and the bound polypeptides were eluted with two column volumes (CV) of 0.1M glycine pH 3.0. The eluted polypeptides were flowed directly from the drip columns into conical tubes that contained 1M Tris pH 8.0; this strategy was used to neutralize the pH of the eluted polypeptide solution as quickly as possible. Resin was regenerated using an additional 3 CV of 0.1M glycine pH 3.0. For short term storage, resin was stored in 1×TBS pH 7.4 at 4° C.; for long term storage, resin was stored in 0.5×TBS pH 7.4, 50% glycerol at −20° C.

Exemplary anti-FLAG purification of SEQID-00105 from B. subtilis yielded 4.0 mg of protein in a 4.3 ml elution. The sample was loaded onto a polyacrylamide gel at three different dilutions for increased sensitivity and SEQID-00105 was found to be 95% pure. Exemplary anti-FLAG purification of SEQID-00298 from A. niger was performed according to the same procedure. The elution fraction was neutralized, as described, and analyzed by SDS-PAGE and Bradford assay as described herein. The main band in the elution was found to be 95% pure. The main band in the elution is compared to the MW ladder on the same gel, and matched the expected molecular weight of SEQID-00298. Forty mL of anti-FLAG resin captured 4.0 mg of material, resulting in an estimated resin capacity of 0.10 mg/mL.

Purification by 5 ml Immobilized Metal Affinity Chromatography (IMAC).

E. coli was grown in shake flask fermentation with targeted expression of individual nutritive polypeptides with HIS8 tags (SEQ ID NO: 3919), as described herein. Cells were harvested from each shake-flask by bucket centrifugation. The supernatant was discarded, and the cells were suspended in 30 mM imidazole, 50 mM sodium phosphate, 0.5 M NaCl, pH 7.5 at a wet cell weight (WCW) concentration of 20% w/v. The suspended cells were then lysed with two passes through a M110-P microfluidizer (Microfluidics, Westwood, Mass.) at 20,000 psi through an 87 um interaction chamber. The lysed cells were centrifuged at 15,000 relative centrifugal force (RCF) for 120 minutes, and then decanted. Cellular debris was discarded, and the supernatants were 0.2 um filtered. These filtered protein solutions were then purified by immobilized metal affinity chromatography (IMAC) on an ÄKTA Explorer 100 FPLC (GE Healthcare, Piscataway, N.J.). Nutritive polypeptides were purified over 5 mL (1.6 cm diameter×2.5 cm height) IMAC Sepharose 6 Fast Flow columns (GE Healthcare, Piscataway, N.J.).

IMAC resin (GE Healthcare, IMAC Sepharose 6 Fast Flow) was charged with nickel using 0.2 NiSO4 and washed with 500 mM NaCl, 200 mM imidazole, pH 7.5 followed by equilibration in 30 mM imidazole, 50 mM sodium phosphate, 0.5 M NaCl, pH 7.5. 50 mL of each protein load solution was applied onto a 5 mL IMAC column, and washed with additional equilibration solution to remove unbound impurities. The protein of interest was then eluted with 15 mL of IMAC Elution Solution, 0.25 M imidazole, 0.5 M NaCl, pH 7.5. All column blocks were performed at a linear flow rate of 150 cm/hr. Each IMAC elution fraction was buffer exchanged by dialysis into a neutral pH formulation solution. The purified proteins were analyzed for concentration and purity by capillary electrophoresis and/or SDS-PAGE. Concentration was also tested by Bradford and A280 measurement, as described herein. Table E9A demonstrates a list of nutritive polypeptides that were purified by IMAC at 5 mL scale.

TABLE E9A Nutritive polypeptides that were purified by IMAC at 5 mL SEQID Mass (mg) Purity 00533 3 22% 00522 25.5 36% 00085 34 51% 00103 4.5 56% 00359 40.5 56% 00346 30.7 56% 00510 112 61% 00622 70 70% 00522 47 72% 00546 235.6 75% 00353 5.6 76% 00601 83.8 77% 00418 14 80% 00502 93.2 84% 00100 68 87% 00606 77.8 87% 00104 93 89% 00076 92 91% 00341 176.6 91% 00598 60.3 91% 00647 73.7 93% 00105 3.8 93% 00343 35.3 95% 00103 112 95% 00511 179 95% 00354 85.8 96% 00587 93 96% 00610 90.5 97% 00485 269 98% 00356 76.9 98% 00352 134.9 99% 00345 196.2 100%  00338 123.2 100%  00298 0.6 100%  00357 104.8 100%  00605 202 100%  00559 241.8 100%  00338 268 100% 

Purification by 1 L Immobilized Metal Affinity Chromatography (IMAC).

E. coli was grown in 20 L fermentation with targeted expression of individual nutritive polypeptides with HIS8 tags (SEQ ID NO: 3919), as described herein. Cells were harvested from the fermenter and centrifuged using a Sharples AS-16P centrifuge to collect wet cell mass. Cells were subsequently resuspended in 30 mM imidazole, 50 mM sodium phosphate, 0.5 M NaCl, pH 7.5 at a wet cell weight (WCW) concentration of 20% w/v. The cell suspension was then lysed using four passes through a Niro Soavi Homogenizer (Niro Soavi, Parma, Italy) at an operational pressure of 12,500-15,000 psi and a flow rate of 1 L/min. The lysate was clarified using a Beckman J2-HC bucket centrifuge (Beckman-Coulter, Brea, Calif.) at 13,700×g for 1 hour. Cellular debris was discarded, and the supernatant was filtered through a Sartopore II XLG 0.8/0.2 um filter (Sartorius Stedim, Bohemia, N.Y.) at 30 L/m2/hr. Filtered lysate was purified by IMAC using IMAC Sepharose 6 Fast Flow resin packed in a 0.9 L column (9 cm diameter×13.8 cm height).

IMAC resin was equilibrated, as described herein, at a linear flow rate of 300 cm/hr. Once equilibrated, the entirety of the filtered lysate was passed over the column at a linear flow rate of 150 cm/hr. Load volumes ranged from six to ten column volumes. After the load, unbound material was washed off of the column, and the target protein was eluted. Elution pools were shipped at room temperature, 4° C. or frozen. This decision was dependent on the stability of the nutritive polypeptide in Elution Solution. Table E9B summarizes a number of nutritive polypeptides that have been purified by IMAC at the 1 L column scale.

TABLE E9B Nutritive polypeptides that have been purified by IMAC at 1 L scale SEQID IMAC Elution Mass IMAC Elution Purity 00240 9.00 grams  98% 00338 43.5 grams 100% 00341 54.3 grams 100% 00352 19.8 grams 100% 00559 19.5 grams  89% 00587 8.6 grams  69%

Nutritive polypeptides were filtered through a Sartopore II XLG 0.8/0.2□□m filter and loaded directly into an ultrafiltration/diafiltration (UF/DF) unit operation. Membrane area and nominal molecular weight cutoff were chosen as appropriate for each nutritive polypeptide. Nutritive polypeptides were ultrafiltered at a cross flow of 12 L/m2/min and a TMP target of 25 psi. Nutritive polypeptides were concentrated approximately ten-fold on Hydrosart ultrafiltration cassettes (Sartorius Stedim, Bohemia, N.Y.), and diafiltered seven diavolumes into a formulation buffer that is specific to the nutritive polypeptide. Ultrafiltration permeate was discarded. The diafiltered, concentrated retentate was collected, filtered through a 0.22 um membrane filter and frozen at −80° C.

In some cases, frozen protein concentrates were lyophilized using a Labconco lyophilizer (Labconco, Kansas City, Mo.). Residual water content of the cake is analyzed using the Karl Fisher method.

Purification by 10 L Immobilized Metal Affinity Chromatography (IMAC).

E. coli was grown in 250 L fermentation with targeted expression of individual nutritive polypeptides with HIS8 tags (SEQ ID NO: 3919), as described herein. Cells were harvested from the 250 L fermenter and centrifuged using a Sharples AS-16P centrifuge to collect wet cell mass. Cells were subsequently resuspended in 30 mM imidazole, 50 mM sodium phosphate, 0.5 M NaCl, pH 7.5 at a WCW concentration of 20% w/v. The cells suspension was then lysed using four passes through a Niro Soavi Homogenizer (Niro Soavi, Parma, Italy) at an operational pressure of 12,500-15,000 psi and a flow rate of 1 L/min. Clarified lysate was generated using four passes through a Sharples AS-16P centrifuge at 15,000 rpm operated at 0.5 L/min. Cellular debris was discarded, and the supernatant was filtered through a series of filters. Clarified lysate was passed sequentially through a SartoPure GF+0.65 um, a SartoGuard PES 1.2/0.2 um and a Sartopore II XLG 0.8/0.2 um filter (Sartorius Stedim, Bohemia, N.Y.). Filtered lysate was purified by IMAC using IMAC Sepharose 6 Fast Flow resin packed in an 8.5 column (20 cm diameter×27.1 cm height).

IMAC resin was equilibrated as described at a linear flow rate of 150 cm/hr. Once equilibrated, the filtered lysate was passed over the column at a linear flow rate of 150 cm/hr. Load volumes ranged from 3.8 to 5.0 CV. After the load, unbound material was washed off of the column with additional equilibration. Nutritive polypeptides manufactured at the 10 L IMAC scale were subject to an additional set of washes with 2 CV of 10 mM sodium phosphate dibasic, 300 mM NaCl; 3 CV of 0.5% w/v sodium deoxycholate, 50 mM sodium phosphate dibasic, 300 mM NaCl; and 5CV of 10 mM sodium phosphate dibasic, 300 mM NaCl. Following the washes, the target polypeptide was eluted as described. Elution pools were stored at room temperature.

Multiple nutritive polypeptides were purified by IMAC chromatography at the 10 L column scale. Table E9C summarizes the purification of SEQID-00105 and SEQID-00338. FIG. 1 provides an exemplary SDS-PAGE analysis of the purification of SEQID-00105.

TABLE E9C Nutritive polypeptides were purified by IMAC chromatography at the 10 L column scale SEQID Mass Purity 00105 179 g 98% 00105 265 g 98% 00105 131 g 91% 00105 147 g 92% 00105 164 g 94% 00105 148 g 95% 00105 229 g 100%  00105 228 g 100%  00338 137 g 92% 00338 196 g 100%  00338 169 g 100% 

After IMAC purification at the 10 L column scale, nutritive polypeptides were filtered through a Sartopore II XLG 0.8/0.2 um filter and loaded directly into an ultrafiltration/diafiltration (UF/DF) unit operation. Membrane area and nominal molecular weight cutoff were chosen as appropriate for each nutritive polypeptide. Nutritive polypeptides were ultrafiltered at a cross flow of 12 L/m2/min and a TMP target of 25 psi. Nutritive polypeptides were concentrated approximately ten-fold on Hydrosart ultrafiltration cassettes (Sartorius Stedim, Bohemia, N.Y.), and diafiltered sequentially into four diavolumes of 10% phosphate buffered saline (PBS), pH 8.7; followed by two diavolumes of 25 mM tetrasodium ethylenediaminetetraacetic acid (Na4EDTA); followed by seven diavolumes of 10% PBS, pH 8.7. Intermediate diafiltration into Na4EDTA was performed in order to chelate any leached nickel(II) from the IMAC resin. Ultrafiltration permeate was discarded; the diafiltered, concentrated retentate was filtered through a 0.2 um membrane filter, and frozen at −80° C.

The ultrafiltration pool was filtered with a sterilizing-grade filter with the goal of bioburden reduction. The nutritive polypeptide was filtered into glass trays that were rinsed with ethanol. Filled glass trays were subsequently frozen at −80° C. The frozen material was then lyophilized to a dry cake using a Labconco lyophilization unit (Labconco, Kansas City, Mo.). The mass of the protein in the tray was monitored with time, until it plateaued, which was considered to be complete drying. The dried protein cake was sealed by the lid of the tray, and over-packaged by vacuum sealing in a plastic bag. The entire package was stored at −80° C.

Ion Exchange Chromatography.

Selecting an appropriate method of purifying a nutritive polypeptide has implications for the speed of process development, cost of manufacture, final purity, and robustness of the purification. Nutritive polypeptides have been isolated by various chromatographic methods. The mode of chromatography selected for use depends on the physicochemical properties of the target nutritive polypeptide. Charged nutritive polypeptides bind to ion exchange chromatography resin through electrostatic interactions.

In the present application, we have defined two methods of screening a library of polypeptides to rank-order them for their ability to bind to ion exchange resins. One method is an in silico prediction based on calculation of protein net charge across a range of pH using the primary sequence of the polypeptides, as described herein. The second method is a multiplexed purification screen in vitro, as described herein. The two methods have successfully been used independently of each other, and they have been used together on the same set of 168 nutritive polypeptides with supportive data, as described herein.

The in silico method of predictive ranking for ion exchange purification is based on calculating net charge of a nutritive polypeptide at a range of pH based on the primary sequence. The primary sequence of a nutritive polypeptide is used to predict the mode of chromatography that is most likely to successfully isolate that nutritive polypeptide from host cell proteins and other impurities. Highly charged nutritive polypeptides are likely to bind tightly to ion exchange chromatography resin. The tightest binding is achieved for nutritive polypeptides which have one predominant charge, either positive or negative. It is possible for a nutritive polypeptide with a mixture of positive and negative charges to have tight binding to ion exchange resin, but it is also possible that those charges may work against each other. Similarly, a nutritive polypeptide with alternating positive and negative patches on its surface may not bind as tightly as one with a dominant portion of its surface that is one single charge. Similarly, a nutritive polypeptide that has a strongly positive or negative terminus, tail, tag, or linker sequence may effectively display that highly charged group allowing for extremely tight binding.

A prevalence of one or more certain amino acids, e.g., histidine, arginine, and lysine in a polypeptide imparts in that polypeptide, or a portion thereof, a positive charge when the pH of the protein solvent is below the pKa of the one or more amino acids. Polypeptide charge includes total protein charge, net charge, or the charge of a portion of the polypeptide. In embodiments wherein a polypeptide or portion thereof is positively charged, a cation exchange resin is used.

A prevalence of one or more certain amino acids, e.g., glutamic acid and aspartic acid in a polypeptide imparts in that polypeptide, or a portion thereof, a negative charge when the pH of the protein solvent is above the pKa of the one or more amino acids. Polypeptide charge includes total protein charge, net charge, or the charge of a portion of the polypeptide. In embodiments wherein a polypeptide or portion thereof is negatively charged, an anion exchange resin is used.

The net charge of a polypeptide changes as a function of the pH of the protein solvent. The number of positive charges and negative charges can be calculated at any pH based on the primary sequence of the polypeptide. The sum of the positive charges and negative charges at any one pH results in the calculated net charge. The isoelectric point (pI) of the polypeptide is the pH at which its calculated net charge is 0. To make comparisons, the net charge of a sequence is normalized by the number of amino acids in the sequence and the parameter “net charge per amino acid” results as the novel comparator between sequences, which is used to predict chromatographic performance.

Nutritive polypeptide sequences have been evaluated by calculating the net charge per amino acid of each polypeptide at every pH (1-14). Additionally, the pI of each polypeptide was calculated. Nutritive polypeptides were ranked by pI and by net charge per amino acid. Polypeptides with a low pI and very negative net charge per amino acid across a wide range of pH are predicted to bind to anion exchange chromatography resin with high affinity. Polypeptides with a high pI and very positive net charge per amino acid across a wide range of pH are predicted to bind to cation exchange chromatography resin with high affinity. In some embodiments herein, only a portion of the polypeptide is charged (as in a terminus, tail, tag, or linker), it is recognized that the pI and net polypeptide charge may be variable, and other factors or empirical measurements may be useful to predict the binding affinity of such a polypeptide to chromatography resins.

FIG. 2 demonstrates example nutritive polypeptides, which, based on primary sequence, are predicted to bind to either anion or cation exchange resin. Nutritive polypeptides with a pI of <4.0, and a net charge per amino acid that is negative across a broad range of pH are predicted to bind anion exchange resin with high affinity ((1) SEQID-00105, (2) SEQID-00008, (3) SEQID-00009, (4) SEQID-00475). Nutritive polypeptides with a pI of >10.0, and a net charge per amino acid that is positive across a broad range of pH are predicted to bind cation exchange resin with high affinity ((5) SEQID-00472, (6) SEQID-00640, (7) SEQID-00019).

The primary sequence analyses presented herein indicate that SEQID-00105 and SEQID-00009 are likely to bind to anion exchange chromatography resin with high affinity, and that SEQID-00640 is likely to bind to cation exchange chromatography resin with high affinity. These predictions were tested and demonstrated to be true, as demonstrated in the following four examples of polypeptide purification after microbial cell culture. In the first example, SEQID-00009 was purified directly from lysed E. coli cells to 99% purity using anion exchange chromatography. In the second example, SEQID-00105 was isolated from Bacillus subtilis supernatant by anion exchange chromatography. In the third example, SEQID-00105 expressed intracellularly in E. coli was refined to 100% purity using anion exchange chromatography after it had been initially purified by IMAC chromatography. In the fourth example, SEQID-00640 was isolated from Bacillus subtilis supernatant by cation exchange chromatography.

SEQID-00009 was expressed intracellularly in E. coli, as described herein. The cells were suspended in solution and ruptured. Three solutions were tested (0.1 M Na2CO3 pH 11.4, 0.1 M tris HCl pH 4.1, and 0.1 M potassium phosphate pH 7.0). These lysed solutions were clarified by centrifugation and mixed with anion exchange resin for binding. Two resins were tested (Fractogel® EMD TMAE Hicap (M) from EMD and POROS® D 50 μm from Life Technologies). These six binding conditions were performed in batch mode and the resins were washed with the appropriate lysis buffer to remove any unbound protein. The maximally-bound damp resin was then transferred to smaller drip columns. Each drip column was then eluted with up to six sequential washes of increasing NaCl concentration (each NaCl wash solution was buffered with the appropriate lysis buffer). SEQID-00009 was eluted in these fractions, collected, and analyzed by chip electrophoresis, as described herein. SEQID-00009 was identified as an eluting band at the expected molecular weight. In every case, SEQID-00009 eluted from the drip column at a purity higher than the load purity. This observation indicates that SEQID-00009 did bind to the anion exchange resins, as predicted, and that purification was achieved. The maximum purity achieved was 99%. In every case, SEQID-00009 was among the last proteins to elute from the resin indicating that the binding affinity of SEQID-00009 to two resins at a range of pH is generally higher than the binding affinity of any host cell protein from E. coli.

Microbial cell culture of Bacillus subtilis was performed as described herein expressing and secreting SEQID-00105 into the fermentation media. The cells were removed by centrifugation, and the supernatant was further clarified by membrane filtration. The clarified supernatant was concentrated by ultrafiltration to decrease load time on the anion exchange column and the solution was exchanged into a low salt solution buffered at pH 6.0. This solution was passed over a chromatography column (1 cm diameter, 20 cm height) containing POROS® XQ Strong Anion Exchange Resin from Life Technologies. The unbound proteins were rinsed from the resin with 20 mM Bistris, pH 6.3. The bound proteins were then eluted using a 30 column volume gradient to 400 mM NaCl, 20 mM Bistris, pH 6.3. The column effluent was collected in sequential fractions and analyzed by chip electrophoresis, as described herein. SEQID-00105 was identified as an eluting band at the expected molecular weight. The maximum purity achieved in one fraction was 100%.

SEQID-00105 was expressed intracellularly in E. coli, as described herein. The cells were ruptured, and the SEQID-00105 was purified by IMAC chromatography, according to the procedure described herein. The IMAC elution pool was further refined to 100% purity using anion exchange chromatography. The IMAC elution pool was concentrated and then diluted into 50 mM tris pH 8.0. This solution was then passed through a column (1.6 cm diameter, 20 cm height) packed with anion exchange resin: Fractogel® EMD TMAE Hicap (M) from EMD. The bound protein was rinsed with equilibration solution, and then eluted with 350 mM NaCl 50 mM tris pH 8.0. This procedure was repeated multiple times, and all samples were analyzed by chip electrophoresis, as described herein. The elution samples ranged from 79% to 99% pure.

Microbial cell culture of Bacillus subtilis was performed as described herein expressing and secreting SEQID-00640 into the fermentation media. The cells were removed by centrifugation, and the supernatant was further clarified by membrane filtration. The clarified supernatant was diluted 1:2 with deionized water and titrated to pH 5 with 1 M acetic acid. The resulting solution was membrane filtered prior to loading onto a cation exchange column (1.2 cm diameter 10 cm height) packed with POROS® XS Strong Cation Exchange Resin from Life Technologies. The bound resin was flushed with a 50 mM Acetate, 50 mM NaCl, pH 5.0 solution. The protein was then eluted with a 20 CV gradient to 1.05 M NaCl pH 5.0. Elution fractions were collected and analyzed by SDS-PAGE, Coomassie Blue Stain. The peak sample demonstrated 100% purity and no impurities eluted later in the gradient, indicating that the SEQID-00640 polypeptide bound to the cation exchange resin with more affinity than any host cell proteins.

Purification by Precipitation.

Protein precipitation is a well-known method for purification of polypeptides (Scopes R. 1987. Protein Purification: Principles and Practice. New York: Springer). Many polypeptides precipitate as salt concentrations increase, a phenomenon known as salting out. Salt types have been ranked and organized on the Hofmeister series for their different abilities to salt out proteins (F. Hofmeister Arch. Exp. Pathol. Pharmacol. 24, (1888) 247-260.). Proteins also have different propensity to precipitate due to high salt concentration based on their physicochemical properties, however, a universal metric to rank proteins for this characteristic has not been established. The use of such a ranking metric to select nutritive polypeptides for their ability to be purified has implications for the speed of process development, cost of manufacture, final purity, and robustness of the purification.

In most industrial applications of purification by polypeptides precipitation, the polypeptide of interest is selectively precipitated, and the impurities are then rinsed away from the solid precipitate. In certain embodiments, polypeptides do not precipitate with high levels of salt, and purification is achieved by precipitating the impurities. In the present application, we have defined two methods of screening a library of polypeptides to rank-order them for their ability to remain soluble through harsh precipitation conditions. One method is an in silico prediction based on calculation of protein total charge across a range of pH using the primary sequence of the polypeptides, as described herein. The second method is a multiplexed purification screen in vitro, as described herein. The two methods have successfully been used independently of each other, and they have been used together on the same set of 168 nutritive polypeptides with supportive data, as described herein.

The solubility of a polypeptide correlates directly with the abundance of surface charges (Jim Kling, Highly Concentrated Protein Formulations: Finding Solutions for the Next Generation of Parenteral Biologics, BioProcess International, 2014.). It has been established that surface charges can impart physical characteristics to a polypeptide (Lawrence, M. S., Phillips, K. J., & Liu, D. R. (2007). Supercharging proteins can impart unusual resilience. Journal of the American Chemical Society, 129(33), 10110-2. doi:10.1021/ja071641y).

The in silico method of predictive solubility ranking is based on calculating total number of charges of a nutritive polypeptide at a range of pH based on the primary sequence.

A prevalence of one or more certain amino acids, e.g., histidine, arginine, and lysine in a polypeptide imparts in that polypeptide, or a portion thereof, a positive charge when the pH of the protein solvent is below the pKa of the one or more amino acids. A prevalence of one or more certain amino acids, e.g., glutamic acid and aspartic acid in a polypeptide imparts in that polypeptide, or a portion thereof, a negative charge when the pH of the protein solvent is above the pKa of the one or more amino acids.

The total number of charges of a polypeptide changes as a function of the pH of the protein solvent. The number of positive charges and negative charges can be calculated at any pH based on the primary sequence of the polypeptide, as described herein. The sum of the positive charges and negative charges at any one pH results in the calculated net charge. The isoelectric point (pI) of the polypeptide is the pH at which its calculated net charge is 0. To make comparisons across nutritive polypeptides, the total number of positive charges is added to the total number of negative charges (the absolute value), and that total charge is normalized by the number of amino acids in the sequence, and the parameter “total charge per amino acid” results as the novel comparator between sequences, which is used to predict the polypeptide's resistance to precipitation. The more resistant a polypeptide is, the higher the likelihood that it can be purified to a high degree by precipitating out the impurities. Unlike predicting chromatographic performance, solubility is not affected by the polarity of the charges. While it is often true that a polypeptide experiences its lowest solubility at the pI of the sequence, some polypeptides have a high total charge and are still extremely soluble at their pI, as shown herein.

Nutritive polypeptide sequences have been evaluated by calculating the total charge per amino acid of each polypeptide at a range of pH (1-14). Nutritive polypeptides were ranked by total charge per amino acid. Polypeptides with a low pI and very negative net charge per amino acid across a wide range of pH and polypeptides with a high pI and very positive net charge per amino acid across a wide range of pH are all expected to score equally well by this ranking. Polypeptides with a large number of both charges score the best.

FIG. 3 demonstrates example nutritive polypeptides, which, based on primary sequence, are predicted to have extremely high solubility. This set includes polypeptides with a low pI (<4) and very negative net charge per amino acid across a wide range ((1) SEQID-00475, (2) SEQID-00009). This set includes polypeptides with a high pI (>10) and very positive net charge per amino acid across a wide range of pH ((4) SEQID-00433, (5) SEQID-00472). This set includes polypeptides with more neutral pI ((3) SEQID-00478). Many of the polypeptides displayed here show high charge even at extreme pH values, such as <4 and >12. The entire set is expected to be extremely soluble and resist precipitation across a wide range of pH.

In this demonstration, the E. coli cells were harvested from shake flask fermentation by centrifugation, and the whole cells were distributed into tubes (1 gram of cells per tube). To each tube, 4 mL of lysis solution was added, and cells were lysed by sonication at 75 Amps for 30 seconds. Lysis solutions included: Water, 8 M Urea 0.1 M Tris 0.1M NaCl, 0.1 M Acetate 10% gly 0.1% Tween-80 0.3 M Arg 0.3M NaCl 10 mM EDTA, 10 mM Imidazole pH 5.0, 0.1 M Acetate 10% gly 0.1% Tween-80 0.3 M Arg 0.3M NaCl 10 mM EDTA, 10 mM Imidazole pH 7.0, 100 NaCl 100 Hepes 10 Imidazole pH 7.5, 500 NaCl 100 Hepes 10 Imidazole pH 7.5, 100 mM Phos, 150 mM NaCl, 10 mM Imidazole pH 7.5, 0.1 M NaCl 0.1 M Hepes 10 mM Imidazole 50 mM CaCl2 PH 7.5, 0.1 M Hepes 3.5 M Am Sulfate pH 7.5, 0.1 M Hepes 2 M Am. Sulfate pH 7.5, 0.1 M Tris, 0.1 M Tris 0.5 M NaCl, 150 mM NaCl 10 mM Acetate 15 mM Imidazole pH 6.04, and 500 mM NaCl 100 mM Acetate 15 mM Imidazole pH 6.04. The lysate was clarified by centrifugation and 0.2 um filtration. The clarified supernatant was analyzed by SDS-PAGE (blue stain), as described herein. SEQID-00009 demonstrated solubility in each of these conditions. E. coli host cell proteins generally demonstrated solubility in these conditions as well, with one exception. In the presence of 3.5 M ammonium sulfate effectively precipitated the majority of the host cell proteins resulting in 85% purified SEQID-00009 after the cell harvest stage of the process. This result indicates that SEQID-00009 is more soluble than most E. coli host cell proteins and that precipitation can be used as part of a low cost method for isolation. This correlates with the high total charge of SEQID-00009 and supports that the prediction is accurate. Furthermore, it is predicted that polypeptides with more charge than SEQID-00009 would be even more soluble which could have the benefit over SEQID-00009 of higher polypeptide yield.

In subsequent experiments, SEQID-00009 was purified to 99% purity with a single stage of ammonium sulfate precipitation. In this demonstration, the E. coli cells were harvested from shake flask fermentation by centrifugation, and the whole cells were suspended in 0.1 M sodium carbonate, pH 10 (1 gram of cells in 4 mL of solution). The cells were lysed by sonication (80 Amp for 2 minutes). The lysate was clarified by centrifugation and 0.2 □m membrane filtration. The clarified supernatant was divided into a series of 3 mL fractions, to which a stock solution of 4 M ammonium sulfate, pH 9.8 was added. Variable amounts of stock solution were added to achieve a range of ammonium sulfate concentrations. The samples were mixed for 10 minutes at room temperature, and clarified by centrifugation and 0.2 um membrane filtration. The clarified supernatants were analyzed by SDS-PAGE (blue stain), as described herein. FIG. 4 shows the purity of SEQID-00009 is as a function of ammonium sulfate concentration.

Multiplexed Purification: Ion Exchange Chromatography.

In some cases, an entire library of proteins is tested in a multiplexed screening experimental platform. 168 nutritive polypeptides were transfected and expressed in a multiplexed expression system in which a single growth condition was used to produce each polypeptide in a single container. This multiplexed expression system allows any set of polypeptide sequences to be tested in parallel for a wide range of manufacturability parameters, each of which can be used to rank order the set of polypeptides being examined. A set of manufacturability parameters includes expression level, polypeptide solubility, ability of polypeptide to be purified by chromatography, ability to resist thermal denaturation, ability of polypeptide to digest, ability of polypeptide to be purified by resisting harsh treatments.

The set of 168 nutritive polypeptides was tested for intracellular expression in E. coli. The solubly expressed polypeptides were pre-treated as a group and then subjected to a series of purification conditions so that the set could be rank ordered in terms of their ease of purification by multiple methodologies. As described herein, it is expected that the same subset of proteins will be identified from each expression system to bind a particular mode of chromatography based on the primary sequence analysis, specifically net charge per amino acid.

For E. coli multiplexed purification, the set of nutritive polypeptide sequences was HIS8 tagged (SEQ ID NO: 3919). The cells were cultured, as described herein, ruptured, and the solution was clarified, as described herein. This production resulted in a solution containing all of the polypeptides from the set which were both expressed and soluble. That set of soluble polypeptides was passed over an 5 ml IMAC column, and eluted, as described herein. This IMAC purification effectively isolated the solubly expressed nutritive polypeptides as a set by removing the majority of E. coli host cell proteins. The elution fraction was concentrated and buffer exchanged into a low salt solution buffered near neutral pH before testing various purification methods. The methods tested include anion exchange chromatography, cation exchange chromatography, and negative precipitation, in which the impurities precipitate and the polypeptides that remain soluble rank the highest. In this case, impurities have been removed, so the polypeptides are rank ordered amongst themselves. Additionally, the set of proteins was tested for thermal stability, by heating, wherein the polypeptides which remain soluble after heating are more thermal stable than those which precipitate.

This mixture of polypeptides was rank ordered for their ability to bind anion exchange and cation exchange chromatography resins. Four chromatography resins were tested. Two anion exchange resins: Capto DEAE, from GE Lifesciences and Eshmuno® Q Resin from EMD. Two cation exchange resins: POROS® XS Strong Cation Exchange Resin from Life Technologies and Eshmuno® S Resin from EMD. Each resin was tested with eight different buffering conditions, as follows. Buffers used for anion wxchange: Water (no buffer), pH 7; 15 mM Na2HPO4, pH 8.7; 30 mM Na2HPO4, pH 9.0; 15 mM Tris Base, pH 9.6; 30 mM Tris Base, pH 10.0; 30 mM Na2CO3, pH 11.2; 25 mM Arginine, pH 10.1. Buffers used for cation exchange: Water (no buffer), pH 7; 15 mM KH2PO4, pH 4.2; 30 mM KH2PO4, pH 4.5; 15 mM Tris Acid, pH 4.9; 30 mM Tris Acid, pH 4.7; 15 mM MES Acid, pH 3.9; 25 mM MES Acid, pH 4.1.

The resins were distributed to a 96 well filterplate (20 uL of resin per well) and each was equilibrated three times. The protein set was mixed with the equilibration buffers and allowed to bind to the resins. The unbound proteins in solution were separated from the resin by centrifuging the liquid through the filterplate for collection in a 96 well plate below. The remaining unbound proteins were further rinsed off the resin with a wash of equilibration buffer. The bound proteins were then sequentially eluted with three stages of increasing salt concentration (50, 250, 1500 mM NaCl buffered in the appropriate set of buffers above). The loosely bound proteins were removed first, and the proteins that were removed in the final elution condition were very tightly bound to the resin. Thus, a library of 168 proteins was expressed in E. coli and rank ordered for their binding affinity to anion exchange and cation exchange chromatography resin.

The experiment described herein produced 160 samples (four resins, eight buffers, five collections). The five collection stages include: the flow through fraction, the wash fraction, the 50 mM NaCl elution, the 250 mM NaCl elution, and the 1500 mM NaCl elution. All 160 samples were analyzed by UV-vis absorbance at 280 nm, by Bradford total protein assay, by Chip electrophoresis, and select samples were analyzed by LC/MS/MS. All analytical assays are described herein.

The assays demonstrated that some protein flowed through the resin and was not bound. In most cases, the wash fraction did not elute a significant amount of protein, indicating that any further protein to elute was in fact bound to the resin. As the NaCl concentration increased, the total protein being removed also increased demonstrating successful binding and elution in nearly every condition. The proteins detected by electrophoresis in 1500 mM NaCl elution conditions remained bound through the 250 mM NaCl wash condition, indicating strong binding. Select conditions were selected for LC/MS/MS analysis. The LC/MS/MS analysis was performed with the 1500 mM NaCl elution sample from anion exchange chromatography resin (Capto DEAE) in the 30 mM Tris Base buffering condition. The LC/MS/MS results were searched against the sequences of all 168 nutritive polypeptides originally expressed in the library. Eight unique polypeptides were identified as having high binding affinity to this anion exchange resin in this condition are SEQID-00341, SEQID-00346, SEQID-00497, SEQID-00525, SEQID-00555, SEQID-00605, SEQID-00606, SEQID-00610. For each polypeptide sequence in this set, the net charge per amino acid was calculated based on primary sequence across the range of pH tested. As described herein, it is expected that polypeptides which bind tightly to anion exchange resins have a net charge per amino acid below 0 across the pH range, and this is demonstrated to be true, with a single exception. Any exception is expected to be due to the fact that there is charge heterogeneity across the length of the sequence, and as described, net charge per amino acid does not always capture that. This multiplexed screen identified a set of polypeptides from a larger library based on their affinity to bind anion exchange resin, and this result can be predicted based on the primary sequence analysis, as described herein.

The LC/MS/MS analysis was performed with the 1500 mM NaCl elution sample from cation exchange chromatography resin (Poros XS) in the 15 mM Tris Acid buffering condition. The LC/MS/MS results were searched against the sequences of all 168 nutritive polypeptides originally expressed in the library. Eight unique polypeptides were identified as having high binding affinity to this cation exchange resin in this condition are SEQID-00302, SEQID-495, SEQID-00522, SEQID-00537, SEQID-00546, SEQID-00547, SEQID-00560, SEQID-00598. For each polypeptide sequence in this set, the net charge per amino acid was calculated based on primary sequence across the range of pH tested. As described herein, it is expected that polypeptides which bind tightly to cation exchange resins have a net charge per amino acid above 0 across the pH range, and this is demonstrated to be true, with minor exception. Any exception is expected to be due to the fact that there is charge heterogeneity across the length of the sequence, and as described, net charge per amino acid does not always capture that. This multiplexed screen identified a set of polypeptides from a larger library based on their affinity to bind cation exchange resin, and this result can be predicted based on the primary sequence analysis, as described herein.

In the Bacillus subtilis example of multiplexed purification, the set of polypeptide sequences was expressed without any type of purification tag. The cells were cultured in flasks, as described herein and expressed and secreted the polypeptides into the growth media. The cells were removed by centrifugation and the solution was further clarified by membrane filtration, as described herein. This production process resulted in a solution containing all of the polypeptides from the set which were both expressed and solubly secreted. That set of soluble polypeptides was concentrated and buffer exchanged into a solution of phosphate, pH 7.0 before testing various purification methods. The methods tested include anion exchange chromatography, cation exchange chromatography, and negative precipitation, in which the impurities precipitate and the polypeptides that remain soluble rank the highest. In these multiplexed purification studies, the polypeptides are purified away from each other and from the host cell proteins in order to be rank ordered amongst themselves.

This mixture of polypeptides was rank ordered for their ability to bind to anion exchange and cation exchange chromatography resins. Four chromatography resins were tested. Two anion exchange resins: Capto DEAE, from GE Lifesciences and Eshmuno® Q Resin from EMD. Two cation exchange resins: POROS® XS Strong Cation Exchange Resin from Life Technologies and Eshmuno® S Resin from EMD. Each resin was tested with eight different buffering conditions. Buffers used for Anion Exchange: 18 mM BIS-TRIS, pH 6.5; 13 mM HEPES, pH 7.0; 18 mM HEPES, pH 7.5; 16 mM TRIS, pH 8.0; 32 mM TRIS, pH 8.5; 88 mM TRIS, pH 9.0; 13 mM Na2CO3, pH 9.5; 20 mM Na2CO3, pH 10.0. Buffers used for Cation Exchange: 19 mM Citrate, pH 3.0; 13 mM Citrate, pH 3.5; 49 mM Acetate, pH 4.0; 22 mM Acetate, pH 4.5; 14 mM Acetate, pH 5.0; 10 mM Acetate, pH 5.5; 24 mM MES, pH 6.0; 15 mM MES, pH 6.5.

The resins were distributed to a 96 well filterplate (50 μL of resin per well) and each was equilibrated three times. The protein set was mixed with the equilibration buffers and allowed to bind to the resins. The unbound proteins in solution were separated from the resin by centrifuging the liquid through the filterplate for collection in a 96 well plate below. The remaining unbound proteins were further rinsed off the resin with two wash cycles of equilibration buffer. The bound proteins were then sequentially eluted with increasing salt concentration (250, 500, 1000 mM, 2000 mM NaCl). Each salt solution was buffered in the appropriate equilibration buffer except for the 2000 mM NaCl solution which was buffered with MES at pH 6.0 for anion exchange resins and with TRIS at pH 8.0 for cation exchange resins. The loosely bound proteins were removed first, and the proteins that were removed in the final elution condition were very tightly bound to the resin. Thus, the library of proteins was expressed in Bacillus subtilis and rank ordered for their binding affinity to anion exchange and cation exchange chromatography resin.

The experiment described herein produced 192 samples (four resins, eight buffers, six collections). The six collection stages include: the flow through fraction, the wash fraction, the 250 mM NaCl elution, the 500 mM NaCl elution, the 1000 mM NaCl elution, and the 2000 mM NaCl elution. All 192 samples were analyzed by Chip electrophoresis. Select samples were analyzed by SDS-PAGE. Select samples were analyzed by LC/MS/MS. All analytical assays are described herein. Identification of strongly bound proteins is performed by a combination of Chip electrophoresis, SDS-PAGE, and LC/MS/MS.

SDS-PAGE results demonstrate that the majority of polypeptides do not bind to these resins, in fact they are found in the flow through fraction. Therefore, the polypeptides that bind to the resins are unique in their ability to be purified from the majority of other polypeptides. The set of polypeptides in the various elution fractions have been isolated from a larger set based on their properties in a purification process, which has implications for the manufacture, cost, development time, and eventual purity of these polypeptides. Furthermore, of the polypeptides that bind to resins, these can be rank ordered by their ability to remain bound to the resin through stringent wash conditions with increasing concentrations of NaCl. Those polypeptides found in the 2000 mM NaCl samples have been able to remain bound through 1000 mM NaCl wash conditions. It is widely accepted that any polypeptide which can remain bound to an ion exchange resin above 500 mM NaCl is considered to have very high affinity for that resin. The banding pattern is similar between the two cation exchange resins supporting that the proposed mechanism. Likewise, the banding pattern is similar between the two anion exchange resins, and represents a different sample set than those identified by cation exchange. To rank order the individual polypeptides identified in any subset, LC/MS/MS is utilized.

As an exemplary dataset, the LC/MS/MS results identified the following polypeptides as binding to the Capto DEAE anion exchange resin at pH 7.5: P39645, P37869, P80698, P80868, P21880, P80239, P50849, P12425, 034669, P39138, P37871, P19669, P29727, P80643, 034981, P80879, P54716, P37477. As an exemplary dataset, the LC/MS/MS results identified the following polypeptides as binding to the Poros XS cation exchange resin at pH 4.0: 034669, P19405, 031803, 005411, 031973, 031643, P80239, P26901, P08821, P80240, P49814, 034310, P0C178, 031925, P71014, P42111.

These polypeptides identified as having high binding affinity were analyzed for physicochemical properties based on their primary sequence. The net charge per amino acid was calculated based on primary sequence across the range of pH tested. As described herein, it is expected that polypeptides which bind tightly to anion exchange resins have a net charge per amino acid below 0 across the pH range, and this was generally demonstrated to be true. As described herein, it is expected that polypeptides which bind tightly to cation exchange resins have a net charge per amino acid above 0 across the pH range, and this was generally demonstrated to be true. Any exception is expected to be due to the fact that there is charge heterogeneity across the length of the sequence, and as described, net charge per amino acid does not always capture that. This multiplexed screen identified a set of polypeptides from a larger library based on their affinity to bind anion exchange resin, and this result can be predicted based on the primary sequence analysis, as described herein.

In the Bacillus subtilis example of multiplexed purification, the set of 168 nutritive polypeptide sequences was expressed without any type of purification tag. The cells were cultured in flasks, as described herein and expressed and secreted the polypeptides into the growth media. The methods tested included negative flocculation/precipitation, in which the impurities precipitate and the polypeptides that remain soluble rank the highest. In these multiplexed purification studies, impurities are present in the form of soluble impurities (e.g. host cell proteins, DNA, phospholipids, and product-related impurities, such as isoforms or aggregated species), insoluble impurities, cells, or cellular debris (e.g. membrane fragments). Negative precipitation was performed prior to and following removal of insoluble impurities, cells, and cellular debris centrifugation and further clarification by membrane filtration.

The set of solubly expressed and secreted polypeptides in the cell suspension (prior to centrifugation and membrane filtration) and clarified supernatant (following centrifugation and membrane filtration) were rank ordered for their ability to associate with the flocculating agents. Furthermore, the flocculating agents were rank ordered for their ability to associate with the impurities. 48 flocculating agents were tested at two different concentrations, as follows: Ammonium Bicarbonate (100 mM, 200 mM); Manganese Chloride (100 mM, 200 mM); Nickel Sulfate (100 mM, 200 mM); Sodium Citrate (100 mM, 200 mM); Lithium Acetate (100 mM, 200 mM); Propylene Glycol (10% v/v, 20% v/v); Ammonium Nitrate (100 mM, 200 mM); Potassium Chloride (100 mM, 200 mM); Sodium Sulfate (100 mM, 200 mM); Sodium Molybdate (100 mM, 200 mM); Acetic Acid (100 mM, 200 mM); Chitosan MMW (0.1% w/v, 0.2% w/v); Ammonium Sulfate (100 mM, 200 mM); Sodium Chloride (0.5M, 1.0M); Zinc Sulfate (100 mM, 200 mM); Sodium Nitrate (100 mM, 200 mM); Citric Acid (100 mM, 200 mM); Guanidine HCl (0.6M, 1.2M); Ammonium Chloride (100 mM, 200 mM); Zinc Chloride (100 mM, 200 mM); Potassium Carbonate (100 mM, 200 mM); Sodium Phosphate (100 mM, 200 mM); Hydrochloric Acid (100 mM, 200 mM); PEG 1000 (5% w/v, 10% w/v); Calcium Chloride (100 mM, 200 mM); Iron Citrate (100 mM, 200 mM); Potassium Nitrate (100 mM, 200 mM); Sodium Propionate (100 mM, 200 mM); Potassium Hydroxide (100 mM, 200 mM); PEG 4000 (5% w/v, 10% w/v); Choline Chloride (100 mM, 200 mM); Copper Sulfate (100 mM, 200 mM); Potassium Phosphate (100 mM, 200 mM); Sodium Succinate (100 mM, 200 mM); Sodium Hydroxide (100 mM, 200 mM); Triton X-100 (0.5% w/v, 1.0% w/v); Iron Chloride (100 mM, 200 mM); Iron Sulfate (100 mM, 200 mM); Deoxycholic Acid (0.5% w/v, 1.0% w/v); Sodium Thiocyanate (100 mM, 200 mM); Ethanol (10% v/v, 20% v/v); Tween 80 (0.5% w/v, 1.0% w/v); Magnesium Chloride (100 mM, 200 mM); Magnesium Sulfate (100 mM, 200 mM); Sodium Carbonate (100 mM, 200 mM); Sodium Thiosulfate (100 mM, 200 mM); Isopropanol (10% v/v, 20% v/v); Urea (0.8M, 1.6M).

The cell suspension (prior to centrifugation and membrane filtration) and clarified supernatant (following centrifugation and membrane filtration) were distributed into a 96-well filter plate (300 μL per well for the low flocculating agent concentration and 267 μL for the high flocculating agent concentration). These loads were diluted 0.1× or 0.2× by adding 33 μL or 67 μL, respectively, of concentrated flocculating agent solutions. The resulting solution was mixed for 1 hour at room temperature. Following mixing, the remaining soluble material was separated from the insoluble material by centrifuging the liquid through the filterplate for collection in a 96 well plate below. All 192 samples were analyzed by Chip electrophoresis. Select samples were analyzed by SDS-PAGE. Select samples were analyzed by LC/MS/MS. All analytical assays are described herein.

SDS-PAGE results demonstrated that some conditions effectively precipitated many polypeptides, indicating that the soluble polypeptides in those conditions were rather soluble, and can be isolated in these conditions. The polypeptides that were soluble in the various precipitation conditions were isolated from a larger set based on their properties in a purification process, which has implications for the manufacture, cost, development time, and eventual purity of these polypeptides. Some polypeptides are widely soluble across a range of conditions due to their high charge, according the mechanism described herein. To rank order the individual polypeptides identified in any subset, LC/MS/MS is utilized.

As an exemplary dataset, the LC/MS/MS results demonstrated that the following polypeptides were isolated due to their sustained solubility in 100 mM Acetic Acid, pH 5.18: 034669, P54423, P21879, P10475, P28598, 031803, P40767, P17889, 034918, Q08352, P24327, P37871, 031973, P81101, P50849, P26901, P80700, 034385, P70960, P42111, P21880, P27876, P80868, P54716, 034313, 007603, 005411, P54531, 005497, P12425, 007921, P19405, Q06797, P02394, P24141, P09339, P37965, P07343, P37809, P0C178, P39824, P49814, P39632, P39773, P51777, P21883, 006989, P25152, P70961, 007593, 034310, P80860, P37437, P80698, P13243, P38494, P39645, P39148, 031398, P08821, P08877, 005268, P04957, P28366, P31103, P94421, P14949, P80864, P37869, P80240, P80859, 006993, 034666, 034714, P37546, Q9KWU4, 031605, P16616, P80239, 034788, P71014, P37571, P09124, P42971, 031925, P39793, P17865, P16263, P18429, P05653, P26908, P33166, 034499, P08750, P54602, Q45071, P12047, P42919, 034334, 034358, P39120, P39126, P00691, P14192, P22250, P37870, P39116, P54484, P54488, P54547, P56849, 031579, 034629, P30949, P54422, P54530, P54542, P96739.

As an exemplary dataset, the LC/MS/MS results demonstrated that the following polypeptides were isolated due to their sustained solubility in 100 mM Potassium Carbonate, pH 9.66: O34669, P54423, P21879, P24327, P40767, P17889, 031973, P10475, P28598, P80700, P37871, P80868, 031803, P81101, P70960, P27876, P19405, P28366, P71014, P26901, 034385, P21880, Q06797, P24141, P07343, P80698, P13243, P42971, P39793, 031643, P39071, 032210, P21468, P42199, P54531, P37965, P37809, P21883, P38494, P39148, P08877, P09124, P17865, P16263, P54602, P46906, 034918, Q08352, P42111, 005411, 005497, 007921, P02394, P09339, P49814, P39632, P37437, P39645, P08821, P04957, P31103, Q9KWU4, P80239, 034788, P18429, P05653, P26908, 034499, P08750, P12047, P37870, P54547, Q06796, Q45477, P25144, P46898, P40871, 031501, P21464, P21465, P40409.

As an exemplary dataset, the LC/MS/MS results demonstrated that the following polypeptides were isolated due to their sustained solubility in 100 mM Calcium Chloride, pH 7.50: O34669, P54423, P21879, 034918, 031803, P10475, P28598, P24327, P40767, P80700, P27876, P37871, 034385, P13243, Q08352, 007921, P17889, 031973, P80868, P26901, P24141, P80698, P02394, Q06797, P39148, P19405, P54531, P37965, P09339, P39645, 034788, P37571, 007909, P70960, P21880, P42971, P37809, P80239, Q45477, P94421, P81101, P07343, P39793, P39071, P38494, P17865, P42111, P12425, P39773, 006989, P80864, 005411, 005497, P25144, P0C178, P39824, P25152, P70961, 031398, 005268, P37869, P80859, 032150, P39138, 031643, P21468, P42199, P21883, P09124, P49814, P05653, Q06796, 034313, P51777, 034310, 006993, 034666, 031925, P33166, P39634, P37808, P39779, P28366, P08877, P16263, P39632, P08821, P04957, 034499, P08750, P46898, P50849, P54716, P80860, P14949, P80240, Q45071, 034334, 034358, P39120, P39126, P20278, P53001, P54375, 006006, 006988, 034667, 034981, P08164, P19669, P30950, P37487, P45694, P81102, P71014, P54602, P46906, P31103, P18429, P26908, P12047, P40871, 007603, 034714, P37546, P42919, P00691, P22250, P39116, P54488, P40924, C0SP93, 031760, 032023, 032106, 032167, 034962, P12048, P25995, P28015, P28599, P34957, P35137, P37253, P37477, P37812, P37940, P46354, P49778, P54169, P54418, P54550, P54941, P80885, P94576, Q04796, Q06004, Q07868, Q9R9I1.

As an exemplary dataset, the LC/MS/MS results demonstrated that the following polypeptides were isolated due to their sustained solubility in 100 mM Iron Chloride, pH 4.54: P26901, 034669, 034918, P54423, 031803, P96657, 031973, P37871, 007921, 031643, Q06796, P17889, P80698, P80239, 005411, 007909, 031925, P20278, P71014, P21879, P10475, P80700, P27876, Q08352, P81101, P42111, P0C178, P39824, 032210, P28598, P24327, 034385, Q06797, P19405, P37571, P38494, 031398, P09124, P51777, P08821, P18429, 007593, P80868, P09339, P39645, 034788, 005268, P49814, P08877, P39632, P04957, P14949, P31103, 006746, 007555, P40767, P02394, P54531, P37965, P70960, P37809, P07343, P39773, P33166, P39634, P16263, P46898, P50849, P54716, P80240, Q45071, P53001, P54375, P26908, P42919, P40924, P14192, P54484, P56849, 006748, P12878, P21477, P32081, P46899, P50620, P54464.

These polypeptides identified as having high solubility were analyzed for physicochemical properties based on their primary sequence. The total charge per amino acid was calculated based on primary sequence across the range of pH tested. As described herein, it is expected that the most soluble polypeptides have a high total charge per amino acid, and this was generally demonstrated to be true. This multiplexed screen identified a set of polypeptides from a larger library based on their affinity to bind anion exchange resin, and this result can be predicted based on the primary sequence analysis, as described herein.

Multiplexed Purification: Precipitation & Flocculation.

Conventional biopharmaceutical protein purification methods used to remove cells and cellular debris include centrifugation, microfiltration, and depth filters. Filter aids, such as diatomaceous earth, can be used to enhance performance of these steps, but they are not always effective and sometimes significantly bind the product of interest. Their use may also require the addition of a solid or a homogeneous suspension that can be challenging as part of large-scale biopharmaceutical operations.

Polymeric flocculants can be used to aid in the clarification of mammalian cell culture process streams, but they can have limitations. For example, protamine sulfate preparations typically used as processing aids are limited in application due to concerns about inactivation of the protein of interest or product loss due to precipitation (Scopes, Protein Purification Principles and Practice 3rd edition, Cantor eds. 22-43; 171 (1994)).

High quality reagents, such as that sold for medical use, can be expensive. In certain instances, removal to very low levels be required to ensure there are no adverse effects in patients. For example, chitosan is not a well-defined reagent and there are concerns about its consistent performance in routine use in clarification applications. Multiple charged polymers, such as DEAE dextran, acrylamide-based polymers often used in waste-water treatment (NALCO Water Handbook, Section 2.1: Applications—Impurity Removal, 3rd ed., McGraw-Hill, 2009) and polyethylene amine (PEI) have been considered for use in clarification applications. With respect to the latter two types of polymers, the acrylamide reagents have the potential for contamination with toxic reagents and polyethylene amine, while a highly effective clarification reagent, is often contaminated with varying amounts of ethylenimine monomer, a suspected cancer agent (Scawen et al., Handbook of Enzyme Biotechnology 2nd edition, Wiseman eds.:15-53 (1985)). Moreover, many of these polymers, including PEI, tend to bind almost irreversibly to many chromatography resins, thereby limiting downstream processing options. The regulatory and raw material reuse concerns associated with these polymers have limited their application primarily to academic studies.

Non-polymer based flocculants, such as alum and iron salts, have been utilized in the wastewater treatment industry (NALCO Water Handbook, Section 2.1: Applications—Impurity Removal, 3rd ed., McGraw-Hill, 2009). These substances may appear to be non-useful in processing protein products, because they may bind to the protein product or may catalyze chemical reactions resulting in modifications of the protein that could affect safety or efficacy.

In some cases, an entire library of proteins is tested in a multiplexed screening experimental platform. The 168 nutritive polypeptide library was transfected and expressed in a multiplexed expression system in which a single growth condition was used to produce each polypeptide in a single container. This multiplexed expression system allows any set of polypeptide sequences to be tested in parallel for a wide range of manufacturability parameters, each of which can be used to rank order the set of polypeptides being examined. A set of manufacturability parameters includes expression level, polypeptide solubility, ability of polypeptide to be purified by chromatography, ability to resist thermal denaturation, ability of polypeptide to digest, ability of polypeptide to be purified by resisting harsh treatments.

For E. coli multiplexed purification by precipitation, the set of 168 nutritive polypeptide sequences was HIS8 tagged (SEQ ID NO: 3919). The cells were cultured, as described herein, ruptured, and the solution was clarified, as described herein. This production resulted in a solution containing all of the polypeptides from the set which were both expressed and soluble. That set of soluble polypeptides was passed over an IMAC column, and eluted, as described herein. This IMAC purification effectively isolated the solubly expressed nutritive polypeptides as a set by removing the majority of E. coli host cell proteins. The elution fraction was concentrated and buffer exchanged into a low salt solution buffered near neutral pH before testing various purification methods. The methods tested include anion exchange chromatography, cation exchange chromatography, and negative precipitation, in which the impurities precipitate and the polypeptides that remain soluble rank the highest. In this case, impurities have been removed, so the polypeptides are rank ordered amongst themselves. Additionally, the set of proteins was tested for thermal stability, by heating, wherein the polypeptides which remain soluble after heating are more thermal stable than those which precipitate.

The pre-treated group of polypeptides expressed by E. coli was distributed to 32 wells of a 96 well plate (4.7 uL of protein stock per well at a total protein concentration of 43 g/L). Stock solutions were added to each well to create the following conditions: Control (No Additives); 42 mM Citrate/Phosphate, pH 7.1; 42 mM Citrate/Phosphate, pH 6.5; 42 mM Citrate/Phosphate, pH 6.0; 42 mM Citrate/Phosphate, pH 5.6; 42 mM Citrate/Phosphate, pH 5.0; 42 mM Citrate/Phosphate, pH 4.6; 42 mM Citrate/Phosphate, pH 4.3; 42 mM Citrate/Phosphate, pH 3.9; 42 mM Citrate/Phosphate, pH 3.7; 42 mM Citrate/Phosphate, pH 2.8; 75 mM Tris Base; 50 mM Na2CO3; 50 mM Piperazine Base; 100 mM sodium phosphate dibasic; 50 mM ethanolamine; 100 mM sodium phosphate monobasic; 100 mM MES Acid; 100 mM Sodium Acetate, pH 4.1; 100 mM MOPS Acid; 100 mM Tris HCl; 25 mM Acetic Acid; 25 mM Boric Acid; 25 mM Citric Acid; 50 mM PIPES Acid; 50 mM Succinic Acid; 1.2 M sodium sulfite; 1.5 M sodium sulfite; 2.5 M Ammonium Sulfate; 3.5 M Ammonium Sulfate; 200 mM CaCl2; 60% methanol. Water was added such that each well contained a total of 40 uL of solution at 5 g/L. The plates were mixed for 30 minutes at room temperature, then centrifuged at 3,000 RCF for 10 minutes to pellet any precipitated protein. A sample was taken from each well for analysis. The 96 well plate was then heated at 95° C. for 2 minutes. The plate was again centrifuged at 3,000 RCF for 10 minutes to pellet any precipitated protein, and a sample was taken from each well for analysis. All 64 samples were analyzed by Bradford assay, and select samples were analyzed by chip electrophoresis, followed by LC/MS/MS. All analytical assays are described herein. The measurements of total protein remaining in solution demonstrate that many conditions caused polypeptide precipitation, indicating that a portion of the conditions tested were rigorous harsh conditions.

LC/MS/MS analysis was performed with four select samples, described in the Table E9D. The detection of nutritive polypeptide in the soluble fraction is noted with an X.

TABLE E9D Nutritive polypeptides detected in the soluble fraction of select conditions. Detection is noted with an X. 42 mM citrate/ 100 mM phosphate, sodium 50 mM 2.5M pH 5.6, phosphate PIPES acid, Ammonium SEQID heated monobasic heated sulfate SEQID-00105 X SEQID-00115 X SEQID-00302 X X X X SEQID-00304 X SEQID-00305 X X X X SEQID-00316 X X SEQID-00323 X X SEQID-00338 X X X X SEQID-00341 X X X X SEQID-00343 X X X X SEQID-00345 X X X X SEQID-00346 X X X SEQID-00352 X X X X SEQID-00354 X X X X SEQID-00356 X SEQID-00357 X X X SEQID-00485 X X SEQID-00495 X X X SEQID-00497 X X SEQID-00502 X X X X SEQID-00507 X SEQID-00509 X SEQID-00510 X X X X SEQID-00511 X X X SEQID-00515 X SEQID-00518 X SEQID-00521 X X SEQID-00522 X X X X SEQID-00525 X X X X SEQID-00528 X X SEQID-00529 X X SEQID-00533 X X SEQID-00537 X X SEQID-00540 X X X SEQID-00546 X X SEQID-00547 X X X SEQID-00553 X X X X SEQID-00555 X X SEQID-00559 X SEQID-00560 X X X SEQID-00564 X X X X SEQID-00570 X X X SEQID-00585 X X X X SEQID-00587 X X X SEQID-00592 X X X SEQID-00598 X X X SEQID-00601 X X X SEQID-00603 X SEQID-00605 X X X SEQID-00606 X X SEQID-00610 X SEQID-00613 X X X SEQID-00619 X SEQID-00622 X X SEQID-00623 X X X X SEQID-00631 X X X SEQID-00632 X X X X SEQID-00633 X X X SEQID-00641 X SEQID-00647 X X X SEQID-00648 X

LC/MS/MS data identified a number of soluble polypeptides in each condition. The different conditions tested across the screen represent a number of different mechanisms of precipitation, and these different conditions were able to identify different sets of polypeptides based on their different physicochemical properties. Based on the number of polypeptides which remained soluble, of the conditions examined by LC/MS/MS, the harshest condition is the 2.5 M ammonium sulfate condition at room temperature. The polypeptides that were soluble in that condition were generally soluble in all three other conditions tested, with few exceptions. A large number of polypeptides were identified as being soluble after being heated to 95° C. for two minutes. In this experiment, a library of nutritive polypeptides was physically screened for solubility across a wide variety of conditions, and subsets of soluble peptides were identified within each condition.

Example 11. Selection of Amino Acid Sequences of Nutritive Polypeptides from Amino Acid Sequence Libraries Based on Solvation Scores and Aggregation Scores, and Other Sequence-Based Analyses

Solvation Score.

The solvation score is a primary sequence-based metric for assessing the hydrophilicity and potential solubility of a given protein. It is defined as the total free energy of solvation (i.e. the free energy change associated with transfer from gas phase to a dilute solution) for all amino acid side chains, assuming each residue were solvated independently, normalized by the total number of residues in the sequence. The side chain solvation free energies were found computationally by calculating the electrostatic energy difference between a vacuum dielectric of 1 and a water dielectric of 80 (by solving the Poisson-Boltzmann equation) as well as the non-polar, Van der Waals energy using a linear solvent accessible surface area model (D. Sitkoff, K. A. Sharp, B. Honig. “Accurate Calculation of Hydration Free Energies Using Macroscopic Solvent Models”. J. Phys. Chem. 98, 1994). These solvation free energies correlate well with experimental measurements. For amino acids with ionizable sidechains (Arg, Asp, Cys, Glu, His, Lys and Tyr), an average solvation free energy is based on the relative probabilities for each ionization state at the specified pH. The solvation score is effectively a measure of the solvation free energy assuming all polar residues are solvent exposed and non-polar residues are solvent excluded upon folding.

Aggregation Score.

The aggregation score is a primary sequence based metric for assessing the hydrophobicity and likelihood of aggregation of a given protein. Using the Kyte and Doolittle hydrophobity scale (Kyte J, Doolittle R F (May 1982). “A simple method for displaying the hydropathic character of a protein”. J. Mol. Biol. 157 (1): 105-32), which gives hydrophobic residues positive values and hydrophilic residues negative values, the effective hydrophobicity as a function of sequence position is calculated using a moving average of 5 residues centered around each residue. The aggregation score is found by summing all those average hydrophobicity values greater than 0 and normalizing by the total length of the protein. The underlying understanding is that aggregation is the result of two or more hydrophobic patches coming together to exclude water and reduce surface exposure, and the likelihood that a protein will aggregate is a function of how densely packed its hydrophobic (i.e., aggregation prone) residues are.

Charge Content.

The absolute or net charge per amino acid is calculated as a function of pH and independently of the location of the residue within the protein. Given a pH value and the pKa of a titratable residue, the Henderson-Hasselbalch equation is solved to determine the relative concentrations of each titration state (e.g. −1 or 0 for the acidic residue glutamate).

pH = pK a + log ( [ A - ] [ HA ] )

The Henderson-Hasselbalch Equation

The average charge for that titratable residue is found by converting these relative concentrations into effective probabilities of being charged and multiplying by the charge and number of instances of that amino acid. The net or absolute charge found from this procedure is then divided by the number of amino acids to get the per amino acid value.

The residue types shown below in Table E11A are used with the corresponding pKa values and relevant titration states. These pKa values come from the pKa table provided in the European Molecular Biology Open Software Suite (Rice, P. Longden, I. and Bleasby, A. EMBOSS: The European Molecular Biology Open Software Suite. Trends in Genetics, 16, 2000).

TABLE E11A Residue pKa Titration States Glutamate −4.1 −1, 0 Aspartate −3.9 −1, 0 Arginine −12.5 0, +1 Lysine −10.8 0, +1 Histidine −6.5 0, +1 Cysteine −8.5 −1, 0 Tyrosine −10.1 −1, 0 C-terminus −3.6 −1, 0 N-terminus −8.6 0, +1

Weighted Euclidean Distance.

To identify candidate proteins with a similar amino acid breakdown to a known, clinically efficacious blend, a weighted Euclidean distance based search strategy is used. In principle, this means computing the weighted percent differences for each amino acid relative to the target amino acid distribution, as defined by the following equation:


Distance=√{square root over (Σiaddαi(xi−xiT)t)}

where AA is the set of all amino acids in the target distribution, xi is the fraction by weight of amino acid i in the candidate protein sequence, xiT is the fraction by weight of amino acid i in the target amino acid distribution, and αi is the relative weight associated with amino acid i. The relative weights were applied to ensure that large deviations from the most important amino acid targets were appropriately penalized.

As an example, for treatment of sarcopenia, support of exercise, and stimulation of thermogenesis amino acid blends (given the relative importance of Leucine and the other two branched chain amino acids, Isoleucine and Valine) relative weights of 3:2:2 for Leucine, Valine, and Isoleucine were used. All other amino acids were given a relative weight of 1.

Allergenicity.

The allergenicity score is a primary sequence based metric based on WHO recommendations (fao.org/ag/agn/food/pdf/allergygm.pdf) for assessing how similar a protein is to any known allergen, the primary understanding being that high percent identity between a target and a known allergen is likely indicative of cross reactivity. For a given protein, the likelihood of eliciting an allergic response is assessed via a complimentary pair of sequence homology based tests. The first test determines the protein's percent identity across the entire sequence via a global-local sequence alignment to a database of known allergens using the FASTA algorithm with the BLOSUM50 substitution matrix, a gap open penalty of 10, and a gap extension penalty of 2. It is suggested that proteins with less than 50% global homology across both sequences are unlikely to be allergenic (Goodman R. E. et al. Allergenicity assessment of genetically modified crops—what makes sense? Nat. Biotech. 26, 73-81 (2008).; Aalberse R. C. Structural biology of allergens. J. Allergy Clin. Immunol. 106, 228-238 (2000).). The second test assesses the local allergenicity along the protein sequence by determining the local allergenicity of all possible contiguous 80 amino acid fragments via a global-local sequence alignment of each fragment to a database of known allergens using the FASTA algorithm with the BLOSUM50 substitution matrix, a gap open penalty of 10, and a gap extension penalty of 2. The highest percent identity of any 80 amino acid window with any allergen is taken as the final score for the protein of interest. The WHO guidelines suggest using a 35% identity cutoff. The custom database comprises pooled allergen lists collected by the Food Allergy Research and Resource Program (allergenonline.org/), UNIPROT annotations (uniprot.org/docs/allergen), and the Structural Database of Allergenic Proteins (SDAP, fermi.utmb.edu/SDAP/sdap_lnk.html). This database includes all currently recognized allergens by the International Union of Immunological Societies (IUIS, allergen.org/) as well as a large number of additional allergens not yet officially named.

Toxicity/Nonallergenicity/Antinutricity.

The toxicity, nonallergenicity, and anti-nutricity of a protein are all assessed similarly, by determining the protein's percent identity to databases of known toxic, nonallergenic, and protease inhibitory proteins, respectively. The toxicity and anti-nutritive qualities are assumed to be a function of the whole protein (i.e., a fragment of a known toxic protein will not be toxic), as their toxic and inhibitory mechanisms of action are often structural in nature (Huntington J, Read R, Carrell R. “Structure of a serpin-protease complex shows inhibition by deformation”. Nature 407 (2000): 923-6; Van den Born H. K. et al. Theoretical analysis of the structure of the peptide fasciculin and its docking to acetylcholinesterase. Protein Sci. 4 (1995): 703-715.; and Harel M. Crystal structure of an acetylcholinesterase-fasciculin complex: interaction of a three-fingered toxin from snake venom with its target. Structure. 3 (1995): 1355-1366.). Given that protein structure is a function of the entire protein sequence, a global-global alignment is performed of the protein of interest against the two respective databases using the FASTA algorithm with the BLOSUM50 substitution matrix, a gap open penalty of 10, and a gap extension penalty of 2. A cut off of 35% can be used. While it does not provide specific instructions of how to avoid toxic/antinutritive polypeptides, reference Delaney B. et al. Evaluation of protein safety in the context of agricultural biotechnology. Food. Chem. Toxicol. 46 (2008: S71-S97 suggests that one should avoid both known toxic and antinutritive polypeptides when assessing the safety of a possible food protein.

The nonallergenicity of a protein is related to its likelihood of eliciting an allergenic response upon exposure (similar to but opposite of allergenicity). Specifically, the human immune system is exposed to a multitude of possible allergenic proteins on a regular basis, and has the intrinsic ability to determine self from non-self. The exact nature of this ability is not always clear, and there are many diseases that arise as a result of the failure of the body to differentiate self from non-self (e.g. arthritis). Nonetheless, the understanding is that proteins that look (i.e. share a large degree of sequence homology to) a lot like nonallergenic (i.e., human) proteins are less likely to elicit an immune response. In particular, it has been shown that for some protein families with known allergenic members (tropomyosins, parvalbumins, caseins), those proteins that bear more sequence homology to their human counterparts relative to known allergenic proteins, are not thought to be allergenic (Jenkins J. A. et al. Evolutionary distance from human homologs reflects allergenicity of animcal food proteins. J. Allergy Clin Immunol. 120 (2007): 1399-1405.) For a given protein, the nonallergenicity score is measured by determining the maximum percent identity of the protein to a database of human proteins from a global-local alignment using the FASTA algorithm with the BLOSUM50 substitution matrix, a gap open penalty of 10, and a gap extension penalty of 2. Cutoffs can vary. For example, Jenkins J. A. et al. (Evolutionary distance from human homologs reflects allergenicity of animcal food proteins. J. Allergy Clin Immunol. 120 (2007): 1399-1405) claim that proteins with a sequence identity to a human protein above ˜62% are less likely to be allergenic.

Example 12. Expression of Nutritive Polypeptides

The lists below include all the nutritive protein sequences that were expressed in Escherichia coli, Bacillus, Aspergillus niger, and mammalian cells. In E. coli, the proteins were detected in either whole cell lysates or in the soluble fraction of the cell lysate. In Bacillus, expression was detected in either cell lysates or secreted supernatants of Bacillus subtilis or Bacillus megaterium. In Aspergillus niger, proteins were secreted from the fungus and detected in the supernatant. For proteins expressed in mammalian cells, they were expressed in either in Chinese Hamster Ovarian-S strain (CHO-S) or Human Embryonic Kidney 293F strain (HEK293F). Expression was measured by the following metrics: mass spectrometry spectrum counts for protein expression data acquired from LC-MS/MS in pooled library, SDS-PAGE, Chip Electrophoresis, dot blot, Western blot and ELISA for individual protein expression as described above.

The following nutritive polypeptides were detected in either whole cell lysates or in the soluble fraction of the cell lysates of Escherichia coli: SEQID-00001, SEQID-00002, SEQID-00003, SEQID-00004, SEQID-00005, SEQID-00007, SEQID-00008, SEQID-00009, SEQID-00011, SEQID-00012, SEQID-00013, SEQID-00014, SEQID-00015, SEQID-00016, SEQID-00020, SEQID-00021, SEQID-00024, SEQID-00025, SEQID-00027, SEQID-00028, SEQID-00029, SEQID-00030, SEQID-00031, SEQID-00033, SEQID-00043, SEQID-00049, SEQID-00051, SEQID-00052, SEQID-00053, SEQID-00054, SEQID-00055, SEQID-00057, SEQID-00059, SEQID-00060, SEQID-00061, SEQID-00068, SEQID-00070, SEQID-00071, SEQID-00073, SEQID-00074, SEQID-00075, SEQID-00076, SEQID-00077, SEQID-00078, SEQID-00083, SEQID-00084, SEQID-00085, SEQID-00086, SEQID-00087, SEQID-00088, SEQID-00090, SEQID-00091, SEQID-00092, SEQID-00093, SEQID-00098, SEQID-00099, SEQID-00100, SEQID-00101, SEQID-00102, SEQID-00103, SEQID-00104, SEQID-00105, SEQID-00106, SEQID-00107, SEQID-00108, SEQID-00110, SEQID-00112, SEQID-00113, SEQID-00115, SEQID-00116, SEQID-00117, SEQID-00118, SEQID-00123, SEQID-00124, SEQID-00128, SEQID-00130, SEQID-00131, SEQID-00132, SEQID-00134, SEQID-00137, SEQID-00139, SEQID-00140, SEQID-00141, SEQID-00142, SEQID-00143, SEQID-00145, SEQID-00146, SEQID-00148, SEQID-00150, SEQID-00151, SEQID-00152, SEQID-00153, SEQID-00154, SEQID-00155, SEQID-00157, SEQID-00158, SEQID-00159, SEQID-00162, SEQID-00166, SEQID-00169, SEQID-00175, SEQID-00193, SEQID-00194, SEQID-00195, SEQID-00196, SEQID-00197, SEQID-00198, SEQID-00199, SEQID-00200, SEQID-00201, SEQID-00202, SEQID-00203, SEQID-00204, SEQID-00205, SEQID-00211, SEQID-00212, SEQID-00213, SEQID-00214, SEQID-00215, SEQID-00216, SEQID-00218, SEQID-00219, SEQID-00220, SEQID-00221, SEQID-00223, SEQID-00224, SEQID-00225, SEQID-00226, SEQID-00227, SEQID-00228, SEQID-00230, SEQID-00232, SEQID-00233, SEQID-00234, SEQID-00235, SEQID-00236, SEQID-00237, SEQID-00239, SEQID-00240, SEQID-00241, SEQID-00264, SEQID-00265, SEQID-00266, SEQID-00267, SEQID-00268, SEQID-00269, SEQID-00270, SEQID-00271, SEQID-00273, SEQID-00274, SEQID-00275, SEQID-00276, SEQID-00284, SEQID-00287, SEQID-00297, SEQID-00298, SEQID-00299, SEQID-00302, SEQID-00303, SEQID-00304, SEQID-00305, SEQID-00306, SEQID-00307, SEQID-00309, SEQID-00318, SEQID-00322, SEQID-00325, SEQID-00326, SEQID-00327, SEQID-00328, SEQID-00329, SEQID-00332, SEQID-00335, SEQID-00336, SEQID-00337, SEQID-00338, SEQID-00341, SEQID-00343, SEQID-00344, SEQID-00345, SEQID-00346, SEQID-00349, SEQID-00350, SEQID-00352, SEQID-00353, SEQID-00354, SEQID-00355, SEQID-00356, SEQID-00357, SEQID-00358, SEQID-00359, SEQID-00360, SEQID-00362, SEQID-00363, SEQID-00408, SEQID-00409, SEQID-00415, SEQID-00416, SEQID-00418, SEQID-00424, SEQID-00481, SEQID-00482, SEQID-00483, SEQID-00484, SEQID-00485, SEQID-00486, SEQID-00487, SEQID-00488, SEQID-00489, SEQID-00490, SEQID-00491, SEQID-00492, SEQID-00493, SEQID-00494, SEQID-00495, SEQID-00496, SEQID-00497, SEQID-00498, SEQID-00499, SEQID-00500, SEQID-00501, SEQID-00502, SEQID-00503, SEQID-00504, SEQID-00505, SEQID-00506, SEQID-00507, SEQID-00508, SEQID-00509, SEQID-00510, SEQID-00511, SEQID-00512, SEQID-00513, SEQID-00514, SEQID-00515, SEQID-00516, SEQID-00517, SEQID-00518, SEQID-00519, SEQID-00520, SEQID-00521, SEQID-00522, SEQID-00523, SEQID-00524, SEQID-00525, SEQID-00526, SEQID-00527, SEQID-00528, SEQID-00529, SEQID-00530, SEQID-00531, SEQID-00532, SEQID-00533, SEQID-00534, SEQID-00535, SEQID-00536, SEQID-00537, SEQID-00538, SEQID-00539, SEQID-00540, SEQID-00541, SEQID-00542, SEQID-00543, SEQID-00544, SEQID-00545, SEQID-00546, SEQID-00547, SEQID-00548, SEQID-00549, SEQID-00550, SEQID-00551, SEQID-00552, SEQID-00553, SEQID-00554, SEQID-00555, SEQID-00556, SEQID-00557, SEQID-00558, SEQID-00559, SEQID-00560, SEQID-00561, SEQID-00562, SEQID-00563, SEQID-00564, SEQID-00565, SEQID-00566, SEQID-00567, SEQID-00568, SEQID-00569, SEQID-00570, SEQID-00571, SEQID-00572, SEQID-00573, SEQID-00574, SEQID-00575, SEQID-00576, SEQID-00577, SEQID-00578, SEQID-00579, SEQID-00580, SEQID-00581, SEQID-00582, SEQID-00583, SEQID-00584, SEQID-00585, SEQID-00586, SEQID-00587, SEQID-00588, SEQID-00589, SEQID-00590, SEQID-00591, SEQID-00592, SEQID-00593, SEQID-00594, SEQID-00595, SEQID-00596, SEQID-00597, SEQID-00598, SEQID-00599, SEQID-00600, SEQID-00601, SEQID-00602, SEQID-00603, SEQID-00604, SEQID-00605, SEQID-00606, SEQID-00607, SEQID-00608, SEQID-00609, SEQID-00610, SEQID-00611, SEQID-00612, SEQID-00613, SEQID-00614, SEQID-00615, SEQID-00616, SEQID-00617, SEQID-00618, SEQID-00619, SEQID-00620, SEQID-00621, SEQID-00622, SEQID-00623, SEQID-00624, SEQID-00625, SEQID-00626, SEQID-00627, SEQID-00628, SEQID-00629, SEQID-00630, SEQID-00631, SEQID-00632, SEQID-00633, SEQID-00634, SEQID-00635, SEQID-00636, SEQID-00637, SEQID-00638, SEQID-00639, SEQID-00640, SEQID-00641, SEQID-00642, SEQID-00643, SEQID-00644, SEQID-00645, SEQID-00646, SEQID-00647, SEQID-00648, SEQID-00669, SEQID-00670, SEQID-00671, SEQID-00672, SEQID-00673, SEQID-00674, SEQID-00675, SEQID-00676, SEQID-00677, SEQID-00678, SEQID-00679, SEQID-00680, SEQID-00681, SEQID-00682, SEQID-00716, SEQID-00717, SEQID-00718, SEQID-00719, SEQID-00720, SEQID-00723, SEQID-00724, SEQID-00725, SEQID-00726, SEQID-00727, SEQID-00728, SEQID-00729, SEQID-00730, SEQID-00731, SEQID-00732, SEQID-00734, SEQID-00735, SEQID-00736, SEQID-00737, SEQID-00738, SEQID-00739, SEQID-00740, SEQID-00741, SEQID-00742, SEQID-00743, SEQID-00744, SEQID-00745, SEQID-00746, SEQID-00747, SEQID-00748, SEQID-00749, SEQID-00750, SEQID-00751, SEQID-00752, SEQID-00753, SEQID-00754, SEQID-00755, SEQID-00756, SEQID-00757, SEQID-00758, SEQID-00759, SEQID-00760, SEQID-00761, SEQID-00763, SEQID-00765, SEQID-00766, SEQID-00767, SEQID-00768, SEQID-00769, SEQID-00770, SEQID-00771, SEQID-00772, SEQID-00773, SEQID-00774, SEQID-00775, SEQID-00777, SEQID-00778, SEQID-00780, SEQID-00781, SEQID-00782, SEQID-00783, SEQID-00784, SEQID-00785, SEQID-00786, SEQID-00787, SEQID-00788, SEQID-00789, SEQID-00790, SEQID-00791, SEQID-00792, SEQID-00793, SEQID-00794, SEQID-00795, SEQID-00796, SEQID-00797, SEQID-00798, SEQID-00799, SEQID-00801, SEQID-00802, SEQID-00803, SEQID-00804, SEQID-00805, SEQID-00806, SEQID-00807, SEQID-00808, SEQID-00809, SEQID-00810, SEQID-00811, SEQID-00812, SEQID-00813, SEQID-00814, SEQID-00815, SEQID-00816, SEQID-00817, SEQID-00818, SEQID-00819, SEQID-00820, SEQID-00821, SEQID-00822, SEQID-00823, SEQID-00824, SEQID-00825, SEQID-00826, SEQID-00827, SEQID-00828, SEQID-00829, SEQID-00830, SEQID-00831, SEQID-00832, SEQID-00833, SEQID-00834, SEQID-00835, SEQID-00836, SEQID-00837.

The following nutritive polypeptides were detected in either cell lysates or secreted supernatants of Bacillus subtilis or Bacillus megaterium: SEQID-00003, SEQID-00004, SEQID-00005, SEQID-00087, SEQID-00099, SEQID-00102, SEQID-00103, SEQID-00105, SEQID-00115, SEQID-00218, SEQID-00220, SEQID-00223, SEQID-00226, SEQID-00236, SEQID-00240, SEQID-00267, SEQID-00271, SEQID-00276, SEQID-00297, SEQID-00298, SEQID-00299, SEQID-00302, SEQID-00303, SEQID-00304, SEQID-00305, SEQID-00306, SEQID-00307, SEQID-00309, SEQID-00318, SEQID-00322, SEQID-00325, SEQID-00326, SEQID-00327, SEQID-00328, SEQID-00329, SEQID-00330, SEQID-00332, SEQID-00335, SEQID-00336, SEQID-00337, SEQID-00338, SEQID-00340, SEQID-00341, SEQID-00343, SEQID-00344, SEQID-00345, SEQID-00346, SEQID-00349, SEQID-00350, SEQID-00352, SEQID-00353, SEQID-00354, SEQID-00355, SEQID-00356, SEQID-00357, SEQID-00358, SEQID-00359, SEQID-00360, SEQID-00361, SEQID-00362, SEQID-00363, SEQID-00374, SEQID-00389, SEQID-00398, SEQID-00403, SEQID-00404, SEQID-00405, SEQID-00407, SEQID-00409, SEQID-00415, SEQID-00416, SEQID-00417, SEQID-00418, SEQID-00419, SEQID-00420, SEQID-00421, SEQID-00424, SEQID-00481, SEQID-00482, SEQID-00483, SEQID-00484, SEQID-00485, SEQID-00486, SEQID-00487, SEQID-00488, SEQID-00489, SEQID-00490, SEQID-00491, SEQID-00492, SEQID-00493, SEQID-00494, SEQID-00495, SEQID-00496, SEQID-00497, SEQID-00498, SEQID-00499, SEQID-00500, SEQID-00501, SEQID-00502, SEQID-00503, SEQID-00504, SEQID-00505, SEQID-00506, SEQID-00507, SEQID-00508, SEQID-00509, SEQID-00510, SEQID-00511, SEQID-00512, SEQID-00513, SEQID-00514, SEQID-00515, SEQID-00516, SEQID-00517, SEQID-00518, SEQID-00519, SEQID-00520, SEQID-00521, SEQID-00522, SEQID-00523, SEQID-00524, SEQID-00525, SEQID-00526, SEQID-00527, SEQID-00528, SEQID-00529, SEQID-00530, SEQID-00531, SEQID-00532, SEQID-00533, SEQID-00534, SEQID-00535, SEQID-00536, SEQID-00537, SEQID-00538, SEQID-00539, SEQID-00540, SEQID-00541, SEQID-00542, SEQID-00543, SEQID-00544, SEQID-00545, SEQID-00546, SEQID-00547, SEQID-00548, SEQID-00549, SEQID-00550, SEQID-00551, SEQID-00552, SEQID-00553, SEQID-00554, SEQID-00555, SEQID-00556, SEQID-00557, SEQID-00558, SEQID-00559, SEQID-00560, SEQID-00561, SEQID-00562, SEQID-00563, SEQID-00564, SEQID-00565, SEQID-00566, SEQID-00567, SEQID-00568, SEQID-00569, SEQID-00570, SEQID-00571, SEQID-00572, SEQID-00573, SEQID-00574, SEQID-00575, SEQID-00576, SEQID-00577, SEQID-00578, SEQID-00579, SEQID-00580, SEQID-00581, SEQID-00582, SEQID-00583, SEQID-00584, SEQID-00585, SEQID-00586, SEQID-00587, SEQID-00588, SEQID-00589, SEQID-00590, SEQID-00591, SEQID-00592, SEQID-00593, SEQID-00594, SEQID-00595, SEQID-00596, SEQID-00597, SEQID-00598, SEQID-00599, SEQID-00600, SEQID-00601, SEQID-00602, SEQID-00603, SEQID-00604, SEQID-00605, SEQID-00606, SEQID-00607, SEQID-00608, SEQID-00609, SEQID-00610, SEQID-00611, SEQID-00612, SEQID-00613, SEQID-00614, SEQID-00615, SEQID-00616, SEQID-00617, SEQID-00618, SEQID-00619, SEQID-00620, SEQID-00621, SEQID-00622, SEQID-00623, SEQID-00624, SEQID-00625, SEQID-00626, SEQID-00627, SEQID-00628, SEQID-00629, SEQID-00630, SEQID-00631, SEQID-00632, SEQID-00633, SEQID-00634, SEQID-00635, SEQID-00636, SEQID-00637, SEQID-00638, SEQID-00639, SEQID-00640, SEQID-00641, SEQID-00642, SEQID-00643, SEQID-00644, SEQID-00645, SEQID-00646, SEQID-00647, SEQID-00648, SEQID-00653, SEQID-00654, SEQID-00655, SEQID-00656, SEQID-00657, SEQID-00659, SEQID-00660, SEQID-00664, SEQID-00668, SEQID-00670, SEQID-00671, SEQID-00672, SEQID-00673, SEQID-00674, SEQID-00675, SEQID-00676, SEQID-00678, SEQID-00679, SEQID-00680, SEQID-00681, SEQID-00682, SEQID-00690, SEQID-00710, SEQID-00711, SEQID-00712, SEQID-00713, SEQID-00714, SEQID-00715, SEQID-00716, SEQID-00717, SEQID-00718, SEQID-00719, SEQID-00720, SEQID-00723, SEQID-00724, SEQID-00725, SEQID-00726, SEQID-00727, SEQID-00728, SEQID-00729, SEQID-00730, SEQID-00731, SEQID-00734, SEQID-00735, SEQID-00736, SEQID-00737, SEQID-00738, SEQID-00739, SEQID-00740, SEQID-00741, SEQID-00742, SEQID-00743, SEQID-00744, SEQID-00745, SEQID-00746, SEQID-00747, SEQID-00748, SEQID-00749, SEQID-00750, SEQID-00751, SEQID-00752, SEQID-00753, SEQID-00754, SEQID-00755, SEQID-00756, SEQID-00757, SEQID-00758, SEQID-00759, SEQID-00760, SEQID-00761, SEQID-00763, SEQID-00765, SEQID-00766, SEQID-00767, SEQID-00768, SEQID-00769, SEQID-00770, SEQID-00771, SEQID-00772, SEQID-00773, SEQID-00774, SEQID-00775, SEQID-00777, SEQID-00778, SEQID-00780, SEQID-00781, SEQID-00782, SEQID-00783, SEQID-00784, SEQID-00785, SEQID-00786, SEQID-00787, SEQID-00788, SEQID-00789, SEQID-00790, SEQID-00791, SEQID-00792, SEQID-00793, SEQID-00794, SEQID-00795, SEQID-00796, SEQID-00797, SEQID-00798, SEQID-00799, SEQID-00800, SEQID-00801, SEQID-00802, SEQID-00803, SEQID-00804, SEQID-00805, SEQID-00806, SEQID-00807, SEQID-00808, SEQID-00809, SEQID-00810, SEQID-00811, SEQID-00812, SEQID-00813, SEQID-00814, SEQID-00815, SEQID-00816, SEQID-00817, SEQID-00818, SEQID-00819, SEQID-00820, SEQID-00821, SEQID-00822, SEQID-00823, SEQID-00824, SEQID-00825, SEQID-00826, SEQID-00827, SEQID-00828, SEQID-00829, SEQID-00830, SEQID-00831, SEQID-00832, SEQID-00833, SEQID-00834, SEQID-00835, SEQID-00836, SEQID-00837.

The following nutritive polypeptides were detected from the secreted supernatant of Aspergillus niger: SEQID-00087, SEQID-00103, SEQID-00105, SEQID-00112, SEQID-00115, SEQID-00218, SEQID-00298, SEQID-00341, SEQID-00352, SEQID-00354, SEQID-00363, SEQID-00406, SEQID-00409, SEQID-00415, SEQID-00416, SEQID-00417, SEQID-00418, SEQID-00419, SEQID-00420, SEQID-00421, SEQID-00424, SEQID-00552, SEQID-00554.

The following nutritive polypeptides were expressed in the mammalian cell lines Chinese Hamster Ovarian-S strain (CHO-S) or Human Embryonic Kidney 293F strain (HEK293F): SEQID-00001, SEQID-00103, SEQID-00105, SEQID-00298.

Example 13. Expression of Nutritive Polypeptides in E. coli Bacteria

A nutritive polypeptide sequence library was generated from edible species and screened to demonstrate nutritive polypeptide expression in E. coli.

Gene Synthesis & Plasmid Construction.

All genes were made synthetically by either Life Technologies/GeneArt or DNA2.0, and optimized for expression in Escherichia coli. The genes were cloned into pET15b (EMD Millipore/Novagen) using the NdeI-BamHI restriction sites within the multiple cloning site (therefore containing an amino-terminal MGSSHHHHI-IHSSGLVPRGSH tag (SEQ ID NO: 3916)), or cloned into the NcoI-BamHI sites (therefore removing the amino terminal tag on the plasmid) using primers to include an amino terminal tag containing MGSHHHHHHHH (SEQ ID NO: 3917) or MGSHHHHHHHHSENLYFQG (SEQ ID NO: 3918). pET15b contains a pBR322 origin of replication, a lac-controlled T7 promoter, and a bla gene conferring resistance to carbenicillin. For manually cloned fragments, inserts were verified by Sanger sequencing using both the T7 promoter primer and the T7 terminator primer. For the secreted constructs, the genes were cloned into pJ444 vector (DNA 2.0, USA) upstream of T5 terminator with C-terminal HHHHHHHH tag (SEQ ID NO: 3919) and DsbA signal peptide (ATGAAAAAGATTTGGCTGGCGCTGGCTGGTTTAGTTTTAGCGTTTAGCGCATCGG CG (SEQ ID NO: 3920)) in N-terminal.

Strain Construction.

T7 Express Competent E. coli was purchased from New England Biolabs and was used as the parent strain. T7 Express is an enhanced BL21 derivative which contains the T7 RNA polymerase in the lac operon, while still lacking the Lon and OmpT proteases. The genoptype of T7 Express is: thuA2 lacZ::T7 gene1 [lon] ompT gal sulA11 R(mcr-73::miniTn10--TetS)2 [dcm] R(zgb-210::Tn10--TetS) endA1 Δ(mcrC-mrr)114::IS10. For secreted constructs, CGSC 5610 (Yale E. coli genetic stock center, USA) was used for the study. The genotype of CGSC 5610 is F-, lacY1 or Δ(cod-lacI)6, glnV44(AS), galK2(Oc), galT22, λ-, mcrA0, rfbC1, metB1, mcrB1, hsdR2. Roughly 1 ng of purified plasmid DNA described above was used to transform chemically competent T7 Express and single colonies were selected on LB agar plates containing 100 mg/l carbenicillin after roughly 16 hr of incubation at 37° C. Single colonies were inoculated into liquid LB containing 100 mg/l carbenicillin and grown to a cell-density of OD600 nm 0.6, at which point, glycerol was supplemented to the medium at 10% (v/v) and an aliquot was taken for storage in a cryovial at −80° C.

Expression Testing.

Expression cultures were grown in LB medium (10 g/l NaCl, 10 g/l tryptone, and 5 g/l yeast extract) or in BioSilta EnBase medium and induced with isopropyl β-D-1-thiogalactopyranoside (IPTG). For expression testing in LB media, a colony or stab from a glycerol stock was inoculated into 3 ml LB supplemented with 100 mg/l carbenicillin and grown overnight (roughly 16 hr) at 37° C. and 250 rpm. The next morning, the cell-density (spectrophotometrically at OD600 nm) was measured and diluted back to OD600 nm=0.05 into 3 ml LB medium supplemented with carbenicillin and grown at 37° C. and 250 rpm. At OD600 nm≈0.8±0.2 the cultures were induced with 1 mM IPTG. Heterologous expression was allowed to proceed for 2 hr at 37° C. and 250 rpm, at which point the cultures were terminated. The terminal cell-density was measured. For expression with Enbase media, a colony or stab from a glycerol stock was inoculated into 3 ml LB with 100 mg/l carbenicillin and transferred to Enbase media with 100 mg/l carbenicillin and 600 mU/1 of glucoamylase at OD600=0.1 and grown overnight at 37° C. and 250 rpm and induced with 1 mM IPTG next day. Heterologous expression was allowed for 24 hours at 37° C. and 250 rpm, at which point the cultures were terminated. The terminal cell-density was measured and the cells were harvested by centrifugation (3000 rpm, 10 min, RT). To determine intracellular production the cells were lyzed with B-PER (Pierce) according to manufacturer's protocol and then assayed to measure protein of interest (POI). To determine the levels of secreted protein, 0.5-ml aliquots of the culture supernatants were filtered by a 0.22 μm filter. The filtrates were then assayed to determine the levels of secreted protein of interest (POI).

Fermentation. The intracellular soluble proteins SEQID-00105, SEQID-00240, SEQID-00338, SEQID-00341, SEQID-00352, SEQID-00363, SEQID-00423, SEQID-00424, SEQID-00425, SEQID-00426, SEQID-00429, SEQID-00559, and SEQID-00587 were expressed in E. coli host cells NEB T7 Express (New England BioLabs) in 20 and/or 250 L fermentations. The fermentation occurred in a carbon and nitrogen rich media containing: Yeast Extract, Soy Hydrolysate, Glycerol, Glucose, and Lactose. The fermentation occurred at 30° C. and induction occurred when the Glucose present in the media was exhausted and Lactose became the primary carbon source sugar. The fermentation process time generally lasted 24-26 hrs. Fermentation run parameters were controlled at a pH of 6.9, temperature of 30° C., and a percent dissolved oxygen of 35%. The culture was supplemented with a Glycerol based feed in the later stages of the culture duration. Harvest occurred when the cells entered stationary phase and no longer required oxygen supplementation to maintain the 35% set point.

Nutritive Polypeptide Library Gene Synthesis & Plasmid Construction.

Genes encoding for 168 different edible species polypeptide sequences were generated as linear fragments and codon-selected for expression in Escherichia coli. In some cases, a single linear fragment contained two or more nucleic acid sequences encoding two or more distinct polypeptide sequences that had a flanking 5′ NdeI restriction site and 3′ BamHI restriction site. All linear fragments were combined at equal molar concentrations. The linear fragments were digested with NdeI and BamHI and the digested mixture was cloned into pET15b (EMD Millipore/Novagen) using primers to include an amino terminal tag containing MGSHHHHHHHH (SEQ ID NO: 3917) and NdeI-BamHI restriction sites. pET15b contains a pBR322 origin of replication, a lac-controlled T7 promoter, and a bla gene conferring resistance to carbenicillin. For cloned fragments, inserts were verified by Sanger sequencing using both the T7 promoter primer and the T7 terminator primer.

168 Nutritive Polypeptide Library Strain Construction.

T7 Express Competent E. coli (New England Biolabs) was used as the parent strain. T7 Express is an enhanced BL21 derivative which contains the T7 RNA polymerase in the lac operon, while lacking the Lon and OmpT proteases. The genotype of T7 Express is: fhuA2 lacZ::T7 gene1 [lon] ompT gal sulA11 R(mcr-73::miniTn10--TetS)2 [dcm] R(zgb-210::Tn10--TetS) endA1 Δ(mcrC-mrr)114::IS10. For secreted constructs, CGSC 5610 (Yale E. coli genetic stock center, USA) was used. The genotype of CGSC 5610 is F-, lacY1 or Δ(cod-lacI) 6, glnV44(AS), galK2(Oc), galT22, λ-, e14-, mcrAO, rfbC1, metB1, mcrB1, hsdR2. Approximately 10 ng of ligated DNA mixture were transformed into chemically competent T7 Express and single colonies were selected on LB agar plates containing 100 mg/l carbenicillin after roughly 16 hr of incubation at 37° C. Multiple transformations were done and approximately 1000 colonies were pooled together and suspended into LB medium. The DNA from 50-100 colonies was sequenced to determine the diversity of the generated library.

168 Nutritive Polypeptide Library Expression Testing.

The colony mixtures resuspended in LB medium were initially grown in 3 ml of LB medium (10 g/l NaCl, 10 g/l tryptone, and 5 g/l yeast extract) with 100 mg/l carbenicillin, then transferred to BioSilta Enbase media with 100 mg/l carbenicillin and 600 mU/l of glucoamylase at OD600=0.1, grown overnight at 37° C. and 250 rpm and induced with 1 mM IPTG next day. Heterologous expression was allowed for 24 hours at 37° C. and 250 rpm, at which point the cultures were terminated. The terminal cell-density was measured and the cells were harvested by centrifugation (3000 rpm, 10 min, RT). To determine intracellular production the cells were lysed using a microfluidizer and the soluble fraction was purified in 5 ml Nickel Affinity column according to manufacturer's protocol and then assayed using LC-MS/MS to identify the proteins that were expressed as described below. 114/168 different proteins were successfully solubly expressed in E. coli based on MS spectral count. Based on this result, certain genes were individually cloned and tested for intracellular soluble expression in E. coli.

Fungal Nutritive Polypeptides Expressed in E. coli Strain Backgrounds.

Four strains were used to express fungal nutritive polypeptides: T7 Express, Shuffle T7, and Shuffle T7 Express from New England Biolabs; and Origami B(DE3) from EMD Millipore.

T7 Express is an enhanced BL21 derivative which contains the T7 RNA polymerase in the lac operon, while lacking the Lon and OmpT proteases. The genotype of T7 Express is: fhuA2 lacZ::T7 gene1 [lon] ompT gal sulA11 R(mcr-73::miniTn10--TetS)2 [dcm] R(zgb-210::Tn10--TetS) endA1 Δ(mcrC-mrr)114::IS10.

Shuffle T7 is a K12 derivative strain that promotes cytoplasmic disulfide bond formation and expresses a chromosomal copy of T7 RNAP. The genotype of Shuffle T7 is F′ lac, pro, lacIQ/Δ(ara-leu)7697 araD139 thuA2 lacZ::T7 gene1 Δ(phoA)PvuII phoR ahpC* galE (or U) galK λatt::pNEB3-rl-cDsbC (SpecR, lacIq) ΔtrxB rpsL150(StrR) Δgor Δ(malF)3.

Shuffle T7 Express is a BL21 derivative strain that promotes cytoplasmic disulfide bond formation and expresses a chromosomal copy of T7 RNAP. The genotype of Shuffle T7 Express is thuA2 lacZ::T7 gene1 [lon] ompT ahpC gal λatt::pNEB3-rl-cDsbC (SpecR, lacIq) ΔtrxB sulA11 R(mcr-73::miniTn10--TetS)2 [dcm] R(zgb-210::Tn10--TetS) endA1 Δgor Δ(mcrC-mrr)114::IS10.

Origami B(DE3) is a BL21 derivative that contains mutations in trxB and gor that promote disulfide bond formation in the cytoplasm. The genotype of Origami B(DE3) is F-ompT hsdSB(rB- mB-) gal dcm lacY1 ahpC (DE3) gor522::Tn10 trxB (KanR, TetR)

Expression screening of these strains was completed as described for E. coli.

Example 14. Expression of Nutritive Polypeptides in B. subtilis Bacteria

Gene Synthesis & Plasmid Construction.

All of the genes encoding proteins of interest (POI) were cloned by PCR. The templates for PCR amplification of these genes were either synthetic genes, generated for expression in E. coli (see above), or genomic DNA from a source organism (e.g. B. subtilis). Synthetic genes were codon optimized for expression in E. coli. All of the genes were cloned with a sequence encoding a 1× FLAG tag (a.a.=DYKDDDDK (SEQ ID NO: 3914)) fused, in frame, to their 3′-terminus immediately preceding the stop codon. For expression in B. subtilis, genes were cloned in the MoBiTec (Göttingen, Germany) expression vector, pHT43, using the Gibson Assembly Master Mix (New England Biolabs, Beverly, Mass.) and the cloning host E. coli Turbo (New England Biolabs) according to manufacturer's instructions. The genes were cloned into pHT43 either as secretion expression constructs (gene fused in-frame with DNA encoding the α-amylase signal peptide (SamyQ) from Bacillus amyloliquefaciens) or as an intracellular expression constructs (gene inserted immediately downstream of the ribosomal binding site (RBS), removing the SamyQ sequence). Following transformation into E. coli, cells containing recombinant plasmids were selected on LB agar plates containing 100 μg/ml carbenicillin (Cb100). Recombinant plasmids were isolated from E. coli and their DNA sequences were verified by Sanger sequencing prior to transformation into the B. subtilis expression host.

Strain Construction.

B. subtilis strain WB800N was purchased from MoBiTec (Göttingen, Germany) and used as the expression host. WB800N is a derivative of a well-studied strain (B. subtilis 168) and it has been engineered to reduce proteolytic degradation of secreted proteins by deletion of genes encoding 8 extracellular proteases (nprE, aprE, epr, bpr, mpr, nprB, vpr and wprA). B. subtilis transformations were performed according to the manufacturer's instructions. Approximately 1 μg of each expression construct was transformed into WB800N and single colonies were selected at 37° C. by plating on LB agar containing 5.0 μg/ml chloramphenicol (Cm5). Individual transformants were grown in LB broth containing Cm5 until they reached log phase. Aliquots of these cultures were mixed with glycerol (20% final concentration) and frozen at 80° C.

Expression Testing.

Frozen glycerol stocks of B. subtilis expression strains were used to inoculate 1-ml of 2×-MAL medium (20 g/l NaCl, 20 g/l tryptone, and 10 g/l yeast extract, 75 g/l maltose) with Cm5, in deep well blocks (96-square wells). Culture blocks were covered with porous adhesive plate seals and incubated overnight in a micro-expression chamber (Glas-Col, Terre Haute, Ind.) at 37° C. and 880 rpm. Overnight cultures were used to inoculate fresh, 2×-MAL, Cm5 cultures, in deep well blocks, to a starting OD600=0.1. These expression cultures were incubated at 37° C., 880 rpm until the OD600=1.0 (approx. 4 hrs) at which time they were induced by adding isopropyl β-D-1-thiogalactopyranoside (IPTG) at a final concentration of 0.1 M and continuing incubation for 4 hrs. After 4 hrs, the cell densities of each culture was measured (OD600) and cells were harvested by centrifugation (3000 rpm, 10 min, RT). After centrifugation, culture supernatant was carefully removed and transferred to a new block and cell pellets were frozen at −80° C. To determine the levels of secreted protein, 0.5-ml aliquots of the culture supernatants were filtered first through a 0.45-μm filter followed by a 0.22 μm filter. The filtrates were then assayed to determine the levels of secreted protein of interest (POI).

To determine levels of intracellularly produced POI, frozen cell pellets were thawed and 0.5 g of 0.1 mm zirconium beads were added to each sample followed by 0.5 ml of PBS. The cells were lysed in the cold room (4° C.) by bead-beating for 5 min in a Qiagen TissuelyserII (Qiagen, Hilden, Germany) equipped with a 96-well plate adapter. Cell lysates were centrifuged at 3000 rpm for 10 min and the supernatant was removed and analyzed for POI concentration as described below. To determine the levels of secreted protein, 0.5-ml aliquots of the culture supernatants were filtered by a 0.22 μm filter. The filtrates were then assayed to determine the levels of secreted protein of interest (POI).

Shake Flask Expression.

A single colony was picked from an agar plate for each SEQID and inoculated into 5 mL of 2×Mal media. These shake flasks inocula were grown in a 30° C. shaker incubator overnight. 5 mLs of this overnight culture was used to inoculate 250 mL 2×Mal. Cultures were grown for 4 hours at 30° C. then induced for 4 hours at 30° C. Cultures were harvested by centrifugation. Centrifuged supernatants for each SEQID were sterile filtered and frozen at −80° C.

Fermentation Expression.

Soluble protein for SEQID-00105 has been secreted from engineered Bacillus subtilis strains containing episomal or integrated plasmids. The fermentation occurred at a volume of 4 L in a carbon and nitrogen rich media containing Phytone Peptone, Yeast Extract, and Glucose. Fermentation cultures were grown at 30° C., at a pH of 7.0 and a percent dissolved oxygen of 40%. Induction, via the addition of IPTG, occurred at an OD600 nm of 5.0 (+/−1.0). The cultures were supplemented with a Glucose based feed post induction. The cultures were harvested 5-8 hours post induction. The biomass was then removed via centrifugation and the supernatant was clarified via filtration and stored at 4° C. until processing.

168 Nutritive Polypeptide Library Gene Synthesis & Plasmid Construction.

For expression in B. subtilis, the vector that was used was derived from pHT43 backbone vector (MoBiTec Göttingen, Germany) with no signal peptide and grac promoter substituted with aprE promoter and lad expression cassette removed. 168 genes encoding for 168 different protein sequences that were identified above were made synthetically by Life Technologies/GeneArt as linear fragments (GeneStrings) and selected for expression in Escherichia coli. In most cases two genes were synthesized together in a single linear fragment that had a flanking 5′ NdeI restriction site and 3′ BamHI restriction site. All the linear fragments were mixed together at equal molar concentrations. Then the linear fragments mixture was digested with NdeI and BamHI. The digested mixtures were cloned into the vector either as secretion expression constructs (gene fused in-frame with DNA encoding the lipase signal peptide (LipAsp) from Bacillus subtilis) or as an intracellular expression constructs (gene inserted immediately downstream of the ribosomal binding site (RBS), with and without N-terminal tag containing MGSHHHHHHHH (SEQ ID NO: 3917)). The library of genes was ligated to the vector PCR product using T4 DNA ligase (New England Biolabs, Beverly, Mass.). The ligation products were transformed into the cloning host, E. coli Turbo (New England Biolabs) according to manufacturer's instructions. 50-100 colonies were sequenced to determine the diversity of the leader peptide library. The colonies on the agar plate were then suspended in LB media and harvested for plasmid purification.

168 Nutritive Polypeptide Library Strain Construction.

WB800N is a derivative of a well-studied strain (B. subtilis 168) and it has been engineered to reduce proteolytic degradation of secreted proteins by deletion of genes encoding 8 extracellular proteases (nprE, aprE, epr, bpr, mpr, nprB, vpr and wprA). B. subtilis strain WB800N was purchased from MoBiTec (Göttingen, Germany) and was modified to have the following mutations WB800N: pXylA-comK::Erm, degU32(Hy), ΔsigF. This new strain was used as the expression host. Roughly 1 μg of the plasmid mixture purified from E. coli cells was transformed into the expression strain. After transformation, 100 μL of the culture were plated onto four LB 1.5% agar plates containing 5 mg/L chloramphenicol and incubated at 37° C. for 16 hrs. After incubation, 2 mL of LB media with 5 mg/L chloramphenicol were added to the surface of each plate containing several thousand transformants, and the cells were suspended in the surface medium by scraping with a cell spreader and mixing. Suspended cells from the four replicates were pooled together to form the preinoculum culture for the expression experiment.

168 Nutritive Polypeptide Expression Testing.

The OD600 of the preinoculum culture made from resuspended cells was measured using a plate reader to be roughly 20-25. A 500 mL baffled shake flask containing 50 mL of 2×Mal medium (20 g/L NaCl, 20 g/L Tryptone, 10 g/L yeast extract, 75 g/L D-Maltose) with 5 mg/L chloramphenicol was inoculated to OD600≈0.2 to form the inoculum culture, and incubated at 30° C. shaking at 250 rpm for roughly 6 hours. OD600 was measured and the inoculum culture was used to inoculate the expression culture in a 2 L baffled shake flask containing 250 ml 2×Mal medium with 5 mg/L chloramphenicol, 1× Teknova Trace Metals, and 0.01% Antifoam 204 to an OD600 of 0.1. The culture was shaken for 30° C. and 250 rpm for 18 hours, at which point the culture was harvested. For the secreted protein library constructs, the terminal cell density was measured and the supernatant was harvested by centrifugation (5000×g, 30 min, RT) and filtered using 0.22 um filter. For the intracellular protein library constructs, the terminal cell density was measured and the cells were harvested by centrifugation (5000×g, 30 min, RT). Cells were then lysed using microfluidizer and the soluble fraction was purified using nickel affinity column if the construct library had N-terminal His tag. Otherwise the soluble fraction was used for further analysis. All the samples were run on SDS-PAGE gels, separated into ten fractions, and then analyzed using LC-MS/MS as described below. 40/168 proteins were successfully secreted, 10/168 proteins were successfully produced intracellularly without His tag and 28/168 proteins were successfully produced intracellularly with 5′ 8×His tag (SEQ ID NO: 3919) in Bacillus subtilis. Based on the results, certain genes were individually cloned and tested for individual secretion in Bacillus subtilis.

Secreted Nutritive Polypeptide Plasmid Construction.

For this study, the pHT43 backbone vector with no signal peptide was modified by removing the SamyQ signal peptide to allow for the native signal peptide to guide secretion, substituting the grac promoter with the aprE promoter, removing the lad region, and adding a 1×FLAG tag (DYKDDDDK (SEQ ID NO: 3914)) before the terminator region. The unmodified pHT43 vector from MoBiTec contains the Pgrac promoter, the SamyQ signal peptide, Amp and Cm resistance genes, a lacI region, a repA region, and the ColE1 origin of replication. To amplify the genes of interest, genomic DNA preps were made from wild-type Bacillus strains. Secreted nutritive polypeptide genes including their native signal peptide coding regions were PCR amplified using PCR primers with tails containing 25 bp homology regions to the pHT43 backbone and were run on a 1% Agarose TAE gel to check for correct insert size. 10 μL from each PCR were pooled together into a single library of inserts, and the mix was Gibson ligated to a pHT43 backbone vector. The ligation was transformed into 10-Beta electrocompetent cells (New England Biolabs), and transformed cells were plated at a 10-1 dilution onto four LB agar plates with 100 mg/L carbenicillin. One plate was sequenced using a forward primer that binds in the promoter region and a reverse primer that binds in the terminator region. 2 mL of LB medium with 100 mg/L carbenicillin was added to the remaining three plates. Cells were scraped and suspended into the LB medium, and the plasmids were extracted from the cell suspensions to form the multiplex plasmid mix to be transformed into the expression strain. The secreted polypeptide library strain construction and expression were done similar to 168 nutritive polypeptide library strain construction and expression testing.

Secretion Leader Peptide Library Construction.

Secretion signal peptide libraries facilitate the secretion of any given protein of interest. One approach to enhancing secretion is to fuse a library of signal peptide sequences to the protein of interest and screen for those that result in the highest level of secretion. The signal peptide library described here consists of 173 signal peptides that were identified as being associated to naturally Sec mediated secreted proteins in Bacillus subtilis (Brockmeier et al Molecular Biology, 2006). A signal peptide library was generated for SEQID-43136, starting with plasmid pES1207 which has SEQID-43136 fused to the signal peptide from the B. amyloliquefaciens α-amylase (SamyQ). pES1207 was used as template for a PCR reaction with primers, Pfwd and Prev, which amplified the entire plasmid except for the SamyQ sequence. Pfwd and Prev possessed tails that had AarI restriction sites and the PCR products were purified and cut with Aar I. The fragment was then dephosphorylated using Antarctic phosphatase (New England Biolabs, Beverly, Mass.). DNA encoding the individual signal peptides was constructed by duplexing single stranded oligonucleotides comprising the forward- and reverse-strands of each signal peptide sequence. The oligonucleotides were designed such that single strand tails were formed at the 5′-ends of the duplexed molecule. These were complementary of the overhangs generated by the AarI digestion of the vector PCR fragment. To duplex the oligonucleotides, the direct strand and the reverse strand oligonucleotides were mixed together, phosphorylated using T4 polynucleotide kinase (New England Biolabs, Beverly, Mass.) and annealed. Post annealing, signal peptide DNA sequences were mixed in equal proportion in a single tube. The library of signal peptides was ligated to the vector PCR product using T4 DNA ligase (New England Biolabs, Beverly, Mass.). The ligation products were transformed into the cloning host, E. coli Turbo (New England Biolabs) according to manufacturer's instructions. 50-100 colonies were sequenced to determine the diversity of the leader peptide library for SEQID-00298 and SEQID-00338. The colonies on the agar plate were then suspended in LB media and harvested for plasmid purification.

Secretion Leader Peptide Library Strain Construction.

B. subtilis strain WB800N (MoBiTec, Göttingen, Germany) was used as the expression host. Approximately 10 μg of signal peptide library of a particular protein construct was transformed into WB800N and single colonies were selected at 37° C. by plating on LB agar containing 5.0 μg/ml chloramphenicol (Cm5). The leader peptide library screening for SEQID-00105, SEQID-00352, SEQID-00341, SEQID-00103 were carried out in B. subtilis WB800N that has been modified to have mutations in the sigF sporulation factor and also the intracellular serine protease (ispA) was disrupted with an antibiotic marker. The strain also had an inducible comK (the competence initiation transcription factor) integrated into chromosome for higher transformation efficiency.

Secretion Leader Peptide Library Expression Screening.

400-500 individual transformants of the B. subtilis signal peptide library were used to inoculate individual, 1-ml cultures of 2×-MAL medium (20 g/l NaCl, 20 g/l tryptone, and 10 g/l yeast extract, 75 g/l maltose) with Cm5, in deep well blocks (96-square wells). In addition to the signal peptide library strains, a strain containing plasmid with the protein of interest and the SamyQ leader peptide was inoculated as a control. Culture blocks were covered with porous adhesive plate seals and incubated overnight in a micro-expression chamber (Glas-Col, Terre Haute, Ind.) at 37° C. and 800 rpm. Overnight cultures were used to inoculate fresh, 2×-MAL, Cm5 cultures, in deep well blocks, to a starting OD600=0.15.

Expression cultures were incubated at 37° C., 880 rpm until the OD600=1.0 (approx. 4 hrs) at which time they were induced by adding isopropyl β-D-1-thiogalactopyranoside (IPTG) at a final concentration of 1 mM and continuing incubation for 4 hrs. After 4 hrs, the cell densities of each culture was measured (OD600) and cells were harvested by centrifugation (3000 rpm, 10 min, RT). After centrifugation, the culture supernatant was carefully removed and transferred to a new block and cell pellets were frozen at −80° C. To determine the levels of secreted protein, 0.5-ml aliquots of the culture supernatants were filtered first through a 0.45-μm filter followed by a 0.22 μm filter. The filtrates were then assayed by Chip electrophoresis, as described herein, to determine the levels of secreted protein of interest (POI) and compared with the level of secretion of base construct.

Diluted overnight cultures were used as inoculum for LB broth cultures containing Cm5. These cultures were grown at 37 C until they reached log phase. Aliquots of these cultures were mixed with glycerol (20% final concentration) and frozen at 80° C. The top 10-15 hits were then purified using Instagene matrix (Biorad, USA) and amplified around the signal peptide and sent for sequencing to identify the signal peptide sequence.

TABLE E14A Exemplary results of B. subtilis leader peptide library screening. SEQID- SEQID- SEQID- SEQID- SEQID- SEQID- Other Gene 298 00338 00105 00352 00341 00103 SEQ IDs SEQID Name Protein Sequence mg/l/OD mg/l/OD mg/l/OD mg/l/OD mg/l/OD mg/l/OD mg/l/OD 3921 abnA MKKKKTWKRFLHFSSAALAAGLIFTSAAPAEA 4.4 9.5 135.7 ~ 148.14 12.1 3.1 SEQID- 00405 3922 bglC MKRSISIFITCLLITLLTMGGMIASPASA 7.8 ~ 73.6 2.7 33.9 12.1 ~ 3923 bpr MRKKTKNRLISSVLSTVVISSLLFPGAAGA ~ ~ ~ ~ 40.4 ~ ~ 3924 glpQ MRKNRILALFVLSLGLLSFMVTPVSA ~ 5.7 ~ ~ 41.9 ~ 5.5 SEQID- 00398 3925 lipA MKFVKRRIIALVTILMLSVTSLFALQPSAKAA 14.1 ~ ~ 10.9 ~ 16.1 ~ 3926 lytB MKSCKQLIVCSLAAILLLIPSVSFA ~ 3.4 ~ ~ ~ ~ ~ 3927 lytF MKKKLAAGLTASAIVGTTLVVTPAEA ~ ~ ~ ~ ~ 14.8 ~ 3928 mpr MKLVPRFRKQWFAYLTVLCLALAAAVSFGVPA 10.8 ~ ~ ~ ~ ~ ~ KA 3929 nprB MRNLTKTSLLLAGLCTAAQMVFVTHASA ~ ~ ~ ~ 37.5 ~ ~ 3930 pelB MKRLCLWFTVFSLFLVLLPGKALG ~ ~ ~ ~ 64.4 ~ ~ 3931 penP MKLKTKASIKFGICVGLLCLSITGFTPFFNST ~ ~ ~ 2.8 ~ ~ ~ HAEA 3932 phoB MKKFPKKLLPIAVLSSIAFSSLASGSVPEASA 7.9 ~ ~ ~ ~ ~ ~ 3933 wapA MKKRKRRNFKRFIAAFLVLALMISLVPADVLA ~ 7 ~ 2.14 ~ ~ ~ 3934 xynA MFKFKKNFLVGLSAALNSISLFSATASA 4.3 9.8 114.4 ~ 83.3 13.4 ~ 3935 ybfO MKRMIVRMTLPLLIVCLAFSSFSASARA ~ ~ ~ 0.69 ~ ~ ~ 3936 yckD MKRITINIITMFIAAAVISLTGTAEA ~ ~ ~ ~ 61.7 ~ ~ 3937 yddT MRKKRVITCVMAASLTLGSLLPAGYASA ~ ~ 135.6 ~ ~ ~ ~ 3938 yfhK MKKKQVMLALTAAAGLGLTALHSAPAAKA ~ ~ ~ ~ ~ 20.8 ~ 3939 yfjS MKWMCSICCAAVLLAGGAAQA ~ ~ ~ ~ ~ 9.4 ~ 3940 yjcM MKKELLASLVLCLSLSPLVSTNEVFA ~ ~ 83.4 ~ 80.2 ~ ~ 3941 yjdB MNFKKTVVSALSISALALSVSGVASA ~ ~ 152.7 ~ ~ 14.3 ~ 3942 yjfA MKRLFMKASLVLFAVVFVFAVKGAPAKA ~ ~ ~ ~ ~ 20 ~ 3943 ykoJ MLKKKWMVGLLAGCLAAGGFSYNAFA ~ ~ ~ 2.23 132.5 ~ ~ 3944 ylqB MKKIGLLFMLCLAALFTIGFPAQQADA ~ ~ ~ ~ ~ 13.3 ~ 3945 yndA MRFTKVVGFLSVLGLAAVFPLTAQA ~ ~ 124.3 ~ ~ ~ ~ 3946 yqgA MKQGKFSVFLILLLMLTLVVAPKGKAEA 3.9 ~ ~ ~ ~ ~ ~ 3947 yraJ MTLTKLKMLSMLTVMIASLFIFSSQALA ~ 6.3 ~ ~ ~ ~ ~ 3948 yuaB MKRKLLSSLAISALSLGLLVSAPTASFAAE ~ ~ ~ ~ ~ ~ 2.5 SEQID- 00404 3949 yurI MTKKAWFLPLVCVLLISGWLAPAASASA ~ ~ ~ 2.26 ~ ~ ~ 3950 yvcE MRKSLITLGLASVIGTSSFLIPFTSKTASA ~ ~ 124.4 ~ ~ 14.6 ~ 3951 yvgO MKRIRIPMTLALGAALTIAPLSFASA ~ ~ ~ ~ ~ 17.7 ~ 3952 yvnB MRKYTVIASILLSFLSVLSGG ~ ~ ~ ~ ~ 26.2 ~ 3953 ywaD MKKLLTVMTMAVLTAGTLLLPAQSVTPAAHA ~ ~ ~ ~ ~ ~ 3.4 SEQID- 403 3954 ywsB MNKPTKLFSTLALAAGMTAAAAGGAGTIHA ~ ~ 131 ~ ~ ~ ~ 3954 yxaK MVKSFRMKALIAGAAVAAAVSAGAVSDVPAAK ~ ~ ~ 3.7 ~ ~ ~ VLQPTAAYA 3956 yxiT MKWNNMLKAAGIAVLLFSVFAYAAPSLKAVQA ~ 7.9 ~ 0.95 41.5 ~ ~

Example 15. Expression of Nutritive Polypeptides in Aspergillus Niter Fungi

Gene Synthesis & Plasmid Construction. Genes encoding natively secreted proteins were PCR amplified from the Aspergillus niger ATCC 64974 using primers designed from the genome sequence of Aspergillus niger CBS 513.88. In one example, genes included native 5′ secretion sequences and were cloned into the expression vector pAN56-1 (Genbank: Z32700.1) directly under the control of the gpdA promoter from Aspergillus nidulans with the addition of a C-terminal 3× FLAG tag (DYKDHDGDYKDHDIDYKDDDDK (SEQ ID NO: 3915)). In another example genes included only the mature peptide with the addition of a heterologous 5′ secretion signal. Plasmids were constructed using the Gibson Assembly Kit (New England Biolabs, Beverly, Mass.). Recombinant plasmids were sequence verified before transformation into Aspergillus hosts.

pFGLAHIL6T was obtained from the BCCM/LMBP (Ghent, Netherlands). Plasmids were constructed using the Gibson Assembly Kit (New England Biolabs, Beverly, Mass.). Recombinant plasmids were sequence verified before transformation into Aspergillus hosts.

The pyrA nutritional marker was PCR amplified from Aspergillus niger ATCC 64974 using primers designed from genome sequence of Aspergillus niger ATCC 1015. The pyrA PCR fragment was digested with XbaI and ligated into an XbaI fragment of pCSN44 (Staben et al., 1989) to construct pES1947. pCSN44 was obtained from the BCCM/LMBP (Ghent, Netherlands). The recombinant plasmid was sequence verified before transformation into Aspergillus hosts.

Strain Construction.

A protease deficient derivate of Aspergillus niger ATCC 62590 was used as the expression host. Expression vectors were co-transformed with pES1947 using the protoplast method as described in Punt et al., 1992, Methods in Enzymology, 216, 447-457. Approximately 5 ug of each plasmid was transformed into Aspergillus niger protoplasts. Transformants were selected on minimal media supplemented with 1.2 M sorbitol and 1.5% bacto agar (10 g/l glucose, 4 g/l sodium nitrate, 20 ml/l salts solution (containing 26.2 g/l potassium chloride and 74.8 g/l Potassium phosphate monobasic at pH 5.5), 1 ml/l vitamin solution (containing 100 mg/l Pyridoxine hydrochloride, 150 mg/l Thiamine hydrochloride, 750 mg/l 4-Aminobenzoic acid, 2.5 g/l Nicotinic acid, 2.5 g/l riboflavin, 20 g/l choline chloride, and 30 mg/l biotin), and 1 ml/l of metals solution (containing 20 g/l Zinc sulfate heptahydrate (ZnSO4-7H2O), 11 g/l Boric acid (H3B03), 5 g/l Manganese (II) chloride tetrahydrate (MnCl2-4H2O), 5 g/l Iron (II) sulfate heptahydrate (FeSO4-7H2O), 1.7 g/l Cobalt(II) chloride hexahydrate (CoCl2-6H2O), 1.6 g/l Copper(II) sulfate pentahydrate (CuSO4-5H2O), 1.5 g/l Sodium molybdate dihydrate (NaMoO4-2H2O), and 5.0 g/l EDTA disodium salt dihydrate (Na2EDTA-2H2O) at pH 6.5). Individual transformants were isolated on minimal media plates and allowed to grow at 30 C until they sporulated. Spores were harvested in water and stored at 4 C.

Expression Testing. Spore stocks of Aspergillus niger strains were inoculated at 106 spores/mL into 2 mL of CM (MM plus 5.0 g/l yeast extract, 2.0 g/l casamino acids) adjusted to pH 7 with 40 mM MES and SigmaFast Protease Inhibitor Cocktail (1 tab/100 mL, Sigma Aldrich) in 24 well square bottom deep well blocks. Culture blocks were covered with porous adhesive plate seals and incubated for 48 hrs in a micro-expression chamber (Glas-Col, Terre Haute, Ind.) at 30° C. at 600 rpm. After the growth period, 0.5-ml aliquots of the culture supernatants were filtered first through a 25 μm/0.45-μm dual stage filter followed by a 0.22 μm filter. The filtrates were then assayed to determine the levels of secreted protein of interest (POI).

TABLE E15A Exemplary results of Aspergillus leader peptide library screening. signal peptide name (gene name_species SEQID name) signal sequence Genotype 3957 native signal MRWLLTSSALLVPAAA PgpdA-native signal peptide-SEQID-00409-3XFlag peptide 3958 AXHA_ASPNG MKFLKAKGSLLSSGIYLIALAPFVNA PgpdA-AXHA_ASPNG-SEQID-00409-3XFlag 3959 PPIB_ASPNG MNFKNIFLSFFFVLAVGLALVHA PgpdA-PPIB_ASPNG-SEQID-00409-3XFlag 3960 FAEA_ASPNG MKQFSAKYALILLATAGQALA PgpdA-FAEA_ASPNG-SEQID-00409-3XFlag 3961 BGAL_ASPNG MKLSSACAIALLAAQAAGA PgpdA-BGAL_ASPNG-SEQID-00409-3XFlag 3962 PLYA_ASPNG MKYSTIFSAAAAVFAGSAAA PgpdA-PLYA_ASPNG-SEQID-00409-3XFlag 3963 PRTA_ASPNG MKFSTILTGSLFATAALA PgpdA-PRTA_ASPNG-SEQID-00409-3XFlag 3964 AGALC_ASPNG MIGSSHAVVALGLFTLYGHSAA PgpdA-AGALC_ASPNG-SEQID-00409-3XFlag 3965 PHYB_ASPNG MPRTSLLTLACALATGASA PgpdA-PHYB_ASPNG-SEQID-00409-3XFlag 3966 AORSN_ASPOR MRPLSHLSFFNGLLLGLSALSA PgpdA-AORSN_ASPOR-SEQID-00409-3XFlag 3967 DPP5_ASPOR MGALRWLSIAATASTALA PgpdA-DPP5_ASPOR-SEQID-00409-3XFlag 3968 PHYA_ASPNG MGVSAVLLPLYLLSGVTSGLAVP PgpdA-PHYA_ASPNG-SEQID-00409-3XFlag 3969 EXG_ASPOR MLPLLLCIVPYCWS PgpdA-EXG_ASPOR-SEQID-00409-3XFlag 3970 native signal MHFLQNAVVAATMGAALA PgpdA-native signal peptide-SEQID-00420-3XFlag peptide 3971 PPA1_ASPNG MKGTAASALLIALSATAAQA PgpdA-PPA1_ASPNG-SEQID-00420-3XFlag 3972 PEPC_ASPNG MKGILGLSLLPLLTAA PgpdA-PEPC_ASPNG-SEQID-00420-3XFlag 3973 PRTA_ASPNG MKFSTILTGSLFATAALA PgpdA-PRTA_ASPNG-SEQID-00420-3XFlag 3974 AGALC_ASPNG MIGSSHAVVALGLFTLYGHSAA PgpdA-AGALC_ASPNG-SEQID-00420-3XFlag 3975 RNT1_ASPOR MMYSKLLTLTTLLLPTALALPSLVER PgpdA-RNT1_ASPOR-SEQID-00420-3XFlag 3976 PGLR1_ASPNG MHSYQLLGLAAVGSLVSA PgpdA-PGLR1_ASPNG-SEQID-00420-3XFlag 3977 ORYZ_ASPOR MQSIKRTLLLLGAILPAVLGA PgpdA-ORYZ_ASPOR-SEQID-00420-3XFlag 3978 PLYB_ASPNG MHYKLLFAAAAASLASAVSA PgpdA-PLYB_ASPNG-SEQID-00420-3XFlag 3979 NUS1_ASPOR MPRLLPISAATLALAQLTYG PgpdA-NUS1_ASPOR-SEQID-00420-3XFlag 3980 PHYB_ASPNG MPRTSLLTLACALATGASA PgpdA-PHYB_ASPNG-SEQID-00420-3XFlag 3981 TAN_ASPOR MRQHSRMAVAALAAGANA PgpdA-TAN_ASPOR-SEQID-00420-3XFlag 3982 PDI_ASPNG MRSFAPWLVSLLGASAVVAA PgpdA-PDI_ASPNG-SEQID-00420-3XFlag 3983 XYN2_ASPNG MLTKNLLLCFAAAKAALA PgpdA-XYN2_ASPNG-SEQID-00420-3XFlag 3984 PHYA_ASPOR MAVLSVLLPITFLLSSVTG PgpdA-PHYA_ASPOR-SEQID-00420-3XFlag 3985 DPP5_ASPOR MGALRWLSIAATASTALA PgpdA-DPP5_ASPOR-SEQID-00420-3XFlag 3986 PHYA_ASPNG MGVSAVLLPLYLLSGVTSGLAVP PgpdA-PHYA_ASPNG-SEQID-00420-3XFlag 3987 PEPA_ASPNG MVVFSKTAALVLGLSTAVSA PgpdA-PEPA_ASPNG-SEQID-00420-3XFlag 3988 AGLU_ASPNG MVKLTHLLARAWLVPLAYGASQSLL PgpdA-AGLU_ASPNG-SEQID-00420-3XFlag 3989 ABFA_ASPNG MVAFSALSGVSAVSLLLSLVQNAHG PgpdA-ABFA_ASPNG-SEQID-00420-3XFlag 3990 AMYG_ASPNG MSFRSLLALSGLVCTGLA PgpdA-AMYG_ASPNG-SEQID-00420-3XFlag 3991 PHOA_ASPNG MFTKQSLVTLLGGLSLAVA PgpdA-PHOA_ASPNG-SEQID-00420-3XFlag 3992 ABFB_ASPNG MFSRRNLVALGLAATVSA PgpdA-ABFB_ASPNG-SEQID-00420-3XFlag

Aspergillus niger Signal Peptide Library Construction.

It is difficult to predict which secretion signal peptide will facilitate the secretion of any given protein of interest best. Therefore, one approach to optimizing secretion is to fuse a library of signal peptide sequences to the protein and screen for those that result in the highest level of secretion. We constructed a signal peptide library for SEQID-00409 and SEQID-00420. Table E15A shows the signal peptides that were fused with SEQID-00409 and SEQID-00420. DNA encoding the individual signal peptides was constructed by duplexing single stranded oligonucleotides comprising the forward- and reverse-strands of each signal peptide sequence. The oligonucleotides were designed such that single strand tails were formed at the 5′-ends of the duplexed molecule. Genes encoding natively secreted proteins SEQID-00409 and SEQID-00420 were PCR amplified from the Aspergillus niger ATCC 64974 using primers designed from the genome sequence of Aspergillus niger CBS 513.88. Genes included native 5′ secretion sequences and were cloned into the expression vector pAN56-1 (Genbank: Z32700.1) directly under the control of the gpdA promoter from Aspergillus nidulans with the addition of a C-terminal 3× FLAG tag (DYKDHDGDYKDHDIDYKDDDDK (SEQ ID NO: 3915)). Then the vectors were amplified without the native signal peptide and plasmids with different signal peptides were reconstructed with the duplex signal peptide sequences using the Gibson Assembly Kit (New England Biolabs, Beverly, Mass.). Recombinant plasmids were sequence verified before transformation into Aspergillus hosts.

The pyrA nutritional marker was PCR amplified from Aspergillus niger ATCC 64974 using primers designed from genome sequence of Aspergillus niger ATCC 1015. The pyrA PCR fragment was digested with XbaI and ligated into an XbaI fragment of pCSN44 (Staben et al., 1989) to construct pES1947. pCSN44 was obtained from the BCCM/LMBP (Ghent, Netherlands). The recombinant plasmid was sequence verified before transformation into Aspergillus hosts.

Aspergillus niger signal peptide library strain construction. A protease deficient derivate of Aspergillus niger ATCC 62590 was used as the expression host. Each signal peptide-gene combination vector was individually co-transformed with plasmid containing the nutritional marker pyrG using the protoplast method as described in Punt et al., 1992. Approximately 5 ug of each plasmid were transformed into Aspergillus niger protoplasts. Transformants were selected on minimal media supplemented with 1.2 M sorbitol and 1.5% bacto agar (10 g/l glucose, 4 g/l sodium nitrate, 20 ml/l salts solution (containing 26.2 g/l potassium chloride and 74.8 g/l Potassium phosphate monobasic at pH 5.5), 1 ml/l vitamin solution (containing 100 mg/l Pyridoxine hydrochloride, 150 mg/l Thiamine hydrochloride, 750 mg/l 4-Aminobenzoic acid, 2.5 g/l Nicotinic acid, 2.5 g/l riboflavin, 20 g/l choline chloride, and 30 mg/l biotin), and 1 ml/l of metals solution (containing 20 g/l Zinc sulfate heptahydrate (ZnSO4-7H2O), 11 g/l Boric acid (H3B03), 5 g/l Manganese (II) chloride tetrahydrate (MnCl2-4H2O), 5 g/l Iron (II) sulfate heptahydrate (FeSO4-7H2O), 1.7 g/l Cobalt(II) chloride hexahydrate (CoCl2-6H2O), 1.6 g/l Copper(II) sulfate pentahydrate (CuSO4-5H2O), 1.5 g/l Sodium molybdate dihydrate (NaMoO4-2H2O), and 5.0 g/l EDTA disodium salt dihydrate (Na2EDTA-2H2O) at pH 6.5). Individual transformants were isolated on minimal media plates and allowed to grow at 30 C until they sporulated.

Aspergillus niger Signal Peptide Library Expression Testing.

Six different primary transformants from each construct were inoculated into 1 ml of minimal media as defined above supplemented with 5.0 g/l yeast extract, 2.0 g/l casamino acids) adjusted to pH 7 with 160 mM MES in 96 deep well culture blocks. Culture blocks were covered with porous adhesive plate seals and incubated for 48 hrs in a micro-expression chamber (Glas-Col, Terre Haute, Ind.) at 33° C. at 800 rpm. After the growth period, 0.5-ml aliquots of the culture supernatants were filtered first through a 25 μm/0.45-μm dual stage filter followed by a 0.22 μm filter. The filtered supernatants were then analyzed using Chip Electrophoresis as described below or anti-FLAG DOT-BLOT and SDS-PAGE as described below. Based on these results the primary transformants, which demonstrated higher secretion than the native signal peptide, were isolated on minimal media plate and allowed to grow at 30° C. until they sporulated.

Spore stocks of the above Aspergillus niger strains along with the control Aspergillus niger strain that contain expression construct of pgpdA promoter and native signal peptide of SEQID-00409 and SEQID-00424 were inoculated at 106 spores/mL into 10 mL of minimal media as defined above supplemented with 5.0 g/l yeast extract, 2.0 g/l casamino acids) adjusted to pH 7 with 160 mM MES in 125 ml plastic flask. Aspergillus spores were then grown at 30° C. with 150 RPM for two days. After the growth period, aliquots of the culture supernatants were filtered first through a 25 μm/0.45-μm dual stage filter followed by a 0.22 μm filter. The filtrates were then analyzed using SDS-PAGE as described herein.

FIG. 5. demonstrates the secretion of SEQID-00409 (left) and SEQID-00420 (right) with new signal peptide compared to native signal peptide.

Aspergillus niger Heterologous Nutritive Polypeptide Gene Synthesis & Plasmid Construction.

Genes encoding nutritive polypeptides were synthesized (Geneart, Life Technologies). Genes were codon optimized for expression in Aspergillus niger. Synthesized genes were PCR amplified and cloned into the expression vector pAN56-1 (Genbank: Z32700.1) fused to glucoamylase with its native leader sequence under the control of the gpdA promoter from Aspergillus nidulans with the addition of a C-terminal 3× FLAG tag (DYKDHDGDYKDHDIDYKDDDDK (SEQ ID NO: 3915)) and Kexin protease site (NVISKR (SEQ ID NO: 3993)) between glucoamylase gene and gene of interest. Plasmids were constructed using the Gibson Assembly Kit (New England Biolabs, Beverly, Mass.). Recombinant plasmids were sequence verified before transformation into Aspergillus hosts.

SEQID-00087, SEQID-00103, SEQID-00105, SEQID-00115, SEQID-00218, SEQID-00298, SEQID-00302, SEQID-00341, SEQID-00352, SEQID-00354 genes were utilized.

Aspergillus niger heterologous nutritive polypeptide strain construction. A protease deficient derivate of Aspergillus niger D15#26 (E. Karnaukhova et al, Microbial Cell Factories, 6:34) was used as the expression host. 10 ug of Expression vectors were co-transformed with lug plasmid containing pyrG selection marker using the protoplast method described in Punt et al., 1992, Methods in Enzymology, 216, 447-457. Transformants were selected on minimal media containing 10 g/l glucose, 6 g/l sodium nitrate, 20 ml/l salts solution (containing 26 g/l potassium chloride and 76 g/l Potassium phosphate monobasic at pH 5.5), 2 mM magnesium sulphate, 1 ml/l vitamin solution (containing 100 mg/l Pyridoxine hydrochloride, 150 mg/l Thiamine hydrochloride, 750 mg/l 4-Aminobenzoic acid, 2.5 g/l Nicotinic acid, 2.5 g/l riboflavin, 20 g/l choline chloride, and 30 mg/l biotin), and 1 ml/l of metals solution (containing 20 g/l Zinc sulfate heptahydrate (ZnSO4-7H2O), 11 g/l Boric acid (H3BO3), 5 g/l Manganese (II) chloride tetrahydrate (MnCl2-4H2O), 5 g/l Iron (II) sulfate heptahydrate (FeSO4-7H2O), 1.7 g/l Cobalt(II) chloride hexahydrate (CoCl2-6H2O), 1.6 g/l Copper(II) sulfate pentahydrate (CuSO4-5H2O), 1.5 g/l Sodium molybdate dihydrate (NaMoO4-2H2O), and 5.0 g/l EDTA disodium salt dihydrate (Na2EDTA-2H2O) at pH 6.5) and supplemented with 1.2 M sorbitol and 1.5% bacto agar. Individual transformants were isolated on minimal media plates and allowed to grow at 30 C until they sporulated. Spores were harvested in water at stored at 4 C.

Aspergillus niger Heterologous Nutritive Polypeptide Expression Testing.

90 different primary transformants from each construct were inoculated into 1 ml of minimal media as defined above supplemented with 1 g/L casamino acids in 96 deep well culture blocks (1st MTP). Culture blocks were covered with porous adhesive plate seals and incubated for 72 hrs in a micro-expression chamber (Glas-Col, Terre Haute, Ind.) at 33° C. at 800 rpm. After the growth period, 0.5-ml aliquots of the culture supernatants were filtered first through a 25 μm/0.45-μm dual stage filter followed by a 0.22 μm filter. The filtrates were then assayed using an anti-FLAG ELISA method, as described herein, to determine the levels of secreted protein of interest (POI). At least five colonies from nine expression constructs excluding SEQID-00302 yielded positive signals in an anti-flag ELISA as reported in Table E15B. At least 5 primary transformants that showed positive signals in anti-FLAG ELISA assay from each of the nine expression strains were also streaked onto a fresh minimal media agar plate for single spore purification and retested for confirmation (2nd MTP).

Spores were harvested from the plate and inoculation was done with fresh spore crops with a density of approximately 1E9 spores/ml. 10 ul of these spore crops were added to 10 ml of minimal medium giving a start density of 1E6 spores/ml. Aspergillus spores were then grown at 33 C with 150 RPM for three days. After the growth period, aliquots of the culture supernatants were filtered first through a 25 μm/0.45-μm dual stage filter followed by a 0.22 μm filter. The filtrates were then analyzed using anti-FLAG ELISA, anti-flag western blot and SDS-PAGE described below.

Certain clones from different expression construct were grown using fresh spore crops with a final density of approximately. 1E6 spores/ml in one litre minimal media. Aspergillus spores were then grown at 33 C with 150 RPM for three days. After the growth period, aliquots of the culture supernatants were filtered first through a 25 μm/0.45-μm dual stage filter followed by a 0.22 μm filter. The filtrates were then analyzed using anti-FLAG ELISA, anti-flag western blot and SDS-PAGE described below. Any secreted protein above 39 mg/l in an anti-flag ELISA from a one liter shake flask were detected by an anti-FLAG western blot and SDS-PAGE.

TABLE E15B demonstrates the anti-flag ELISA results and anti-FLAG western blot results of different protein secretion in Aspergillus niger ELISA ELISA ELISA ELISA 10 ml Western 1 litre 1st MTP 2nd MTP shake flask 10 ml shake flask Seq ID # (mg/l) (mg/l) (mg/l) shake flask (mg/l) SEQID-00103 11.7-42.2 3.0-29.1 1.6-4.0 ++ 1.2-5.6 SEQID-00105  1.4-20.2  0-2.1 0-2.4 + 0.2-1.6 SEQID-00298  66.0-215.4  5.2-119.5  1.3-79.4 +++  93.9-224.3 SEQID-00897 11.9-21.4 3.6-17.5 0.6-2.2 + SEQID-00115 16.3-96.2  0-4.9 0.7-3.7 SEQID-00218 28.2-69.3  0.4-191.7  2.1-13.3 + SEQID-00341 10.4-25.3 1.9-20.5 3.0-8.3 +++ 1.0-3.6 SEQID-00352  27.1-207.6 0-16.2   0.7-245.70 +++  39.0-124.3 SEQID-00354  9.1-120.4 0-47.2 0.9-7.8 +++ 0-1.2

Example 16. Expression of Nutritive Polypeptides in Cultured Mammalian Cells

Gene Synthesis and Plasmid Construction. All genes were made synthetically (GeneArt, Life Technologies) and optimized for expression in Homo Sapiens. Genes encoding SEQID-00001, SEQID-00103, SEQID-00105, SEQID-00298 and SEQID-00363 were chosen for secretion in mammalian cells. All the genes were fused with 5′ signal peptide sequence from Ig-kappa protein (METDTLLLWVLLLWVPGSTGD (SEQ ID NO: 3994)) and HHHHHHHH tag(SEQ ID NO: 3919) in the 3′ end. All the gene constructs were cloned to pcDNA 3.1 (+) vector (Life Technologies) downstream of pCMV promoter in the multiple cloning site of NheI and BamHI. All gene sequences contained the GCC sequence upstream of start codon to generate GCCATGG Kozak sequence. All plasmids were transformed in E. coli, sequence verified and 10 mg of each plasmid were purified for transfection into mammalian cells.

Strain Construction.

The cell lines selected for experimentation are 2 transient lines from Invitrogen, FreeStyle™ CHO-S Cells (PN 51-4448) and FreeStyle™ 293F Cells (PN 51-0029). Cell lines were received from Invitrogen and stored in liquid nitrogen until sub-culturing was initiated. For sub-culturing prior to transfection, cells were thawed into specific media: CHO-S cells were thawed into 30 mL of FreeStyle™ CHO Expression Media (Invitrogen PN 12651-014) supplemented with 8 mM Glutamax (Invitrogen PN 35050-061) and 1×HT (Invitrogen PN11067-030). 293F cells were thawed into 30 mL FreeStyle™ 293 Expression Medium (Invitrogen PN 12338-018). Cells were allowed to recover in suspension for 72 hrs under 80% humidity, 8% Carbon Dioxide, 36.5° C., shaking at 110 rotations per minute. Cells were passed from viable cell densities (VCD) not exceeding 2.0×10 6 cells/mL to 0.2×10 6 cells/mL. Sub-culturing continued for 5 passages prior to transfection.

At 26 hrs prior to transfection the cultures were passed back to 0.6×10 6 viable cells/mL in a volume of 60 mL. Each nutritive polypeptide was transfected in duplicate with 2 mock transfections performed as control for each cell line. On the day of transfection cells were counted and determined to be at 1.1×10 6 cells/mL with greater than 98% viability.

Transfection Procedure.

Preparation of DNA-lipid complexes for 120 mL total volume (60 mL/250 mL shake flask) transfections.

The following procedure was performed in a Laminar Flow Hood. 150 μg of plasmid DNA were diluted into OptiPRO™ SFM Reduced Serum Medium (Invitrogen PN: 12307-050 to a total volume of 2.4 mL. This solution was mixed gently and 0.2 um filter sterilized. Diluted 150 uL of FreeStyle™ MAX Reagent (Invitrogen PN 16447-100) in OptiPRO™ SFM Reduced Serum Medium to a total volume of 2.4 mL, mixed gently and incubated for 5 minutes at room temperature. After 5-minute incubation, the 2.4 mL of diluted DNA was added to the 2.4 mL of the diluted reagent, gently mixed and incubated for 20 min at room temperature to form the DNA-lipid complex. After 20-minute incubation, 2.4 mL of complex was added to each duplicate flask. Control Flask received 2.4 mL of OptiPRO™ SFM reduced serum medium. Transfected flasks were placed back in incubated shaker under 80% humidity, 8% Carbon Dioxide 36.5° C., shaking at 110 rotations per minute. Cultures were monitored for % Viability and VCD. All flasks were supplemented on Day 3 with a 20% Phytone Peptone Feed made in the media specific to each cell line. The final concentration of Phytone Peptone in the flask equaled 2%. Cultures were harvested on Day 4 via centrifugation and 0.2 um filtration of the supernatant. Supernatants were run on a non-reducing SDS-PAGE 12% Bis-Tris Gel to confirm expression of nutritive polypeptides at molecular weights. SEQID-00001, SEQID-00103, SEQID-00105, and SEQID-00298 were confirmed as being expressed from 293F cells but no visible bands could be detected in any of the CHO-S cultures. SEQID-00363 could not be visualized on the gel from either 293F or CHO-S cultures. Supernatants from 293F cultures for SEQID-00001, SEQID-00103, SEQID-00105, SEQID-00298, and SEQID-00363 as well as CHO-S supernatants for SEQID-00103 and SEQID-00105 were purified via IMAC. SEQID-00103, SEQID-00105, and SEQID-00298 were scaled up to 190 mL transfection volume in a 1 L shake flask using 293F cells. Transfection procedure was also scaled accordingly. 4×190 mL cultures for both SEQID-00103 and SEQID-00105 and 2×190 mL cultures for SEQID-00298 were run. On Day 2 these cultures were fed with the 20% Phytone Peptone feed to a final concentration of 2% in culture and were harvested on Day 5.

Example 17: Nutritive Polypeptide Expression Analysis

Nutritive polypeptides intracellularly expressed and/or secreted were detected using a variety of methods. These methods include electrophoresis, western blot, dot-blot, ELISA, and quantitative LC/MS/MS.

Electrophoresis Analysis.

Extracellular and/or intracellular expressed proteins were analyzed by chip electrophoresis (Labchip GXII) or SDS-PAGE analysis to evaluate expression level.

For SDS-PAGE, 10 μl sample in Invitrogen LDS Sample Buffer mixed with 5% β-mercaptoethanol was boiled and loaded onto either: 1) a Novex® NuPAGE® 12% Bis-Tris gel (Life Technologies), or 2) a Novex®16% Tricine gel (Life Technologies), and run using standard manufacturer's protocols. Gels were stained using SimplyBlue™ SafeStain (Life Technologies) using the standard manufacturer's protocol and imaged using the Molecular Imager® Gel Doc™ XR+ System (Bio-Rad). Over-expressed heterologous proteins were identified by comparison against a molecular weight marker and control cultures.

For chip electrophoresis (Labchip GX II) samples were analyzed using a HT Low MW Protein Express LabChip® Kit (following the manufacturer's protocol) by adding 2 μl of sample to 7 μl sample buffer. A protein ladder was loaded every 12 samples for molecular weight determination and quantification (molecular weight in kDa).

LC-MS/MS Analysis.

Whole cell, cell lysate and secreted samples can be analyzed for protein expression using LC-MS/MS. To analyze samples, 10 μg of sample were loaded onto a 10% SDS-PAGE gel (Invitrogen) and separated approximately 2 cm. The gel was excised into ten segments and the gel slices were processed by washing with 25 mM ammonium bicarbonate, followed by acetonitrile. Gel slices were then reduced with 10 mM dithiothreitol at 60° C., followed by alkylation with 50 mM iodoacetamide at room temperature. Finally, the samples were digested with trypsin (Promega) at 37° C. for 4 h and the digestions were quenched with the addition of formic acid. The supernatant samples were then analyzed by nano LC/MS/MS with a Waters NanoAcquity HPLC system interfaced to a ThermoFisher Q Exactive. Peptides were loaded on a trapping column and eluted over a 75 μm analytical column at 350 nL/min; both columns were packed with Jupiter Proteo resin (Phenomenex). A 1 h gradient was employed. The mass spectrometer was operated in data-dependent mode, with MS and MS/MS performed in the Orbitrap at 70,000 FWHM resolution and 17,500 FWHM resolution, respectively. The fifteen most abundant ions were selected for MS/MS. Data were searched against an appropriate database using Mascot to identify peptides. Mascot DAT files were parsed into the Scaffold software for validation, filtering and to create a nonredundant list per sample. Data were filtered at 1% protein and peptide false discovery rate (FDR) and requiring at least two unique peptides per protein.

Anti-FLAG Western Blot.

Extracellular and/or intracellular protein was analyzed using western blot to evaluate expression level.

For SDS-PAGE, 10 μl sample in Invitrogen LDS Sample Buffer mixed with 5% β-mercaptoethanol was boiled and loaded onto a Novex® NuPAGE® 12% Bis-Tris gel (Life Technologies). For standards, 0.5 μg to 2 μg Amino-terminal FLAG-BAP™ Fusion Protein (Sigma) were loaded as a positive control. Gel electrophoresis was performed according to manufacturer's protocol. Once run, the gel was transferred onto an iBlot® Mini Transfer Stack nitrocellulose 0.2 μm pore size membrane (Life Technologies) according to manufacturer protocol. Next, the nitrocellulose membrane was removed from the stack and assembled into a Millipore SNAP i.d.® 2.0 Protein Detection Apparatus. 30 ml of Millipore Blok CH Noise Cancelling reagent was placed into an assembled reservoir tray and vacuumed through. 3 ml of antibody solution was prepared by diluting 2 μl of Sigma Monoclonal ANTI-FLAG® M2-Peroxidase (HRP) antibody into 3 ml of Millipore Blok CH Noise Cancelling Reagent. Antibody solution was added to reservoir tray and allowed to incubate for 10 minutes without vacuum. After incubation, the reservoir tray was filled with 90 ml of 1×PBS+0.1% Tween 20 and vacuumed through as the final wash step. After wash, the nitrocellulose membrane was removed and placed into a reagent tray. 20 ml of Millipore Luminata Classico Western HRP substrate was added and allowed to incubate for 1 minute. After incubation, the membrane was placed into imaging tray of Gel Doc™ XR+ System (Bio-rad) and imaged using a chemiluminescent protocol.

Anti-FLAG Dot Blot.

Extracellular and/or intracellular proteins were analyzed using dot blot to evaluate expression level.

110 μl of 0.2 μm filtered sample was mixed with 110 μl 8.0M Guanidine Hydrochloride, 0.1M Sodium Phosphate (Denaturing Buffer) to allow for even protein binding. A standard curve of Amino-terminal FLAG-BAP™ Fusion Protein (Sigma) was prepared in the same matrix as the samples, starting at 2 μg, diluting 2× serially to 0.0313 μg. Invitrogen 0.45 μm nitrocellulose membrane was pre-wet in 1×PBS buffer for 5 minutes and then loaded onto Bio-Rad Dot Blot Apparatus. 300 μl of PBS was vacuumed through to further wet the membrane. Next, 200 μl the 1:1 Sample:Denaturing Buffer mixture was loaded into each well and allowed to drain through the dot blot apparatus by gravity for 30 minutes. Next, a 300 μl PBS is wash was performed on all wells by vacuum followed by loading 300 μl of Millipore Blok CH Noise Cancelling reagent and incubating for 60 minutes. After blocking, membrane was washed with 300 μl of 1×PBS+0.1% Tween 20. Next, antibody solution was prepared by adding 2.4 μl of Sigma Monoclonal ANTI-FLAG® M2-Peroxidase (HRP) antibody to 12 ml of Millipore Blok CH Noise Cancelling reagent (1:5000 dilution). 100 μl of the resulting antibody solution was added to each well and allowed to incubate for 30 minutes by gravity. After antibody incubation, three final washes were performed with 300 μl 1×PBS+0.1% Tween 20 by vacuum. After washes, nitrocellulose membrane was removed and placed into a reagent tray. 20 ml of Millipore Luminata Classico Western HRP substrate was added and allowed to incubate for 1 minute. After incubation, membrane was placed into imaging tray of Gel Doc™ XR+ System (Bio-rad) and imaged using a chemiluminescent protocol.

Anti-FLAG ELISA. Protein expression was detected by direct ELISA using an anti-FLAG antibody. Briefly, a dilution series (0.005-10 μg/mL) of FLAG fusion protein (Sigma) was prepared in 0.1 M NaHCO3 (pH 9.5). A dilution series (0.01-20 μg/mL) of FLAG fusion protein was also prepared in spent medium from an empty fungus culture diluted 10-fold in 0.1 M NaHCO3 (pH 9.5). Experimental cultured medium samples were diluted 10-fold in 0.1 M NaHCO3 (pH 9.5). The FLAG fusion protein dilution series and the experimental samples were then transferred (0.2 mL) to the wells of a Nunc-Immuno™ Maxisorp™ (Thermo) 96-well plate and incubated overnight at 2-8° C. to facilitate protein adsorption. The following morning, plates were rinsed three times with Tris-buffered saline (TBS) containing 0.05% TWEEN 80 (TBST). The wells were blocked from non-specific protein binding by incubation with 0.2 mL of 1% non-fat dried milk dissolved in TBST for 1 h at room temperature. The plates were rinsed three more times with TBST and then incubated at room temperature for 1 h with 0.2 mL of the monoclonal antibody anti-FLAG M2-HRP (Sigma) diluted 1:2000 in blocking buffer. The plates were again rinsed three more times with TBST before incubation with 0.2 mL/well of SIGMAFAST™ o-phenylenediamine dihydrocholride (OPD) (Sigma) for 30 min at room temperature. The reaction was terminated with the addition of 0.05 mL/well of 1 M HCl and the absorbance of samples was measured at 492 nm on a spectrophotometer equipped with a plate reader.

Example 18. Therapeutic Liquid Formulations of Nutritive Polypeptides

Nutritive polypeptide sequences were administered for therapeutic purposes in a variety of liquid formulations. For example, the formulations utilized differ by protein concentration, solution pH, presence or absence of particulates, minerals, tastants, and/or excipient additives. In therapeutic liquid formulations, the concentration of protein ranges from 0.1% to 60% w/w in solution. In certain instances, a lower concentration is preferable. For example, SEQID-00105 has been dosed as low as 10%. In some cases a higher concentration is preferable. For example, SEQID-00105 and SEQID-00363 were dosed as 35% solutions. In some cases a nutritive polypeptide sequence is dosed at its maximum solubility, which varies by protein and is generally in the range of 0.1% to 60% w/w. Both SEQID-00105 and SEQID-00363 were shown to be a soluble liquid at 50% w/w in solution. In therapeutic liquid formulations, the solution pH generally ranges from 2 to 11. Protein solubility is known to be a strong function of solution pH (C. Tanford, Physical Chemistry of Macromolecules, p. 242, Wiley, New York, 1961.) In most cases, nutritive polypeptide sequences are least soluble at their isoelectric point (pI), and thus solution pH is often adjusted to be above or below the pI of the nutritive polypeptide sequence. This modulation of pH allows control over protein solubility. For example, SEQID-00105 and SEQID-00363 have isoelectric points near 4. These nutritive polypeptide sequences were purposely formulated at pH 8-9, so they would be above their pI. For example, SEQID-00587 has a pI of 9.7 and was purposely formulated at pH 7, so that it would be below its pI. Nutritive polypeptides that are soluble at their pI, can be formulated at their pI so that the liquid formulation does not require an additional buffering species to maintain the solution pH. In this case, the protein itself acts to buffer the solution pH, as its own amino acids are protonated and deprotonated. A protein solution formulated at the pI of the protein shows resistance to changes in pH when acid or base are added, indicating that the solution is, in fact, buffered by the protein itself. In some therapeutic liquid formulations, a nutritive polypeptide includes particulate matter. Particulate matter is visible to the eye, and/or it contains subvisible particulates (example soluble aggregates). Particulate matter is product-related, and/or it is foreign. Particulate matter occurs in suspension, as a slurry, and/or settles at the bottom of the solution. In some embodiments, particulate matter is desirable because wherein it is indigestible or slowly digestible in the stomach, it acts as a carrier delivering the nutritive polypeptide to the intestine. In some embodiments, particulate matter is desirable in a liquid formulation because it allows dosing the nutritive polypeptide above its limit of solubility. In the case of SEQID-00105, there is no visible particulate matter. SEQID-00424 was dosed with suspended particulate matter.

In therapeutic liquid formulations, a nutritive polypeptide sequence is typically formulated with one or more excipients dissolved in the formulation. Each excipient is included for a specific purpose. Buffers (examples: Tris, phosphate, ammonium bicarbonate, sodium carbonate, acetate, citrate, arginine) are added to control the solution pH. Sugars are added to control aggregation and solubility (examples: trehalose, glucose, sucrose, and mannitol). Detergents are added to control solubility and aggregation (examples tween, triton, CHAPS, and deoxycholate). Polyalcohols are added to control solubility and aggregation (glycerol, PEG). Chaotropes are added to increase protein solubility (example thiocyanate and urea). Antioxidants are added to prevent protein oxidation (examples ascorbic acid and methionine). Salts are added to increase protein solubility and/or are added to achieve a desired osmolality (example sodium chloride). SEQID-00105, SEQID-00363, SEQID-00426 were formulated in sodium phosphate and sodium chloride and dosed orally. SEQID-00587 and SEQID-00559 were formulated in ammonium bicarbonate and dosed orally. SEQID-00240 was formulated in sodium carbonate and dosed orally. Tastants were added to nutritive polypeptides to enhance the gustatory experience of the user, with successful result. Vanilla extract was added in combination with sucralose according to Tang et al. (Tang J E, Moore D R, Kujbida G W, Tarnopolsky M A, Phillips S M. J Appl Physiol (1985). 2009 September; 107(3):987-92). The qualitative benefit of enhanced flavor was documented by users as being “quite pleasant.”

The two tables Table E18A and Table E18B summarize a number of administrations of nutritive polypeptides to humans and to rats.

TABLE E18A Therapeutic liquid formulations of nutritive polypeptides used for human administration. Human Administration Protein SEQID Conc (g/L) dosed (g) pH Buffer NaCl Tastants SEQID-00105 350 35 8.7 2 mM phosphate 6 mM Each formulation SEQID-00363 350 35 7 3 mM phosphate 14 mM contained 8 mL of SEQID-00105 117 20 8.7 2 mM phosphate 6 mM vanilla extract per SEQID-00105 117 20 8.7 2 mM phosphate 6 mM liter and 4 grams SEQID-00363 117 20 7 3 mM phosphate 14 mM of sucralose per SEQID-00426 117 20 7 none none liter, according to Tang, et al.

TABLE E18B Therapeutic liquid formulations of nutritive polypeptides used for human administration. Rat Administration dosage (mg protein Conc per kg body SEQID (g/L) weight) pH Buffer NaCl SEQID-00105 250 2,850 8.7 2 mM phosphate 6 mM SEQID-00105 250 2,850 8.7 2 mM phosphate 6 mM SEQID-00338 250 2,850 8.7 2 mM phosphate 6 mM SEQID-00338 250 2,850 8.7 2 mM phosphate 6 mM SEQID-00105 98 1,113 8.7 2 mM phosphate 6 mM SEQID-00240 135 1,539 10.8 25 mM Na2CO3 0 mM SEQID-00105 156 1,781 8.7 2 mM phosphate 6 mM SEQID-00363 229 2,850 7 3 mM phosphate 14 mM (a-mannosidase treated) SEQID-00363 250 2,850 7 3 mM phosphate 14 mM (hydrolyzed) SEQID-00105 250 2,850 8.7 2 mM phosphate 6 mM SEQID-00105 250 2,850 8.7 2 mM phosphate 6 mM SEQID-00105 250 2,850 8.7 2 mM phosphate 6 mM SEQID-00105 250 2,850 8.7 2 mM phosphate 6 mM SEQID-00105 250 2,850 8.7 2 mM phosphate 6 mM SEQID-00338 250 2,850 8.7 2 mM phosphate 6 mM SEQID-00352 250 2,850 7 3 mM phosphate 14 mM SEQID-00363 250 2,850 7 3 mM phosphate 14 mM SEQID-00363 250 2,850 7 3 mM phosphate 14 mM SEQID-00423 250 2,850 7 3 mM phosphate 14 mM SEQID-00424 250 2,850 7 3 mM phosphate 14 mM SEQID-00425 250 2,850 7 3 mM phosphate 14 mM SEQID-00426 250 2,850 7 none 0 mM SEQID-00426 250 2,850 7 none 0 mM SEQID-00429 250 2,850 7 3 mM phosphate 14 mM SEQID-00559 250 2,850 7.7 25 mM NH4HCO3 0 mM SEQID-00587 250 2,850 7.7 25 mM NH4HCO3 0 mM

Example 19. Therapeutic Formulations of Nutritive Polypeptides

Alternative to soluble, homogenous, liquid formulations, nutritive polypeptides can be prepared alternatively. This example describes alternative nutritive polypeptide formulations such as slurries, gels, tablets and food ingredients.

Slurry Formulations.

Slurries are semiliquid mixtures that contain both soluble and insoluble material which generally appear as fine granules in solution. Nutritive polypeptide slurries are prepared when a nutritive polypeptide is either concentrated above its maximum solubility, or, when a lyophilized or freeze dried preparation of a nutritive polypeptide is resuspended above its maximum solubility. Optionally, addition of an emulsifier such as soy lecithin is added at a concentration between 0.1-1% to stabilize the homogeneity of a slurry (van Nieuwenhuyzen et al., 1999 Eur. Journal of Lipid Sci. and Tech.).

Gel Formulations.

Alternatively, nutritive polypeptides are formulated as gels. Gels are solid, jelly-like formulations that are generally formed through molecular cross linking. Nutritive polypeptides are formulated as gels through treatment with transglutaminase (EC Number 2.3.2.13). Transglutaminase can catalyze the cross linking of nutritive polypeptides between g-carboxamide groups of peptide bound glutamine residues and the e-amino groups of nutritive polypeptide bound lysine residues. This formation of a g-glutamyl-e-lysine cross links proteins in solution; thus, promoting gel formation.

Human transglutaminase is a calcium (Ca2+) dependent enzyme with a Kd for calcium in the range of 0.3-3 uM (Ahvazi et al., 2003 Journal of Biological Chemistry). Due to the precipitave effects of calcium in preparations of nutritive polypeptides, a microbial (Streptomyces mobaraensis) orthologue of transglutaminase has been identified that acts in a calcium-independent manner (Ando et al, 1989 Agric Biol Chem). To prepare a gelatinous preparation of nutritive polypeptides, nutritive polypeptides are solubly formulated at 250 g/L at neutral pH. Transglutaminase is spiked into the preparation at 10 EU per g nutritive polypeptide and allowed to react at 35° C. until adequate gel formation has occurred (Chen et al., 2003 Biomaterials).

Tablet Formulations.

Preparation of solid tablets of nutritive polypeptides is accomplished according to Sakarkar et al. 2009 (International Journal of Applied Pharmaceutics). Tablets are formulated as a mixture of nutritive polypeptides, excipients and binding agents. Tablets are composed of 50% w/w nutritive polypeptide, 26% w/w microcrystalline cellulose, 7.5 w/w sodium bicarbonate-citric acid mixture (70:30), 6.5% w/w lactose, 5.5% w/w magnesium stearate and bound by 4.5% w/w polyvinyl pyrrolidone solution in isopropanol. Tablets are dehumidified and granulated prior to compression into 100 mg tablets. Tablets are coated with 12.5% w/w ethyl cellulose solution in dichloromethane and diethyl phthalate as a plasticizer.

Inhalable Dry Powder.

Nutritive polypeptides are formulated to be administered as an inhalable dry powder as described in Lucas et al., 1998 Pharm Res. Nutritive proteins are co-processed with malto-dextrin by spray-drying to produce model protein particles. Aerosol formulations are then prepared by tumble mixing protein powders with α-lactose monohydrate (63-90 μm) or modified lactoses containing between 2.5 and 10% w/w fine particle lactose (FPL) or micronised polyethylene glycol 6000. Powder blends are then characterized in terms of particle size distribution, morphology and powder flow.

Conventional Food Formulations.

Nutritive polypeptides are formulated as a food ingredients as dry, solid formulations. For example, nutritive polypeptides are incorporated into pasta dough by mixing dry nutritive polypeptide formulations into water, durum semolina, Arabica gum, mono- and diglycerides, fiber, yeast and citric acid. Dough is cut to shape, dried and packaged.

Example 20: In Vitro Screening of Amino Acids and Nutritive Polypeptides for GLP-1 Production

Glucagon-like peptide-1 (GLP-1) is a peptide hormone produced by L-cells of the intestine in response to multiple nutrient stimuli. GLP-1 is an incretin that decreases blood glucose levels by increasing insulin release from the pancreas. GLP-1 acts on other peripheral tissues increasing glucose uptake and storage in skeletal muscle and adipose tissue, decreasing the rate of gastric emptying, and decreasing hepatic glucose production (Baggio L L & DJ Drucker. 2007. Biology of incretins: GLP-1 and GIP. Gastroenterology. 132:2131-2157). GLP-1 is a product of the post-translational cleavage of proglucagon to generate active GLP-1 (7-36) this is rapidly degraded after secretion by dipeptidyl peptidase IV (DPPIV) to GLP-1 (9-36).

Several mammalian gastrointestinal cell lines are used as models for L-cell secretion of GLP-1 (IEC-6 from rats, NCI-H716 and FHs74Int from Humans, and STC-1 and GLUTag from mice). The purpose of these experiments is to determine which amino acid combinations, nutritive protein digests, and/or full length nutritive proteins induce GLP-1 secretion in vitro.

The NCI-H716 cell line is obtained from the American Type Culture Collection (Catalog number ATCC® CCL-251™, Manassas, Va.). RPMI-1640, Dulbecco's Modified Eagle Medium (DMEM) and Dulbecco's Phosphate Buffered Saline (DPBS) are obtained from Life Technologies (Catalog numbers 11875, 11965 and 14190, respectively; Carlsbad, Calif.). Penicillin-Streptomycin Solution is obtained from ATCC (Catalog number 30-2300, Manassas, Va.). Antibiotic-Antimycotic Solution (100×) is obtained from Sigma-Aldrich (Catalog number A5955, St. Louis, Mo.). Fetal bovine serum is obtained from GE Healthcare (Catalog number SH3007103HI, Wilmington, Mass.). Bovine serum albumin is obtained from Fisher Scientific (Catalog number BP1600-100, Pittsburgh, Pa.). Fatty-acid free bovine serum albumin is obtained from Sigma-Aldrich (Catalog number A7030, St. Louis, Mo.). Phenylmethyl sulfonyl fluoride (PMSF), a protease inhibitor, is obtained from Thermo Scientific (Catalog number 36978, Waltham, Mass.). Diprotin A, a DPPIV inhibitor, is obtained from Sigma-Aldrich (Catalog number 19759, St. Louis, Mo.). Tissue Protein Extraction Reagent (T-PER) is obtained from Thermo Scientific (Catalog number 78510, Waltham, Mass.). Active GLP-1 concentration is determined using the AlphaLisa GLP-1 (7-36 amide) Immunoassay Research Kit (Catalog number AL215, PerkinElmer, Waltham, Mass.) and read on an EnSpire Alpha plate reader (PerkinElmer, Waltham, Mass.). Data are analyzed using Microsoft Excel version 14.0.7128.5000 (Microsoft Corporation, Redmond, Wash.) and GraphPad Prism version 6.03 for Windows (GraphPad Software, La Jolla, Calif.). Krebs Ringer Buffer (KRB) is prepared and the AlphaLISA GLP-1 (7-36 amide) Immunoassay Research Kit is obtained from PerkinElmer (Catalog number AL215, Waltham, Mass.).

Ninety-six well plates are pre-coated with MatriGel. 80 μL of MatriGel is added to 5 mL Dulbecco's Modified Eagle Medium (DMEM) without glucose. 50 μL are added per well and the plate incubated at 37° C., 5% CO2 for 30 minutes. MatriGel solution is aspirated prior to addition of cells.

NCI-H716 cells are maintained in RPMI-1640 medium supplemented with 10% fetal bovine serum (FBS) and 1% Antibiotic-Antimycotic Solution and incubated in T-75 tissue culture flasks at 37° C., 5% CO2. Cells are passaged 1:3 or 1:6 every 2 to 3 days.

Cells are detached with 0.25% trypsin-EDTA incubated at 37° C., 5% CO2 and centrifuged at 750 rpm for 10 minutes to pellet cells. Cell pellet is washed twice with 1× Dulbecco's Phosphate Buffered Saline (DPBS) supplemented with 1% FBS and centrifuged at 750 rpm for 10 minutes to pellet cells. The cells are resuspended in Dulbecco's Modified Eagle Medium (DMEM) supplemented with 10% FBS and 1% penicillin/streptomycin and counted on a hemocytometer. Cells are diluted to 1.8×106 cells/mL and 200 μL added to each well of a 96 well plate pre-coated with Matri-Gel. Cells are incubated for 2 days at 37° C., 5% CO2.

Screening of GLP-1 Secretion to Amino Acid Treatments in NCI-H716 Cells

Following two day incubation, medium is aspirated and replaced with 200 Starvation Buffer [i.e. Krebs-Ringer Buffer (KRB) containing 50 μg/mL PMSF, 34 μg/mL Diprotin A, and 0.2% fatty-acid free bovine serum albumin (BSA)] and incubated for 30 minutes at 37° C., 5% CO2. Starvation buffer is then aspirated and replaced with 100 μL/well of treatment article in Starvation Buffer. Cells are stimulated with treatment article for 2 hours then medium is removed and frozen at −80° C. Treatment articles include individual amino acids, amino acid blends, nutritive protein digests, and/or full length nutritive proteins.

Determination of Active GLP-1 Concentration from Supernatant.

Supernatant is assayed for the active form of GLP-1 (7-36)NH2 using the AlphaLISA GLP-1 (7-36 amide) Immunoassay Research Kit (PerkinElmer, AL215) in accordance with the manufacturer's instructions. The standard is assayed in Assay Buffer supplemented to an equivalent concentration of Starvation Buffer. Alternatively, the active form of GLP-1 is measured using an enzyme-linked immunosorbent assay (ELISA) as described herein. Luminescence data from the AlphaLISA GLP-1 (7-36 amide) Immunoassay or GLP-1 ELISA is analyzed on Microsoft Excel and GraphPad Prism.

Duplicate sample concentrations are determined by non-linear regression using a 4 parameter logistic model of the standard following an x=log(x) transformation of the active GLP-1 concentration in GraphPad Prism 6. ANOVA and multiple comparison tests are conducted on GraphPad Prism 6.

Comparisons of the GLP-1 content from samples collected after test article treatment to those collected after vehicle control treatment describe the degree of GLP-1 secretion due to a test article over time. Comparison of this difference across treatments describes the differential effects of each amino acid, amino acid blend, nutritive protein digest, and nutritive protein treatment relative to the other, and provides a means of ranking their efficacy.

GLP-1 Secretion by Amino Acids

Cells were starved of amino acids as described herein and stimulated with stimulation buffer alone, 19 amino acids, 17 amino acids (without leucine, isoleucine or valine), 20 amino acids or leucine alone at their concentration in DME/F12 medium (see Table E20C) for two hours. Supernatant was harvested and frozen at −80° C. GLP-1 (7-36) amide concentration was assayed subsequently. FIG. 6 shows supernatant concentration of GLP-1 (7-36) detected in the supernatant following stimulation, error bars are the standard deviation of the technical replicates. Fourteen compositions increased the concentration of GLP-1 above 10% greater than that observed in the larger buffer only stimulation (those lacking either Asn, Met, Gln, Tyr, His, Gly, Cys, Phe, Trp, Ala, Glu, Leu, Ile and Val). Two compositions, amino acids lacking Arg & Leu only treatment, showed decreased concentration of GLP-1 (7-36) below 10% below the lower buffer only stimulation.

TABLE E20A Amino Acids μM Glycine 250 L-Alanine 50 L-Arginine 700 L-Asparagine 57 L-Aspartic Acid 50 L-Cysteine 100 L-Glutamic Acid 100 L-Glutamine 2500 L-Histidine 150 L-Isoleucine 416 L-Leucine 451 L-Lysine 500 L-Methionine 116 L-Phenylalanine 215 L-Proline 150 L-Serine 250 L-Threonine 449 L-Tryptophan 44 L-Tyrosine 214 L-Valine 452

Example 21. Use of Nutritive Polypeptides to Improve Glycemic Control in Healthy, Fasted Rats

Glucose tolerance tests are a common method of measuring glycemic control in both clinical and preclinical settings. During the test, a large dose of glucose is given and blood samples are taken at subsequent time points to determine how quickly blood glucose levels renormalize. In an oral glucose tolerance test (OGTT), a dose of glucose is ingested by mouth. Such tests are clinically used to identify individuals with poor glucose control (Cobelli et al., 2014), and pre-clinically to assess the therapeutic efficacy of anti-diabetes medications (Wagman & Nuss, 2001)(Moller, 2001). Glucose levels are modulated by a number of different gastrointestinal hormones, including insulin, glucagon, somatostatin, glucose-dependent insulinotropic peptide (GIP), and glucagon-like peptide 1 (GLP-1), which both directly and indirectly modulate glucose levels. Combined measurements of glucose and gastrointestinal hormones are used to assess glucose intolerance, insulin resistance, and the severity of metabolic disease (Ferrannini & Mari, 2014).

An OGTT was performed on twenty five healthy, fasted, male Sprague-Dawley rats with indwelling jugular vein catheters (JVC) to assess the acute effects of nutritive polypeptide dosing on glycemic control as well as insulin and GLP-1 levels.

Oral Glucose Tolerance Test.

All rodents were approximately 10-12 weeks old, weighed average of approximately 350 g and acclimated for 4 days prior to testing. Animals were housed singly with bedding and fed a regular rodent chow diet (Lab Diet 5001) prior to the study. Housing temperature was at kept at 22±2° C., humidity at 50±20%, and a 12 hour light/12 hour dark cycle was implemented. Air circulation was ten or more air changes per hour with 100% fresh air. Prior to treatment, all rats were fasted overnight for fourteen hours. Fasted animals were treated with formulations of nutritive polypeptides dissolved in water by oral gavage (see Table E21A), and fifteen minutes after treatment gavage were then challenged with an oral gavage of glucose (2 g/kg). Blood glucose was measured and blood was collected at seven time points (−15, 0, 15, 30, 60, 90, 120 minutes relative to the glucose challenge). Blood was collected in an EDTA collection tube containing plasma stablizers (a DPP4 inhibitor and a protease cocktail inhibitor). One additional rat was sacrificed at and bled out to provide naïve blood for analytical standards. Glucose was measured using small drops of blood collected via the JVC using a glucometer (AlphaTrak 2, Abbott).

TABLE E21A Dose Average Total Glucose Glucose Glucose Dose Conc. Volume BW Protein Volume Dose Conc. Group (mg/kg) (mg/mL) (mL/kg) (kg) (mg) (mL/kg) (g/kg) (mg/mL) Vehicle NA NA 11.4 0.35 NA 4 2 500 SEQID- 2850 250 11.4 0.35 5,985 4 2 500 00105 Arginine 1000 87.7 11.4 0.35 2,100 4 2 500 HCl SEQID- 2850 250 11.4 0.35  5,985* 4 2 500 00338

Approximately 300 μL of blood was collected from the JVC of all rats in Group 1-4 at six time points (−15, 0, 15, 30, 60, and 120 minutes). Glucose gavages and blood collections were timed to take the same amount of time per animal so that sample times were accurate for each animal. All time points were collected within 5% of the target time. Blood was collected into pre-chilled (0-4° C.) K2EDTA blood collection tubes containing Protease Inhibitor Cocktail (Catalog number D8340, Sigma-Aldrich, St. Louis, Mo.) and DPPIV inhibitor (Millipore, Billerica, Mass.) added to the tubes at 1:100 prior to collection. After blood collection, blood samples were maintained chilled (2-6° C.) and centrifuged within 30 minutes. The collected plasma was placed in sample tubes and immediately stored at −80° C.

Prior to immunoassay analysis samples were thawed on ice for 1 hour, mixed thoroughly by pipette, rearrayed to 96 well microplates. Separate aliquots were prepared for insulin immunoassay and GLP-1 immunoassay. Master plate and aliquots were stored frozen at −80° C.

FIG. 7 shows the average blood glucose values during the OGTT described herein. The error bars shown are the standard errors of the mean. All groups showed significant differences in blood glucose from fasting at times t=15, and t=30 min after the glucose challenge (p<0.05, Dunnett multiple comparison test). The groups that received SEQID-00105 and SEQID-00338 showed a significant difference in blood glucose relative to vehicle at times t=15, and t=30 after the glucose challenge (p<0.05, Tukey-Kramer multiple comparison test, comparing treatment group to each other at each time point). These data indicated that acute ingestion of SEQID-00105 and SEQID-00338 can significantly improve glycemic control after a glucose challenge in outbred male Sprague-Dawley rats.

The area under curve for blood glucose was calculated using the Linear-Log Trapezoidal Method in Microsoft Excel, and statistical analysis was conducted on GraphPad Prism 6.

The area under curve for blood glucose integreated from 0-120 minutes and from 0-60 minutes (FIG. 8) show that acute dosing of SEQID-00105 and SEQID-00338 improves blood glucose control by reducing blood glucose excursion in the context of an oral glucose tolerance test. Between 0 and 60 minutes both SEQID-00105 and SEQID-00338 have significantly smaller change in blood glucose in comparison to vehicle (P<0.05, Dunnett's multiple comparisons test).

Rat Insulin Enzyme Linked Immunosorbent Assay (ELISA).

An Ultra-Sensitive Rat Insulin ELISA Kit was obtained from Crystal Chem, Inc. (Catalog number 90060, Downers Grove, Ill.). Plates were washed using a BioTek ELx50 microplate strip washer (BioTek, Winooski, Vt.). Absorbance was read on a Synergy Mx monochromator-based microplate reader (BioTek, Winooski, Vt.). Data was analyzed using Microsoft Excel version 14.0.7128.5000 (Microsoft Corporation, Redmond, Wash.) and GraphPad Prism version 6.03 for Windows (GraphPad Software, La Jolla, Calif.).

The ELISA kit was prewarmed to room temperature for 30 minutes prior to beginning the assay set up. The standard curve dilutions were prepared in accordance with the manufacturer's instructions for running the assay in Wide Range format.

Plasma matrix from the nave group and sample plasma were thawed on ice and then centrifuged at approximately 1000×rcf for 10 minutes at 4° C. to pellet any insoluble material.

Matrix Assay Buffer for running the insulin standard was prepared using plasma matrix from the nave group to a concentration of 5.26% in 95 μL. 95 μL of Assay Buffer was added to all sample wells, and 95 μL of Matrix Assay Buffer was added to all standard wells. 5 μL of each sample and standard were added in duplicate. The plates were incubated at 4° C. for 2 hours. The plates were then washed five times with 300 μL/well 1× Wash Buffer. The plates were tapped sharply several times on paper towels to remove any residual wash buffer.

Anti-Insulin Enzyme Conjugate Working Solution was prepared by combining 2 volumes Anti-Insulin Enzyme Conjugate Stock with 1 volume Enzyme Conjugate Diluent, and mixing by pipetting up and down and gently vortexing. 100 μL/well of Anti-Insulin Enzyme Conjugate Working Solution was added to all wells. The plates were sealed and incubated at room temperature for 30 minutes and then washed seven times with 300 μL/well 1× Wash Buffer. The plates were tapped sharply several times on paper towels to remove any residual wash buffer. 100 pt/well Enzyme Substrate Solution was then added to each well and incubated in the dark at room temperature for 40 minutes. 100 μL/well Stop Solution was added to all wells.

The absorbance was read on the Synergy Mx plate reader at 450 nm and 630 nm. Final values obtained were the A450 nm-A630 nm values.

The insulin standard curve was corrected for matrix concentration of insulin by subtracting the mean of the 0 ng/mL insulin standard from each of the standard well A450 nm-A630 nm values in Excel. Duplicate sample concentrations were determined by non-linear regression using a 4 parameter logistic model of the background corrected standard following an x=log(x) transformation of insulin concentration in GraphPad Prism 6. ANOVA and multiple comparison tests were conducted on GraphPad Prism 6. The area under curve was integrated using the Linear-Log Trapezoidal Method on Microsoft Excel, with post hoc testing conducted in GraphPad.

FIG. 9 shows the average plasma insulin concentration for n=6 rats per treatment group over the course of the experiment. The error bars show the standard error of the mean. All treatment groups had statistically significant increases in plasma insulin relative to their treatment or vehicle gavage at 15, 30 and 60 minutes following the glucose challenge (Dunnett's multiple comparisons test). Only SEQID-00105 had a statistically significant increase in plasma insulin concentration at the time of the glucose challenge (0) relative to the plasma insulin at the time of the treatment gavage (P<0.0001, Dunnett's multiple comparisons test). Both SEQID-00105 and SEQID-00338 had statistically significant greater insulin concentrations at 120 minutes following the glucose challenge (P<0.05 and P<0.01, respectively; Dunnett's multiple comparisons test).

In comparison to the vehicle, only SEQID-00105 showed a statistically significantly greater increase in plasma insulin concentration at the time of the glucose challenge (P<0.001, Dunnett's multiple comparisons test).

These data indicated that an acute ingestion of SEQID-00105 can stimulate insulin release within 15 minutes of ingestion in outbred male Sprague-Dawley rats.

FIG. 10 shows the area under curve integrated between 0-240 and 0-60 minutes for all treatment groups. No treatment group was statistically significantly greater than vehicle.

Total GLP-1 Enzyme Linked Immunosorbent Assay (ELISA).

A rat GLP-1 ELISA Kit was obtained from Crystal Chem, Inc. (Catalog number 81507, Downers Grove, Ill.). Plates were washed using a BioTek ELx50 microplate strip washer (BioTek, Winooski, Vt.). Absorbance was read on a Synergy Mx monochromator-based microplate reader (BioTek, Winooski, Vt.). Data was analyzed using Microsoft Excel version 14.0.7128.5000 (Microsoft Corporation, Redmond, Wash.) and GraphPad Prism version 6.03 for Windows (GraphPad Software, La Jolla, Calif.).

The ELISA kit was prewarmed to room temperature for 30 minutes prior to beginning the assay set up. The standard curve dilutions were prepared in accordance with the manufacturer's instructions.

Plasma matrix from Group 5 and sample plasma were thawed on ice and then centrifuged at approximately 1000×rcf for 10 minutes at 4° C. to pellet any insoluble material.

Matrix Assay Buffer for running the GLP-1 standard was prepared using plasma matrix from the nave group to a concentration of 25% in 100 μL. The ELISA microplates were washed three times with 350 μL/well 1× Wash Buffer and tapped sharply on paper towels to remove residual wash buffer. 100 μL of Assay Buffer was added to all sample wells, and 100 μL of Matrix Assay Buffer was added to all standard wells. 25 μL of each sample and standard were added in duplicate. Wells were mixed by pipetting. The plates were covered with adhesive foil and incubated for 18 hours at room temperature on a horizontal plate shaker at 100 rpm. The plates were then washed three times with 350 μL/well 1× Wash Buffer then tapped sharply several times on paper towels to remove any residual wash buffer. 100 μL/well of Biotin Labeled Antibody Solution was added to all wells. The plates were then sealed and incubated at room temperature for 1 hour on a horizontal plate shaker at 100 rpm. The plates were then washed three times with 350 μL/well 1× Wash Buffer then tapped sharply several times on paper towels to remove any residual wash buffer. 100 μL/well SA-HRP Solution was added to all wells. The plates were then sealed and incubated at room temperature for 30 minutes on a horizontal plate shaker at 100 rpm. The plates were then washed three times with 350 μL/well 1× Wash Buffer then tapped sharply several times on paper towels to remove any residual wash buffer. 100 μL/well Enzyme Substrate Solution was added to all wells. The plates were then sealed and incubated in the dark, without shaking, for 30 minutes at room temperature. Following substrate incubation, 100 μL Stop Solution was added to all wells.

The absorbance values were read on the Synergy Mx plate reader at 450 nm and 630 nm. Final values obtained were the A450 nm-A630 nm values.

The standard curve was corrected for matrix concentration of total GLP-1 by subtracting the mean of the 0 pM GLP-1 standard from each of the standard well A450 nm-A630 nm values in Excel. Duplicate sample concentrations were determined by non-linear regression using a 4 parameter logistic model of the background corrected standard following an x=log(x) transformation of GLP-1 concentration. ANOVA and multiple comparison tests were conducted using GraphPad Prism 6. The area under curve was integrated using the Linear-Log Trapezoidal Method on Microsoft Excel, with post hoc testing conducted in GraphPad.

FIG. 11 shows average plasma GLP-1 concentration for n=6 rats per treatment group over the course of the experiment. The error bars shown here correspond to the standard error of the mean. The SEQID-00338 treatment group shows a statistically significant greater concentration of GLP-1 at the time of the glucose challenge than vehicle (p<0.0005, Dunnett's multiple comparisons test).

Example 22: Use of Nutritive Polypeptides to Improve Glycemic Control in Zucker Fatty (Fa/Fa) Rats

An OGTT was performed in 18, fasted, male Zucker fatty rats with indwelling jugular vein catheters (JVC) to assess the acute effects of nutritive polypeptide dosing on glycemic control as well as insulin and GLP-1 levels.

Oral Glucose Tolerance Test.

All rodents were approximately 10-11 weeks old, weighed average of approximately 450 g and acclimated for 4 days prior to testing. Animals were housed singly with bedding and fed a regular rodent chow diet (Lab Diet 5001) prior to the study. Housing temperature was at kept at 22±2° C., humidity at 50±20%, and a 12 hour light/12 hour dark cycle was implemented. Air circulation was ten or more air changes per hour with 100% fresh air. Prior to treatment, all rats were fasted overnight for fourteen hours. Fasted animals were treated with formulations of nutritive polypeptides dissolved in water by oral gavage (see Table E22A), and fifteen minutes after treatment gavage were then challenged with an oral gavage of glucose (5 g/kg). Blood glucose was measured and blood was collected at seven time points (−15, 0, 15, 30, 60, 90, 120 minutes relative to the glucose challenge). Blood was collected in an EDTA collection tube containing plasma stablizers (a DPP4 inhibitor and a protease cocktail inhibitor). One additional rat was sacrificed at and bled out to provide naïve blood for analytical standards. Glucose was measured using small drops of blood collected via the JVC using a glucometer (AlphaTrak 2, Abbott).

TABLE E22A Polypeptides Glucose Body weight Dose Volume Dose Group (average, g) SEQID (mg/kg) (ml/kg) (g/kg) 1 450 Vehicle 0 11.4 5 2 450 00105 2850 11.4 5 3 450 00338 2850 11.4 5

Approximately 300 μL of blood was collected from the JVC of all rats in Group 1-3 at six time points (−15, 0, 15, 30, 60, 90, and 120 minutes). Glucose gavages and blood collections were timed to take the same amount of time per animal so that sample times were accurate for each animal. All time points were collected within 5% of the target time. Blood was collected into pre-chilled (0-4° C.) K2EDTA blood collection tubes containing Protease Inhibitor Cocktail (Catalog number D8340, Sigma-Aldrich, St. Louis, Mo.) and DPPIV inhibitor (Millipore, Billerica, Mass.) added to the tubes at 1:100 prior to collection. After blood collection, blood samples were maintained chilled (2-6° C.) and centrifuged within 30 minutes. The collected plasma was placed in sample tubes and immediately stored at −80° C.

Prior to immunoassay analysis samples were thawed on ice for 1 hour, mixed thoroughly by pipette, rearrayed to 96 well microplates. Separate aliquots were prepared for insulin immunoassay and GLP-1 immunoassay. Master plate and aliquots were stored frozen at −80° C.

FIG. 12 shows the average blood glucose values during the OGTT described herein. The error bars shown are the standard errors of the mean.

The integrated area under curve (AUC) was calculated on Microsoft Excel using the Linear-Log Trapezoidal Method and statistical testing was conducted on GraphPad Prism 6.03.

FIG. 13 shows the integrated AUC for each treatment group between the time of glucose challenge (0 min.) and 60 minutes, and between time 0 and 120 minutes. No treatment showed statistically significant difference from vehicle between time 0 and 60 minutes. Between time 0 and 120 minutes SEQID-00105 but not SEQID-00338 shows a statistically significant decrease in integrated area under curve compared to vehicle (P<0.005, Dunnett's multiple comparisons test). These data show that acute dosing of SEQID-00105 can reduce glucose excursion due to an oral glucose challenge in a Zucker Fatty (fa/fa) model rodent model.

Rat Insulin Enzyme Linked Immunosorbent Assay (ELISA).

An Ultra-Sensitive Rat Insulin ELISA Kit was obtained from Crystal Chem, Inc. (Catalog number 90060, Downers Grove, Ill.). Plates were washed using a BioTek ELx50 microplate strip washer (BioTek, Winooski, Vt.). Absorbance was read on a Synergy Mx monochromator-based microplate reader (BioTek, Winooski, Vt.). Data was analyzed using Microsoft Excel version 14.0.7128.5000 (Microsoft Corporation, Redmond, Wash.) and GraphPad Prism version 6.03 for Windows (GraphPad Software, La Jolla, Calif.).

The ELISA kit was prewarmed to room temperature for 30 minutes prior to beginning the assay set up. The standard curve dilutions were prepared in accordance with the manufacturer's instructions for running the assay in Wide Range format.

Plasma matrix from the nave group and sample plasma were thawed on ice and then centrifuged at approximately 1000×rcf for 10 minutes at 4° C. to pellet any insoluble material.

Matrix Assay Buffer for running the insulin standard was prepared using plasma matrix from the nave group to a concentration of 5.26% in 95 μL. 95 μL of Assay Buffer was added to all sample wells, and 95 μL of Matrix Assay Buffer was added to all standard wells. 5 μL of each sample and standard were added in duplicate. The plates were incubated at 4° C. for 2 hours. The plates were then washed five times with 300 μL/well 1× Wash Buffer. The plates were tapped sharply several times on paper towels to remove any residual wash buffer.

Anti-Insulin Enzyme Conjugate Working Solution was prepared by combining 2 volumes Anti-Insulin Enzyme Conjugate Stock with 1 volume Enzyme Conjugate Diluent, and mixing by pipetting up and down and gently vortexing. 100 μL/well of Anti-Insulin Enzyme Conjugate Working Solution was added to all wells. The plates were sealed and incubated at room temperature for 30 minutes and then washed seven times with 300 μL/well 1× Wash Buffer. The plates were tapped sharply several times on paper towels to remove any residual wash buffer. 100 pt/well Enzyme Substrate Solution was then added to each well and incubated in the dark at room temperature for 40 minutes. 100 μL/well Stop Solution was added to all wells.

The absorbance was read on the Synergy Mx plate reader at 450 nm and 630 nm. Final values obtained were the A450 nm-A630 nm values. Samples in which the value exceeded the standard curve were rerun at 1:1 dilution against a 2.5% matrix standard

The insulin standard curve was corrected for matrix concentration of insulin by subtracting the mean of the 0 ng/mL insulin standard from each of the standard well A450 nm-A630 nm values in Excel. Duplicate sample concentrations were determined by non-linear regression using a 4 parameter logistic model of the background corrected standard following an x=log(x) transformation of insulin concentration in GraphPad Prism 6. ANOVA and multiple comparison tests were conducted on GraphPad Prism 6. The area under curve was integrated using the Linear-Log Trapezoidal Method on Microsoft Excel, with post hoc testing conducted in GraphPad.

FIG. 14 shows the average plasma insulin concentration for n=6 rats per treatment group in Vehicle & SEQID-00105 and n=5 rats per treatment group in the case of SEQID-00338 over the course of the experiment. One-way ANOVA with Dunnett's multiple comparisons tests were used to compare within each treatment to time 0 and between treatments at the same time point to vehicle. Vehicle had no statistically significant change in plasma insulin compared to the time of vehicle gavage. SEQID-00105 had statistically significantly greater plasma insulin compared to the time of treatment gavage (time −15) at the time of the glucose challenge (time 0) and at 15 minutes and 90 minutes following glucose challenge (P=0.0002, P<0.05 and P<0.05, respectively). SEQID-00338 had statistically significantly greater plasma insulin concentration compared to the time of treatment gavage at all subsequent sampled time points (P<0.0001, P<0.05, P<0.005, P<0.005, and P=0.0005, respectively).

In comparisons between treatments, neither SEQID-00105 nor SEQID-00338 was significantly different from Vehicle at the time of treatment or vehicle gavage. At the time of the glucose challenges and all subsequent time points sampled (0, 15, 30, 60 and 90 minutes) both SEQID-00105 and SEQID-00338 had a significantly greater plasma insulin concentration than vehicle. This shows that both treatments stimulate insulin secretion in the Zucker Fatty (fa/fa) model.

FIG. 15 shows the integrated area under the curve for each group. Only SEQID-00338 treatment integrated between 0 and 90 minutes was statistically significantly greater than vehicle (P<0.005).

Active GLP-1 Enzyme Linked Immunosorbent Assay (ELISA). A rat active GLP-1 ELISA Kit was obtained from Eagle Biosciences (Catalog number GP121-K01, Nashua, N.H.). Plates were washed using a BioTek ELx50 microplate strip washer (BioTek, Winooski, Vt.). Absorbance was read on a Synergy Mx monochromator-based microplate reader (BioTek, Winooski, Vt.). Data was analyzed using Microsoft Excel version 14.0.7128.5000 (Microsoft Corporation, Redmond, Wash.) and GraphPad Prism version 6.03 for Windows (GraphPad Software, La Jolla, Calif.).

The ELISA kit was prewarmed to room temperature for at least 30 minutes prior to beginning the assay set up. The standard curve dilutions were prepared in accordance with the manufacturer's instructions.

Plasma matrix from Group 5 and sample plasma were thawed on ice and then centrifuged at approximately 1000×rcf for 10 minutes at 4° C. to pellet any insoluble material.

Matrix Assay Buffer for running the active GLP-1 standard was prepared using plasma matrix from the nave group to a concentration of 10% in 100 μL for samples tested at 1:1 dilution or 20% in 100 μL for samples tested undiluted. 20 μL of standards and samples were added to the pre-coated microplate and 100 μL appropriate assay buffer added to the standard and sample wells. The plates were covered with adhesive foil and incubated for 24 hours at 4° C. protected from light. The plates were then washed five times with 350 pt/well 1× Wash Buffer then tapped sharply several times on paper towels to remove any residual wash buffer. 100 pt/well of ELISA HRP Substrate was added to all wells. The plates were then sealed and incubated at room temperature for 20 minutes protected from light. 100 μL/well ELISA Stop Solution was added to all wells and gently mixed.

The absorbance values were read on the Synergy Mx plate reader at 450 nm and 620 nm. Final values obtained were the A450 nm-A620 nm values.

The standard curve was corrected for matrix concentration of active GLP-1 by subtracting the mean of the 0 pM GLP-1 standard from each of the standard well A450 nm-A620 nm values in Excel. Duplicate sample concentrations were determined by non-linear regression using a 4 parameter logistic model of the background corrected standard following an x=log(x) transformation of GLP-1 concentration. ANOVA and multiple comparison tests were conducted using GraphPad Prism 6. The area under curve was integrated using the Linear-Log Trapezoidal Method on Microsoft Excel, with post hoc testing conducted in GraphPad.

FIG. 16 shows average plasma GLP-1 concentration for n=6 rats per vehicle and SEQID-00105, and n=5 rats per SEQID-00338 treatment group, over the course of the experiment. One-way ANOVA with Dunnett's multiple comparisons tests were used to compare within each treatment to time 0 and between treatments at the same time point to vehicle. Vehicle showed no significant difference in GLP-1 (7-36) concentration at any time point after vehicle gavage. SEQID-00105 showed significantly greater GLP-1 (7-36) concentration at 15 minutes and 30 minutes after glucose challenge (P<0.0001 and P<0.05, respectively). SEQID-00338 had a significantly greater GLP-1 concentration at 15 minutes following glucose challenge (P<0.005).

When compared to vehicle at each time point only SEQID-00105 had a significantly greater GLP-1 (7-36) concentration than vehicle at 15 minutes following glucose challenge (P<0.005).

FIG. 17 shows the area under curve for GLP-1 (7-36) for each treatment group integrated to 0-90 and 0-60 minutes. No treatment had a significantly different GLP-1 (7-36) integrated AUC compared to vehicle at either 0-90 or 0-60 minutes.

In another experiment, an OGTT was performed in 24, fasted, male Zucker fatty rats with indwelling jugular vein catheters (JVC) to assess the acute effects of nutritive protein dosing on glycemic control as well as insulin and GLP-1 levels. The capacity of SEQID-00105 to improve glucose control was tested by comparing the blood glucose excursion of SEQID-00105 to alogliptin to SEQID-00105+alogliptin to vehicle in the context of an oral glucose tolerance test. All rodents were approximately 10-11 weeks old, weighed average of approximately 430 g and acclimated for 4-5 days prior to testing. Prior to treatment, all rats were fasted overnight for fourteen hours. Fasted animals were treated with formulations of nutritive proteins dissolved in water by oral gavage (see table E22B), and fifteen minutes after treatment gavage were then challenged with an oral gavage of glucose (2 g/kg). Blood glucose was measured and blood was collected at nine time points (−15, 0, 15, 30, 60, 90, 120, 180 and 240 minutes relative to the glucose challenge).

TABLE E22B Polypeptides DPP IV Glucose Body weight Treatment Dose Volume Inhibitor Dose Dose Group (average, g) ID (mg/kg) (ml/kg) (mg/kg) (g/kg) 1 430.5 Vehicle 0 11.4 0 2 2 432.3 SEQID-00105 2850 11.4 0 2 3 431.7 Alogliptin 0 11.4 0.3 2 4 435.5 SEQID-00105 + 2850 11.4 0.3 2 Alogliptin

FIG. 18 shows the average blood glucose values during the OGTT described herein. N=6 rats per treatment group. The error bars shown are the standard errors of the mean. Each group was compared post hoc on GraphPad Prism 6, two-way ANOVA, Dunnett's multiple comparisons test first comparing within each group to the fasting glucose concentration, followed by a comparison of each time point to vehicle. No treatment group blood glucose was significantly different from fasting at the time of the glucose challenge.

Fasting glucose and blood glucose at the time of the glucose challenge was not significantly different between each group and vehicle. At 15 minutes following glucose challenge, SEQID-00105 only and SEQID-00105+Alogliptin had significantly lower blood glucose compare to vehicle (P=0.0003 & P<0.0001, respectively). At 30 minutes following the glucose challenge Alogliptin alone and SEQID-00105+Alogliptin had significantly lower blood glucose than vehicle (P=0.0424 & P=0.0021, respectively). At 60 minutes following the glucose challenge, only Alogliptin alone was significantly lower than vehicle (P<0.0001). No group after 60 minutes had significantly different blood glucose than vehicle.

The integrated area under curve (AUC) was calculated on Microsoft Excel using the Linear-Log Trapezoidal Method and two-way ANOVA and Dunnett's multiple comparisons tests was conducted on GraphPad Prism 6.03. No treatment showed statistically significant difference in glucose AUC from vehicle between time 0 and 60 minutes. Between time 0 and 120 minutes and between time 0 and 240 minutes only the alogliptin alone treatment was significantly less than vehicle (P=0.0051 & P=0.0054, respectively).

Example 23: Use of Nutritive Polypeptides to Improve Glycemic Control and Fasting Glucose in Diet Induced Obese Mice

The effects of chronic dosing of therapeutic nutritive polypeptides described herein are evaluated by oral glucose tolerance tests in diet induced obese (DIO) mice. In particular, chronic dosing in this animal model of metabolic disease can affect glycemic control, insulin resistance, fasting glucose and insulin levels, and comparison of fasting levels as well as glucose area under the curve (AUC) during an OGTT before and after a period of daily dosing provides a measure of compound efficacy.

Four groups of 10 male C57BL/6 mice are fed a high fat diet ad libitum for 14 weeks to ensure that they develop elevated fasting glucose levels (hyperglycemia), elevated fasting insulin levels (hyperinsulinemia), and impaired glucose control (Xu. H. et al. Chronic inflammation in fat plays a crucial role in the development of obesity-related insulin resistance. J. Clin. Invest. (2003) 112: 1821-1830). A single group of 10 male C57BL/6 mice (group 1) is fed a normal diet for comparison of treatment arms on HFD to normal diet at the same age, handling, and housing conditions. All mice are housed and acclimated for three days prior to the start of the study.

All Mice are fasted overnight for fourteen hours before OGTT treatment, and a blood sample is collected the morning of the study prior to dosing for analysis of fasting glucose and hormone levels. Fasted animals are treated with formulation of provided nutritive polypeptides by oral gavage fifteen minutes before glucose challenge. Glucose (2 g/kg) is orally administered at time zero. Blood glucose is measured at seven time points (−15, 0, 15, 30, 60, 90, 120 minutes). Age-matched normal mice under regular diet are used as an internal standard for the analytics. In groups 2-5, mice receive daily dose of provided therapeutic nutritive polypeptides (2.85 g/kg) (groups 4 and 5) or vehicle control (groups 2 and 3) for 15-30 days via daily gavage or formulation into chow. All blood is collected in an EDTA collection tube containing plasma stabilizers (a DPP4 inhibitor and a protease cocktail inhibitor), processed for plasma, and stored frozen at −80 C.

An OGTT as described above is performed again on the final day of the study with either vehicle (groups 2 and 4) or test article (groups 1, 3, and 5). Prior to the final test article or vehicle dose, a blood sample is collected for analysis of fasting glucose and hormone levels. At the completion of the final OGTT, all mice are sacrificed and entire blood volume is collected via cardiac puncture. Analysis of insulin levels are performed using an ELISA method described herein.

Comparisons of end of study OGTT glucose AUC across groups 2 and 4 describe the effect of chronic test article administration without the acute effects of test article administration on the OGTT. Comparisons of end of study OGTT glucose AUC across groups 3 and 5 describe the effect of chronic test article administration and include the acute effects of test article administration on the OGTT. Comparisons of fasting glucose and insulin levels at the beginning and end of the study within group 4 or within group 5 describe the effect of chronic dosing on fasting glucose and insulin levels.

Example 24: Use of Nutritive Polypeptides to Induce Insulin Secretion in Rodents

Ingested amino acids have been shown to have the capacity to induce insulin secretion (Gannon M. C. and F. Q. 2010. Nuttall. Amino Acid Ingestion and Glucose Metabolism—A Review. IUBMB Life, 62(9): 660-668). Protein ingestion increases plasma insulin significantly in comparison to ingestion of glucose alone (Nuttall, et al. 1984. Effect of protein ingestion on the glucose and insulin response to a standardized oral glucose load. Diabetes Care. 7(5):465-470). Ingested protein increases insulin secretion in part via the action of incretin hormones, i.e., glucagon-like peptide-1 (GLP-1) and glucose-dependent insulinotropic peptide (GIP), which are secreted by endocrine cells upon luminal exposure to nutrients (Baggio L L & DJ Drucker. 2007. Biology of incretins: GLP-1 and GIP. Gastroenterology. 132:2131-2157). Amino acids such as leucine and arginine have also been shown to directly stimulate insulin release (Newsholme P. et al. New insights into amino acid metabolism, B-cell function and diabetes. Clin. Sci. (2005) 108: 185-194). In these studies, the insulin response after acute dosing of various nutritive proteins was measured in rodents.

Animals were treated and acutely dosed with various test articles as part of pharmacokinetic rodent experiments according to methods described herein. Unless otherwise stated all dosing was at 2.85 g/kg body weight. All treatment groups except for SEQID-00240 which had n=5 rats, and SEQID-00587 which had n=3, contained n=4 rats.

Quantification of Plasma Insulin Using AlphaLISA® Insulin Immunoassay

Plasma samples were thawed on ice and centrifuged for 10 minutes at 1109×g to pellet insoluble material. AlphaLISA® Insulin Immunoassay Kit (PerkinElmer, AL204C) was removed from 4° C. cold room and kept on ice during assay set up. 1× Assay Buffer was prepared by diluting 10× Assay Buffer with milliQ water.

Standard buffer for dilution of insulin was prepared at 25.9% fasted rat plasma in 1× Assay Buffer and used to prepare a 16 point standard curve. Plasma samples were prepared by diluting 7:20 in 1× Assay Buffer in a 96-well PCR microplate.

Acceptor bead mix was prepared by diluting Acceptor Beads and Biotinylated Anti-Insulin Antibody 1/400 in 1× Assay Buffer. Acceptor bead mix was pipetted, 20 μL/well, into a white opaque 384 well microplate (PerkinElmer, OptiPlate-384), to this mixture was added 10 μL/well of insulin standard in fasted rat plasma or sample rat plasma in duplicate. The plate was sealed with a foil plate seal and incubated on a horizontal shaker at 600 rpm for 60 minutes at room temperature.

It was necessary to protect the Streptavidin Coated Donor Beads from light; as such, in a darkened room, donor bead mix was prepared by diluting the streptavidin coated donor beads 1/125 in 1× Assay Buffer. After the first incubation step, 20 μL/well donor bead mix was added to the standard and samples. The assay plate was sealed and returned to the horizontal shaker at 600 rpm, for 30 minutes at room temperature. Following the donor bead incubation, luminescence was read on the EnSpire Alpha plate reader.

Data were analyzed with Microsoft Excel version 14.0.7128.5000 (32-bit) and GraphPad Prism version 6.03. A standard curve was generated to log(Insulin (μIU)). Rat plasma insulin concentrations were interpolated using a sigmoidal, four parameter logistic equation. The mean of duplicate concentration of insulin in the technical replicates for each rat were plotted against time. Area under curve was integrated for 0-240 minutes and 0-60 minutes using the Linear-Log Linear Method in Microsoft Excel.

Quantification of Plasma Insulin Using a Rat Insulin Enzyme Linked Immunosorbent Assay (ELISA).

An Ultra-Sensitive Rat Insulin ELISA Kit was obtained from Crystal Chem, Inc. (Catalog number 90060, Downers Grove, Ill.). Plates were washed using a BioTek ELx50 microplate strip washer (BioTek, Winooski, Vt.). Absorbance was read on a Synergy Mx monochromator-based microplate reader (BioTek, Winooski, Vt.). Data was analyzed using Microsoft Excel version 14.0.7128.5000 (Microsoft Corporation, Redmond, Wash.) and GraphPad Prism version 6.03 for Windows (GraphPad Software, La Jolla, Calif.).

The ELISA kit was pre-warmed to room temperature for 30 minutes prior to beginning the assay set up. The standard curve dilutions were prepared in accordance with the manufacturer's instructions for running the assay in Wide Range format.

Plasma matrix from the nave group and sample plasma were thawed on ice and then centrifuged at approximately 1000×rcf for 10 minutes at 4° C. to pellet any insoluble material.

Matrix Assay Buffer for running the insulin standard was prepared using plasma matrix from the nave group to a concentration of 5.26% in 95 μL. 95 μL of Assay Buffer was added to all sample wells, and 95 μL of Matrix Assay Buffer was added to all standard wells. 5 μL of each sample and standard were added in duplicate. The plates were incubated at 4° C. for 2 hours. The plates were then washed five times with 300 μL/well 1× Wash Buffer. The plates were tapped sharply several times on paper towels to remove any residual wash buffer.

Anti-Insulin Enzyme Conjugate Working Solution was prepared by combining 2 volumes Anti-Insulin Enzyme Conjugate Stock with 1 volume Enzyme Conjugate Diluent, and mixing by pipetting up and down and gently vortexing. 100 μL/well of Anti-Insulin Enzyme Conjugate Working Solution was added to all wells. The plates were sealed and incubated at room temperature for 30 minutes and then washed seven times with 300 μL/well 1× Wash Buffer. The plates were tapped sharply several times on paper towels to remove any residual wash buffer. 100 μL/well Enzyme Substrate Solution was then added to each well and incubated in the dark at room temperature for 40 minutes. 100 μL/well Stop Solution was added to all wells. The absorbance was read on the Synergy Mx plate reader at 450 nm and 630 nm. Final values obtained were the A450 nm-A630 nm values.

The insulin standard curve was corrected for matrix concentration of insulin by subtracting the mean of the 0 ng/mL insulin standard from each of the standard well A450 nm-A630 nm values in Excel. Duplicate sample concentrations were determined by non-linear regression using a 4 parameter logistic model of the background corrected standard following an x=log(x) transformation of insulin concentration in GraphPad Prism 6. ANOVA and multiple comparison tests were conducted on GraphPad Prism 6. Area under curve was integrated for 0-240 minutes and 0-60 minutes using the Linear-Log Linear Method in Microsoft Excel.

In Vivo Plasma Insulin Concentrations

FIGS. 19 and 20 show the combined biological replicate data for a study of vehicle and SEQID-00105 administered at three different doses and SEQID-00426, SEQID-00338, SEQID-00341 administered at one dose, where plasma insulin was measured using the AlphaLISA Insulin kit. All error bars represent the standard error of the mean. A one-way ANOVA with Dunnett's multiple comparisons tests were used to compare within each treatment to time 0 and between treatments at the same time point to vehicle.

In FIG. 19, SEQID-00105 at 2.85 g/kg had a statistically significant increase in plasma insulin concentration at 15, 30 and 60 minutes following gavage (P<0.0001, P<0.0001, and P<0.05, respectively); SEQID-00105 at 1.78 g/kg had a statistically significant increase in plasma insulin concentration at 15 and 30 minutes following gavage (P=0.0005 and P<0.05, respectively); at the lowest SEQID-00105 dose plasma insulin concentration over the time course was not significantly different from time 0. When compared to the plasma insulin concentration of the vehicle control at each time point, SEQID-00105 at 2.85 g/kg showed statistically significant greater plasma insulin than vehicle at 15 minutes and 30 minutes following oral gavage (P<0.0001 and P<0.001, respectively), SEQID-00105 at 1.78 g/kg showed a statistically significant increase in plasma insulin concentration at 15 minutes following oral gavage (P=0.0005). In FIG. 20, only SEQID-00338 had a statistically significant increase in plasma insulin concentration from 0 at 15 and 30 minutes following oral gavage (P=0.005 and P<0.05, respectively). When compared to vehicle plasma insulin concentration at each concentration SEQID-00338 was significantly greater than vehicle at 15 minutes following oral gavage (P<0.01).

FIGS. 21 and 22 show the integrated area under curves for plasma insulin concentrations measured for vehicle of SEQID-00105, SEQID-00426, SEQID-00338, SEQID-00341, error bars are the standard error of the mean. One-way ANOVA with Dunnett's multiple comparisons tests were used to compare the AUCs to vehicle. SEQID-00105 at 2.85 g/kg had a statistically significantly greater plasma insulin AUC than vehicle when integrated from 0-240 minutes and 0-60 minutes (P=0.0005 and P<0.05, respectively).

FIGS. 23 and 24 show the combined biological replicate data for a study of vehicle and SEQID-00423, SEQID-00587, SEQID-00105, SEQID-00424, SEQID-00425, and SEQID-00429, where plasma insulin was measured using the AlphaLISA Insulin kit. All error bars represent the standard error of the mean. One-way ANOVA with Dunnett's multiple comparisons tests were used to compare within each treatment to time 0 and between treatments at the same time point to vehicle.

FIGS. 25 and 26 shows the integrated area under curves for plasma insulin concentrations shown in FIGS. 23 and 24. One-way ANOVA with Dunnett's multiple comparisons tests were used to compare the AUCs to vehicle. SEQID-00587 had a significantly greater plasma insulin AUC when integrated at 0-240 minutes (P<0.005).

FIG. 27 shows the combined biological replicate data for a study of vehicle and SEQID-00105, SEQID-00240, and SEQID-00559, where plasma insulin was measured using the Rat Insulin ELISA kit. All error bars represent the standard error of the mean. One-way ANOVA with Dunnett's multiple comparisons tests were used to compare within each treatment to time 0. SEQID-00105, and SEQID-00559 both had a statistically significant increase in plasma insulin concentration at 15 minutes post gavage compared to time 0 (P<0.05, both). SEQID-00240 had statistically significant increase in plasma insulin at 15 and 30 minutes post gavage compared to time 0 (P<0.05 & P<0.01, respectively). The vehicle had no statistically significant change in plasma insulin concentration compared to time 0.

FIG. 28 shows the integrated area under curve from 0-240 and 0-60 minutes post gavage for each treatment, error bars are the standard error of the mean. No treatment showed a significantly greater AUC compared to vehicle at 0-60 or 0-240 minutes by a Dunnett's multiple comparisons test.

Example 25: Nutritive Polypeptide Stimulation of Glucagon Like Peptide 2 Secretion in Healthy, Fasted Rats

Glucagon-like peptide-2 (GLP-2) is a thirty three amino acid peptide produced by the post-translational cleavage of proglucagon. GLP-2 is secreted by the intestinal enteroendocrine L-cells of humans and rodents along with GLP-1 in response to exposure to nutrients in the gut lumen. GLP-2 has been shown previously to improve outcomes in the treatment of short bowel syndrome (Brinkman A S, Murali S G, Hitt S, Solverson P M, Holst J J, Ney D M. 2012. Enteral nutrients potentiate glucagon-like peptide-2 action and reduce dependence on parenteral nutrition in a rat model of human intestinal failure. Am. J. Physiol. Gastrointest. Liver Physiol. 303(5):G610-G622) by supporting intestinal growth (Liu X, Murali S G, Holst J J, Ney D M. 2008. Enteral nutrients potentiate the intestinotrophic action of glucagon-like peptide-2 in association with increased insulin-like growth factor-I responses in rats. Am. J. Physiol. Gastrointest. Liver Physiol. 295(6):R1794-R1802).

Animals were treated and acutely dosed with test articles as part of pharmacokinetic rodent experiments according to methods described herein. In this experiment, two test articles were analyzed, vehicle and a nutritive formulation of SEQID-240, dosed at 1.54 g/kg.

Total GLP-2 Enzyme-Linked Immunosorbent Assay (ELISA)

Total GLP-2 was measured with Millipore Total GLP-2 ELISA Kit (Millipore, EZGLP2-37K). The ELISA kit was equilibrated to room temperature for a minimum of 30 minutes prior to running the assay. The kit GLP-2 Standard, Quality Control 1 and Quality Control 2 were reconstituted with 500 μL MilliQ water, inverted 5 times and incubated at room temperature for 5 minutes, then gently vortexed to mix. The standard curve was prepared by diluting the GLP-2 Standard serially 1:1 in kit Assay Buffer to generate an eight point standard curve including 0 ng/mL GLP-2. Plasma samples were thawed on ice and centrifuged for 10 minutes at approximately 1109×rcf to pellet insoluble material. Fasted plasma matrix from untreated, fasted rat was prepared for running the standard and quality controls in duplicate at 20% (2×).

1× Wash Buffer was prepared in a clean 500 mL glass bottle by combining 50 mL 10× Wash Buffer with 450 mL MilliQ water. Strips were prepared by washing 3 times with 300 μL 1× Wash Buffer applied with a 30-300 μL 8-channel pipette. Between washes the wash buffer was decanted into a waste receptacle and the plate tapped sharply on a stack of paper towels to remove remaining wash buffer.

Following the first wash, 904 of Assay Buffer was added to sample wells, and 50 μL of 20% matrix in Assay Buffer was added to all standard and quality control wells. 10 μL of sample was added to each sample well and 50 μL of standard and quality control were added to matrix containing wells. Samples, standards and quality control wells were run in duplicate, except for 0 ng/mL GLP-2 standard which was run in quadruplicate.

The plate was sealed with a plastic plate seal and incubated at room temperature on a horizontal plate shaker at 450 rpm for 2 hours. Following the first incubation, the plate was washed three times with 1× Wash Buffer, decanting into a waste receptacle and tapping the inverted plate sharply on a stack of paper towels to remove excess wash buffer after each wash.

100 μL of Detection Antibody was added to each well and the plate was sealed with a plastic plate seal and incubated at room temperature on a horizontal plate shaker at 450 rpm for 1 hour. Following the second incubation, the plate was washed three times with 1× Wash Buffer, decanting into a waste receptacle and tapping the inverted plate sharply on a stack of paper towels to remove excess wash buffer after each wash.

100 μL Enzyme Solution was added to each well and the plate was sealed with a plastic plate seal and incubated at room temperature on a horizontal plate shaker at 450 rpm for 30 minutes. Following the third incubation, the plate was washed three times with 1× Wash Buffer, decanting into a waste receptacle and tapping the inverted plate sharply on a stack of paper towels to remove excess wash buffer after each wash.

100 μL Substrate was added to each well and the plate was sealed with a plastic plate seal and an opaque foil seal and incubated at room temperature on a horizontal plate shaker at 450 rpm for 20 minutes. Following substrate reaction, 100 μL of Stop Solution was added to each well and the plate gently shaken to mix.

Absorbance was measured on a SynergyMX Plate Reader at 450 nm and 590 nm. Measured values were the difference between 450 nm and 590 nm absorbance values.

Data were analyzed with GraphPad Prism 6.03. A standard curve was generated to log(Total GLP-2 (ng/mL)) after subtracting the 0 ng/mL background value. Rat plasma GLP-2 concentrations were interpolated using a sigmoidal, four parameter logistic equation. The mean of duplicate concentration of GLP-2 in the technical replicates for each rat were plotted against time. The integrated area under curve was calculated on Microsoft Excel where the area between each time point was calculated for each biological replicate as the sum of the areas using the Linear-Log Trapezoidal Method.

FIG. 29 shows the calculated total GLP-2 concentration over a 4 hour time course for SEQID-00240 and vehicle control. Vehicle GLP-2 concentration did not change significantly at any of the time points sampled compared to time 0, whereas SEQID-00240 showed a statistically significant increase in GLP-2 concentration relative to time 0 at 15 (P<0.00), 30 (P<0.0001) and 60 minutes (P<0.01) following SEQID-00240 gavage (Dunnett's multiple comparison test). SEQID-00240 was compared to vehicle at each time point by ordinary One-Way ANOVA with a Dunnett's multiple comparisons test post hoc analysis. GLP-2 concentration was not significantly different between treatments at time 0. GLP-2 concentrations were statistically significantly greater than vehicle at 15 (P<0.001), 30 (P<0.0001), 60 (P<0.0001) and 120 minutes (P<0.05) following treatment. These data show that an acute dose of SEQID-00240 but not vehicle induces secretion of GLP-2 in healthy fasted rodents.

FIG. 30 shows the integrated GLP-2 area under the curve over the first hour and the full 4 hours. The area under curve for GLP-2 was significantly greater in the SEQID-00240 treatment compared to vehicle when integrated over 0-60 minutes and over 0-240 minutes (P<0.005 & P<0.01, respectively, unpaired 2-tailed Student's t-test). This data indicated that acute dosing of SEQID-00240 significantly stimulated GLP-2 secretion in comparison to vehicle within the first hour of acute dosing in healthy fasted rodents.

Example 26: Effect of Orally Delivered Nutritive Polypeptides on Plasma Insulin and Incretin Levels in Humans

The insulin and incretin response to protein ingestion is predicated on the delivery of amino acids. The purpose of this study was to examine the changes in plasma insulin concentrations in response to SEQID-00426 and SEQID-00105 over a period of 240 minutes. Two groups of four apparently healthy subjects between the ages of 18 and 50 received 20 grams of the nutritive polypeptide formulations orally. All subjects were fasted overnight (>8 hrs) before starting the study. Venous blood samples were collected at specified time points (i.e. 0, 15, 30, 60, 90, 120, 150, 180, 210 and 240 minutes) following the oral ingestion of nutritive polypeptide to assess changes in plasma insulin and incretin concentrations. FIG. 31 shows the average insulin response of all subjects to SEQID-00105, and FIG. 32 shows the average fold response over baseline (time 0 min), measured as described herein. The error bars on all figures correspond to the standard error of the mean. The first phase insulin response occurs between time 0 min and 90 min. The second phase insulin response occurs between 90 min and 210 min. A 1-way ANOVA comparison across time indicates that there is a significant change in plasma insulin over time (p=0.003). A Dunnett multicomparison test indicates that the insulin values at 15 and 30 min time points are significantly different from that at time 0 min (p<0.05).

FIG. 33 shows the average insulin response of all patients to SEQID-00426, and FIG. 34 shows the average fold response over baseline (time 0 min), measured as described herein. The error bars on all figures correspond to the standard error of the mean.

FIG. 35 shows the average total Gastric Inhibitory Polypeptide (GIP) response of all patients to SEQID-00426, and FIG. 36 shows the average fold response over baseline (time 0 min), measured as described herein. The error bars on all figures correspond to the standard error of the mean.

Example 27. In Vitro Demonstration of Skeletal Muscle Cell Growth and Signaling Using Nutritive Polypeptide Amino Acid Compositions Containing Tyrosine, Arginine, and/or Leucine

The mammalian target of rapamycin (mTOR) is a protein kinase an a key regulator of cell growth, notably via protein synthesis. mTOR acts as a master regulator of cellular metabolism that nucleates two complexes, mTORC1 and mTORC2 that have different kinase specificity and distinct protein partners (citations from ESS-020).

mTOR drives protein synthesis across tissues. mTORC1 mediated response to growth signaling is gated by amino acids. The localization of the response to lysosomes couples mTOR activation to muscle protein catabolism. mTORC1 can be gated by essential amino acids (EAAs), leucine, and glutamine. Amino acids must be present for any upstream signal, including growth factors, to activate mTORC1 (citations from ESS-020).

These experiments demonstrated the capacity of arginine, tyrosine as well as leucine to modulate mTORC1 activation by measuring downstream phosphorylation of the ribosomal protein S6 (rps6) in response to stimulation by single amino acids in vitro.

Primary Rat Skeletal Muscle Cell (RSKMC) culture medium was purchased from Cell Applications (Catalog number: R150-500, San Diego, Calif.). Starvation medium DMEM/F12 was bought from Sigma (Catalog number: D9785, St. Louis, Mo.). Customized starvation medium Mod.4 was purchased from Life Technologies (Catalog number: 12500062, Grand Island, N.Y.), which does not contain any amino acids, phenol red, or glucose. Fetal bovine serum (FBS) and other growth factors were obtained from Cell Applications (Catalog number: R151-GS, San Diego, Calif.). Tissue culture flasks and clear bottom 96-well tissue culture plates were purchased from Corning Incorporated (Catalog number: 430641 and 353072, respectively, Corning, N.Y.). Trypsin/EDTA was obtained from Life Technology (Catalog number: 25200, Grand Island, N.Y.). DPBS and HBSS were also purchased from Life Technologies (Catalog number: 14190, 14175, respectively). AlphaScreen® SureFire® Ribosomal Protein S6 Assay Kits was obtained from Perkin Elmer (Catalog number: TGRS6P2S10K).

Primary Rat Skeletal Muscle Cell (RSKMC) culture. RSKMC were isolated using protocol described herein and cryopreserved in liquid nitrogen. The cells were also maintained in RSKMC medium (Cell Applications) in T75 tissue flask in a 37° C., 5% CO2 tissue culture incubator (Model 3110, Thermo Fisher Scientific). The cells were split every three day when they reached 90% confluency. RSKMC cells were cultured in RSKMC medium in T75 tissue flask to 100% confluency. The culture medium was aspirated from the culture flask and rinsed once with 10 ml of DPBS, and then 1.5 ml of 0.25% trypsin/EDTA was added to the cells. After the cells were detached from the flask, 10 ml of culture medium were added. The medium was pipetted up and down with a 10 ml pipet to detach the cells from the flask. The cells were then seeded into clear bottom 96-well tissue culture plates at a density of 50,000 cells per well. Following overnight culture in a 37° C., 5% CO2 incubator, the cells were starved over a period of 4 hours with starvation DME/F12 medium without FBS and leucine in a 37° C., 5% CO2 tissue culture incubator, then starved for another hour incubation with Hank's Buffered Salt Solution (HBSS). The cells were stimulated with different concentrations of leucine in starvation medium for 15 and 30 minutes. The cells were also treated with 5 nM of Rapamycin (R0395, Sigma) or 100 nM of Insulin (19278, Sigma) for 15 and 30 minutes. The cells were lysed in 20 μL of Lysis Buffer (Perkin Elmer) for 10 minutes at room temperature with shaking at 725 rpm. The cell lysates were stored at −80° C. and AlphaScreen® assay was performed the next day. AlphaScreen® SureFire® Ribosomal Protein S6 Assay was performed according to manufacturer's manual.

FIG. 37 shows the relative alphascreen signal (y-axis) measured at different Leucine concentrations, demonstrating that leucine stimulates phorphorylation of rps6 in primary RSkMC in a dose-dependent manner. This stimulation was inhibited in a dose dependent manner by the mTOR inhibitor, rapamycin.

FIG. 38 shows that leucine stimulates phosphorylation of rps6 in primary RSkMC in a dose-dependent manner in both a complete amino acid medium (Arg, His, Lys, Asp, Glu, Ser, Thr, Asn, Gln, Cys, Gly, Pro, Ala Val, Ile, Met, Phe, Tyr, Trp), as well as a minimal 12 amino acid mixture containing only (Arg, His, Lys, Thr, Gln, Cys, Val, Ile, Met, Phe, Tyr, and Trp) at their DME/F12 concentrations (see table E27B).

Primary skeletal muscle cells obtained from the Soleus (Sol), gastrocnemius (GS) and extensor digitorum longus (EDL) of two Sprague-Dawley rats. FIGS. 39, 40, and 41 shows that leucine stimulates mTOR RPS6 pathway using isolated primary cells in a dose dependent manner.

Arginine, tyrosine and leucine are required to fully stimulate the mTOR pathway. Cells were starved as described above in Mod.4 medium without fetal bovine serum and then stimulated in Mod.4 medium lacking each of the respective single amino acids (see table E27A and E27B for composition of Mod.4 medium and DME/F12 amino acid levels, respectively).

TABLE E27A Vitamins μM Other mM Inorganic Salts mM Choline chloride 64.1 D-Glucose 17.5 Calcium chloride, 1.05 (Dextrose) anhydrous D-Calcium 4.7 Sodium Pyruvate 0.5 Copper (II) Sulfate 5.21E−06 pantothenate Pentahydrate Folic Acid 6.01 HEPES 15 Magnesium Sulfate 0.407 (anhyd.) Niacinamide 16.56 Hypoxanthine 0.018 Magnesium Chloride 0.301 Pyrodoxine 9.88 Linoleic Acid 1.50E−04 Potassium Chloride 4.157 hydrochloride Riboflavin 0.58 Putrescine 5.03E−04 Sodium Bicarbonate 0.014 Hydrochloride Thiamine 6.44 Thioctic Acid 5.10E−04 Sodium Chloride 120.6 hydrochloride i-inositol 70 Thymidine 1.51E−03 Sodium Phosphate 0.521 Monobasic D-biotin 1.43E−02 Phenol Red 5.00 × 10{circumflex over ( )}−4 Sodium Phosphate 0.5 (%) Dibasic Vitamin B-12 0.5 Iron (III) Nitrate 1.24E−04 Nonahydrate Iron (II) Sulfate 1.50E−03 Heptahydrate Zinc Sulfate 1.50E−03 heptahydrate

TABLE E27B Amino Acids μM Glycine 250 L-Alanine 50 L-Arginine 700 L-Asparagine 57 L-Aspartic Acid 50 L-Cysteine 100 L-Glutamic Acid 100 L-Glutamine 2500 L-Histidine 150 L-Isoleucine 416 L-Leucine 451 L-Lysine 500 L-Methionine 116 L-Phenylalanine 215 L-Proline 150 L-Serine 250 L-Threonine 449 L-Tryptophan 44 L-Tyrosine 214 L-Valine 452

Primary muscle cells were starved for 2 hours, and then stimulated with 0 μM or 500 μM single amino acid in 37 C, 5% CO2 tissue culture incubator for 30 minutes. The treatment was performed in triplicate. FIG. 42 demonstrates that the combination of leucine, arginine, and tyrosine are necessary and sufficient to activate the mTOR pathway in RMSKC to the same degree as a full complement of all 20 amino acids at their DME/F12 concentrations, and that none of the individual or paired treatments of Leu, Arg, or Tyr were capable of a similar response.

FIGS. 43, 44, and 45 show the effect of a dose response of each amino acid (Leu, Arg, Tyr) in the background of all other 19 amino acids vs the other 2 amino acids (e.g. dose response of Arg in a background of Tyr and Leu) on rps6 phosphorylation. These data indicate the synergy between Leu, Arg, and Tyr is dose dependent. Comparing the 20 amino acids response at low doses of Arg to that in a comparable high Leu and Tyr background, there is a reduction in the degree of stimulation caused by the other 17 amino acids. At high doses of Arg, the response in both backgrounds equalizes. Comparing the 20 amino acids response at low doses of Leu to that in a comparable high Arg and Tyr background, there is no difference in rps6 phosphorylation response. Comparing the 20 amino acid response at low doses of Tyr to that in a comparable high Leu and Arg background, the other 17 amino acids can further potentiate the mTOR response.

Example 28. Determination of Safety and Lack of Toxicity of Leucine-Enriched Nutritive Polypeptides Following Oral Consumption by Rodents

An acute toxicology study was completed to confirm the expected safety of nutritive polypeptides SEQID-00105, SEQID-00363, and SEQID-00426 in rodents.

Each study group contained 5 male rats and 5 female rats (10 Wistar, 6-7 weeks old, Males 220-250 g, Females 180-200 g). Test formulations were 350 g/L nutritive polypeptide and aqueous buffer as a control. Animals were acclimated for 1 week upon arrival and given a diet of regular chow always available. Before dosing, animals were weighed and pre-bleeds were taken. Single dosage of 10 ml/kg was completed via oral gavage.