Safe Intelligent Content Modification
A computer-implemented method for deflecting abnormal computer interactions includes receiving, at a computer server system and from a client computer device that is remote from the computer server system, a request for web content; identifying, by computer analysis of mark-up code content that is responsive to the request, executable code that is separate from, but programmatically related to, the mark-up code content; generating groups of elements in the mark-up code content and the related executable code by determining that the elements within particular groups are programmatically related to each other; modifying elements within particular ones of the groups consistently so as to prevent third-party code written to interoperate with the elements from modifying from interoperating with the modified elements, while maintain an ability of the modified elements within each group to interoperate with each other; and recoding the mark-up code content and the executable code to include the modified elements.
Latest Shape Security Inc. Patents:
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims priority under 35 U.S.C. §119(e)(1), to U.S. Provisional Application Ser. No. 61/800,907, filed on Mar. 15, 2013, the entire contents of which are incorporated herein by reference.
This document generally relates to computer security that involves modifying content served to client computers so as to prevent malicious activity by those computers.
Computer fraud is big business both for the fraudsters and the people who try to stop them. One common area of computer fraud involves attempts by organizations to infiltrate computers of ordinary people, and by that action to trick those people into giving up confidential information, such as credit card information and access codes. For example, via an exploit commonly termed “Man in the Browser,” a user's computer can be provided with code that intercepts legitimate communications by the user, such as with the user's bank, and does so after the communications have been decrypted, e.g., by a web browser on the computer. Such code may alter the interface that the user sees, such as by generating an interface that looks to the user like their bank is requesting particular information (e.g., a PIN number) when in fact the bank would never request such information via a web page. Alternatively, the code may generate an interface that indicates to a user that a banking or shopping transaction was executed as the user requested, when in fact, the illegal organization altered the transaction so as to send the user's money to an entity associated with the organization.
Various approaches have been taken to identify and prevent such malicious activity. For example, programs have been developed for operation on client computers or at the servers of the organizations that own and operate the client computer to detect improper activity.
In general, creating a moving, unpredictable target by modifying aspects of web code each time it is served can prevent or deter a wide variety of computer attacks. For example, such techniques can be used to combat credential stuffing, in which malicious parties obtain leaked or cracked user credentials for a given web service and then use automated bots to perform credential testing at other websites or services based on the illicitly obtained credentials. By changing the content and structure of the web code each time it served, bots that seek to either listen for user credentials or to perform automated credential testing may be thwarted by random changes in the web code that significantly complicate the bot's task of determining how to effectively interact with the web code.
Likewise, other forms of computer attacks can also be prevented or deterred by the web code transformations described in this document. Some of these attacks include: (a) denial of service attacks, and particularly advanced application denial of service attacks, where a malicious party targets a particular functionality of a website (e.g., a widget or other web application) and floods the server with requests for that functionality until the server can no longer respond to requests from legitimate users; (b) rating manipulation schemes in which fraudulent parties use automated scripts to generate a large number of positive or negative reviews of some entity such as a marketed product or business in order to artificially skew the average rating for the entity up or down; (c) fake account creation in which malicious parties use automated scripts to establish and use fake accounts on one or more web services to engage in attacks ranging from content spam, e-mail spam, identity theft, phishing, ratings manipulation, fraudulent reviews, and countless others; (d) fraudulent reservation of rival goods, where a malicious party exploits flaws in a merchant's website to engage in a form of online scalping by purchasing all or a substantial amount of the merchant's inventory and quickly turning around to sell the inventory at a significant markup; (e) ballot stuffing, where automated bots are used to register a large number of fraudulent poll responses; (f) website scraping, where both malicious parties and others (e.g., commercial competitors), use automated programs to obtain and collect data such as user reviews, articles, or technical information published by a website, and where the scraped data is used for commercial purposes that may threaten to undercut the origin website's investment in the scraped content; and (g) web vulnerability assessments in which malicious parties scan any number of websites for security vulnerabilities by analyzing the web code and structure of each site.
The systems, methods, and techniques for web code modifications described in this paper can prevent or deter each of these types of attacks. For example, by randomizing the implicit references in web code that may be used for making requests to a web server or by randomly injecting distractor fields into the code that were not originally part of the code provided by the web server, the effectiveness of bots and other malicious automated scripts is substantially diminished.
Such modification of the served code can help to prevent bots or other malicious code from exploiting or even detecting weaknesses in the web server system. For example, the names of functions or variables may be changed in various random manners each time a server system serves the code. As noted above, such constantly changing modifications may interfere with the ability of malicious parties to identify how the server system operates and web pages are structured, so that the malicious party cannot generate code to automatically exploit that structure in dishonest manners. In referring to random modification, this document refers to changes between different sessions or page loads that prevent someone at an end terminal or controlling an end terminal to identify a pattern in the server-generated activity. For example, a reversible function may change the names when serving the code, and may interpret any HTTP requests received back from a client by changing the names in an opposite direction (so that the responses can be interpreted properly by the web servers even though the responses are submitted by the clients with labels that are different than those that the web servers originally used in the code). Such techniques may create a moving target that can prevent malicious organizations from reverse-engineering the operation of a web site so as to build automated bots that can interact with the web site, and potentially carry out Man-in-the-Browser and other Man-in-the-Middle operations and attacks.
The techniques discussed here may be carried out by a server subsystem that acts as an adjunct to a web server system that is commonly employed by a provider of web content. For example, as discussed in more detail below, an internet retailer may have an existing system by which it presents a web storefront at a web site (e.g., www.examplestore.com), interacts with customers to show them information about items available for purchase through the storefront, and processes order and payment information through that same storefront. The techniques discussed here may be carried out by the retailer adding a separate server subsystem (either physical or virtualized) that stands between the prior system and the internet. The new subsystem may act to receive web code from the web servers (or from a traffic management system that receives the code from the web servers), may translate that code in random manners before serving it to clients, may receive responses from clients and translate them in the opposite direction, and then provide that information to the web servers using the original names and other data. In addition, such a system may provide the retailer or a third party with whom the retailer contracts (e.g., a web security company that monitors data from many different clients and helps them identify suspect or malicious activity) with information that identifies suspicious transactions. For example, the security subsystem may keep a log of abnormal interactions, may refer particular interactions to a human administrator for later analysis or for real-time intervention, may cause a financial system to act as if a transaction occurred (so as to fool code operating on a client computer) but to stop such a transaction, or any number of other techniques that may be used to deal with attempted fraudulent transactions.
In one implementation, a computer-implemented method for deflecting abnormal computer interactions is disclosed. The method comprises receiving, at a computer server system and from a client computer device that is remote from the computer server system, a request for web content; identifying, by computer analysis of mark-up code content that is responsive to the request, executable code that is separate from, but programmatically related to, the mark-up code content; generating groups of elements in the mark-up code content and the related executable code by determining that the elements within particular groups are programmatically related to each other; modifying elements within particular ones of the groups consistently so as to prevent third-party code written to interoperate with the elements from modifying from interoperating with the modified elements, while maintain an ability of the modified elements within each group to interoperate with each other; and recoding the mark-up code content and the executable code to include the modified elements. The method can also include serving the recoded mark-up code content and executable code to the client computer device. Moreover, the method can comprise performing the steps of receiving, identifying, generating, modifying, and recoding repeatedly for each of multiple different requests from different client computers, wherein the elements within particular ones of the groups are modified in different manners for each of the requests. The method can also comprise generating instrumentation code configured to monitor interaction with the recoded mark-up code, executable code, or both, and to report to the computer server system information that identifies abnormalities in the interaction. In addition, the method may comprise receiving, at the computer server system and from the instrumentation code executing on the client computing device, a report of activity by alien code attempting to interoperate with the recoded mark-up code, executable code, or both.
In another implementation, a computer system for recoding web content served to client computers is disclosed. The system can include a web server system configured to provide computer code in multiple different formats in response to requests from client computing devices; and a security intermediary that is arranged to (i) receive the computer code from the web server before the resource is provided to the client computing devices, (ii) identify common elements in the different formats of the computer code by determining that the common elements interoperate with each other when the code is executed; (iii) modify the common elements in a consistent manner across the different formats of the computer code; and (iv) recode the computer code using the modified common elements. The system can be further configured to serve the recoded computer code to particular client computing devices that requested the code. Also, the security intermediary can be programmed to perform actions (i) through (iv) in response to each request for content, and to modify the common elements in different manners for different requests for the same computer code. The system may additionally include an instrumentation module programmed to generate instrumentation code configured to monitor interaction with the recoded mark-up code, executable code, or both, and to report to the computer server system information that identifies abnormalities in the interaction. The system can include a computer interface configured to receive resources from a web server that has been served in the form of computer code to client computing devices in response to requests from the client computing devices.
Other features and advantages will be apparent from the description and drawings, and from the claims.
BRIEF DESCRIPTION OF THE DRAWINGS
Like reference numbers and designations in the various drawings indicate like elements.
Referring to a general system overview in
The system 100 can include a security intermediary 102 that is logically located between the web server system 104 and one or more client devices 114. The security intermediary 102 can receive a portion or all of the traffic, including web code, transmitted between the various client devices 112 and the web server system 104 (and vice-versa). In compliance with a governing security policy, when the web server system 104 provides a resource such as a web page in response to a client computer 112 request, the web server system 104 can forward the response to the security intermediary 102 (perhaps through a load balancer 106 or other data management devices or applications) so that the web code for the response can be modified and also supplemented with instrumentation code. Modification of the web code may be random in certain manners, and can differ each time a response is served to a client to prevent malware from learning the structure or operation of the web server, or from being developed by a malicious organization that learns the structure or operation. Additionally, the web code can be supplemented with instrumentation code that is executable on the client computer 112. The instrumentation code may detect when abnormal behavior occurs on the client computer 112, such as possible unauthorized activity by the malware, and can report the occurrence of such activity to the security intermediary 102.
When security intermediary 102 receives requests (e.g., HTTP requests) from clients in response to previously-served modified web code being processed on those clients, it can apply reverse modifications to the requests before forwarding the requests to the web server system 104. Additionally, the security intermediary 102 can receive reports from the instrumentation code that indicate abnormal behavior on the client computer 112, and the security intermediary 102 can log these events, alert the web server system 104 to possible malicious activity, and send reports about such events to a central security server (not shown). The central security server may, in some implementations, analyze reports in the aggregate from multiple security intermediaries 102, and/or reports from multiple client devices 114 and multiple computing sessions and page loads. In performing such activities, the security intermediary may rely on external resources 104, such as when the security intermediary 102 is located at a customer premise or data center, and the resources are available from a central security provider, such as a company that supplied the security intermediary 102 to the customer
Referring now to particular components of the system, a content decoding, analyzing and re-encoding module 120 sits at the middle of the system and may be adjust to or implement the structures identified in the circle shown to interact with the content decoding, analyzing and re-encoding module 120. The content decoding, analyzing and re-encoding module 120 may receive a request aimed at a web server system (e.g., system 104 in
Referring now to
The next stage of policy application has to do with matching content to actions. Content may be identified within a DOM for the content to be served using XPATH, regular expressions, or by other means. Actions include substitutions, the addition of content and other actions that may be provided as extensions to the system. These operations are represented by the Substitution 136, Additional Content 138, and Other Actions 134 subsystems in
Once a policy and a logical HTTP transaction are received by the executor 126, the HTTP response and the portion of the policy that identifies content to be acted upon are forwarded to a content interpreter 124 (
If the decoding process identifies the need to resolve external references, those references are resolved by the HTTP client 122. External references include script or style tags within HTML content that reference content to be delivered as part of another HTTP request. If the content is static and reported as not modified, the content interpreter 124 will attempt to locate previously processed and analyzed versions of content within an interpreted content representation cache 124, 144.
To ensure that the analysis completes, the system 200 imposes limits on the level of analysis that will be performed. Limits may be based on complexity or on clock time, or other appropriate measure. Complexity limits may consist of how deep to examine the various syntax trees that are created during the decoding phase or how many iterations of loops that are encountered should be unrolled. Time-based limits impose soft real time limits on the computing time to perform analysis. Time-based limits may allow subsequent requests involving identical content to succeed where initial requests failed as some analysis results may be cached.
Because analysis may or may not be successful, it is subject to policies about whether the desired modification should be performed regardless of the success of the analysis. Regardless of whether or not a particular policy is applied, the executer 126 reports information about its analysis to the policy engine 128. Analysis results are logged for review and analysis by operators of the system. Policy application status is also reported for such review and analysis.
Where content modifications such as substitutions result in changes to the web content's implicit API, information about the substitution may be returned to the policy engine 128 to associate with a session or to be encoded directly into the modified content. Substitution information is used to translate subsequent requests. As noted further below, the information may be stored by the system or may be encrypted and provided to the requesting client computer, to be stored as a cookie or other component at that computer, and to be returned by the computer with any subsequent requests. The appropriateness of a particular specific strategy depends on the specific application of content modification.
Once a response is modified, the content renderer translates the system's internal representation of the modified content using the content re-encoders 148. Re-encoded modified content can then be transmitted to the client computer that made the request using the HTTP hander 126. Such operations may be repeated for each request/answer between a client computer and a web server system, and the modifications can be different each time, even when the same or substantially the same content is requested (e.g., the same page is served).
As a particular example of the processing of a request from a client computer, consider the HTTP POST in relation to
The security intermediary 102 may include one or more computing devices that are separate from computing devices of the web server 104. In such implementations, the security intermediary 102 may communicate directly with the web server through a networking cable such as an Ethernet cable or fiber optic line (and typically through many such connections). The intermediary 102 can also communicate with the web server 104 through a network such as a local area network (“LAN”). In some instances, the intermediary 102 can be installed on the same premises as the web server 104 so that operators or administrators of the web server 104 can also maintain the intermediary 102 locally (particularly for large-volume applications). Installing the intermediary 102 in close proximity to the web server 104 can have one or more additional benefits including reduced transmission latency between the intermediary 102 and the web server 104 and increased security that stems from a private connection between the intermediary 102 and the web server 104 that is isolated from public networks such as the internet 110. This configuration can also avoid any need to encrypt communication between the intermediary 102 and the web server 104, which can be computationally expensive and slow.
In some implementations, the security intermediary 102 may include one or more computing devices that are separate from the computing devices of the web server 104, and that are connected to the web server 104 through a public network such as the internet 110. For example, a third-party security company may maintain one or more security intermediaries 102 on the security company's premises. The security company may offer services to protect websites and/or web servers 104 from exploitation according to the techniques described herein. The security intermediary 102 could then act as a reverse proxy for the web server 104, receiving outbound traffic from the web server 104 over the internet 110, processing the traffic, and forwarding the processed traffic to one or more requesting client computer 112. Likewise, the intermediary 102 may receive incoming traffic from client computer 112 over the internet 110, process the incoming traffic, and forward the processed traffic to the web server 104 over the internet 110. In this configuration, communication between the security intermediary 102 and the web server 104 may be encrypted and secured using protocols such as HTTPS to authenticate the communication and protect against interception or unauthorized listeners over the internet 110. In some embodiments, a private line or network may connect the web server 104 to the remote security intermediary 102, in which case the system 100 may use unencrypted protocols to communicate between the intermediary 102 and web server 104.
In some implementations, security intermediary 102 may be a virtual subsystem of web server 104. For example, the one or more computing devices that implement web server 104 may also include software and/or firmware for the security intermediary 102. The system 100 may include the security intermediary 102 as software that interfaces with, and/or is integrated with, software for the web server 104. For example, when the web server 104 receives a request over the internet 110, the software for the security intermediary 102 can first process the request and then submit the processed request to the web server 104 through an API for the web server 104 software. Similarly, when the web server 104 responds to a request, the response can be submitted to the security intermediary 102 software through an API for processing by security intermediary 102 before the response is transmitted over the internet 110.
In some configurations of the system 100, two or more security intermediaries 102 may serve the web server 104. Redundant security intermediaries 102 can be used to reduce the load on any individual intermediary 102 and to protect against failures in one or more security intermediaries. The system 100 can also balance traffic among two or more security intermediaries 102. For example, the system 100 may categorize traffic into shards that represent a logical portion of traffic to or from a website. Shards may be categorized according to client identity, network information, URL, the domain or host name in an HTTP request, identity of resources requested from the web server 104, location of resources requested from the web server 104, and/or the content of a request or the requested resource 104.
By this system then, content to be served by a web server system to a client computer (and to many thousands of client computers via many thousands of requests) can be altered and appended—altered to prevent malware from interacted with it in a malicious manner, and appended to provide instrumentation code that monitors the operation of the code on the client device and reports any abnormal actions so that a central system can analyze those actions to identify the presence of malware in a system. As described in more detail in
The system 200 in this example is a system that is operated by or for a large number of different businesses that serve web pages and other content over the internet, such as banks and retailers that have on-line presences (e.g., on-line stores, or on-line account management tools). The main server systems operated by those organizations or their agents are designated as web servers 204a-204n, and could include a broad array of web servers, content servers, database servers, financial servers, load balancers, and other necessary components (either as physical or virtual servers).
A set of security server systems 202a to 202n are shown connected between the web servers 204a to 204n and a network 210 such as the internet. Although both extend to n, the actual number of sub-systems could vary. For example, certain of the customers could install two separate security server systems to serve all of their web server systems (which could by one or more), such as for redundancy purposes. The particular security server systems 202a-202n may be matched to particular ones of the web server systems 204a-204n, or they may be at separate sites, and all of the web servers for various different customers may be provided with services by a single common set of security servers 202a-202n (e.g., when all of the server systems are at a single co-location facility so that bandwidth issues are minimized).
A key for the function that encodes and decodes such strings can be maintained by the security server system 202 along with an identifier for the particular client computer so that the system 202 may know which key or function to apply, and may otherwise maintaining a state for the client computer and its session. A stateless approach may also be employed, whereby the security server system 202 encrypts the state and stores it in a cookie that is saved at the relevant client computer. The client computer may then pass that cookie data back when it passes the information that needs to be decoded back to its original status. With the cookie data, the system 202 may use a private key to decrypt the state information and use that state information in real-time to decode the information from the client computer. Such a stateless implementation may create benefits such as less management overhead for the server system 202 (e.g., for tracking state, for storing state, and for performing clean-up of stored state information as sessions time out or otherwise end) and as a result, higher overall throughput.
An instrumentation module 226 is programmed to add active code to the content that is served from a web server. The instrumentation is code that is programmed to monitor the operation of other code that is served. For example, the instrumentation may be programmed to identify when certain methods are called, when those methods have been identified as likely to be called by malicious software. When such actions are observed by the instrumentation code to occur, the instrumentation code may be programmed to send a communication to the security server reporting on the type of action that occurred and other meta data that is helpful in characterizing the activity. Such information can be used to help determine whether the action was malicious or benign.
The instrumentation code may also analyze the DOM on a client computer in predetermined manners that are likely to identify the presence of and operation of malicious software, and to report to the security servers 202 or a related system. For example, the instrumentation code may be programmed to characterize a portion of the DOM when a user takes a particular action, such as clicking on a particular on-page button, so as to identify a change in the OM before and after the click (where the click is expected to cause a particular change to the DOM if there is benign code operating with respect to the click, as opposed to malicious code operating with respect to the click). Data that characterizes the DOM may also be hashed, either at the client computer or the server system 202, to produce a representation of the DOM that is easy to compare against corresponding representations of DOMs from other client computers. Other techniques may also be used by the instrumentation code to generate a compact representation of the DOM or other structure expected to be affected by malicious code in an identifiable manner.
Instrumentation code may also be used to gather information about the entity interacting with the content. This information may be helpful in distinguishing between human and non-human actors. For example, particular interactions or patterns of interaction with content on the client computers may be analyzed to determine whether the interactions are more likely the result of a legitimate user interaction with the content, a malicious or otherwise unwanted human interaction with the content from a remote user operating in the background, or a non-human actor such as an automated bot (malicious) or a browser plug-in (benign).
As noted, the content from web servers 204a-204n, as encoded by decode, analysis, and re-encode module 224, may be rendered on web browsers of various client computers. Uninfected clients computers 212a-212n represent computers that do not have malicious code programmed to interfere with a particular site a user visits or to otherwise perform malicious activity. Infected clients computers 214a-214n represent computers that do have malicious code (218a-218n, respectively) programmed to interfere with a particular site a user visits or to otherwise perform malicious activity. In certain implementations, the client computers 212, 214 may also store the encrypted cookies discussed above and pass such cookies back through the network 210. The client computers 212, 214 will, once they obtain the served content, implement DOMs for managing the displayed web pages, and instrumentation code may monitor the DOM as discussed above. Reports of illogical activity (e.g., software on the client device calling a method that does not exist in the downloaded and rendered content)
The reports from the instrumentation code may be analyzed and processed in various manners in order to determine how to respond to particular abnormal events, and to track down malicious code via analysis of multiple different similar interactions. For small-scale analysis, each web site operator may be provided with a single security console 207 that provides analytical tools for a single site or group of sites. For example, the console 207 may include software for showing groups of abnormal activities, or reports that indicate the type of code served by the web site that generates the most abnormal activity. For example, a security officer for a bank may determine that defensive actions are needed if must of the reported abnormal activity for its web site relates to content elements corresponding to money transfer operations—an indication that stale malicious code may be trying to access such elements surreptitiously.
A central security console may connect to a large number of web content providers, and may be run, for example, by an organization that provides the software for operating the security server systems 202a-202n. Such console 208 may access complex analytical and data analysis tools, such as tools that identify clustering of abnormal activities across thousands of client computers and sessions, so that an operator of the console 208 can focus on those cluster in order to diagnose them as malicious or benign, and then take steps to thwart any malicious activity.
In certain other implementations, the console 208 may have access to software for analyzing telemetry data received from a very large number of client computers that execute instrumentation code provided by the system 200. Such data may result from forms being re-written across a large number of web pages and web sites to include content that collects system information such as browser version, installed plug-ins, screen resolution, window size and position, operating system, network information, and the like. In addition, user interaction with served content may be characterized by such code, such as the speed with which a user interacts with a page, the path of a pointer over the page, and the like. Such collected telemetry data, across many thousands of sessions, may be used by the console 208 to identify what is “natural” interaction with a particular page and what is “unnatural” interaction that is likely the result of a bot interacting with the content. Statistical and machine learning methods may be used to identify patterns in such telemetry data, and to resolve bot candidates to particular client computers. Such client computers may then be handled in special manners by the system 200, may be blocked from interaction, or may have their operators notified that their computer is running bad software.
The process begins at box 302, where a request for web content is received, such as from a client computer operated by an individual seeking to perform a banking transaction at a website for the individual's bank. The request may be in the form of an HTTP request and may be received by a load balancer operated by, or for, the bank. The load balancer may recognize the form of the request and understand that it is to be handled by a security system that the bank has installed to operate along with its web server system. The load balancer may thus provide the request to the security system, which may forward it to the web server system after analyzing the request (e.g., to open a tracking session based on the request), or may provide the request to the web server system and also provide information about the request to the security system in parallel.
At box 308, the process generates groups from such programmatically related elements. For example, the process may flay portions of the code that was to be served, may copy portions of the code into a cash for further processing or may otherwise identify the programmatically related code across the different formats of code so that it can be analyzed and recoded.
At box 310, the process modifies the groups of elements in a consistent manner across the different types of code. For example, the security server system may be programmed to identify names of parameters, methods, or other items in the code, and to change those names consistently throughout the code so that, for example, calls to a particular method will be processed properly by that renamed method. Such renaming, as described above, may involve generating a random new name for content that will not be displayed to the user, where randomness is exhibited in making selections that thwart a malicious party from being able to predict what names will be used in any particular page load or session.
In addition, the code that is served by the security system may be supplemented with instrumentation code that runs on the computer browser and monitors interaction with the web page. For example, the instrumentation code may look for particular method calls or other calls to be made, such as when the calls or actions relate to a field in a form that is deemed to be subject to malicious activity, such as a client ID number field, a transaction account number field, or a transaction amount field. When the instrumentation code observes such activity on the client device, it will report that activity along with metadata that helps to characterize the activity, and at box 314, the process receives such reports from the instrumentation code and processes them, such as by forwarding them to a central security system that may analyze them to determine whether such activity is benign or malicious.
For purposes of additional illustration, particular cases of transforming code for delivery through a security server system are illustrated.
In the first example an original page is shown with human-recognizable labels of “democss,” “demoinput1,” demoinput2,” and “blue”:
In the following transformed page, those labels have been replaced with randomly generated text, where the page will perform for a user in the same way as before the transformations. In this example, each of the input element (demoinput1, demoinput2) in the original page, will now have a set of input elements (introduced by Shape's safe-intelligent-content-modification engine) to confuse the bots. The Shape's client side library will determine which element dsjafhg897s or dssd8mfn77 pertinent to demoinput1 and the which element ksjfhg098 or dsfkjh9877 pertinent to demoinput2 will be marked for display. The CSS property will be chosen dynamically based on the rule set by safe-intelligent-content-modification engine.
In the above example,
In the above example, the security server system will detect that some origin generated content collides with a subset of SICM's transformation(s). In such case, SICM algorithm will regenerate the value to avoid collision before sending the bits to the visitor webpage. The regenerated code without the collision:
When the security server system content is not safely modifiable, the system marks the page as not modifiable and passes the form without breaking the functionality/style of the original website. The security server system can determine that content is not safely modifiable according to policies that indicate complexity limits for modifications. If, for a given policy, the system determines that content is too complex for safe modification and thus exceeds the policy's complexity limits, then the security server system will mark the page as not modifiable and pass the form without breaking the functionality/style of the original website. An original page:
The origin website creates a unique session ID and the form elements are generated by appending a Fibonacci number to username and password. The regenerated code:
The system 400 includes a processor 410, a memory 420, a storage device 430, and an input/output device 440. Each of the components 410, 420, 430, and 440 are interconnected using a system bus 450. The processor 410 is capable of processing instructions for execution within the system 400. The processor may be designed using any of a number of architectures. For example, the processor 410 may be a CISC (Complex Instruction Set Computers) processor, a RISC (Reduced Instruction Set Computer) processor, or a MISC (Minimal Instruction Set Computer) processor.
In one implementation, the processor 410 is a single-threaded processor. In another implementation, the processor 410 is a multi-threaded processor. The processor 410 is capable of processing instructions stored in the memory 420 or on the storage device 430 to display graphical information for a user interface on the input/output device 440.
The memory 420 stores information within the system 400. In one implementation, the memory 420 is a computer-readable medium. In one implementation, the memory 420 is a volatile memory unit. In another implementation, the memory 420 is a non-volatile memory unit.
The storage device 430 is capable of providing mass storage for the system 400. In one implementation, the storage device 430 is a computer-readable medium. In various different implementations, the storage device 430 may be a floppy disk device, a hard disk device, an optical disk device, or a tape device.
The input/output device 440 provides input/output operations for the system 400. In one implementation, the input/output device 440 includes a keyboard and/or pointing device. In another implementation, the input/output device 640 includes a display unit for displaying graphical user interfaces.
The features described can be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. The apparatus can be implemented in a computer program product tangibly embodied in an information carrier, e.g., in a machine-readable storage device for execution by a programmable processor; and method steps can be performed by a programmable processor executing a program of instructions to perform functions of the described implementations by operating on input data and generating output. The described features can be implemented advantageously in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. A computer program is a set of instructions that can be used, directly or indirectly, in a computer to perform a certain activity or bring about a certain result. A computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
Suitable processors for the execution of a program of instructions include, by way of example, both general and special purpose microprocessors, and the sole processor or one of multiple processors of any kind of computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. The essential elements of a computer are a processor for executing instructions and one or more memories for storing instructions and data. Generally, a computer will also include, or be operatively coupled to communicate with, one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
To provide for interaction with a user, the features can be implemented on a computer having a display device such as a CRT (cathode ray tube) or LCD (liquid crystal display) monitor for displaying information to the user and a keyboard and a pointing device such as a mouse or a trackball by which the user can provide input to the computer. Additionally, such activities can be implemented via touchscreen flat-panel displays and other appropriate mechanisms.
The features can be implemented in a computer system that includes a back-end component, such as a data server, or that includes a middleware component, such as an application server or an Internet server, or that includes a front-end component, such as a client computer having a graphical user interface or an Internet browser, or any combination of them. The components of the system can be connected by any form or medium of digital data communication such as a communication network. Examples of communication networks include a local area network (“LAN”), a wide area network (“WAN”), peer-to-peer networks (having ad-hoc or static members), grid computing infrastructures, and the Internet.
The computer system can include clients and servers. A client and server are generally remote from each other and typically interact through a network, such as the described one. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any inventions or of what may be claimed, but rather as descriptions of features specific to particular implementations of particular inventions. Certain features that are described in this specification in the context of separate implementations can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of various system components in the implementations described above should not be understood as requiring such separation in all implementations, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.
Thus, particular implementations of the subject matter have been described. Other implementations are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In certain implementations, multitasking and parallel processing may be advantageous. In some implementations, the subject matter may be embodied as methods, systems, devices, and/or as an article or computer program product. The article or computer program product may comprise one or more computer-readable media or computer-readable storage devices, which may be tangible and non-transitory, that include instructions that may be executable by one or more machines such as computer processors.
1. A computer-implemented method for deflecting abnormal computer interactions, the method comprising:
- identifying, by computer analysis of mark-up code content at a computer server system, executable code that is separate from, but programmatically related to, the mark-up code content;
- analyzing the mark-up code content and executable code related to the mark-up code content, to identify elements that can be altered without changing a manner in which code containing the identified elements is presented on a computer;
- generating one or more groups of elements in the mark-up code content and the related executable code by determining that particular elements within particular groups are programmatically related to each other;
- generating a mapping that identifies locations, in the mark-up code content, the executable code, or both, of the identified elements that are in the groups of elements;
- repeatedly modifying elements within particular ones of the groups in response to different requests for the mark-up code content, the modifying being performed (a) consistently across multiple elements for a particular request, but (b) differently as between different requests, so as to prevent third-party code written to interoperate with the elements from interoperating with the modified elements, while maintaining an ability of the modified elements within each group of the one or more groups to interoperate with each other, the modifying using the generated mapping for modifying elements in response to multiple separate requests for the mark-up code content; and
- repeatedly recoding the mark-up code content and the executable code to include the modified elements, in response to requests for the mark-up code from multiple client devices.
2. The computer-implemented method of claim 1, further comprising serving the recoded mark-up code content and executable code to client computer devices that request the mark-up code.
3. The computer-implemented method of claim 1, further comprising performing the steps of identifying, generating, modifying, and recoding repeatedly for each of multiple different requests from the multiple client devices, wherein the elements within particular ones of the groups are modified, using the mapping, in different manners for each of the multiple different requests.
4. The computer-implemented method of claim 1, further comprising generating instrumentation code configured to monitor interaction with the recoded mark-up code, executable code, or both, and to report to the computer server system information that identifies abnormalities in the monitored interaction.
5. The computer-implemented method of claim 4, further comprising receiving, at the computer server system and from instrumentation code executing on one of the client devices, a report of activity by alien code attempting to interoperate with the recoded mark-up code, executable code, or both.
6. The computer-implemented method of claim 5, wherein the attempt to interoperate comprises an attempt to alter a document object model for a web browser on the one of the client devices.
7. The computer-implemented method of claim 1, wherein generating the one or more groups of elements comprises identifying elements that address or are addressed by a common name.
8. The computer-implemented method of claim 7, wherein the common name is a common name of an element, method, function, or object.
9. The computer-implemented method of claim 7, wherein modifying the elements comprises changing the common name in a consistent manner across the elements.
10. The computer-implemented method of claim 8, wherein changing the common name comprises changing the common name to a random string of characters.
12. The computer-implemented method of claim 1, further comprising modifying elements in cascading style sheet (CSS) code identified as being programmatically related to the HTML code.
13. A computer system for recoding web content served to client computers, the system comprising:
- an interface, executed on one or more processors from code stored on one or more non-transitory media, for receiving information from a web server system configured to provide computer code in multiple different formats in response to requests from client computing devices; and
- a security intermediary, executed on the one or more processors from the code stored on the one or more non-transitory media, that is arranged to (i) receive the computer code from the interface before the resource is provided to the client computing devices, (ii) identify common elements in the different formats of the computer code by determining that the common elements interoperate with each other when the code is executed and generate a mapping that identifies locations of the identified common elements in the different formats of the computer code; (iii) use the mapping to modify the common elements in a consistent manner across the different formats of the computer code within particular servings of the computer code, and in different manners between different servings of the computer code; and (iv) recode the computer code using the modified common elements in manners that differs, with respect to particular elements, for different requests for the computer code, so as to interfere with attempts by malware to interoperate with the code.
14. The computer-implemented system of claim 13, wherein the system is further configured to serve the recoded computer code to particular client computing devices that requested the code.
15. The computer-implemented system of claim 13, wherein the security intermediary is programmed to perform actions (i) through (iv) in response to each request for content, and to modify the multiple different common elements in different manners as compared to how the corresponding elements were modified for different requests for the same computer code.
16. The computer-implemented system of claim 13, further comprising an instrumentation module programmed to generate instrumentation code configured to monitor interaction with the recoded mark-up code, executable code, or both, and to report to the computer server system information that identifies abnormalities in the monitored interaction.
17. The computer-implemented system of claim 16, wherein the system is further programmed to receive from the instrumentation code executing on a client computing device, a report of activity by alien code attempting to interoperate unsuccessfully with the recoded computer code by attempts to interoperate with a version of the computer code that has not been recoded, and to generate an alert related to such attempting.
18. The computer-implemented system of claim 17, wherein the attempt to interoperate comprises an attempt to alter a document object model for a web browser on the client computer.
19. The computer-implemented system of claim 13, wherein identifying the common elements comprises identifying elements that address or are addressed by a common name in the computer code.
20. The computer-implemented system of claim 19, wherein the common name is a common name of an element, method, function, or object.
21. The computer-implemented system of claim 20, wherein modifying the elements comprises changing the common name in a consistent manner across the elements.
22. The computer-implemented system of claim 21, wherein changing the common name comprises changing the common name to a random string of characters.
Filed: Oct 16, 2013
Publication Date: Sep 18, 2014
Applicant: Shape Security Inc. (Palo Alto, CA)
Inventors: Justin D. Call (Santa Clara, CA), Xiaoming Zhou (Sunnyvale, CA), Xiaohan Huang (Cupertino, CA), Subramanian Varadarajan (San Jose, CA), Roger S. Hoover (Granite Canon, WY)
Application Number: 14/055,704
International Classification: H04L 29/06 (20060101);