diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000000000000000000000000000000000000..8b400c7ab81b7b18baff3f81d597f5e511883134 --- /dev/null +++ b/LICENSE @@ -0,0 +1,347 @@ +The GNU General Public License (GPL) + +Version 2, June 1991 + +Copyright (C) 1989, 1991 Free Software Foundation, Inc. +51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + +Everyone is permitted to copy and distribute verbatim copies of this license +document, but changing it is not allowed. + +Preamble + +The licenses for most software are designed to take away your freedom to share +and change it. By contrast, the GNU General Public License is intended to +guarantee your freedom to share and change free software--to make sure the +software is free for all its users. This General Public License applies to +most of the Free Software Foundation's software and to any other program whose +authors commit to using it. (Some other Free Software Foundation software is +covered by the GNU Library General Public License instead.) You can apply it to +your programs, too. + +When we speak of free software, we are referring to freedom, not price. Our +General Public Licenses are designed to make sure that you have the freedom to +distribute copies of free software (and charge for this service if you wish), +that you receive source code or can get it if you want it, that you can change +the software or use pieces of it in new free programs; and that you know you +can do these things. + +To protect your rights, we need to make restrictions that forbid anyone to deny +you these rights or to ask you to surrender the rights. These restrictions +translate to certain responsibilities for you if you distribute copies of the +software, or if you modify it. + +For example, if you distribute copies of such a program, whether gratis or for +a fee, you must give the recipients all the rights that you have. You must +make sure that they, too, receive or can get the source code. And you must +show them these terms so they know their rights. + +We protect your rights with two steps: (1) copyright the software, and (2) +offer you this license which gives you legal permission to copy, distribute +and/or modify the software. + +Also, for each author's protection and ours, we want to make certain that +everyone understands that there is no warranty for this free software. If the +software is modified by someone else and passed on, we want its recipients to +know that what they have is not the original, so that any problems introduced +by others will not reflect on the original authors' reputations. + +Finally, any free program is threatened constantly by software patents. We +wish to avoid the danger that redistributors of a free program will +individually obtain patent licenses, in effect making the program proprietary. +To prevent this, we have made it clear that any patent must be licensed for +everyone's free use or not licensed at all. + +The precise terms and conditions for copying, distribution and modification +follow. + +TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + +0. This License applies to any program or other work which contains a notice +placed by the copyright holder saying it may be distributed under the terms of +this General Public License. The "Program", below, refers to any such program +or work, and a "work based on the Program" means either the Program or any +derivative work under copyright law: that is to say, a work containing the +Program or a portion of it, either verbatim or with modifications and/or +translated into another language. (Hereinafter, translation is included +without limitation in the term "modification".) Each licensee is addressed as +"you". + +Activities other than copying, distribution and modification are not covered by +this License; they are outside its scope. The act of running the Program is +not restricted, and the output from the Program is covered only if its contents +constitute a work based on the Program (independent of having been made by +running the Program). Whether that is true depends on what the Program does. + +1. You may copy and distribute verbatim copies of the Program's source code as +you receive it, in any medium, provided that you conspicuously and +appropriately publish on each copy an appropriate copyright notice and +disclaimer of warranty; keep intact all the notices that refer to this License +and to the absence of any warranty; and give any other recipients of the +Program a copy of this License along with the Program. + +You may charge a fee for the physical act of transferring a copy, and you may +at your option offer warranty protection in exchange for a fee. + +2. You may modify your copy or copies of the Program or any portion of it, thus +forming a work based on the Program, and copy and distribute such modifications +or work under the terms of Section 1 above, provided that you also meet all of +these conditions: + + a) You must cause the modified files to carry prominent notices stating + that you changed the files and the date of any change. + + b) You must cause any work that you distribute or publish, that in whole or + in part contains or is derived from the Program or any part thereof, to be + licensed as a whole at no charge to all third parties under the terms of + this License. + + c) If the modified program normally reads commands interactively when run, + you must cause it, when started running for such interactive use in the + most ordinary way, to print or display an announcement including an + appropriate copyright notice and a notice that there is no warranty (or + else, saying that you provide a warranty) and that users may redistribute + the program under these conditions, and telling the user how to view a copy + of this License. (Exception: if the Program itself is interactive but does + not normally print such an announcement, your work based on the Program is + not required to print an announcement.) + +These requirements apply to the modified work as a whole. If identifiable +sections of that work are not derived from the Program, and can be reasonably +considered independent and separate works in themselves, then this License, and +its terms, do not apply to those sections when you distribute them as separate +works. But when you distribute the same sections as part of a whole which is a +work based on the Program, the distribution of the whole must be on the terms +of this License, whose permissions for other licensees extend to the entire +whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest your +rights to work written entirely by you; rather, the intent is to exercise the +right to control the distribution of derivative or collective works based on +the Program. + +In addition, mere aggregation of another work not based on the Program with the +Program (or with a work based on the Program) on a volume of a storage or +distribution medium does not bring the other work under the scope of this +License. + +3. You may copy and distribute the Program (or a work based on it, under +Section 2) in object code or executable form under the terms of Sections 1 and +2 above provided that you also do one of the following: + + a) Accompany it with the complete corresponding machine-readable source + code, which must be distributed under the terms of Sections 1 and 2 above + on a medium customarily used for software interchange; or, + + b) Accompany it with a written offer, valid for at least three years, to + give any third party, for a charge no more than your cost of physically + performing source distribution, a complete machine-readable copy of the + corresponding source code, to be distributed under the terms of Sections 1 + and 2 above on a medium customarily used for software interchange; or, + + c) Accompany it with the information you received as to the offer to + distribute corresponding source code. (This alternative is allowed only + for noncommercial distribution and only if you received the program in + object code or executable form with such an offer, in accord with + Subsection b above.) + +The source code for a work means the preferred form of the work for making +modifications to it. For an executable work, complete source code means all +the source code for all modules it contains, plus any associated interface +definition files, plus the scripts used to control compilation and installation +of the executable. However, as a special exception, the source code +distributed need not include anything that is normally distributed (in either +source or binary form) with the major components (compiler, kernel, and so on) +of the operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering access to copy +from a designated place, then offering equivalent access to copy the source +code from the same place counts as distribution of the source code, even though +third parties are not compelled to copy the source along with the object code. + +4. You may not copy, modify, sublicense, or distribute the Program except as +expressly provided under this License. Any attempt otherwise to copy, modify, +sublicense or distribute the Program is void, and will automatically terminate +your rights under this License. However, parties who have received copies, or +rights, from you under this License will not have their licenses terminated so +long as such parties remain in full compliance. + +5. You are not required to accept this License, since you have not signed it. +However, nothing else grants you permission to modify or distribute the Program +or its derivative works. These actions are prohibited by law if you do not +accept this License. Therefore, by modifying or distributing the Program (or +any work based on the Program), you indicate your acceptance of this License to +do so, and all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + +6. Each time you redistribute the Program (or any work based on the Program), +the recipient automatically receives a license from the original licensor to +copy, distribute or modify the Program subject to these terms and conditions. +You may not impose any further restrictions on the recipients' exercise of the +rights granted herein. You are not responsible for enforcing compliance by +third parties to this License. + +7. If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), conditions +are imposed on you (whether by court order, agreement or otherwise) that +contradict the conditions of this License, they do not excuse you from the +conditions of this License. If you cannot distribute so as to satisfy +simultaneously your obligations under this License and any other pertinent +obligations, then as a consequence you may not distribute the Program at all. +For example, if a patent license would not permit royalty-free redistribution +of the Program by all those who receive copies directly or indirectly through +you, then the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under any +particular circumstance, the balance of the section is intended to apply and +the section as a whole is intended to apply in other circumstances. + +It is not the purpose of this section to induce you to infringe any patents or +other property right claims or to contest validity of any such claims; this +section has the sole purpose of protecting the integrity of the free software +distribution system, which is implemented by public license practices. Many +people have made generous contributions to the wide range of software +distributed through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing to +distribute software through any other system and a licensee cannot impose that +choice. + +This section is intended to make thoroughly clear what is believed to be a +consequence of the rest of this License. + +8. If the distribution and/or use of the Program is restricted in certain +countries either by patents or by copyrighted interfaces, the original +copyright holder who places the Program under this License may add an explicit +geographical distribution limitation excluding those countries, so that +distribution is permitted only in or among countries not thus excluded. In +such case, this License incorporates the limitation as if written in the body +of this License. + +9. The Free Software Foundation may publish revised and/or new versions of the +General Public License from time to time. Such new versions will be similar in +spirit to the present version, but may differ in detail to address new problems +or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and "any later +version", you have the option of following the terms and conditions either of +that version or of any later version published by the Free Software Foundation. +If the Program does not specify a version number of this License, you may +choose any version ever published by the Free Software Foundation. + +10. If you wish to incorporate parts of the Program into other free programs +whose distribution conditions are different, write to the author to ask for +permission. For software which is copyrighted by the Free Software Foundation, +write to the Free Software Foundation; we sometimes make exceptions for this. +Our decision will be guided by the two goals of preserving the free status of +all derivatives of our free software and of promoting the sharing and reuse of +software generally. + +NO WARRANTY + +11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR +THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE +STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE +PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, +INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND +FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND +PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, +YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + +12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL +ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE +PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY +GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR +INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA +BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A +FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER +OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. + +END OF TERMS AND CONDITIONS + +How to Apply These Terms to Your New Programs + +If you develop a new program, and you want it to be of the greatest possible +use to the public, the best way to achieve this is to make it free software +which everyone can redistribute and change under these terms. + +To do so, attach the following notices to the program. It is safest to attach +them to the start of each source file to most effectively convey the exclusion +of warranty; and each file should have at least the "copyright" line and a +pointer to where the full notice is found. + + One line to give the program's name and a brief idea of what it does. + + Copyright (C) + + This program is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by the Free + Software Foundation; either version 2 of the License, or (at your option) + any later version. + + This program is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + more details. + + You should have received a copy of the GNU General Public License along + with this program; if not, write to the Free Software Foundation, Inc., + 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this when it +starts in an interactive mode: + + Gnomovision version 69, Copyright (C) year name of author Gnomovision comes + with ABSOLUTELY NO WARRANTY; for details type 'show w'. This is free + software, and you are welcome to redistribute it under certain conditions; + type 'show c' for details. + +The hypothetical commands 'show w' and 'show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may be +called something other than 'show w' and 'show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your school, +if any, to sign a "copyright disclaimer" for the program, if necessary. Here +is a sample; alter the names: + + Yoyodyne, Inc., hereby disclaims all copyright interest in the program + 'Gnomovision' (which makes passes at compilers) written by James Hacker. + + signature of Ty Coon, 1 April 1989 + + Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Library General Public +License instead of this License. + + +"CLASSPATH" EXCEPTION TO THE GPL + +Certain source files distributed by Oracle America and/or its affiliates are +subject to the following clarification and special exception to the GPL, but +only where Oracle has expressly included in the particular source file's header +the words "Oracle designates this particular file as subject to the "Classpath" +exception as provided by Oracle in the LICENSE file that accompanied this code." + + Linking this library statically or dynamically with other modules is making + a combined work based on this library. Thus, the terms and conditions of + the GNU General Public License cover the whole combination. + + As a special exception, the copyright holders of this library give you + permission to link this library with independent modules to produce an + executable, regardless of the license terms of these independent modules, + and to copy and distribute the resulting executable under terms of your + choice, provided that you also meet, for each linked independent module, + the terms and conditions of the license of that module. An independent + module is a module which is not derived from or based on this library. If + you modify this library, you may extend this exception to your version of + the library, but you are not obligated to do so. If you do not wish to do + so, delete this exception statement from your version. diff --git a/LoongArch64-support.patch b/LoongArch64-support.patch new file mode 100644 index 0000000000000000000000000000000000000000..c249aab3133977dfdf6443254b999fac53436995 --- /dev/null +++ b/LoongArch64-support.patch @@ -0,0 +1,83276 @@ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/make/autoconf/jvm-features.m4 b/make/autoconf/jvm-features.m4 +--- a/make/autoconf/jvm-features.m4 2024-01-17 09:43:20.000000000 +0800 ++++ b/make/autoconf/jvm-features.m4 2024-02-20 10:42:35.822197048 +0800 +@@ -23,6 +23,12 @@ + # questions. + # + ++# ++# This file has been modified by Loongson Technology in 2022. These ++# modifications are Copyright (c) 2020, 2022, Loongson Technology, and are made ++# available on the same license terms set forth above. ++# ++ + ############################################################################### + # Terminology used in this file: + # +@@ -283,6 +289,8 @@ + AC_MSG_RESULT([yes]) + elif test "x$OPENJDK_TARGET_CPU" = "xriscv64"; then + AC_MSG_RESULT([yes]) ++ elif test "x$OPENJDK_TARGET_CPU" = "xloongarch64"; then ++ AC_MSG_RESULT([yes]) + else + AC_MSG_RESULT([no, $OPENJDK_TARGET_CPU]) + AVAILABLE=false +@@ -300,7 +308,8 @@ + if test "x$OPENJDK_TARGET_CPU_ARCH" = "xx86" || \ + test "x$OPENJDK_TARGET_CPU" = "xaarch64" || \ + test "x$OPENJDK_TARGET_CPU" = "xppc64le" || \ +- test "x$OPENJDK_TARGET_CPU" = "xriscv64"; then ++ test "x$OPENJDK_TARGET_CPU" = "xriscv64" || \ ++ test "x$OPENJDK_TARGET_CPU" = "xloongarch64"; then + AC_MSG_RESULT([yes]) + else + AC_MSG_RESULT([no, $OPENJDK_TARGET_CPU]) +@@ -355,6 +364,13 @@ + if test "x$OPENJDK_TARGET_OS" = "xlinux"; then + AC_MSG_RESULT([yes]) + else ++ AC_MSG_RESULT([no, $OPENJDK_TARGET_OS-$OPENJDK_TARGET_CPU]) ++ AVAILABLE=false ++ fi ++ elif test "x$OPENJDK_TARGET_CPU" = "xloongarch64"; then ++ if test "x$OPENJDK_TARGET_OS" = "xlinux"; then ++ AC_MSG_RESULT([yes]) ++ else + AC_MSG_RESULT([no, $OPENJDK_TARGET_OS-$OPENJDK_TARGET_CPU]) + AVAILABLE=false + fi +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/make/autoconf/platform.m4 b/make/autoconf/platform.m4 +--- a/make/autoconf/platform.m4 2024-01-17 09:43:20.000000000 +0800 ++++ b/make/autoconf/platform.m4 2024-02-20 10:42:35.825530378 +0800 +@@ -23,6 +23,12 @@ + # questions. + # + ++# ++# This file has been modified by Loongson Technology in 2022. These ++# modifications are Copyright (c) 2018, 2022, Loongson Technology, and are made ++# available on the same license terms set forth above. ++# ++ + # Support macro for PLATFORM_EXTRACT_TARGET_AND_BUILD. + # Converts autoconf style CPU name to OpenJDK style, into + # VAR_CPU, VAR_CPU_ARCH, VAR_CPU_BITS and VAR_CPU_ENDIAN. +@@ -545,11 +551,20 @@ + HOTSPOT_$1_CPU=ppc_64 + elif test "x$OPENJDK_$1_CPU" = xppc64le; then + HOTSPOT_$1_CPU=ppc_64 ++ elif test "x$OPENJDK_$1_CPU" = xloongarch; then ++ HOTSPOT_$1_CPU=loongarch_64 ++ elif test "x$OPENJDK_$1_CPU" = xloongarch64; then ++ HOTSPOT_$1_CPU=loongarch_64 + fi + AC_SUBST(HOTSPOT_$1_CPU) + + # This is identical with OPENJDK_*, but define anyway for consistency. + HOTSPOT_$1_CPU_ARCH=${OPENJDK_$1_CPU_ARCH} ++ # Override hotspot cpu definitions for LOONGARCH platforms ++ if test "x$OPENJDK_$1_CPU" = xloongarch64; then ++ HOTSPOT_TARGET_CPU_ARCH=loongarch ++ fi ++ + AC_SUBST(HOTSPOT_$1_CPU_ARCH) + + # Setup HOTSPOT_$1_CPU_DEFINE +@@ -569,6 +584,8 @@ + HOTSPOT_$1_CPU_DEFINE=PPC64 + elif test "x$OPENJDK_$1_CPU" = xriscv64; then + HOTSPOT_$1_CPU_DEFINE=RISCV64 ++ elif test "x$OPENJDK_$1_CPU" = xloongarch64; then ++ HOTSPOT_$1_CPU_DEFINE=LOONGARCH64 + + # The cpu defines below are for zero, we don't support them directly. + elif test "x$OPENJDK_$1_CPU" = xsparc; then +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/abstractInterpreter_loongarch.cpp b/src/hotspot/cpu/loongarch/abstractInterpreter_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/abstractInterpreter_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/abstractInterpreter_loongarch.cpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,155 @@ ++/* ++ * Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "ci/ciMethod.hpp" ++#include "interpreter/interpreter.hpp" ++#include "oops/klass.inline.hpp" ++#include "runtime/frame.inline.hpp" ++ ++int AbstractInterpreter::BasicType_as_index(BasicType type) { ++ int i = 0; ++ switch (type) { ++ case T_BOOLEAN: i = 0; break; ++ case T_CHAR : i = 1; break; ++ case T_BYTE : i = 2; break; ++ case T_SHORT : i = 3; break; ++ case T_INT : i = 4; break; ++ case T_LONG : ++ case T_VOID : ++ case T_FLOAT : ++ case T_DOUBLE : i = 5; break; ++ case T_OBJECT : ++ case T_ARRAY : i = 6; break; ++ default : ShouldNotReachHere(); ++ } ++ assert(0 <= i && i < AbstractInterpreter::number_of_result_handlers, ++ "index out of bounds"); ++ return i; ++} ++ ++// How much stack a method activation needs in words. ++int AbstractInterpreter::size_top_interpreter_activation(Method* method) { ++ const int entry_size = frame::interpreter_frame_monitor_size(); ++ ++ // total overhead size: entry_size + (saved fp thru expr stack ++ // bottom). be sure to change this if you add/subtract anything ++ // to/from the overhead area ++ const int overhead_size = ++ -(frame::interpreter_frame_initial_sp_offset) + entry_size; ++ ++ const int stub_code = frame::entry_frame_after_call_words; ++ assert(method != nullptr, "invalid method"); ++ const int method_stack = (method->max_locals() + method->max_stack()) * ++ Interpreter::stackElementWords; ++ return (overhead_size + method_stack + stub_code); ++} ++ ++// asm based interpreter deoptimization helpers ++int AbstractInterpreter::size_activation(int max_stack, ++ int temps, ++ int extra_args, ++ int monitors, ++ int callee_params, ++ int callee_locals, ++ bool is_top_frame) { ++ // Note: This calculation must exactly parallel the frame setup ++ // in AbstractInterpreterGenerator::generate_method_entry. ++ ++ // fixed size of an interpreter frame: ++ int overhead = frame::sender_sp_offset - ++ frame::interpreter_frame_initial_sp_offset; ++ // Our locals were accounted for by the caller (or last_frame_adjust ++ // on the transition) Since the callee parameters already account ++ // for the callee's params we only need to account for the extra ++ // locals. ++ int size = overhead + ++ (callee_locals - callee_params)*Interpreter::stackElementWords + ++ monitors * frame::interpreter_frame_monitor_size() + ++ temps* Interpreter::stackElementWords + extra_args; ++ ++ return size; ++} ++ ++void AbstractInterpreter::layout_activation(Method* method, ++ int tempcount, ++ int popframe_extra_args, ++ int moncount, ++ int caller_actual_parameters, ++ int callee_param_count, ++ int callee_locals, ++ frame* caller, ++ frame* interpreter_frame, ++ bool is_top_frame, ++ bool is_bottom_frame) { ++ // Note: This calculation must exactly parallel the frame setup ++ // in AbstractInterpreterGenerator::generate_method_entry. ++ // If interpreter_frame!=nullptr, set up the method, locals, and monitors. ++ // The frame interpreter_frame, if not null, is guaranteed to be the ++ // right size, as determined by a previous call to this method. ++ // It is also guaranteed to be walkable even though it is in a skeletal state ++ ++ // fixed size of an interpreter frame: ++ ++ int max_locals = method->max_locals() * Interpreter::stackElementWords; ++ int extra_locals = (method->max_locals() - method->size_of_parameters()) * Interpreter::stackElementWords; ++ ++#ifdef ASSERT ++ assert(caller->sp() == interpreter_frame->sender_sp(), "Frame not properly walkable(2)"); ++#endif ++ ++ interpreter_frame->interpreter_frame_set_method(method); ++ // NOTE the difference in using sender_sp and interpreter_frame_sender_sp ++ // interpreter_frame_sender_sp is the original sp of the caller (the unextended_sp) ++ // and sender_sp is fp+8 ++ intptr_t* locals = interpreter_frame->sender_sp() + max_locals - 1; ++ ++#ifdef ASSERT ++ if (caller->is_interpreted_frame()) { ++ assert(locals < caller->fp() + frame::interpreter_frame_initial_sp_offset, "bad placement"); ++ } ++#endif ++ ++ interpreter_frame->interpreter_frame_set_locals(locals); ++ BasicObjectLock* montop = interpreter_frame->interpreter_frame_monitor_begin(); ++ BasicObjectLock* monbot = montop - moncount; ++ interpreter_frame->interpreter_frame_set_monitor_end(montop - moncount); ++ ++ //set last sp; ++ intptr_t* esp = (intptr_t*) monbot - tempcount*Interpreter::stackElementWords - ++ popframe_extra_args; ++ interpreter_frame->interpreter_frame_set_last_sp(esp); ++ // All frames but the initial interpreter frame we fill in have a ++ // value for sender_sp that allows walking the stack but isn't ++ // truly correct. Correct the value here. ++ // ++ if (extra_locals != 0 && ++ interpreter_frame->sender_sp() == interpreter_frame->interpreter_frame_sender_sp() ) { ++ interpreter_frame->set_interpreter_frame_sender_sp(caller->sp() + extra_locals); ++ } ++ *interpreter_frame->interpreter_frame_cache_addr() = method->constants()->cache(); ++ *interpreter_frame->interpreter_frame_mirror_addr() = method->method_holder()->java_mirror(); ++} ++ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/assembler_loongarch.cpp b/src/hotspot/cpu/loongarch/assembler_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/assembler_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/assembler_loongarch.cpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,820 @@ ++/* ++ * Copyright (c) 1997, 2014, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/assembler.hpp" ++#include "asm/assembler.inline.hpp" ++#include "gc/shared/cardTableBarrierSet.hpp" ++#include "gc/shared/collectedHeap.inline.hpp" ++#include "interpreter/interpreter.hpp" ++#include "memory/resourceArea.hpp" ++#include "prims/methodHandles.hpp" ++#include "runtime/objectMonitor.hpp" ++#include "runtime/os.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "utilities/macros.hpp" ++ ++#ifdef PRODUCT ++#define BLOCK_COMMENT(str) /* nothing */ ++#define STOP(error) stop(error) ++#else ++#define BLOCK_COMMENT(str) block_comment(str) ++#define STOP(error) block_comment(error); stop(error) ++#endif ++ ++#define BIND(label) bind(label); BLOCK_COMMENT(#label ":") ++// Implementation of AddressLiteral ++ ++AddressLiteral::AddressLiteral(address target, relocInfo::relocType rtype) { ++ _target = target; ++ _rspec = rspec_from_rtype(rtype, target); ++} ++ ++bool Assembler::is_vec_imm(float val) { ++ juint x = *reinterpret_cast(&val); ++ juint masked = x & 0x7e07ffff; ++ ++ return (masked == 0x3e000000 || masked == 0x40000000); ++} ++ ++bool Assembler::is_vec_imm(double val) { ++ julong x = *reinterpret_cast(&val); ++ julong masked = x & 0x7fc0ffffffffffff; ++ ++ return (masked == 0x3fc0000000000000 || masked == 0x4000000000000000); ++} ++ ++int Assembler::get_vec_imm(float val) { ++ juint x = *reinterpret_cast(&val); ++ ++ return simm13((0b11011 << 8) | (((x >> 24) & 0xc0) ^ 0x40) | ((x >> 19) & 0x3f)); ++} ++ ++int Assembler::get_vec_imm(double val) { ++ julong x = *reinterpret_cast(&val); ++ ++ return simm13((0b11100 << 8) | (((x >> 56) & 0xc0) ^ 0x40) | ((x >> 48) & 0x3f)); ++} ++ ++int AbstractAssembler::code_fill_byte() { ++ return 0x00; // illegal instruction 0x00000000 ++} ++ ++// Now the Assembler instruction (identical for 32/64 bits) ++void Assembler::ld_b(Register rd, const Address &src) { ++ Register dst = rd; ++ Register base = src.base(); ++ Register index = src.index(); ++ ++ int scale = src.scale(); ++ int disp = src.disp(); ++ ++ if (index != noreg) { ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ ldx_b(dst, base, index); ++ } else { ++ add_d(AT, base, index); ++ ld_b(dst, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ ld_b(dst, AT, disp); ++ } ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ ldx_b(dst, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ ld_b(dst, base, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ldx_b(dst, base, AT); ++ } ++ } ++} ++ ++void Assembler::ld_bu(Register rd, const Address &src) { ++ Register dst = rd; ++ Register base = src.base(); ++ Register index = src.index(); ++ ++ int scale = src.scale(); ++ int disp = src.disp(); ++ ++ if (index != noreg) { ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ ldx_bu(dst, base, index); ++ } else { ++ add_d(AT, base, index); ++ ld_bu(dst, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ ld_bu(dst, AT, disp); ++ } ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ ldx_bu(dst, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ ld_bu(dst, base, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ldx_bu(dst, base, AT); ++ } ++ } ++} ++ ++void Assembler::ld_d(Register rd, const Address &src) { ++ Register dst = rd; ++ Register base = src.base(); ++ Register index = src.index(); ++ ++ int scale = src.scale(); ++ int disp = src.disp(); ++ ++ if (index != noreg) { ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ ldx_d(dst, base, index); ++ } else { ++ add_d(AT, base, index); ++ ld_d(dst, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ ld_d(dst, AT, disp); ++ } ++ } else if (is_simm(disp, 16) && !(disp & 3)) { ++ if (scale == 0) { ++ add_d(AT, base, index); ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ } ++ ldptr_d(dst, AT, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ ldx_d(dst, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ ld_d(dst, base, disp); ++ } else if (is_simm(disp, 16) && !(disp & 3)) { ++ ldptr_d(dst, base, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ldx_d(dst, base, AT); ++ } ++ } ++} ++ ++void Assembler::ld_h(Register rd, const Address &src) { ++ Register dst = rd; ++ Register base = src.base(); ++ Register index = src.index(); ++ ++ int scale = src.scale(); ++ int disp = src.disp(); ++ ++ if (index != noreg) { ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ ldx_h(dst, base, index); ++ } else { ++ add_d(AT, base, index); ++ ld_h(dst, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ ld_h(dst, AT, disp); ++ } ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ ldx_h(dst, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ ld_h(dst, base, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ldx_h(dst, base, AT); ++ } ++ } ++} ++ ++void Assembler::ld_hu(Register rd, const Address &src) { ++ Register dst = rd; ++ Register base = src.base(); ++ Register index = src.index(); ++ ++ int scale = src.scale(); ++ int disp = src.disp(); ++ ++ if (index != noreg) { ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ ldx_hu(dst, base, index); ++ } else { ++ add_d(AT, base, index); ++ ld_hu(dst, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ ld_hu(dst, AT, disp); ++ } ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ ldx_hu(dst, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ ld_hu(dst, base, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ldx_hu(dst, base, AT); ++ } ++ } ++} ++ ++void Assembler::ll_w(Register rd, const Address &src) { ++ assert(src.index() == NOREG, "index is unimplemented"); ++ ll_w(rd, src.base(), src.disp()); ++} ++ ++void Assembler::ll_d(Register rd, const Address &src) { ++ assert(src.index() == NOREG, "index is unimplemented"); ++ ll_d(rd, src.base(), src.disp()); ++} ++ ++void Assembler::ld_w(Register rd, const Address &src) { ++ Register dst = rd; ++ Register base = src.base(); ++ Register index = src.index(); ++ ++ int scale = src.scale(); ++ int disp = src.disp(); ++ ++ if (index != noreg) { ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ ldx_w(dst, base, index); ++ } else { ++ add_d(AT, base, index); ++ ld_w(dst, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ ld_w(dst, AT, disp); ++ } ++ } else if (is_simm(disp, 16) && !(disp & 3)) { ++ if (scale == 0) { ++ add_d(AT, base, index); ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ } ++ ldptr_w(dst, AT, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ ldx_w(dst, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ ld_w(dst, base, disp); ++ } else if (is_simm(disp, 16) && !(disp & 3)) { ++ ldptr_w(dst, base, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ldx_w(dst, base, AT); ++ } ++ } ++} ++ ++void Assembler::ld_wu(Register rd, const Address &src) { ++ Register dst = rd; ++ Register base = src.base(); ++ Register index = src.index(); ++ ++ int scale = src.scale(); ++ int disp = src.disp(); ++ ++ if (index != noreg) { ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ ldx_wu(dst, base, index); ++ } else { ++ add_d(AT, base, index); ++ ld_wu(dst, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ ld_wu(dst, AT, disp); ++ } ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ ldx_wu(dst, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ ld_wu(dst, base, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ldx_wu(dst, base, AT); ++ } ++ } ++} ++ ++void Assembler::st_b(Register rd, const Address &dst) { ++ Register src = rd; ++ Register base = dst.base(); ++ Register index = dst.index(); ++ ++ int scale = dst.scale(); ++ int disp = dst.disp(); ++ ++ if (index != noreg) { ++ assert_different_registers(src, AT); ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ stx_b(src, base, index); ++ } else { ++ add_d(AT, base, index); ++ st_b(src, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ st_b(src, AT, disp); ++ } ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ stx_b(src, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ st_b(src, base, disp); ++ } else { ++ assert_different_registers(src, AT); ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ stx_b(src, base, AT); ++ } ++ } ++} ++ ++void Assembler::sc_w(Register rd, const Address &dst) { ++ assert(dst.index() == NOREG, "index is unimplemented"); ++ sc_w(rd, dst.base(), dst.disp()); ++} ++ ++void Assembler::sc_d(Register rd, const Address &dst) { ++ assert(dst.index() == NOREG, "index is unimplemented"); ++ sc_d(rd, dst.base(), dst.disp()); ++} ++ ++void Assembler::st_d(Register rd, const Address &dst) { ++ Register src = rd; ++ Register base = dst.base(); ++ Register index = dst.index(); ++ ++ int scale = dst.scale(); ++ int disp = dst.disp(); ++ ++ if (index != noreg) { ++ assert_different_registers(src, AT); ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ stx_d(src, base, index); ++ } else { ++ add_d(AT, base, index); ++ st_d(src, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ st_d(src, AT, disp); ++ } ++ } else if (is_simm(disp, 16) && !(disp & 3)) { ++ if (scale == 0) { ++ add_d(AT, base, index); ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ } ++ stptr_d(src, AT, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ stx_d(src, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ st_d(src, base, disp); ++ } else if (is_simm(disp, 16) && !(disp & 3)) { ++ stptr_d(src, base, disp); ++ } else { ++ assert_different_registers(src, AT); ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ stx_d(src, base, AT); ++ } ++ } ++} ++ ++void Assembler::st_h(Register rd, const Address &dst) { ++ Register src = rd; ++ Register base = dst.base(); ++ Register index = dst.index(); ++ ++ int scale = dst.scale(); ++ int disp = dst.disp(); ++ ++ if (index != noreg) { ++ assert_different_registers(src, AT); ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ stx_h(src, base, index); ++ } else { ++ add_d(AT, base, index); ++ st_h(src, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ st_h(src, AT, disp); ++ } ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ stx_h(src, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ st_h(src, base, disp); ++ } else { ++ assert_different_registers(src, AT); ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ stx_h(src, base, AT); ++ } ++ } ++} ++ ++void Assembler::st_w(Register rd, const Address &dst) { ++ Register src = rd; ++ Register base = dst.base(); ++ Register index = dst.index(); ++ ++ int scale = dst.scale(); ++ int disp = dst.disp(); ++ ++ if (index != noreg) { ++ assert_different_registers(src, AT); ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ stx_w(src, base, index); ++ } else { ++ add_d(AT, base, index); ++ st_w(src, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ st_w(src, AT, disp); ++ } ++ } else if (is_simm(disp, 16) && !(disp & 3)) { ++ if (scale == 0) { ++ add_d(AT, base, index); ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ } ++ stptr_w(src, AT, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ stx_w(src, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ st_w(src, base, disp); ++ } else if (is_simm(disp, 16) && !(disp & 3)) { ++ stptr_w(src, base, disp); ++ } else { ++ assert_different_registers(src, AT); ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ stx_w(src, base, AT); ++ } ++ } ++} ++ ++void Assembler::fld_s(FloatRegister fd, const Address &src) { ++ Register base = src.base(); ++ Register index = src.index(); ++ ++ int scale = src.scale(); ++ int disp = src.disp(); ++ ++ if (index != noreg) { ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ fldx_s(fd, base, index); ++ } else { ++ add_d(AT, base, index); ++ fld_s(fd, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ fld_s(fd, AT, disp); ++ } ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ fldx_s(fd, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ fld_s(fd, base, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ fldx_s(fd, base, AT); ++ } ++ } ++} ++ ++void Assembler::fld_d(FloatRegister fd, const Address &src) { ++ Register base = src.base(); ++ Register index = src.index(); ++ ++ int scale = src.scale(); ++ int disp = src.disp(); ++ ++ if (index != noreg) { ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ fldx_d(fd, base, index); ++ } else { ++ add_d(AT, base, index); ++ fld_d(fd, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ fld_d(fd, AT, disp); ++ } ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ fldx_d(fd, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ fld_d(fd, base, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ fldx_d(fd, base, AT); ++ } ++ } ++} ++ ++void Assembler::fst_s(FloatRegister fd, const Address &dst) { ++ Register base = dst.base(); ++ Register index = dst.index(); ++ ++ int scale = dst.scale(); ++ int disp = dst.disp(); ++ ++ if (index != noreg) { ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ fstx_s(fd, base, index); ++ } else { ++ add_d(AT, base, index); ++ fst_s(fd, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ fst_s(fd, AT, disp); ++ } ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ fstx_s(fd, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ fst_s(fd, base, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ fstx_s(fd, base, AT); ++ } ++ } ++} ++ ++void Assembler::fst_d(FloatRegister fd, const Address &dst) { ++ Register base = dst.base(); ++ Register index = dst.index(); ++ ++ int scale = dst.scale(); ++ int disp = dst.disp(); ++ ++ if (index != noreg) { ++ if (is_simm(disp, 12)) { ++ if (scale == 0) { ++ if (disp == 0) { ++ fstx_d(fd, base, index); ++ } else { ++ add_d(AT, base, index); ++ fst_d(fd, AT, disp); ++ } ++ } else { ++ alsl_d(AT, index, base, scale - 1); ++ fst_d(fd, AT, disp); ++ } ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ ++ if (scale == 0) { ++ add_d(AT, AT, index); ++ } else { ++ alsl_d(AT, index, AT, scale - 1); ++ } ++ fstx_d(fd, base, AT); ++ } ++ } else { ++ if (is_simm(disp, 12)) { ++ fst_d(fd, base, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ fstx_d(fd, base, AT); ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/assembler_loongarch.hpp b/src/hotspot/cpu/loongarch/assembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/assembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/assembler_loongarch.hpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,3213 @@ ++/* ++ * Copyright (c) 1997, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_ASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_ASSEMBLER_LOONGARCH_HPP ++ ++#include "asm/register.hpp" ++#include "runtime/vm_version.hpp" ++ ++// Calling convention ++class Argument { ++ public: ++ enum { ++ n_int_register_parameters_c = 8, // A0, ... A7 (c_rarg0, c_rarg1, ...) ++ n_float_register_parameters_c = 8, // FA0, ... FA7 (c_farg0, c_farg1, ...) ++ ++ n_int_register_parameters_j = 9, // T0, A0, ... A7 (j_rarg0, j_rarg1, ...) ++ n_float_register_parameters_j = 8 // FA0, ... FA7 (j_farg0, j_farg1, ...) ++ }; ++}; ++ ++constexpr Register c_rarg0 = A0; ++constexpr Register c_rarg1 = A1; ++constexpr Register c_rarg2 = A2; ++constexpr Register c_rarg3 = A3; ++constexpr Register c_rarg4 = A4; ++constexpr Register c_rarg5 = A5; ++constexpr Register c_rarg6 = A6; ++constexpr Register c_rarg7 = A7; ++ ++constexpr FloatRegister c_farg0 = FA0; ++constexpr FloatRegister c_farg1 = FA1; ++constexpr FloatRegister c_farg2 = FA2; ++constexpr FloatRegister c_farg3 = FA3; ++constexpr FloatRegister c_farg4 = FA4; ++constexpr FloatRegister c_farg5 = FA5; ++constexpr FloatRegister c_farg6 = FA6; ++constexpr FloatRegister c_farg7 = FA7; ++ ++constexpr Register j_rarg0 = T0; ++constexpr Register j_rarg1 = A0; ++constexpr Register j_rarg2 = A1; ++constexpr Register j_rarg3 = A2; ++constexpr Register j_rarg4 = A3; ++constexpr Register j_rarg5 = A4; ++constexpr Register j_rarg6 = A5; ++constexpr Register j_rarg7 = A6; ++constexpr Register j_rarg8 = A7; ++ ++constexpr FloatRegister j_farg0 = FA0; ++constexpr FloatRegister j_farg1 = FA1; ++constexpr FloatRegister j_farg2 = FA2; ++constexpr FloatRegister j_farg3 = FA3; ++constexpr FloatRegister j_farg4 = FA4; ++constexpr FloatRegister j_farg5 = FA5; ++constexpr FloatRegister j_farg6 = FA6; ++constexpr FloatRegister j_farg7 = FA7; ++ ++constexpr Register Rnext = S1; ++constexpr Register Rmethod = S3; ++constexpr Register Rsender = S4; ++constexpr Register Rdispatch = S8; ++ ++constexpr Register V0 = A0; ++constexpr Register V1 = A1; ++ ++// bytecode pointer register ++constexpr Register BCP = S0; ++// local variable pointer register ++constexpr Register LVP = S7; ++// temporary callee saved register ++constexpr Register TSR = S2; ++ ++constexpr Register TREG = S6; ++ ++constexpr Register S5_heapbase = S5; ++ ++constexpr Register FSR = V0; ++constexpr FloatRegister FSF = FA0; ++ ++constexpr Register RECEIVER = T0; ++constexpr Register IC_Klass = T1; ++ ++// ---------- Scratch Register ---------- ++constexpr Register AT = T7; ++constexpr FloatRegister fscratch = F23; ++ ++constexpr Register SCR1 = T7; ++// SCR2 is allocable in C2 Compiler ++constexpr Register SCR2 = T4; ++ ++class Address { ++ public: ++ enum ScaleFactor { ++ no_scale = 0, ++ times_2 = 1, ++ times_4 = 2, ++ times_8 = 3, ++ times_ptr = times_8 ++ }; ++ static ScaleFactor times(int size) { ++ assert(size >= 1 && size <= 8 && is_power_of_2(size), "bad scale size"); ++ if (size == 8) return times_8; ++ if (size == 4) return times_4; ++ if (size == 2) return times_2; ++ return no_scale; ++ } ++ ++ private: ++ Register _base; ++ Register _index; ++ ScaleFactor _scale; ++ int _disp; ++ ++ public: ++ ++ // creation ++ Address() ++ : _base(noreg), ++ _index(noreg), ++ _scale(no_scale), ++ _disp(0) { ++ } ++ ++ // No default displacement otherwise Register can be implicitly ++ // converted to 0(Register) which is quite a different animal. ++ ++ Address(Register base, int disp = 0) ++ : _base(base), ++ _index(noreg), ++ _scale(no_scale), ++ _disp(disp) { ++ assert_different_registers(_base, AT); ++ } ++ ++ Address(Register base, Register index, ScaleFactor scale, int disp = 0) ++ : _base (base), ++ _index(index), ++ _scale(scale), ++ _disp (disp) { ++ assert_different_registers(_base, _index, AT); ++ } ++ ++ // The following overloads are used in connection with the ++ // ByteSize type (see sizes.hpp). They simplify the use of ++ // ByteSize'd arguments in assembly code. ++ ++ Address(Register base, ByteSize disp) ++ : Address(base, in_bytes(disp)) {} ++ ++ Address(Register base, Register index, ScaleFactor scale, ByteSize disp) ++ : Address(base, index, scale, in_bytes(disp)) {} ++ ++ // accessors ++ bool uses(Register reg) const { return _base == reg || _index == reg; } ++ Register base() const { return _base; } ++ Register index() const { return _index; } ++ ScaleFactor scale() const { return _scale; } ++ int disp() const { return _disp; } ++ ++ friend class Assembler; ++ friend class MacroAssembler; ++ friend class LIR_Assembler; // base/index/scale/disp ++}; ++ ++// ++// AddressLiteral has been split out from Address because operands of this type ++// need to be treated specially on 32bit vs. 64bit platforms. By splitting it out ++// the few instructions that need to deal with address literals are unique and the ++// MacroAssembler does not have to implement every instruction in the Assembler ++// in order to search for address literals that may need special handling depending ++// on the instruction and the platform. As small step on the way to merging i486/amd64 ++// directories. ++// ++class AddressLiteral { ++ RelocationHolder _rspec; ++ ++ // If the target is far we'll need to load the ea of this to ++ // a register to reach it. Otherwise if near we can do rip ++ // relative addressing. ++ ++ address _target; ++ ++ protected: ++ // creation ++ AddressLiteral() ++ : _target(nullptr) ++ {} ++ ++ public: ++ ++ ++ AddressLiteral(address target, relocInfo::relocType rtype); ++ ++ AddressLiteral(address target, RelocationHolder const& rspec) ++ : _rspec(rspec), ++ _target(target) ++ {} ++ ++ private: ++ ++ address target() { return _target; } ++ ++ relocInfo::relocType reloc() const { return _rspec.type(); } ++ const RelocationHolder& rspec() const { return _rspec; } ++ ++ friend class Assembler; ++ friend class MacroAssembler; ++ friend class Address; ++ friend class LIR_Assembler; ++ RelocationHolder rspec_from_rtype(relocInfo::relocType rtype, address addr) { ++ switch (rtype) { ++ case relocInfo::external_word_type: ++ return external_word_Relocation::spec(addr); ++ case relocInfo::internal_word_type: ++ return internal_word_Relocation::spec(addr); ++ case relocInfo::opt_virtual_call_type: ++ return opt_virtual_call_Relocation::spec(); ++ case relocInfo::static_call_type: ++ return static_call_Relocation::spec(); ++ case relocInfo::runtime_call_type: ++ return runtime_call_Relocation::spec(); ++ case relocInfo::poll_type: ++ case relocInfo::poll_return_type: ++ return Relocation::spec_simple(rtype); ++ case relocInfo::none: ++ case relocInfo::oop_type: ++ // Oops are a special case. Normally they would be their own section ++ // but in cases like icBuffer they are literals in the code stream that ++ // we don't have a section for. We use none so that we get a literal address ++ // which is always patchable. ++ return RelocationHolder(); ++ default: ++ ShouldNotReachHere(); ++ return RelocationHolder(); ++ } ++ } ++ ++}; ++ ++// Convenience classes ++class RuntimeAddress: public AddressLiteral { ++ ++ public: ++ ++ RuntimeAddress(address target) : AddressLiteral(target, relocInfo::runtime_call_type) {} ++ ++}; ++ ++class OopAddress: public AddressLiteral { ++ ++ public: ++ ++ OopAddress(address target) : AddressLiteral(target, relocInfo::oop_type){} ++ ++}; ++ ++class ExternalAddress: public AddressLiteral { ++ ++ public: ++ ++ ExternalAddress(address target) : AddressLiteral(target, relocInfo::external_word_type){} ++ ++}; ++ ++class InternalAddress: public AddressLiteral { ++ ++ public: ++ ++ InternalAddress(address target) : AddressLiteral(target, relocInfo::internal_word_type) {} ++ ++}; ++ ++// The LoongArch Assembler: Pure assembler doing NO optimizations on the instruction ++// level ; i.e., what you write is what you get. The Assembler is generating code into ++// a CodeBuffer. ++ ++class Assembler : public AbstractAssembler { ++ friend class AbstractAssembler; // for the non-virtual hack ++ friend class LIR_Assembler; // as_Address() ++ friend class StubGenerator; ++ ++ public: ++ // 22-bit opcode, highest 22 bits: bits[31...10] ++ enum ops22 { ++ clo_w_op = 0b0000000000000000000100, ++ clz_w_op = 0b0000000000000000000101, ++ cto_w_op = 0b0000000000000000000110, ++ ctz_w_op = 0b0000000000000000000111, ++ clo_d_op = 0b0000000000000000001000, ++ clz_d_op = 0b0000000000000000001001, ++ cto_d_op = 0b0000000000000000001010, ++ ctz_d_op = 0b0000000000000000001011, ++ revb_2h_op = 0b0000000000000000001100, ++ revb_4h_op = 0b0000000000000000001101, ++ revb_2w_op = 0b0000000000000000001110, ++ revb_d_op = 0b0000000000000000001111, ++ revh_2w_op = 0b0000000000000000010000, ++ revh_d_op = 0b0000000000000000010001, ++ bitrev_4b_op = 0b0000000000000000010010, ++ bitrev_8b_op = 0b0000000000000000010011, ++ bitrev_w_op = 0b0000000000000000010100, ++ bitrev_d_op = 0b0000000000000000010101, ++ ext_w_h_op = 0b0000000000000000010110, ++ ext_w_b_op = 0b0000000000000000010111, ++ rdtimel_w_op = 0b0000000000000000011000, ++ rdtimeh_w_op = 0b0000000000000000011001, ++ rdtime_d_op = 0b0000000000000000011010, ++ cpucfg_op = 0b0000000000000000011011, ++ fabs_s_op = 0b0000000100010100000001, ++ fabs_d_op = 0b0000000100010100000010, ++ fneg_s_op = 0b0000000100010100000101, ++ fneg_d_op = 0b0000000100010100000110, ++ flogb_s_op = 0b0000000100010100001001, ++ flogb_d_op = 0b0000000100010100001010, ++ fclass_s_op = 0b0000000100010100001101, ++ fclass_d_op = 0b0000000100010100001110, ++ fsqrt_s_op = 0b0000000100010100010001, ++ fsqrt_d_op = 0b0000000100010100010010, ++ frecip_s_op = 0b0000000100010100010101, ++ frecip_d_op = 0b0000000100010100010110, ++ frsqrt_s_op = 0b0000000100010100011001, ++ frsqrt_d_op = 0b0000000100010100011010, ++ fmov_s_op = 0b0000000100010100100101, ++ fmov_d_op = 0b0000000100010100100110, ++ movgr2fr_w_op = 0b0000000100010100101001, ++ movgr2fr_d_op = 0b0000000100010100101010, ++ movgr2frh_w_op = 0b0000000100010100101011, ++ movfr2gr_s_op = 0b0000000100010100101101, ++ movfr2gr_d_op = 0b0000000100010100101110, ++ movfrh2gr_s_op = 0b0000000100010100101111, ++ movgr2fcsr_op = 0b0000000100010100110000, ++ movfcsr2gr_op = 0b0000000100010100110010, ++ movfr2cf_op = 0b0000000100010100110100, ++ movcf2fr_op = 0b0000000100010100110101, ++ movgr2cf_op = 0b0000000100010100110110, ++ movcf2gr_op = 0b0000000100010100110111, ++ fcvt_s_d_op = 0b0000000100011001000110, ++ fcvt_d_s_op = 0b0000000100011001001001, ++ ftintrm_w_s_op = 0b0000000100011010000001, ++ ftintrm_w_d_op = 0b0000000100011010000010, ++ ftintrm_l_s_op = 0b0000000100011010001001, ++ ftintrm_l_d_op = 0b0000000100011010001010, ++ ftintrp_w_s_op = 0b0000000100011010010001, ++ ftintrp_w_d_op = 0b0000000100011010010010, ++ ftintrp_l_s_op = 0b0000000100011010011001, ++ ftintrp_l_d_op = 0b0000000100011010011010, ++ ftintrz_w_s_op = 0b0000000100011010100001, ++ ftintrz_w_d_op = 0b0000000100011010100010, ++ ftintrz_l_s_op = 0b0000000100011010101001, ++ ftintrz_l_d_op = 0b0000000100011010101010, ++ ftintrne_w_s_op = 0b0000000100011010110001, ++ ftintrne_w_d_op = 0b0000000100011010110010, ++ ftintrne_l_s_op = 0b0000000100011010111001, ++ ftintrne_l_d_op = 0b0000000100011010111010, ++ ftint_w_s_op = 0b0000000100011011000001, ++ ftint_w_d_op = 0b0000000100011011000010, ++ ftint_l_s_op = 0b0000000100011011001001, ++ ftint_l_d_op = 0b0000000100011011001010, ++ ffint_s_w_op = 0b0000000100011101000100, ++ ffint_s_l_op = 0b0000000100011101000110, ++ ffint_d_w_op = 0b0000000100011101001000, ++ ffint_d_l_op = 0b0000000100011101001010, ++ frint_s_op = 0b0000000100011110010001, ++ frint_d_op = 0b0000000100011110010010, ++ iocsrrd_b_op = 0b0000011001001000000000, ++ iocsrrd_h_op = 0b0000011001001000000001, ++ iocsrrd_w_op = 0b0000011001001000000010, ++ iocsrrd_d_op = 0b0000011001001000000011, ++ iocsrwr_b_op = 0b0000011001001000000100, ++ iocsrwr_h_op = 0b0000011001001000000101, ++ iocsrwr_w_op = 0b0000011001001000000110, ++ iocsrwr_d_op = 0b0000011001001000000111, ++ vclo_b_op = 0b0111001010011100000000, ++ vclo_h_op = 0b0111001010011100000001, ++ vclo_w_op = 0b0111001010011100000010, ++ vclo_d_op = 0b0111001010011100000011, ++ vclz_b_op = 0b0111001010011100000100, ++ vclz_h_op = 0b0111001010011100000101, ++ vclz_w_op = 0b0111001010011100000110, ++ vclz_d_op = 0b0111001010011100000111, ++ vpcnt_b_op = 0b0111001010011100001000, ++ vpcnt_h_op = 0b0111001010011100001001, ++ vpcnt_w_op = 0b0111001010011100001010, ++ vpcnt_d_op = 0b0111001010011100001011, ++ vneg_b_op = 0b0111001010011100001100, ++ vneg_h_op = 0b0111001010011100001101, ++ vneg_w_op = 0b0111001010011100001110, ++ vneg_d_op = 0b0111001010011100001111, ++ vseteqz_v_op = 0b0111001010011100100110, ++ vsetnez_v_op = 0b0111001010011100100111, ++ vsetanyeqz_b_op = 0b0111001010011100101000, ++ vsetanyeqz_h_op = 0b0111001010011100101001, ++ vsetanyeqz_w_op = 0b0111001010011100101010, ++ vsetanyeqz_d_op = 0b0111001010011100101011, ++ vsetallnez_b_op = 0b0111001010011100101100, ++ vsetallnez_h_op = 0b0111001010011100101101, ++ vsetallnez_w_op = 0b0111001010011100101110, ++ vsetallnez_d_op = 0b0111001010011100101111, ++ vfclass_s_op = 0b0111001010011100110101, ++ vfclass_d_op = 0b0111001010011100110110, ++ vfsqrt_s_op = 0b0111001010011100111001, ++ vfsqrt_d_op = 0b0111001010011100111010, ++ vfrint_s_op = 0b0111001010011101001101, ++ vfrint_d_op = 0b0111001010011101001110, ++ vfrintrm_s_op = 0b0111001010011101010001, ++ vfrintrm_d_op = 0b0111001010011101010010, ++ vfrintrp_s_op = 0b0111001010011101010101, ++ vfrintrp_d_op = 0b0111001010011101010110, ++ vfrintrz_s_op = 0b0111001010011101011001, ++ vfrintrz_d_op = 0b0111001010011101011010, ++ vfrintrne_s_op = 0b0111001010011101011101, ++ vfrintrne_d_op = 0b0111001010011101011110, ++ vfcvtl_s_h_op = 0b0111001010011101111010, ++ vfcvth_s_h_op = 0b0111001010011101111011, ++ vfcvtl_d_s_op = 0b0111001010011101111100, ++ vfcvth_d_s_op = 0b0111001010011101111101, ++ vffint_s_w_op = 0b0111001010011110000000, ++ vffint_s_wu_op = 0b0111001010011110000001, ++ vffint_d_l_op = 0b0111001010011110000010, ++ vffint_d_lu_op = 0b0111001010011110000011, ++ vffintl_d_w_op = 0b0111001010011110000100, ++ vffinth_d_w_op = 0b0111001010011110000101, ++ vftint_w_s_op = 0b0111001010011110001100, ++ vftint_l_d_op = 0b0111001010011110001101, ++ vftintrm_w_s_op = 0b0111001010011110001110, ++ vftintrm_l_d_op = 0b0111001010011110001111, ++ vftintrp_w_s_op = 0b0111001010011110010000, ++ vftintrp_l_d_op = 0b0111001010011110010001, ++ vftintrz_w_s_op = 0b0111001010011110010010, ++ vftintrz_l_d_op = 0b0111001010011110010011, ++ vftintrne_w_s_op = 0b0111001010011110010100, ++ vftintrne_l_d_op = 0b0111001010011110010101, ++ vftint_wu_s = 0b0111001010011110010110, ++ vftint_lu_d = 0b0111001010011110010111, ++ vftintrz_wu_f = 0b0111001010011110011100, ++ vftintrz_lu_d = 0b0111001010011110011101, ++ vftintl_l_s_op = 0b0111001010011110100000, ++ vftinth_l_s_op = 0b0111001010011110100001, ++ vftintrml_l_s_op = 0b0111001010011110100010, ++ vftintrmh_l_s_op = 0b0111001010011110100011, ++ vftintrpl_l_s_op = 0b0111001010011110100100, ++ vftintrph_l_s_op = 0b0111001010011110100101, ++ vftintrzl_l_s_op = 0b0111001010011110100110, ++ vftintrzh_l_s_op = 0b0111001010011110100111, ++ vftintrnel_l_s_op = 0b0111001010011110101000, ++ vftintrneh_l_s_op = 0b0111001010011110101001, ++ vreplgr2vr_b_op = 0b0111001010011111000000, ++ vreplgr2vr_h_op = 0b0111001010011111000001, ++ vreplgr2vr_w_op = 0b0111001010011111000010, ++ vreplgr2vr_d_op = 0b0111001010011111000011, ++ xvclo_b_op = 0b0111011010011100000000, ++ xvclo_h_op = 0b0111011010011100000001, ++ xvclo_w_op = 0b0111011010011100000010, ++ xvclo_d_op = 0b0111011010011100000011, ++ xvclz_b_op = 0b0111011010011100000100, ++ xvclz_h_op = 0b0111011010011100000101, ++ xvclz_w_op = 0b0111011010011100000110, ++ xvclz_d_op = 0b0111011010011100000111, ++ xvpcnt_b_op = 0b0111011010011100001000, ++ xvpcnt_h_op = 0b0111011010011100001001, ++ xvpcnt_w_op = 0b0111011010011100001010, ++ xvpcnt_d_op = 0b0111011010011100001011, ++ xvneg_b_op = 0b0111011010011100001100, ++ xvneg_h_op = 0b0111011010011100001101, ++ xvneg_w_op = 0b0111011010011100001110, ++ xvneg_d_op = 0b0111011010011100001111, ++ xvseteqz_v_op = 0b0111011010011100100110, ++ xvsetnez_v_op = 0b0111011010011100100111, ++ xvsetanyeqz_b_op = 0b0111011010011100101000, ++ xvsetanyeqz_h_op = 0b0111011010011100101001, ++ xvsetanyeqz_w_op = 0b0111011010011100101010, ++ xvsetanyeqz_d_op = 0b0111011010011100101011, ++ xvsetallnez_b_op = 0b0111011010011100101100, ++ xvsetallnez_h_op = 0b0111011010011100101101, ++ xvsetallnez_w_op = 0b0111011010011100101110, ++ xvsetallnez_d_op = 0b0111011010011100101111, ++ xvfclass_s_op = 0b0111011010011100110101, ++ xvfclass_d_op = 0b0111011010011100110110, ++ xvfsqrt_s_op = 0b0111011010011100111001, ++ xvfsqrt_d_op = 0b0111011010011100111010, ++ xvfrint_s_op = 0b0111011010011101001101, ++ xvfrint_d_op = 0b0111011010011101001110, ++ xvfrintrm_s_op = 0b0111011010011101010001, ++ xvfrintrm_d_op = 0b0111011010011101010010, ++ xvfrintrp_s_op = 0b0111011010011101010101, ++ xvfrintrp_d_op = 0b0111011010011101010110, ++ xvfrintrz_s_op = 0b0111011010011101011001, ++ xvfrintrz_d_op = 0b0111011010011101011010, ++ xvfrintrne_s_op = 0b0111011010011101011101, ++ xvfrintrne_d_op = 0b0111011010011101011110, ++ xvfcvtl_s_h_op = 0b0111011010011101111010, ++ xvfcvth_s_h_op = 0b0111011010011101111011, ++ xvfcvtl_d_s_op = 0b0111011010011101111100, ++ xvfcvth_d_s_op = 0b0111011010011101111101, ++ xvffint_s_w_op = 0b0111011010011110000000, ++ xvffint_s_wu_op = 0b0111011010011110000001, ++ xvffint_d_l_op = 0b0111011010011110000010, ++ xvffint_d_lu_op = 0b0111011010011110000011, ++ xvffintl_d_w_op = 0b0111011010011110000100, ++ xvffinth_d_w_op = 0b0111011010011110000101, ++ xvftint_w_s_op = 0b0111011010011110001100, ++ xvftint_l_d_op = 0b0111011010011110001101, ++ xvftintrm_w_s_op = 0b0111011010011110001110, ++ xvftintrm_l_d_op = 0b0111011010011110001111, ++ xvftintrp_w_s_op = 0b0111011010011110010000, ++ xvftintrp_l_d_op = 0b0111011010011110010001, ++ xvftintrz_w_s_op = 0b0111011010011110010010, ++ xvftintrz_l_d_op = 0b0111011010011110010011, ++ xvftintrne_w_s_op = 0b0111011010011110010100, ++ xvftintrne_l_d_op = 0b0111011010011110010101, ++ xvftint_wu_s = 0b0111011010011110010110, ++ xvftint_lu_d = 0b0111011010011110010111, ++ xvftintrz_wu_f = 0b0111011010011110011100, ++ xvftintrz_lu_d = 0b0111011010011110011101, ++ xvftintl_l_s_op = 0b0111011010011110100000, ++ xvftinth_l_s_op = 0b0111011010011110100001, ++ xvftintrml_l_s_op = 0b0111011010011110100010, ++ xvftintrmh_l_s_op = 0b0111011010011110100011, ++ xvftintrpl_l_s_op = 0b0111011010011110100100, ++ xvftintrph_l_s_op = 0b0111011010011110100101, ++ xvftintrzl_l_s_op = 0b0111011010011110100110, ++ xvftintrzh_l_s_op = 0b0111011010011110100111, ++ xvftintrnel_l_s_op = 0b0111011010011110101000, ++ xvftintrneh_l_s_op = 0b0111011010011110101001, ++ xvreplgr2vr_b_op = 0b0111011010011111000000, ++ xvreplgr2vr_h_op = 0b0111011010011111000001, ++ xvreplgr2vr_w_op = 0b0111011010011111000010, ++ xvreplgr2vr_d_op = 0b0111011010011111000011, ++ vext2xv_h_b_op = 0b0111011010011111000100, ++ vext2xv_w_b_op = 0b0111011010011111000101, ++ vext2xv_d_b_op = 0b0111011010011111000110, ++ vext2xv_w_h_op = 0b0111011010011111000111, ++ vext2xv_d_h_op = 0b0111011010011111001000, ++ vext2xv_d_w_op = 0b0111011010011111001001, ++ vext2xv_hu_bu_op = 0b0111011010011111001010, ++ vext2xv_wu_bu_op = 0b0111011010011111001011, ++ vext2xv_du_bu_op = 0b0111011010011111001100, ++ vext2xv_wu_hu_op = 0b0111011010011111001101, ++ vext2xv_du_hu_op = 0b0111011010011111001110, ++ vext2xv_du_wu_op = 0b0111011010011111001111, ++ xvreplve0_b_op = 0b0111011100000111000000, ++ xvreplve0_h_op = 0b0111011100000111100000, ++ xvreplve0_w_op = 0b0111011100000111110000, ++ xvreplve0_d_op = 0b0111011100000111111000, ++ xvreplve0_q_op = 0b0111011100000111111100, ++ ++ unknow_ops22 = 0b1111111111111111111111 ++ }; ++ ++ // 21-bit opcode, highest 21 bits: bits[31...11] ++ enum ops21 { ++ vinsgr2vr_d_op = 0b011100101110101111110, ++ vpickve2gr_d_op = 0b011100101110111111110, ++ vpickve2gr_du_op = 0b011100101111001111110, ++ vreplvei_d_op = 0b011100101111011111110, ++ ++ unknow_ops21 = 0b111111111111111111111 ++ }; ++ ++ // 20-bit opcode, highest 20 bits: bits[31...12] ++ enum ops20 { ++ vinsgr2vr_w_op = 0b01110010111010111110, ++ vpickve2gr_w_op = 0b01110010111011111110, ++ vpickve2gr_wu_op = 0b01110010111100111110, ++ vreplvei_w_op = 0b01110010111101111110, ++ xvinsgr2vr_d_op = 0b01110110111010111110, ++ xvpickve2gr_d_op = 0b01110110111011111110, ++ xvpickve2gr_du_op = 0b01110110111100111110, ++ xvinsve0_d_op = 0b01110110111111111110, ++ xvpickve_d_op = 0b01110111000000111110, ++ ++ unknow_ops20 = 0b11111111111111111111 ++ }; ++ ++ // 19-bit opcode, highest 19 bits: bits[31...13] ++ enum ops19 { ++ vrotri_b_op = 0b0111001010100000001, ++ vinsgr2vr_h_op = 0b0111001011101011110, ++ vpickve2gr_h_op = 0b0111001011101111110, ++ vpickve2gr_hu_op = 0b0111001011110011110, ++ vreplvei_h_op = 0b0111001011110111110, ++ vbitclri_b_op = 0b0111001100010000001, ++ vbitseti_b_op = 0b0111001100010100001, ++ vbitrevi_b_op = 0b0111001100011000001, ++ vslli_b_op = 0b0111001100101100001, ++ vsrli_b_op = 0b0111001100110000001, ++ vsrai_b_op = 0b0111001100110100001, ++ xvrotri_b_op = 0b0111011010100000001, ++ xvinsgr2vr_w_op = 0b0111011011101011110, ++ xvpickve2gr_w_op = 0b0111011011101111110, ++ xvpickve2gr_wu_op = 0b0111011011110011110, ++ xvinsve0_w_op = 0b0111011011111111110, ++ xvpickve_w_op = 0b0111011100000011110, ++ xvbitclri_b_op = 0b0111011100010000001, ++ xvbitseti_b_op = 0b0111011100010100001, ++ xvbitrevi_b_op = 0b0111011100011000001, ++ xvslli_b_op = 0b0111011100101100001, ++ xvsrli_b_op = 0b0111011100110000001, ++ xvsrai_b_op = 0b0111011100110100001, ++ ++ unknow_ops19 = 0b1111111111111111111 ++ }; ++ ++ // 18-bit opcode, highest 18 bits: bits[31...14] ++ enum ops18 { ++ vrotri_h_op = 0b011100101010000001, ++ vinsgr2vr_b_op = 0b011100101110101110, ++ vpickve2gr_b_op = 0b011100101110111110, ++ vpickve2gr_bu_op = 0b011100101111001110, ++ vreplvei_b_op = 0b011100101111011110, ++ vbitclri_h_op = 0b011100110001000001, ++ vbitseti_h_op = 0b011100110001010001, ++ vbitrevi_h_op = 0b011100110001100001, ++ vslli_h_op = 0b011100110010110001, ++ vsrli_h_op = 0b011100110011000001, ++ vsrai_h_op = 0b011100110011010001, ++ vsrlni_b_h_op = 0b011100110100000001, ++ xvrotri_h_op = 0b011101101010000001, ++ xvbitclri_h_op = 0b011101110001000001, ++ xvbitseti_h_op = 0b011101110001010001, ++ xvbitrevi_h_op = 0b011101110001100001, ++ xvslli_h_op = 0b011101110010110001, ++ xvsrli_h_op = 0b011101110011000001, ++ xvsrai_h_op = 0b011101110011010001, ++ ++ unknow_ops18 = 0b111111111111111111 ++ }; ++ ++ // 17-bit opcode, highest 17 bits: bits[31...15] ++ enum ops17 { ++ asrtle_d_op = 0b00000000000000010, ++ asrtgt_d_op = 0b00000000000000011, ++ add_w_op = 0b00000000000100000, ++ add_d_op = 0b00000000000100001, ++ sub_w_op = 0b00000000000100010, ++ sub_d_op = 0b00000000000100011, ++ slt_op = 0b00000000000100100, ++ sltu_op = 0b00000000000100101, ++ maskeqz_op = 0b00000000000100110, ++ masknez_op = 0b00000000000100111, ++ nor_op = 0b00000000000101000, ++ and_op = 0b00000000000101001, ++ or_op = 0b00000000000101010, ++ xor_op = 0b00000000000101011, ++ orn_op = 0b00000000000101100, ++ andn_op = 0b00000000000101101, ++ sll_w_op = 0b00000000000101110, ++ srl_w_op = 0b00000000000101111, ++ sra_w_op = 0b00000000000110000, ++ sll_d_op = 0b00000000000110001, ++ srl_d_op = 0b00000000000110010, ++ sra_d_op = 0b00000000000110011, ++ rotr_w_op = 0b00000000000110110, ++ rotr_d_op = 0b00000000000110111, ++ mul_w_op = 0b00000000000111000, ++ mulh_w_op = 0b00000000000111001, ++ mulh_wu_op = 0b00000000000111010, ++ mul_d_op = 0b00000000000111011, ++ mulh_d_op = 0b00000000000111100, ++ mulh_du_op = 0b00000000000111101, ++ mulw_d_w_op = 0b00000000000111110, ++ mulw_d_wu_op = 0b00000000000111111, ++ div_w_op = 0b00000000001000000, ++ mod_w_op = 0b00000000001000001, ++ div_wu_op = 0b00000000001000010, ++ mod_wu_op = 0b00000000001000011, ++ div_d_op = 0b00000000001000100, ++ mod_d_op = 0b00000000001000101, ++ div_du_op = 0b00000000001000110, ++ mod_du_op = 0b00000000001000111, ++ crc_w_b_w_op = 0b00000000001001000, ++ crc_w_h_w_op = 0b00000000001001001, ++ crc_w_w_w_op = 0b00000000001001010, ++ crc_w_d_w_op = 0b00000000001001011, ++ crcc_w_b_w_op = 0b00000000001001100, ++ crcc_w_h_w_op = 0b00000000001001101, ++ crcc_w_w_w_op = 0b00000000001001110, ++ crcc_w_d_w_op = 0b00000000001001111, ++ break_op = 0b00000000001010100, ++ fadd_s_op = 0b00000001000000001, ++ fadd_d_op = 0b00000001000000010, ++ fsub_s_op = 0b00000001000000101, ++ fsub_d_op = 0b00000001000000110, ++ fmul_s_op = 0b00000001000001001, ++ fmul_d_op = 0b00000001000001010, ++ fdiv_s_op = 0b00000001000001101, ++ fdiv_d_op = 0b00000001000001110, ++ fmax_s_op = 0b00000001000010001, ++ fmax_d_op = 0b00000001000010010, ++ fmin_s_op = 0b00000001000010101, ++ fmin_d_op = 0b00000001000010110, ++ fmaxa_s_op = 0b00000001000011001, ++ fmaxa_d_op = 0b00000001000011010, ++ fmina_s_op = 0b00000001000011101, ++ fmina_d_op = 0b00000001000011110, ++ fscaleb_s_op = 0b00000001000100001, ++ fscaleb_d_op = 0b00000001000100010, ++ fcopysign_s_op = 0b00000001000100101, ++ fcopysign_d_op = 0b00000001000100110, ++ ldx_b_op = 0b00111000000000000, ++ ldx_h_op = 0b00111000000001000, ++ ldx_w_op = 0b00111000000010000, ++ ldx_d_op = 0b00111000000011000, ++ stx_b_op = 0b00111000000100000, ++ stx_h_op = 0b00111000000101000, ++ stx_w_op = 0b00111000000110000, ++ stx_d_op = 0b00111000000111000, ++ ldx_bu_op = 0b00111000001000000, ++ ldx_hu_op = 0b00111000001001000, ++ ldx_wu_op = 0b00111000001010000, ++ fldx_s_op = 0b00111000001100000, ++ fldx_d_op = 0b00111000001101000, ++ fstx_s_op = 0b00111000001110000, ++ fstx_d_op = 0b00111000001111000, ++ vldx_op = 0b00111000010000000, ++ vstx_op = 0b00111000010001000, ++ xvldx_op = 0b00111000010010000, ++ xvstx_op = 0b00111000010011000, ++ amcas_b_op = 0b00111000010110000, ++ amcas_h_op = 0b00111000010110001, ++ amcas_w_op = 0b00111000010110010, ++ amcas_d_op = 0b00111000010110011, ++ amcas_db_b_op = 0b00111000010110100, ++ amcas_db_h_op = 0b00111000010110101, ++ amcas_db_w_op = 0b00111000010110110, ++ amcas_db_d_op = 0b00111000010110111, ++ amswap_b_op = 0b00111000010111000, ++ amswap_h_op = 0b00111000010111001, ++ amadd_b_op = 0b00111000010111010, ++ amadd_h_op = 0b00111000010111011, ++ amswap_db_b_op = 0b00111000010111100, ++ amswap_db_h_op = 0b00111000010111101, ++ amadd_db_b_op = 0b00111000010111110, ++ amadd_db_h_op = 0b00111000010111111, ++ amswap_w_op = 0b00111000011000000, ++ amswap_d_op = 0b00111000011000001, ++ amadd_w_op = 0b00111000011000010, ++ amadd_d_op = 0b00111000011000011, ++ amand_w_op = 0b00111000011000100, ++ amand_d_op = 0b00111000011000101, ++ amor_w_op = 0b00111000011000110, ++ amor_d_op = 0b00111000011000111, ++ amxor_w_op = 0b00111000011001000, ++ amxor_d_op = 0b00111000011001001, ++ ammax_w_op = 0b00111000011001010, ++ ammax_d_op = 0b00111000011001011, ++ ammin_w_op = 0b00111000011001100, ++ ammin_d_op = 0b00111000011001101, ++ ammax_wu_op = 0b00111000011001110, ++ ammax_du_op = 0b00111000011001111, ++ ammin_wu_op = 0b00111000011010000, ++ ammin_du_op = 0b00111000011010001, ++ amswap_db_w_op = 0b00111000011010010, ++ amswap_db_d_op = 0b00111000011010011, ++ amadd_db_w_op = 0b00111000011010100, ++ amadd_db_d_op = 0b00111000011010101, ++ amand_db_w_op = 0b00111000011010110, ++ amand_db_d_op = 0b00111000011010111, ++ amor_db_w_op = 0b00111000011011000, ++ amor_db_d_op = 0b00111000011011001, ++ amxor_db_w_op = 0b00111000011011010, ++ amxor_db_d_op = 0b00111000011011011, ++ ammax_db_w_op = 0b00111000011011100, ++ ammax_db_d_op = 0b00111000011011101, ++ ammin_db_w_op = 0b00111000011011110, ++ ammin_db_d_op = 0b00111000011011111, ++ ammax_db_wu_op = 0b00111000011100000, ++ ammax_db_du_op = 0b00111000011100001, ++ ammin_db_wu_op = 0b00111000011100010, ++ ammin_db_du_op = 0b00111000011100011, ++ dbar_op = 0b00111000011100100, ++ ibar_op = 0b00111000011100101, ++ fldgt_s_op = 0b00111000011101000, ++ fldgt_d_op = 0b00111000011101001, ++ fldle_s_op = 0b00111000011101010, ++ fldle_d_op = 0b00111000011101011, ++ fstgt_s_op = 0b00111000011101100, ++ fstgt_d_op = 0b00111000011101101, ++ fstle_s_op = 0b00111000011101110, ++ fstle_d_op = 0b00111000011101111, ++ ldgt_b_op = 0b00111000011110000, ++ ldgt_h_op = 0b00111000011110001, ++ ldgt_w_op = 0b00111000011110010, ++ ldgt_d_op = 0b00111000011110011, ++ ldle_b_op = 0b00111000011110100, ++ ldle_h_op = 0b00111000011110101, ++ ldle_w_op = 0b00111000011110110, ++ ldle_d_op = 0b00111000011110111, ++ stgt_b_op = 0b00111000011111000, ++ stgt_h_op = 0b00111000011111001, ++ stgt_w_op = 0b00111000011111010, ++ stgt_d_op = 0b00111000011111011, ++ stle_b_op = 0b00111000011111100, ++ stle_h_op = 0b00111000011111101, ++ stle_w_op = 0b00111000011111110, ++ stle_d_op = 0b00111000011111111, ++ vseq_b_op = 0b01110000000000000, ++ vseq_h_op = 0b01110000000000001, ++ vseq_w_op = 0b01110000000000010, ++ vseq_d_op = 0b01110000000000011, ++ vsle_b_op = 0b01110000000000100, ++ vsle_h_op = 0b01110000000000101, ++ vsle_w_op = 0b01110000000000110, ++ vsle_d_op = 0b01110000000000111, ++ vsle_bu_op = 0b01110000000001000, ++ vsle_hu_op = 0b01110000000001001, ++ vsle_wu_op = 0b01110000000001010, ++ vsle_du_op = 0b01110000000001011, ++ vslt_b_op = 0b01110000000001100, ++ vslt_h_op = 0b01110000000001101, ++ vslt_w_op = 0b01110000000001110, ++ vslt_d_op = 0b01110000000001111, ++ vslt_bu_op = 0b01110000000010000, ++ vslt_hu_op = 0b01110000000010001, ++ vslt_wu_op = 0b01110000000010010, ++ vslt_du_op = 0b01110000000010011, ++ vadd_b_op = 0b01110000000010100, ++ vadd_h_op = 0b01110000000010101, ++ vadd_w_op = 0b01110000000010110, ++ vadd_d_op = 0b01110000000010111, ++ vsub_b_op = 0b01110000000011000, ++ vsub_h_op = 0b01110000000011001, ++ vsub_w_op = 0b01110000000011010, ++ vsub_d_op = 0b01110000000011011, ++ vhaddw_h_b_op = 0b01110000010101000, ++ vhaddw_w_h_op = 0b01110000010101001, ++ vhaddw_d_w_op = 0b01110000010101010, ++ vhaddw_q_d_op = 0b01110000010101011, ++ vhsubw_h_b_op = 0b01110000010101100, ++ vhsubw_w_h_op = 0b01110000010101101, ++ vhsubw_d_w_op = 0b01110000010101110, ++ vhsubw_q_d_op = 0b01110000010101111, ++ vhaddw_hu_bu_op = 0b01110000010110000, ++ vhaddw_wu_hu_op = 0b01110000010110001, ++ vhaddw_du_wu_op = 0b01110000010110010, ++ vhaddw_qu_du_op = 0b01110000010110011, ++ vhsubw_hu_bu_op = 0b01110000010110100, ++ vhsubw_wu_hu_op = 0b01110000010110101, ++ vhsubw_du_wu_op = 0b01110000010110110, ++ vhsubw_qu_du_op = 0b01110000010110111, ++ vabsd_b_op = 0b01110000011000000, ++ vabsd_h_op = 0b01110000011000001, ++ vabsd_w_op = 0b01110000011000010, ++ vabsd_d_op = 0b01110000011000011, ++ vmax_b_op = 0b01110000011100000, ++ vmax_h_op = 0b01110000011100001, ++ vmax_w_op = 0b01110000011100010, ++ vmax_d_op = 0b01110000011100011, ++ vmin_b_op = 0b01110000011100100, ++ vmin_h_op = 0b01110000011100101, ++ vmin_w_op = 0b01110000011100110, ++ vmin_d_op = 0b01110000011100111, ++ vmul_b_op = 0b01110000100001000, ++ vmul_h_op = 0b01110000100001001, ++ vmul_w_op = 0b01110000100001010, ++ vmul_d_op = 0b01110000100001011, ++ vmuh_b_op = 0b01110000100001100, ++ vmuh_h_op = 0b01110000100001101, ++ vmuh_w_op = 0b01110000100001110, ++ vmuh_d_op = 0b01110000100001111, ++ vmuh_bu_op = 0b01110000100010000, ++ vmuh_hu_op = 0b01110000100010001, ++ vmuh_wu_op = 0b01110000100010010, ++ vmuh_du_op = 0b01110000100010011, ++ vmulwev_h_b_op = 0b01110000100100000, ++ vmulwev_w_h_op = 0b01110000100100001, ++ vmulwev_d_w_op = 0b01110000100100010, ++ vmulwev_q_d_op = 0b01110000100100011, ++ vmulwod_h_b_op = 0b01110000100100100, ++ vmulwod_w_h_op = 0b01110000100100101, ++ vmulwod_d_w_op = 0b01110000100100110, ++ vmulwod_q_d_op = 0b01110000100100111, ++ vmadd_b_op = 0b01110000101010000, ++ vmadd_h_op = 0b01110000101010001, ++ vmadd_w_op = 0b01110000101010010, ++ vmadd_d_op = 0b01110000101010011, ++ vmsub_b_op = 0b01110000101010100, ++ vmsub_h_op = 0b01110000101010101, ++ vmsub_w_op = 0b01110000101010110, ++ vmsub_d_op = 0b01110000101010111, ++ vsll_b_op = 0b01110000111010000, ++ vsll_h_op = 0b01110000111010001, ++ vsll_w_op = 0b01110000111010010, ++ vsll_d_op = 0b01110000111010011, ++ vsrl_b_op = 0b01110000111010100, ++ vsrl_h_op = 0b01110000111010101, ++ vsrl_w_op = 0b01110000111010110, ++ vsrl_d_op = 0b01110000111010111, ++ vsra_b_op = 0b01110000111011000, ++ vsra_h_op = 0b01110000111011001, ++ vsra_w_op = 0b01110000111011010, ++ vsra_d_op = 0b01110000111011011, ++ vrotr_b_op = 0b01110000111011100, ++ vrotr_h_op = 0b01110000111011101, ++ vrotr_w_op = 0b01110000111011110, ++ vrotr_d_op = 0b01110000111011111, ++ vbitclr_b_op = 0b01110001000011000, ++ vbitclr_h_op = 0b01110001000011001, ++ vbitclr_w_op = 0b01110001000011010, ++ vbitclr_d_op = 0b01110001000011011, ++ vbitset_b_op = 0b01110001000011100, ++ vbitset_h_op = 0b01110001000011101, ++ vbitset_w_op = 0b01110001000011110, ++ vbitset_d_op = 0b01110001000011111, ++ vbitrev_b_op = 0b01110001000100000, ++ vbitrev_h_op = 0b01110001000100001, ++ vbitrev_w_op = 0b01110001000100010, ++ vbitrev_d_op = 0b01110001000100011, ++ vilvl_b_op = 0b01110001000110100, ++ vilvl_h_op = 0b01110001000110101, ++ vilvl_w_op = 0b01110001000110110, ++ vilvl_d_op = 0b01110001000110111, ++ vilvh_b_op = 0b01110001000111000, ++ vilvh_h_op = 0b01110001000111001, ++ vilvh_w_op = 0b01110001000111010, ++ vilvh_d_op = 0b01110001000111011, ++ vand_v_op = 0b01110001001001100, ++ vor_v_op = 0b01110001001001101, ++ vxor_v_op = 0b01110001001001110, ++ vnor_v_op = 0b01110001001001111, ++ vandn_v_op = 0b01110001001010000, ++ vorn_v_op = 0b01110001001010001, ++ vfrstp_b_op = 0b01110001001010110, ++ vfrstp_h_op = 0b01110001001010111, ++ vadd_q_op = 0b01110001001011010, ++ vsub_q_op = 0b01110001001011011, ++ vfadd_s_op = 0b01110001001100001, ++ vfadd_d_op = 0b01110001001100010, ++ vfsub_s_op = 0b01110001001100101, ++ vfsub_d_op = 0b01110001001100110, ++ vfmul_s_op = 0b01110001001110001, ++ vfmul_d_op = 0b01110001001110010, ++ vfdiv_s_op = 0b01110001001110101, ++ vfdiv_d_op = 0b01110001001110110, ++ vfmax_s_op = 0b01110001001111001, ++ vfmax_d_op = 0b01110001001111010, ++ vfmin_s_op = 0b01110001001111101, ++ vfmin_d_op = 0b01110001001111110, ++ vfcvt_h_s_op = 0b01110001010001100, ++ vfcvt_s_d_op = 0b01110001010001101, ++ vffint_s_l_op = 0b01110001010010000, ++ vftint_w_d_op = 0b01110001010010011, ++ vftintrm_w_d_op = 0b01110001010010100, ++ vftintrp_w_d_op = 0b01110001010010101, ++ vftintrz_w_d_op = 0b01110001010010110, ++ vftintrne_w_d_op = 0b01110001010010111, ++ vshuf_h_op = 0b01110001011110101, ++ vshuf_w_op = 0b01110001011110110, ++ vshuf_d_op = 0b01110001011110111, ++ vslti_b_op = 0b01110010100001100, ++ vslti_h_op = 0b01110010100001101, ++ vslti_w_op = 0b01110010100001110, ++ vslti_d_op = 0b01110010100001111, ++ vslti_bu_op = 0b01110010100010000, ++ vslti_hu_op = 0b01110010100010001, ++ vslti_wu_op = 0b01110010100010010, ++ vslti_du_op = 0b01110010100010011, ++ vaddi_bu_op = 0b01110010100010100, ++ vaddi_hu_op = 0b01110010100010101, ++ vaddi_wu_op = 0b01110010100010110, ++ vaddi_du_op = 0b01110010100010111, ++ vsubi_bu_op = 0b01110010100011000, ++ vsubi_hu_op = 0b01110010100011001, ++ vsubi_wu_op = 0b01110010100011010, ++ vsubi_du_op = 0b01110010100011011, ++ vfrstpi_b_op = 0b01110010100110100, ++ vfrstpi_h_op = 0b01110010100110101, ++ vrotri_w_op = 0b01110010101000001, ++ vbitclri_w_op = 0b01110011000100001, ++ vbitseti_w_op = 0b01110011000101001, ++ vbitrevi_w_op = 0b01110011000110001, ++ vslli_w_op = 0b01110011001011001, ++ vsrli_w_op = 0b01110011001100001, ++ vsrai_w_op = 0b01110011001101001, ++ vsrlni_h_w_op = 0b01110011010000001, ++ xvseq_b_op = 0b01110100000000000, ++ xvseq_h_op = 0b01110100000000001, ++ xvseq_w_op = 0b01110100000000010, ++ xvseq_d_op = 0b01110100000000011, ++ xvsle_b_op = 0b01110100000000100, ++ xvsle_h_op = 0b01110100000000101, ++ xvsle_w_op = 0b01110100000000110, ++ xvsle_d_op = 0b01110100000000111, ++ xvsle_bu_op = 0b01110100000001000, ++ xvsle_hu_op = 0b01110100000001001, ++ xvsle_wu_op = 0b01110100000001010, ++ xvsle_du_op = 0b01110100000001011, ++ xvslt_b_op = 0b01110100000001100, ++ xvslt_h_op = 0b01110100000001101, ++ xvslt_w_op = 0b01110100000001110, ++ xvslt_d_op = 0b01110100000001111, ++ xvslt_bu_op = 0b01110100000010000, ++ xvslt_hu_op = 0b01110100000010001, ++ xvslt_wu_op = 0b01110100000010010, ++ xvslt_du_op = 0b01110100000010011, ++ xvadd_b_op = 0b01110100000010100, ++ xvadd_h_op = 0b01110100000010101, ++ xvadd_w_op = 0b01110100000010110, ++ xvadd_d_op = 0b01110100000010111, ++ xvsub_b_op = 0b01110100000011000, ++ xvsub_h_op = 0b01110100000011001, ++ xvsub_w_op = 0b01110100000011010, ++ xvsub_d_op = 0b01110100000011011, ++ xvhaddw_h_b_op = 0b01110100010101000, ++ xvhaddw_w_h_op = 0b01110100010101001, ++ xvhaddw_d_w_op = 0b01110100010101010, ++ xvhaddw_q_d_op = 0b01110100010101011, ++ xvhsubw_h_b_op = 0b01110100010101100, ++ xvhsubw_w_h_op = 0b01110100010101101, ++ xvhsubw_d_w_op = 0b01110100010101110, ++ xvhsubw_q_d_op = 0b01110100010101111, ++ xvhaddw_hu_bu_op = 0b01110100010110000, ++ xvhaddw_wu_hu_op = 0b01110100010110001, ++ xvhaddw_du_wu_op = 0b01110100010110010, ++ xvhaddw_qu_du_op = 0b01110100010110011, ++ xvhsubw_hu_bu_op = 0b01110100010110100, ++ xvhsubw_wu_hu_op = 0b01110100010110101, ++ xvhsubw_du_wu_op = 0b01110100010110110, ++ xvhsubw_qu_du_op = 0b01110100010110111, ++ xvabsd_b_op = 0b01110100011000000, ++ xvabsd_h_op = 0b01110100011000001, ++ xvabsd_w_op = 0b01110100011000010, ++ xvabsd_d_op = 0b01110100011000011, ++ xvmax_b_op = 0b01110100011100000, ++ xvmax_h_op = 0b01110100011100001, ++ xvmax_w_op = 0b01110100011100010, ++ xvmax_d_op = 0b01110100011100011, ++ xvmin_b_op = 0b01110100011100100, ++ xvmin_h_op = 0b01110100011100101, ++ xvmin_w_op = 0b01110100011100110, ++ xvmin_d_op = 0b01110100011100111, ++ xvmul_b_op = 0b01110100100001000, ++ xvmul_h_op = 0b01110100100001001, ++ xvmul_w_op = 0b01110100100001010, ++ xvmul_d_op = 0b01110100100001011, ++ xvmuh_b_op = 0b01110100100001100, ++ xvmuh_h_op = 0b01110100100001101, ++ xvmuh_w_op = 0b01110100100001110, ++ xvmuh_d_op = 0b01110100100001111, ++ xvmuh_bu_op = 0b01110100100010000, ++ xvmuh_hu_op = 0b01110100100010001, ++ xvmuh_wu_op = 0b01110100100010010, ++ xvmuh_du_op = 0b01110100100010011, ++ xvmulwev_h_b_op = 0b01110100100100000, ++ xvmulwev_w_h_op = 0b01110100100100001, ++ xvmulwev_d_w_op = 0b01110100100100010, ++ xvmulwev_q_d_op = 0b01110100100100011, ++ xvmulwod_h_b_op = 0b01110100100100100, ++ xvmulwod_w_h_op = 0b01110100100100101, ++ xvmulwod_d_w_op = 0b01110100100100110, ++ xvmulwod_q_d_op = 0b01110100100100111, ++ xvmadd_b_op = 0b01110100101010000, ++ xvmadd_h_op = 0b01110100101010001, ++ xvmadd_w_op = 0b01110100101010010, ++ xvmadd_d_op = 0b01110100101010011, ++ xvmsub_b_op = 0b01110100101010100, ++ xvmsub_h_op = 0b01110100101010101, ++ xvmsub_w_op = 0b01110100101010110, ++ xvmsub_d_op = 0b01110100101010111, ++ xvsll_b_op = 0b01110100111010000, ++ xvsll_h_op = 0b01110100111010001, ++ xvsll_w_op = 0b01110100111010010, ++ xvsll_d_op = 0b01110100111010011, ++ xvsrl_b_op = 0b01110100111010100, ++ xvsrl_h_op = 0b01110100111010101, ++ xvsrl_w_op = 0b01110100111010110, ++ xvsrl_d_op = 0b01110100111010111, ++ xvsra_b_op = 0b01110100111011000, ++ xvsra_h_op = 0b01110100111011001, ++ xvsra_w_op = 0b01110100111011010, ++ xvsra_d_op = 0b01110100111011011, ++ xvrotr_b_op = 0b01110100111011100, ++ xvrotr_h_op = 0b01110100111011101, ++ xvrotr_w_op = 0b01110100111011110, ++ xvrotr_d_op = 0b01110100111011111, ++ xvbitclr_b_op = 0b01110101000011000, ++ xvbitclr_h_op = 0b01110101000011001, ++ xvbitclr_w_op = 0b01110101000011010, ++ xvbitclr_d_op = 0b01110101000011011, ++ xvbitset_b_op = 0b01110101000011100, ++ xvbitset_h_op = 0b01110101000011101, ++ xvbitset_w_op = 0b01110101000011110, ++ xvbitset_d_op = 0b01110101000011111, ++ xvbitrev_b_op = 0b01110101000100000, ++ xvbitrev_h_op = 0b01110101000100001, ++ xvbitrev_w_op = 0b01110101000100010, ++ xvbitrev_d_op = 0b01110101000100011, ++ xvilvl_b_op = 0b01110101000110100, ++ xvilvl_h_op = 0b01110101000110101, ++ xvilvl_w_op = 0b01110101000110110, ++ xvilvl_d_op = 0b01110101000110111, ++ xvilvh_b_op = 0b01110101000111000, ++ xvilvh_h_op = 0b01110101000111001, ++ xvilvh_w_op = 0b01110101000111010, ++ xvilvh_d_op = 0b01110101000111011, ++ xvand_v_op = 0b01110101001001100, ++ xvor_v_op = 0b01110101001001101, ++ xvxor_v_op = 0b01110101001001110, ++ xvnor_v_op = 0b01110101001001111, ++ xvandn_v_op = 0b01110101001010000, ++ xvorn_v_op = 0b01110101001010001, ++ xvfrstp_b_op = 0b01110101001010110, ++ xvfrstp_h_op = 0b01110101001010111, ++ xvadd_q_op = 0b01110101001011010, ++ xvsub_q_op = 0b01110101001011011, ++ xvfadd_s_op = 0b01110101001100001, ++ xvfadd_d_op = 0b01110101001100010, ++ xvfsub_s_op = 0b01110101001100101, ++ xvfsub_d_op = 0b01110101001100110, ++ xvfmul_s_op = 0b01110101001110001, ++ xvfmul_d_op = 0b01110101001110010, ++ xvfdiv_s_op = 0b01110101001110101, ++ xvfdiv_d_op = 0b01110101001110110, ++ xvfmax_s_op = 0b01110101001111001, ++ xvfmax_d_op = 0b01110101001111010, ++ xvfmin_s_op = 0b01110101001111101, ++ xvfmin_d_op = 0b01110101001111110, ++ xvfcvt_h_s_op = 0b01110101010001100, ++ xvfcvt_s_d_op = 0b01110101010001101, ++ xvffint_s_l_op = 0b01110101010010000, ++ xvftint_w_d_op = 0b01110101010010011, ++ xvftintrm_w_d_op = 0b01110101010010100, ++ xvftintrp_w_d_op = 0b01110101010010101, ++ xvftintrz_w_d_op = 0b01110101010010110, ++ xvftintrne_w_d_op = 0b01110101010010111, ++ xvshuf_h_op = 0b01110101011110101, ++ xvshuf_w_op = 0b01110101011110110, ++ xvshuf_d_op = 0b01110101011110111, ++ xvperm_w_op = 0b01110101011111010, ++ xvslti_b_op = 0b01110110100001100, ++ xvslti_h_op = 0b01110110100001101, ++ xvslti_w_op = 0b01110110100001110, ++ xvslti_d_op = 0b01110110100001111, ++ xvslti_bu_op = 0b01110110100010000, ++ xvslti_hu_op = 0b01110110100010001, ++ xvslti_wu_op = 0b01110110100010010, ++ xvslti_du_op = 0b01110110100010011, ++ xvaddi_bu_op = 0b01110110100010100, ++ xvaddi_hu_op = 0b01110110100010101, ++ xvaddi_wu_op = 0b01110110100010110, ++ xvaddi_du_op = 0b01110110100010111, ++ xvsubi_bu_op = 0b01110110100011000, ++ xvsubi_hu_op = 0b01110110100011001, ++ xvsubi_wu_op = 0b01110110100011010, ++ xvsubi_du_op = 0b01110110100011011, ++ xvfrstpi_b_op = 0b01110110100110100, ++ xvfrstpi_h_op = 0b01110110100110101, ++ xvrotri_w_op = 0b01110110101000001, ++ xvbitclri_w_op = 0b01110111000100001, ++ xvbitseti_w_op = 0b01110111000101001, ++ xvbitrevi_w_op = 0b01110111000110001, ++ xvslli_w_op = 0b01110111001011001, ++ xvsrli_w_op = 0b01110111001100001, ++ xvsrai_w_op = 0b01110111001101001, ++ ++ unknow_ops17 = 0b11111111111111111 ++ }; ++ ++ // 16-bit opcode, highest 16 bits: bits[31...16] ++ enum ops16 { ++ vrotri_d_op = 0b0111001010100001, ++ vbitclri_d_op = 0b0111001100010001, ++ vbitseti_d_op = 0b0111001100010101, ++ vbitrevi_d_op = 0b0111001100011001, ++ vslli_d_op = 0b0111001100101101, ++ vsrli_d_op = 0b0111001100110001, ++ vsrai_d_op = 0b0111001100110101, ++ vsrlni_w_d_op = 0b0111001101000001, ++ xvrotri_d_op = 0b0111011010100001, ++ xvbitclri_d_op = 0b0111011100010001, ++ xvbitseti_d_op = 0b0111011100010101, ++ xvbitrevi_d_op = 0b0111011100011001, ++ xvslli_d_op = 0b0111011100101101, ++ xvsrli_d_op = 0b0111011100110001, ++ xvsrai_d_op = 0b0111011100110101, ++ ++ unknow_ops16 = 0b1111111111111111 ++ }; ++ ++ // 15-bit opcode, highest 15 bits: bits[31...17] ++ enum ops15 { ++ vsrlni_d_q_op = 0b011100110100001, ++ ++ unknow_ops15 = 0b111111111111111 ++ }; ++ ++ // 14-bit opcode, highest 14 bits: bits[31...18] ++ enum ops14 { ++ alsl_w_op = 0b00000000000001, ++ bytepick_w_op = 0b00000000000010, ++ bytepick_d_op = 0b00000000000011, ++ alsl_d_op = 0b00000000001011, ++ slli_op = 0b00000000010000, ++ srli_op = 0b00000000010001, ++ srai_op = 0b00000000010010, ++ rotri_op = 0b00000000010011, ++ lddir_op = 0b00000110010000, ++ ldpte_op = 0b00000110010001, ++ vshuf4i_b_op = 0b01110011100100, ++ vshuf4i_h_op = 0b01110011100101, ++ vshuf4i_w_op = 0b01110011100110, ++ vshuf4i_d_op = 0b01110011100111, ++ vandi_b_op = 0b01110011110100, ++ vori_b_op = 0b01110011110101, ++ vxori_b_op = 0b01110011110110, ++ vnori_b_op = 0b01110011110111, ++ vldi_op = 0b01110011111000, ++ vpermi_w_op = 0b01110011111001, ++ xvshuf4i_b_op = 0b01110111100100, ++ xvshuf4i_h_op = 0b01110111100101, ++ xvshuf4i_w_op = 0b01110111100110, ++ xvshuf4i_d_op = 0b01110111100111, ++ xvandi_b_op = 0b01110111110100, ++ xvori_b_op = 0b01110111110101, ++ xvxori_b_op = 0b01110111110110, ++ xvnori_b_op = 0b01110111110111, ++ xvldi_op = 0b01110111111000, ++ xvpermi_w_op = 0b01110111111001, ++ xvpermi_d_op = 0b01110111111010, ++ xvpermi_q_op = 0b01110111111011, ++ ++ unknow_ops14 = 0b11111111111111 ++ }; ++ ++ // 13-bit opcode, highest 13 bits: bits[31...19] ++ enum ops13 { ++ vldrepl_d_op = 0b0011000000010, ++ xvldrepl_d_op = 0b0011001000010, ++ ++ unknow_ops13 = 0b1111111111111 ++ }; ++ ++ // 12-bit opcode, highest 12 bits: bits[31...20] ++ enum ops12 { ++ fmadd_s_op = 0b000010000001, ++ fmadd_d_op = 0b000010000010, ++ fmsub_s_op = 0b000010000101, ++ fmsub_d_op = 0b000010000110, ++ fnmadd_s_op = 0b000010001001, ++ fnmadd_d_op = 0b000010001010, ++ fnmsub_s_op = 0b000010001101, ++ fnmsub_d_op = 0b000010001110, ++ vfmadd_s_op = 0b000010010001, ++ vfmadd_d_op = 0b000010010010, ++ vfmsub_s_op = 0b000010010101, ++ vfmsub_d_op = 0b000010010110, ++ vfnmadd_s_op = 0b000010011001, ++ vfnmadd_d_op = 0b000010011010, ++ vfnmsub_s_op = 0b000010011101, ++ vfnmsub_d_op = 0b000010011110, ++ xvfmadd_s_op = 0b000010100001, ++ xvfmadd_d_op = 0b000010100010, ++ xvfmsub_s_op = 0b000010100101, ++ xvfmsub_d_op = 0b000010100110, ++ xvfnmadd_s_op = 0b000010101001, ++ xvfnmadd_d_op = 0b000010101010, ++ xvfnmsub_s_op = 0b000010101101, ++ xvfnmsub_d_op = 0b000010101110, ++ fcmp_cond_s_op = 0b000011000001, ++ fcmp_cond_d_op = 0b000011000010, ++ vfcmp_cond_s_op = 0b000011000101, ++ vfcmp_cond_d_op = 0b000011000110, ++ xvfcmp_cond_s_op = 0b000011001001, ++ xvfcmp_cond_d_op = 0b000011001010, ++ fsel_op = 0b000011010000, ++ vbitsel_v_op = 0b000011010001, ++ xvbitsel_v_op = 0b000011010010, ++ vshuf_b_op = 0b000011010101, ++ xvshuf_b_op = 0b000011010110, ++ vldrepl_w_op = 0b001100000010, ++ xvldrepl_w_op = 0b001100100010, ++ ++ unknow_ops12 = 0b111111111111 ++ }; ++ ++ // 11-bit opcode, highest 11 bits: bits[31...21] ++ enum ops11 { ++ vldrepl_h_op = 0b00110000010, ++ xvldrepl_h_op = 0b00110010010, ++ ++ unknow_ops11 = 0b11111111111 ++ }; ++ ++ // 10-bit opcode, highest 10 bits: bits[31...22] ++ enum ops10 { ++ bstr_w_op = 0b0000000001, ++ bstrins_d_op = 0b0000000010, ++ bstrpick_d_op = 0b0000000011, ++ slti_op = 0b0000001000, ++ sltui_op = 0b0000001001, ++ addi_w_op = 0b0000001010, ++ addi_d_op = 0b0000001011, ++ lu52i_d_op = 0b0000001100, ++ andi_op = 0b0000001101, ++ ori_op = 0b0000001110, ++ xori_op = 0b0000001111, ++ ld_b_op = 0b0010100000, ++ ld_h_op = 0b0010100001, ++ ld_w_op = 0b0010100010, ++ ld_d_op = 0b0010100011, ++ st_b_op = 0b0010100100, ++ st_h_op = 0b0010100101, ++ st_w_op = 0b0010100110, ++ st_d_op = 0b0010100111, ++ ld_bu_op = 0b0010101000, ++ ld_hu_op = 0b0010101001, ++ ld_wu_op = 0b0010101010, ++ preld_op = 0b0010101011, ++ fld_s_op = 0b0010101100, ++ fst_s_op = 0b0010101101, ++ fld_d_op = 0b0010101110, ++ fst_d_op = 0b0010101111, ++ vld_op = 0b0010110000, ++ vst_op = 0b0010110001, ++ xvld_op = 0b0010110010, ++ xvst_op = 0b0010110011, ++ ldl_w_op = 0b0010111000, ++ ldr_w_op = 0b0010111001, ++ vldrepl_b_op = 0b0011000010, ++ xvldrepl_b_op = 0b0011001010, ++ ++ unknow_ops10 = 0b1111111111 ++ }; ++ ++ // 8-bit opcode, highest 8 bits: bits[31...22] ++ enum ops8 { ++ ll_w_op = 0b00100000, ++ sc_w_op = 0b00100001, ++ ll_d_op = 0b00100010, ++ sc_d_op = 0b00100011, ++ ldptr_w_op = 0b00100100, ++ stptr_w_op = 0b00100101, ++ ldptr_d_op = 0b00100110, ++ stptr_d_op = 0b00100111, ++ csr_op = 0b00000100, ++ ++ unknow_ops8 = 0b11111111 ++ }; ++ ++ // 7-bit opcode, highest 7 bits: bits[31...25] ++ enum ops7 { ++ lu12i_w_op = 0b0001010, ++ lu32i_d_op = 0b0001011, ++ pcaddi_op = 0b0001100, ++ pcalau12i_op = 0b0001101, ++ pcaddu12i_op = 0b0001110, ++ pcaddu18i_op = 0b0001111, ++ ++ unknow_ops7 = 0b1111111 ++ }; ++ ++ // 6-bit opcode, highest 6 bits: bits[31...25] ++ enum ops6 { ++ addu16i_d_op = 0b000100, ++ beqz_op = 0b010000, ++ bnez_op = 0b010001, ++ bccondz_op = 0b010010, ++ jirl_op = 0b010011, ++ b_op = 0b010100, ++ bl_op = 0b010101, ++ beq_op = 0b010110, ++ bne_op = 0b010111, ++ blt_op = 0b011000, ++ bge_op = 0b011001, ++ bltu_op = 0b011010, ++ bgeu_op = 0b011011, ++ ++ unknow_ops6 = 0b111111 ++ }; ++ ++ enum fcmp_cond { ++ fcmp_caf = 0x00, ++ fcmp_cun = 0x08, ++ fcmp_ceq = 0x04, ++ fcmp_cueq = 0x0c, ++ fcmp_clt = 0x02, ++ fcmp_cult = 0x0a, ++ fcmp_cle = 0x06, ++ fcmp_cule = 0x0e, ++ fcmp_cne = 0x10, ++ fcmp_cor = 0x14, ++ fcmp_cune = 0x18, ++ fcmp_saf = 0x01, ++ fcmp_sun = 0x09, ++ fcmp_seq = 0x05, ++ fcmp_sueq = 0x0d, ++ fcmp_slt = 0x03, ++ fcmp_sult = 0x0b, ++ fcmp_sle = 0x07, ++ fcmp_sule = 0x0f, ++ fcmp_sne = 0x11, ++ fcmp_sor = 0x15, ++ fcmp_sune = 0x19 ++ }; ++ ++ enum Condition { ++ zero , ++ notZero , ++ equal , ++ notEqual , ++ less , ++ lessEqual , ++ greater , ++ greaterEqual , ++ below , ++ belowEqual , ++ above , ++ aboveEqual ++ }; ++ ++ static const int LogInstructionSize = 2; ++ static const int InstructionSize = 1 << LogInstructionSize; ++ ++ enum WhichOperand { ++ // input to locate_operand, and format code for relocations ++ imm_operand = 0, // embedded 32-bit|64-bit immediate operand ++ disp32_operand = 1, // embedded 32-bit displacement or address ++ call32_operand = 2, // embedded 32-bit self-relative displacement ++ narrow_oop_operand = 3, // embedded 32-bit immediate narrow oop ++ _WhichOperand_limit = 4 ++ }; ++ ++ static int low (int x, int l) { return bitfield(x, 0, l); } ++ static int low16(int x) { return low(x, 16); } ++ static int low26(int x) { return low(x, 26); } ++ ++ static int high (int x, int l) { return bitfield(x, 32-l, l); } ++ static int high16(int x) { return high(x, 16); } ++ static int high6 (int x) { return high(x, 6); } ++ ++ ++ static ALWAYSINLINE void patch(address a, int length, uint32_t val) { ++ guarantee(val < (1ULL << length), "Field too big for insn"); ++ guarantee(length > 0, "length > 0"); ++ unsigned target = *(unsigned *)a; ++ target = (target >> length) << length; ++ target |= val; ++ *(unsigned *)a = target; ++ } ++ ++ protected: ++ // help methods for instruction ejection ++ ++ // 2R-type ++ // 31 10 9 5 4 0 ++ // | opcode | rj | rd | ++ static inline int insn_RR (int op, int rj, int rd) { return (op<<10) | (rj<<5) | rd; } ++ ++ // 3R-type ++ // 31 15 14 10 9 5 4 0 ++ // | opcode | rk | rj | rd | ++ static inline int insn_RRR (int op, int rk, int rj, int rd) { return (op<<15) | (rk<<10) | (rj<<5) | rd; } ++ ++ // 4R-type ++ // 31 20 19 15 14 10 9 5 4 0 ++ // | opcode | ra | rk | rj | rd | ++ static inline int insn_RRRR (int op, int ra, int rk, int rj, int rd) { return (op<<20) | (ra << 15) | (rk<<10) | (rj<<5) | rd; } ++ ++ // 2RI1-type ++ // 31 11 10 9 5 4 0 ++ // | opcode | I1 | vj | rd | ++ static inline int insn_I1RR (int op, int ui1, int vj, int rd) { assert(is_uimm(ui1, 1), "not a unsigned 1-bit int"); return (op<<11) | (low(ui1, 1)<<10) | (vj<<5) | rd; } ++ ++ // 2RI2-type ++ // 31 12 11 10 9 5 4 0 ++ // | opcode | I2 | vj | rd | ++ static inline int insn_I2RR (int op, int ui2, int vj, int rd) { assert(is_uimm(ui2, 2), "not a unsigned 2-bit int"); return (op<<12) | (low(ui2, 2)<<10) | (vj<<5) | rd; } ++ ++ // 2RI3-type ++ // 31 13 12 10 9 5 4 0 ++ // | opcode | I3 | vj | vd | ++ static inline int insn_I3RR (int op, int ui3, int vj, int vd) { assert(is_uimm(ui3, 3), "not a unsigned 3-bit int"); return (op<<13) | (low(ui3, 3)<<10) | (vj<<5) | vd; } ++ ++ // 2RI4-type ++ // 31 14 13 10 9 5 4 0 ++ // | opcode | I4 | vj | vd | ++ static inline int insn_I4RR (int op, int ui4, int vj, int vd) { assert(is_uimm(ui4, 4), "not a unsigned 4-bit int"); return (op<<14) | (low(ui4, 4)<<10) | (vj<<5) | vd; } ++ ++ // 2RI5-type ++ // 31 15 14 10 9 5 4 0 ++ // | opcode | I5 | vj | vd | ++ static inline int insn_I5RR (int op, int ui5, int vj, int vd) { assert(is_uimm(ui5, 5), "not a unsigned 5-bit int"); return (op<<15) | (low(ui5, 5)<<10) | (vj<<5) | vd; } ++ ++ // 2RI6-type ++ // 31 16 15 10 9 5 4 0 ++ // | opcode | I6 | vj | vd | ++ static inline int insn_I6RR (int op, int ui6, int vj, int vd) { assert(is_uimm(ui6, 6), "not a unsigned 6-bit int"); return (op<<16) | (low(ui6, 6)<<10) | (vj<<5) | vd; } ++ ++ // 2RI7-type ++ // 31 17 16 10 9 5 4 0 ++ // | opcode | I7 | vj | vd | ++ static inline int insn_I7RR (int op, int ui7, int vj, int vd) { assert(is_uimm(ui7, 7), "not a unsigned 7-bit int"); return (op<<17) | (low(ui7, 6)<<10) | (vj<<5) | vd; } ++ ++ // 2RI8-type ++ // 31 18 17 10 9 5 4 0 ++ // | opcode | I8 | rj | rd | ++ static inline int insn_I8RR (int op, int imm8, int rj, int rd) { /*assert(is_simm(imm8, 8), "not a signed 8-bit int");*/ return (op<<18) | (low(imm8, 8)<<10) | (rj<<5) | rd; } ++ ++ // 2RI9-type ++ // 31 19 18 10 9 5 4 0 ++ // | opcode | I9 | rj | vd | ++ static inline int insn_I9RR(int op, int imm9, int rj, int vd) { return (op<<19) | (low(imm9, 9)<<10) | (rj<<5) | vd; } ++ ++ // 2RI10-type ++ // 31 20 19 10 9 5 4 0 ++ // | opcode | I10 | rj | vd | ++ static inline int insn_I10RR(int op, int imm10, int rj, int vd) { return (op<<20) | (low(imm10, 10)<<10) | (rj<<5) | vd; } ++ ++ // 2RI11-type ++ // 31 21 20 10 9 5 4 0 ++ // | opcode | I11 | rj | vd | ++ static inline int insn_I11RR(int op, int imm11, int rj, int vd) { return (op<<21) | (low(imm11, 11)<<10) | (rj<<5) | vd; } ++ ++ // 2RI12-type ++ // 31 22 21 10 9 5 4 0 ++ // | opcode | I12 | rj | rd | ++ static inline int insn_I12RR(int op, int imm12, int rj, int rd) { return (op<<22) | (low(imm12, 12)<<10) | (rj<<5) | rd; } ++ ++ // 2RI14-type ++ // 31 24 23 10 9 5 4 0 ++ // | opcode | I14 | rj | rd | ++ static inline int insn_I14RR(int op, int imm14, int rj, int rd) { assert(is_simm(imm14, 14), "not a signed 14-bit int"); return (op<<24) | (low(imm14, 14)<<10) | (rj<<5) | rd; } ++ ++ // 2RI16-type ++ // 31 26 25 10 9 5 4 0 ++ // | opcode | I16 | rj | rd | ++ static inline int insn_I16RR(int op, int imm16, int rj, int rd) { assert(is_simm16(imm16), "not a signed 16-bit int"); return (op<<26) | (low16(imm16)<<10) | (rj<<5) | rd; } ++ ++ // 1RI13-type (?) ++ // 31 18 17 5 4 0 ++ // | opcode | I13 | vd | ++ static inline int insn_I13R (int op, int imm13, int vd) { assert(is_simm(imm13, 13), "not a signed 13-bit int"); return (op<<18) | (low(imm13, 13)<<5) | vd; } ++ ++ // 1RI20-type (?) ++ // 31 25 24 5 4 0 ++ // | opcode | I20 | rd | ++ static inline int insn_I20R (int op, int imm20, int rd) { assert(is_simm(imm20, 20), "not a signed 20-bit int"); return (op<<25) | (low(imm20, 20)<<5) | rd; } ++ ++ // 1RI21-type ++ // 31 26 25 10 9 5 4 0 ++ // | opcode | I21[15:0] | rj |I21[20:16]| ++ static inline int insn_IRI(int op, int imm21, int rj) { assert(is_simm(imm21, 21), "not a signed 21-bit int"); return (op << 26) | (low16(imm21) << 10) | (rj << 5) | low(imm21 >> 16, 5); } ++ ++ // I26-type ++ // 31 26 25 10 9 0 ++ // | opcode | I26[15:0] | I26[25:16] | ++ static inline int insn_I26(int op, int imm26) { assert(is_simm(imm26, 26), "not a signed 26-bit int"); return (op << 26) | (low16(imm26) << 10) | low(imm26 >> 16, 10); } ++ ++ // imm15 ++ // 31 15 14 0 ++ // | opcode | I15 | ++ static inline int insn_I15 (int op, int imm15) { assert(is_uimm(imm15, 15), "not a unsigned 15-bit int"); return (op<<15) | low(imm15, 15); } ++ ++ ++ // get the offset field of beq, bne, blt[u], bge[u] instruction ++ int offset16(address entry) { ++ assert(is_simm16((entry - pc()) / 4), "change this code"); ++ return (entry - pc()) / 4; ++ } ++ ++ // get the offset field of beqz, bnez instruction ++ int offset21(address entry) { ++ assert(is_simm((int)(entry - pc()) / 4, 21), "change this code"); ++ return (entry - pc()) / 4; ++ } ++ ++ // get the offset field of b instruction ++ int offset26(address entry) { ++ assert(is_simm((int)(entry - pc()) / 4, 26), "change this code"); ++ return (entry - pc()) / 4; ++ } ++ ++public: ++ using AbstractAssembler::offset; ++ ++ // zero/one counting ++ static inline int count_leading_zeros(jint x) { ++ int res; ++ asm ("clz.w %0, %1 \n\t" : "=r"(res) : "r"(x)); ++ return res; ++ } ++ ++ static inline int count_leading_zeros(jlong x) { ++ int res; ++ asm ("clz.d %0, %1 \n\t" : "=r"(res) : "r"(x)); ++ return res; ++ } ++ ++ static inline int count_leading_ones(jint x) { ++ int res; ++ asm ("clo.w %0, %1 \n\t" : "=r"(res) : "r"(x)); ++ return res; ++ } ++ ++ static inline int count_leading_ones(jlong x) { ++ int res; ++ asm ("clo.d %0, %1 \n\t" : "=r"(res) : "r"(x)); ++ return res; ++ } ++ ++ static inline int count_trailing_ones(jint x) { ++ int res; ++ asm ("cto.w %0, %1 \n\t" : "=r"(res) : "r"(x)); ++ return res; ++ } ++ ++ static inline int count_trailing_ones(jlong x) { ++ int res; ++ asm ("cto.d %0, %1 \n\t" : "=r"(res) : "r"(x)); ++ return res; ++ } ++ ++ static inline jint rotate_right(jint x, int s) { ++ jint res; ++ asm ("rotr.w %0, %1, %2 \n\t" : "=r"(res) : "r"(x), "r"(s)); ++ return res; ++ } ++ ++ static inline jlong rotate_right(jlong x, int s) { ++ jlong res; ++ asm ("rotr.d %0, %1, %2 \n\t" : "=r"(res) : "r"(x), "r"(s)); ++ return res; ++ } ++ ++ //sign expand with the sign bit is h ++ static int expand(int x, int h) { return -(x & (1< ++ static bool is_nonneg_mask(T x) { ++ return is_power_of_2(x + 1); ++ } ++ ++ // If x is a zero-inseration mask, return true else return false. ++ template ++ static bool is_zeroins_mask(T x) { ++ const int xbits = sizeof(x) * BitsPerByte; ++ int a, b, c; ++ ++ // regexp: 1+0+1* ++ a = count_leading_ones(x); ++ b = count_leading_zeros(rotate_right(x, xbits - a)); ++ c = count_trailing_ones(x); ++ if ((a > 0) && (b > 0) && (a + b + c) == xbits) ++ return true; ++ ++ return false; ++ } ++ ++ // If x is a vector loadable immediate, return true else return false. ++ static bool is_vec_imm(float x); ++ static bool is_vec_imm(double x); ++ // Return the encoded value. ++ static int get_vec_imm(float x); ++ static int get_vec_imm(double x); ++ ++ static int split_low16(int x) { ++ return (x & 0xffff); ++ } ++ ++ // Convert 16-bit x to a sign-extended 16-bit integer ++ static int simm16(int x) { ++ assert(x == (x & 0xFFFF), "must be 16-bit only"); ++ return (x << 16) >> 16; ++ } ++ ++ static int split_high16(int x) { ++ return ( (x >> 16) + ((x & 0x8000) != 0) ) & 0xffff; ++ } ++ ++ static int split_low20(int x) { ++ return (x & 0xfffff); ++ } ++ ++ // Convert 20-bit x to a sign-extended 20-bit integer ++ static int simm20(int x) { ++ assert(x == (x & 0xFFFFF), "must be 20-bit only"); ++ return (x << 12) >> 12; ++ } ++ ++ static int split_low12(int x) { ++ return (x & 0xfff); ++ } ++ ++ static inline void split_simm32(jlong si32, jint& si12, jint& si20) { ++ si12 = ((jint)(si32 & 0xfff) << 20) >> 20; ++ si32 += (si32 & 0x800) << 1; ++ si20 = si32 >> 12; ++ } ++ ++ static inline void split_simm38(jlong si38, jint& si18, jint& si20) { ++ si18 = ((jint)(si38 & 0x3ffff) << 14) >> 14; ++ si38 += (si38 & 0x20000) << 1; ++ si20 = si38 >> 18; ++ } ++ ++ // Convert 12-bit x to a sign-extended 12-bit integer ++ static int simm12(int x) { ++ assert(x == (x & 0xFFF), "must be 12-bit only"); ++ return (x << 20) >> 20; ++ } ++ ++ // Convert 13-bit x to a sign-extended 13-bit integer ++ static int simm13(int x) { ++ assert(x == (x & 0x1FFF), "must be 13-bit only"); ++ return (x << 19) >> 19; ++ } ++ ++ // Convert 26-bit x to a sign-extended 26-bit integer ++ static int simm26(int x) { ++ assert(x == (x & 0x3FFFFFF), "must be 26-bit only"); ++ return (x << 6) >> 6; ++ } ++ ++ static intptr_t merge(intptr_t x0, intptr_t x12) { ++ //lu12i, ori ++ return (((x12 << 12) | x0) << 32) >> 32; ++ } ++ ++ static intptr_t merge(intptr_t x0, intptr_t x12, intptr_t x32) { ++ //lu32i, lu12i, ori ++ return (((x32 << 32) | (x12 << 12) | x0) << 12) >> 12; ++ } ++ ++ static intptr_t merge(intptr_t x0, intptr_t x12, intptr_t x32, intptr_t x52) { ++ //lu52i, lu32i, lu12i, ori ++ return (x52 << 52) | (x32 << 32) | (x12 << 12) | x0; ++ } ++ ++ // Test if x is within signed immediate range for nbits. ++ static bool is_simm (int x, unsigned int nbits) { ++ assert(0 < nbits && nbits < 32, "out of bounds"); ++ const int min = -( ((int)1) << nbits-1 ); ++ const int maxplus1 = ( ((int)1) << nbits-1 ); ++ return min <= x && x < maxplus1; ++ } ++ ++ static bool is_simm(jlong x, unsigned int nbits) { ++ assert(0 < nbits && nbits < 64, "out of bounds"); ++ const jlong min = -( ((jlong)1) << nbits-1 ); ++ const jlong maxplus1 = ( ((jlong)1) << nbits-1 ); ++ return min <= x && x < maxplus1; ++ } ++ ++ static bool is_simm16(int x) { return is_simm(x, 16); } ++ static bool is_simm16(long x) { return is_simm((jlong)x, (unsigned int)16); } ++ ++ // Test if x is within unsigned immediate range for nbits ++ static bool is_uimm(int x, unsigned int nbits) { ++ assert(0 < nbits && nbits < 32, "out of bounds"); ++ const int maxplus1 = ( ((int)1) << nbits ); ++ return 0 <= x && x < maxplus1; ++ } ++ ++ static bool is_uimm(jlong x, unsigned int nbits) { ++ assert(0 < nbits && nbits < 64, "out of bounds"); ++ const jlong maxplus1 = ( ((jlong)1) << nbits ); ++ return 0 <= x && x < maxplus1; ++ } ++ ++public: ++ ++ void flush() { ++ AbstractAssembler::flush(); ++ } ++ ++ inline void emit_int32(int x) { ++ AbstractAssembler::emit_int32(x); ++ } ++ ++ inline void emit_data(int x) { emit_int32(x); } ++ inline void emit_data(int x, relocInfo::relocType rtype) { ++ relocate(rtype); ++ emit_int32(x); ++ } ++ ++ inline void emit_data(int x, RelocationHolder const& rspec) { ++ relocate(rspec); ++ emit_int32(x); ++ } ++ ++ inline void emit_data64(jlong data, relocInfo::relocType rtype, int format = 0) { ++ if (rtype == relocInfo::none) { ++ emit_int64(data); ++ } else { ++ emit_data64(data, Relocation::spec_simple(rtype), format); ++ } ++ } ++ ++ inline void emit_data64(jlong data, RelocationHolder const& rspec, int format = 0) { ++ assert(inst_mark() != nullptr, "must be inside InstructionMark"); ++ // Do not use AbstractAssembler::relocate, which is not intended for ++ // embedded words. Instead, relocate to the enclosing instruction. ++ code_section()->relocate(inst_mark(), rspec, format); ++ emit_int64(data); ++ } ++ ++ //---< calculate length of instruction >--- ++ // With LoongArch being a RISC architecture, this always is BytesPerInstWord ++ // instruction must start at passed address ++ static unsigned int instr_len(unsigned char *instr) { return BytesPerInstWord; } ++ ++ //---< longest instructions >--- ++ static unsigned int instr_maxlen() { return BytesPerInstWord; } ++ ++ ++ // Generic instructions ++ // Does 32bit or 64bit as needed for the platform. In some sense these ++ // belong in macro assembler but there is no need for both varieties to exist ++ ++ void clo_w (Register rd, Register rj) { emit_int32(insn_RR(clo_w_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void clz_w (Register rd, Register rj) { emit_int32(insn_RR(clz_w_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void cto_w (Register rd, Register rj) { emit_int32(insn_RR(cto_w_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void ctz_w (Register rd, Register rj) { emit_int32(insn_RR(ctz_w_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void clo_d (Register rd, Register rj) { emit_int32(insn_RR(clo_d_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void clz_d (Register rd, Register rj) { emit_int32(insn_RR(clz_d_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void cto_d (Register rd, Register rj) { emit_int32(insn_RR(cto_d_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void ctz_d (Register rd, Register rj) { emit_int32(insn_RR(ctz_d_op, (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void revb_2h(Register rd, Register rj) { emit_int32(insn_RR(revb_2h_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void revb_4h(Register rd, Register rj) { emit_int32(insn_RR(revb_4h_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void revb_2w(Register rd, Register rj) { emit_int32(insn_RR(revb_2w_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void revb_d (Register rd, Register rj) { emit_int32(insn_RR( revb_d_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void revh_2w(Register rd, Register rj) { emit_int32(insn_RR(revh_2w_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void revh_d (Register rd, Register rj) { emit_int32(insn_RR( revh_d_op, (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void bitrev_4b(Register rd, Register rj) { emit_int32(insn_RR(bitrev_4b_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void bitrev_8b(Register rd, Register rj) { emit_int32(insn_RR(bitrev_8b_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void bitrev_w (Register rd, Register rj) { emit_int32(insn_RR(bitrev_w_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void bitrev_d (Register rd, Register rj) { emit_int32(insn_RR(bitrev_d_op, (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void ext_w_h(Register rd, Register rj) { emit_int32(insn_RR(ext_w_h_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void ext_w_b(Register rd, Register rj) { emit_int32(insn_RR(ext_w_b_op, (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void rdtimel_w(Register rd, Register rj) { emit_int32(insn_RR(rdtimel_w_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void rdtimeh_w(Register rd, Register rj) { emit_int32(insn_RR(rdtimeh_w_op, (int)rj->encoding(), (int)rd->encoding())); } ++ void rdtime_d(Register rd, Register rj) { emit_int32(insn_RR(rdtime_d_op, (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void cpucfg(Register rd, Register rj) { emit_int32(insn_RR(cpucfg_op, (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void asrtle_d (Register rj, Register rk) { emit_int32(insn_RRR(asrtle_d_op , (int)rk->encoding(), (int)rj->encoding(), 0)); } ++ void asrtgt_d (Register rj, Register rk) { emit_int32(insn_RRR(asrtgt_d_op , (int)rk->encoding(), (int)rj->encoding(), 0)); } ++ ++ void alsl_w(Register rd, Register rj, Register rk, int sa2) { assert(is_uimm(sa2, 2), "not a unsigned 2-bit int"); emit_int32(insn_I8RR(alsl_w_op, ( (0 << 7) | (sa2 << 5) | (int)rk->encoding() ), (int)rj->encoding(), (int)rd->encoding())); } ++ void alsl_wu(Register rd, Register rj, Register rk, int sa2) { assert(is_uimm(sa2, 2), "not a unsigned 2-bit int"); emit_int32(insn_I8RR(alsl_w_op, ( (1 << 7) | (sa2 << 5) | (int)rk->encoding() ), (int)rj->encoding(), (int)rd->encoding())); } ++ void bytepick_w(Register rd, Register rj, Register rk, int sa2) { assert(is_uimm(sa2, 2), "not a unsigned 2-bit int"); emit_int32(insn_I8RR(bytepick_w_op, ( (0 << 7) | (sa2 << 5) | (int)rk->encoding() ), (int)rj->encoding(), (int)rd->encoding())); } ++ void bytepick_d(Register rd, Register rj, Register rk, int sa3) { assert(is_uimm(sa3, 3), "not a unsigned 3-bit int"); emit_int32(insn_I8RR(bytepick_d_op, ( (sa3 << 5) | (int)rk->encoding() ), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void add_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(add_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void add_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(add_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void sub_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(sub_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void sub_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(sub_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void slt (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(slt_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void sltu (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(sltu_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void maskeqz (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(maskeqz_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void masknez (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(masknez_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void nor (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(nor_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void AND (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(and_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void OR (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(or_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void XOR (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(xor_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void orn (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(orn_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void andn(Register rd, Register rj, Register rk) { emit_int32(insn_RRR(andn_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void sll_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(sll_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void srl_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(srl_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void sra_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(sra_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void sll_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(sll_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void srl_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(srl_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void sra_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(sra_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void rotr_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(rotr_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void rotr_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(rotr_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void mul_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(mul_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void mulh_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(mulh_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void mulh_wu (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(mulh_wu_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void mul_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(mul_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void mulh_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(mulh_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void mulh_du (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(mulh_du_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void mulw_d_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(mulw_d_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void mulw_d_wu (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(mulw_d_wu_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void div_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(div_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void mod_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(mod_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void div_wu(Register rd, Register rj, Register rk) { emit_int32(insn_RRR(div_wu_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void mod_wu(Register rd, Register rj, Register rk) { emit_int32(insn_RRR(mod_wu_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void div_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(div_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void mod_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(mod_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void div_du(Register rd, Register rj, Register rk) { emit_int32(insn_RRR(div_du_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void mod_du(Register rd, Register rj, Register rk) { emit_int32(insn_RRR(mod_du_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void crc_w_b_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(crc_w_b_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void crc_w_h_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(crc_w_h_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void crc_w_w_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(crc_w_w_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void crc_w_d_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(crc_w_d_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void crcc_w_b_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(crcc_w_b_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void crcc_w_h_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(crcc_w_h_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void crcc_w_w_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(crcc_w_w_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void crcc_w_d_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(crcc_w_d_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void brk(int code) { assert(is_uimm(code, 15), "not a unsigned 15-bit int"); emit_int32(insn_I15(break_op, code)); } ++ ++ void alsl_d(Register rd, Register rj, Register rk, int sa2) { assert(is_uimm(sa2, 2), "not a unsigned 2-bit int"); emit_int32(insn_I8RR(alsl_d_op, ( (sa2 << 5) | (int)rk->encoding() ), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void slli_w(Register rd, Register rj, int ui5) { assert(is_uimm(ui5, 5), "not a unsigned 5-bit int"); emit_int32(insn_I8RR(slli_op, ( (0b001 << 5) | ui5 ), (int)rj->encoding(), (int)rd->encoding())); } ++ void slli_d(Register rd, Register rj, int ui6) { assert(is_uimm(ui6, 6), "not a unsigned 6-bit int"); emit_int32(insn_I8RR(slli_op, ( (0b01 << 6) | ui6 ), (int)rj->encoding(), (int)rd->encoding())); } ++ void srli_w(Register rd, Register rj, int ui5) { assert(is_uimm(ui5, 5), "not a unsigned 5-bit int"); emit_int32(insn_I8RR(srli_op, ( (0b001 << 5) | ui5 ), (int)rj->encoding(), (int)rd->encoding())); } ++ void srli_d(Register rd, Register rj, int ui6) { assert(is_uimm(ui6, 6), "not a unsigned 6-bit int"); emit_int32(insn_I8RR(srli_op, ( (0b01 << 6) | ui6 ), (int)rj->encoding(), (int)rd->encoding())); } ++ void srai_w(Register rd, Register rj, int ui5) { assert(is_uimm(ui5, 5), "not a unsigned 5-bit int"); emit_int32(insn_I8RR(srai_op, ( (0b001 << 5) | ui5 ), (int)rj->encoding(), (int)rd->encoding())); } ++ void srai_d(Register rd, Register rj, int ui6) { assert(is_uimm(ui6, 6), "not a unsigned 6-bit int"); emit_int32(insn_I8RR(srai_op, ( (0b01 << 6) | ui6 ), (int)rj->encoding(), (int)rd->encoding())); } ++ void rotri_w(Register rd, Register rj, int ui5) { assert(is_uimm(ui5, 5), "not a unsigned 5-bit int"); emit_int32(insn_I8RR(rotri_op, ( (0b001 << 5) | ui5 ), (int)rj->encoding(), (int)rd->encoding())); } ++ void rotri_d(Register rd, Register rj, int ui6) { assert(is_uimm(ui6, 6), "not a unsigned 6-bit int"); emit_int32(insn_I8RR(rotri_op, ( (0b01 << 6) | ui6 ), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void bstrins_w (Register rd, Register rj, int msbw, int lsbw) { assert(is_uimm(msbw, 5) && is_uimm(lsbw, 5), "not a unsigned 5-bit int"); emit_int32(insn_I12RR(bstr_w_op, ( (1<<11) | (low(msbw, 5)<<6) | (0<<5) | low(lsbw, 5) ), (int)rj->encoding(), (int)rd->encoding())); } ++ void bstrpick_w (Register rd, Register rj, int msbw, int lsbw) { assert(is_uimm(msbw, 5) && is_uimm(lsbw, 5), "not a unsigned 5-bit int"); emit_int32(insn_I12RR(bstr_w_op, ( (1<<11) | (low(msbw, 5)<<6) | (1<<5) | low(lsbw, 5) ), (int)rj->encoding(), (int)rd->encoding())); } ++ void bstrins_d (Register rd, Register rj, int msbd, int lsbd) { assert(is_uimm(msbd, 6) && is_uimm(lsbd, 6), "not a unsigned 6-bit int"); emit_int32(insn_I12RR(bstrins_d_op, ( (low(msbd, 6)<<6) | low(lsbd, 6) ), (int)rj->encoding(), (int)rd->encoding())); } ++ void bstrpick_d (Register rd, Register rj, int msbd, int lsbd) { assert(is_uimm(msbd, 6) && is_uimm(lsbd, 6), "not a unsigned 6-bit int"); emit_int32(insn_I12RR(bstrpick_d_op, ( (low(msbd, 6)<<6) | low(lsbd, 6) ), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void fadd_s (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fadd_s_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fadd_d (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fadd_d_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fsub_s (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fsub_s_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fsub_d (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fsub_d_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmul_s (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fmul_s_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmul_d (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fmul_d_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fdiv_s (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fdiv_s_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fdiv_d (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fdiv_d_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmax_s (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fmax_s_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmax_d (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fmax_d_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmin_s (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fmin_s_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmin_d (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fmin_d_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmaxa_s (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fmaxa_s_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmaxa_d (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fmaxa_d_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmina_s (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fmina_s_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmina_d (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fmina_d_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ ++ void fscaleb_s (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fscaleb_s_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fscaleb_d (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fscaleb_d_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fcopysign_s (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fcopysign_s_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fcopysign_d (FloatRegister fd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRR(fcopysign_d_op, (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ ++ void fabs_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(fabs_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void fabs_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(fabs_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void fneg_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(fneg_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void fneg_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(fneg_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void flogb_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(flogb_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void flogb_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(flogb_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void fclass_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(fclass_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void fclass_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(fclass_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void fsqrt_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(fsqrt_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void fsqrt_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(fsqrt_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void frecip_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(frecip_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void frecip_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(frecip_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void frsqrt_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(frsqrt_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void frsqrt_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(frsqrt_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void fmov_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(fmov_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void fmov_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(fmov_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ ++ void movgr2fr_w (FloatRegister fd, Register rj) { emit_int32(insn_RR(movgr2fr_w_op, (int)rj->encoding(), (int)fd->encoding())); } ++ void movgr2fr_d (FloatRegister fd, Register rj) { emit_int32(insn_RR(movgr2fr_d_op, (int)rj->encoding(), (int)fd->encoding())); } ++ void movgr2frh_w(FloatRegister fd, Register rj) { emit_int32(insn_RR(movgr2frh_w_op, (int)rj->encoding(), (int)fd->encoding())); } ++ void movfr2gr_s (Register rd, FloatRegister fj) { emit_int32(insn_RR(movfr2gr_s_op, (int)fj->encoding(), (int)rd->encoding())); } ++ void movfr2gr_d (Register rd, FloatRegister fj) { emit_int32(insn_RR(movfr2gr_d_op, (int)fj->encoding(), (int)rd->encoding())); } ++ void movfrh2gr_s(Register rd, FloatRegister fj) { emit_int32(insn_RR(movfrh2gr_s_op, (int)fj->encoding(), (int)rd->encoding())); } ++ void movgr2fcsr (int fcsr, Register rj) { assert(is_uimm(fcsr, 2), "not a unsigned 2-bit init: fcsr0-fcsr3"); emit_int32(insn_RR(movgr2fcsr_op, (int)rj->encoding(), fcsr)); } ++ void movfcsr2gr (Register rd, int fcsr) { assert(is_uimm(fcsr, 2), "not a unsigned 2-bit init: fcsr0-fcsr3"); emit_int32(insn_RR(movfcsr2gr_op, fcsr, (int)rd->encoding())); } ++ void movfr2cf (ConditionalFlagRegister cd, FloatRegister fj) { emit_int32(insn_RR(movfr2cf_op, (int)fj->encoding(), (int)cd->encoding())); } ++ void movcf2fr (FloatRegister fd, ConditionalFlagRegister cj) { emit_int32(insn_RR(movcf2fr_op, (int)cj->encoding(), (int)fd->encoding())); } ++ void movgr2cf (ConditionalFlagRegister cd, Register rj) { emit_int32(insn_RR(movgr2cf_op, (int)rj->encoding(), (int)cd->encoding())); } ++ void movcf2gr (Register rd, ConditionalFlagRegister cj) { emit_int32(insn_RR(movcf2gr_op, (int)cj->encoding(), (int)rd->encoding())); } ++ ++ void fcvt_s_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(fcvt_s_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void fcvt_d_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(fcvt_d_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ ++ void ftintrm_w_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrm_w_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrm_w_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrm_w_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrm_l_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrm_l_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrm_l_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrm_l_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrp_w_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrp_w_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrp_w_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrp_w_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrp_l_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrp_l_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrp_l_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrp_l_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrz_w_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrz_w_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrz_w_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrz_w_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrz_l_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrz_l_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrz_l_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrz_l_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrne_w_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrne_w_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrne_w_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrne_w_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrne_l_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrne_l_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftintrne_l_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftintrne_l_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftint_w_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftint_w_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftint_w_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftint_w_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftint_l_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftint_l_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ftint_l_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ftint_l_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ffint_s_w(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ffint_s_w_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ffint_s_l(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ffint_s_l_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ffint_d_w(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ffint_d_w_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void ffint_d_l(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(ffint_d_l_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void frint_s(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(frint_s_op, (int)fj->encoding(), (int)fd->encoding())); } ++ void frint_d(FloatRegister fd, FloatRegister fj) { emit_int32(insn_RR(frint_d_op, (int)fj->encoding(), (int)fd->encoding())); } ++ ++ void slti (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(slti_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void sltui (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(sltui_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void addi_w(Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(addi_w_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void addi_d(Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(addi_d_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void lu52i_d(Register rd, Register rj, int si12) { /*assert(is_simm(si12, 12), "not a signed 12-bit int");*/ emit_int32(insn_I12RR(lu52i_d_op, simm12(si12), (int)rj->encoding(), (int)rd->encoding())); } ++ void andi (Register rd, Register rj, int ui12) { assert(is_uimm(ui12, 12), "not a unsigned 12-bit int"); emit_int32(insn_I12RR(andi_op, ui12, (int)rj->encoding(), (int)rd->encoding())); } ++ void ori (Register rd, Register rj, int ui12) { assert(is_uimm(ui12, 12), "not a unsigned 12-bit int"); emit_int32(insn_I12RR(ori_op, ui12, (int)rj->encoding(), (int)rd->encoding())); } ++ void xori (Register rd, Register rj, int ui12) { assert(is_uimm(ui12, 12), "not a unsigned 12-bit int"); emit_int32(insn_I12RR(xori_op, ui12, (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void fmadd_s (FloatRegister fd, FloatRegister fj, FloatRegister fk, FloatRegister fa) { emit_int32(insn_RRRR(fmadd_s_op , (int)fa->encoding(), (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmadd_d (FloatRegister fd, FloatRegister fj, FloatRegister fk, FloatRegister fa) { emit_int32(insn_RRRR(fmadd_d_op , (int)fa->encoding(), (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmsub_s (FloatRegister fd, FloatRegister fj, FloatRegister fk, FloatRegister fa) { emit_int32(insn_RRRR(fmsub_s_op , (int)fa->encoding(), (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fmsub_d (FloatRegister fd, FloatRegister fj, FloatRegister fk, FloatRegister fa) { emit_int32(insn_RRRR(fmsub_d_op , (int)fa->encoding(), (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fnmadd_s (FloatRegister fd, FloatRegister fj, FloatRegister fk, FloatRegister fa) { emit_int32(insn_RRRR(fnmadd_s_op , (int)fa->encoding(), (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fnmadd_d (FloatRegister fd, FloatRegister fj, FloatRegister fk, FloatRegister fa) { emit_int32(insn_RRRR(fnmadd_d_op , (int)fa->encoding(), (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fnmsub_s (FloatRegister fd, FloatRegister fj, FloatRegister fk, FloatRegister fa) { emit_int32(insn_RRRR(fnmsub_s_op , (int)fa->encoding(), (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ void fnmsub_d (FloatRegister fd, FloatRegister fj, FloatRegister fk, FloatRegister fa) { emit_int32(insn_RRRR(fnmsub_d_op , (int)fa->encoding(), (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ ++ void fcmp_caf_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_caf, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cun_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_cun , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_ceq_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_ceq , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cueq_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_cueq, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_clt_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_clt , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cult_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_cult, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cle_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_cle , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cule_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_cule, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cne_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_cne , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cor_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_cor , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cune_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_cune, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_saf_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_saf , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sun_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_sun , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_seq_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_seq , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sueq_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_sueq, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_slt_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_slt , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sult_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_sult, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sle_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_sle , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sule_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_sule, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sne_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_sne , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sor_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_sor , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sune_s (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_s_op, fcmp_sune, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ ++ void fcmp_caf_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_caf, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cun_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_cun , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_ceq_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_ceq , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cueq_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_cueq, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_clt_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_clt , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cult_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_cult, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cle_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_cle , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cule_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_cule, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cne_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_cne , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cor_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_cor , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_cune_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_cune, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_saf_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_saf , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sun_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_sun , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_seq_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_seq , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sueq_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_sueq, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_slt_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_slt , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sult_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_sult, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sle_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_sle , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sule_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_sule, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sne_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_sne , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sor_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_sor , (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ void fcmp_sune_d (ConditionalFlagRegister cd, FloatRegister fj, FloatRegister fk) { emit_int32(insn_RRRR(fcmp_cond_d_op, fcmp_sune, (int)fk->encoding(), (int)fj->encoding(), (int)cd->encoding())); } ++ ++ void fsel (FloatRegister fd, FloatRegister fj, FloatRegister fk, ConditionalFlagRegister ca) { emit_int32(insn_RRRR(fsel_op, (int)ca->encoding(), (int)fk->encoding(), (int)fj->encoding(), (int)fd->encoding())); } ++ ++ void addu16i_d(Register rj, Register rd, int si16) { assert(is_simm(si16, 16), "not a signed 16-bit int"); emit_int32(insn_I16RR(addu16i_d_op, si16, (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void lu12i_w(Register rj, int si20) { /*assert(is_simm(si20, 20), "not a signed 20-bit int");*/ emit_int32(insn_I20R(lu12i_w_op, simm20(si20), (int)rj->encoding())); } ++ void lu32i_d(Register rj, int si20) { /*assert(is_simm(si20, 20), "not a signed 20-bit int");*/ emit_int32(insn_I20R(lu32i_d_op, simm20(si20), (int)rj->encoding())); } ++ void pcaddi(Register rj, int si20) { assert(is_simm(si20, 20), "not a signed 20-bit int"); emit_int32(insn_I20R(pcaddi_op, si20, (int)rj->encoding())); } ++ void pcalau12i(Register rj, int si20) { assert(is_simm(si20, 20), "not a signed 20-bit int"); emit_int32(insn_I20R(pcalau12i_op, si20, (int)rj->encoding())); } ++ void pcaddu12i(Register rj, int si20) { assert(is_simm(si20, 20), "not a signed 20-bit int"); emit_int32(insn_I20R(pcaddu12i_op, si20, (int)rj->encoding())); } ++ void pcaddu18i(Register rj, int si20) { assert(is_simm(si20, 20), "not a signed 20-bit int"); emit_int32(insn_I20R(pcaddu18i_op, si20, (int)rj->encoding())); } ++ ++ void ll_w (Register rd, Register rj, int si16) { assert(is_simm(si16, 16) && ((si16 & 0x3) == 0), "not a signed 16-bit int"); emit_int32(insn_I14RR(ll_w_op, si16>>2, (int)rj->encoding(), (int)rd->encoding())); } ++ void sc_w (Register rd, Register rj, int si16) { assert(is_simm(si16, 16) && ((si16 & 0x3) == 0), "not a signed 16-bit int"); emit_int32(insn_I14RR(sc_w_op, si16>>2, (int)rj->encoding(), (int)rd->encoding())); } ++ void ll_d (Register rd, Register rj, int si16) { assert(is_simm(si16, 16) && ((si16 & 0x3) == 0), "not a signed 16-bit int"); emit_int32(insn_I14RR(ll_d_op, si16>>2, (int)rj->encoding(), (int)rd->encoding())); } ++ void sc_d (Register rd, Register rj, int si16) { assert(is_simm(si16, 16) && ((si16 & 0x3) == 0), "not a signed 16-bit int"); emit_int32(insn_I14RR(sc_d_op, si16>>2, (int)rj->encoding(), (int)rd->encoding())); } ++ void ldptr_w (Register rd, Register rj, int si16) { assert(is_simm(si16, 16) && ((si16 & 0x3) == 0), "not a signed 16-bit int"); emit_int32(insn_I14RR(ldptr_w_op, si16>>2, (int)rj->encoding(), (int)rd->encoding())); } ++ void stptr_w (Register rd, Register rj, int si16) { assert(is_simm(si16, 16) && ((si16 & 0x3) == 0), "not a signed 16-bit int"); emit_int32(insn_I14RR(stptr_w_op, si16>>2, (int)rj->encoding(), (int)rd->encoding())); } ++ void ldptr_d (Register rd, Register rj, int si16) { assert(is_simm(si16, 16) && ((si16 & 0x3) == 0), "not a signed 16-bit int"); emit_int32(insn_I14RR(ldptr_d_op, si16>>2, (int)rj->encoding(), (int)rd->encoding())); } ++ void stptr_d (Register rd, Register rj, int si16) { assert(is_simm(si16, 16) && ((si16 & 0x3) == 0), "not a signed 16-bit int"); emit_int32(insn_I14RR(stptr_d_op, si16>>2, (int)rj->encoding(), (int)rd->encoding())); } ++ void csrrd (Register rd, int csr) { emit_int32(insn_I14RR(csr_op, csr, 0, (int)rd->encoding())); } ++ void csrwr (Register rd, int csr) { emit_int32(insn_I14RR(csr_op, csr, 1, (int)rd->encoding())); } ++ ++ void ld_b (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(ld_b_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void ld_h (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(ld_h_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void ld_w (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(ld_w_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void ld_d (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(ld_d_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void st_b (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(st_b_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void st_h (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(st_h_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void st_w (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(st_w_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void st_d (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(st_d_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void ld_bu (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(ld_bu_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void ld_hu (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(ld_hu_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void ld_wu (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(ld_wu_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void preld (int hint, Register rj, int si12) { assert(is_uimm(hint, 5), "not a unsigned 5-bit int"); assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(preld_op, si12, (int)rj->encoding(), hint)); } ++ void fld_s (FloatRegister fd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(fld_s_op, si12, (int)rj->encoding(), (int)fd->encoding())); } ++ void fst_s (FloatRegister fd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(fst_s_op, si12, (int)rj->encoding(), (int)fd->encoding())); } ++ void fld_d (FloatRegister fd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(fld_d_op, si12, (int)rj->encoding(), (int)fd->encoding())); } ++ void fst_d (FloatRegister fd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(fst_d_op, si12, (int)rj->encoding(), (int)fd->encoding())); } ++ void ldl_w (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(ldl_w_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ void ldr_w (Register rd, Register rj, int si12) { assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(ldr_w_op, si12, (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void ldx_b (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldx_b_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldx_h (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldx_h_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldx_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldx_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldx_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldx_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void stx_b (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(stx_b_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void stx_h (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(stx_h_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void stx_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(stx_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void stx_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(stx_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldx_bu (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldx_bu_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldx_hu (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldx_hu_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldx_wu (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldx_wu_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void fldx_s (FloatRegister fd, Register rj, Register rk) { emit_int32(insn_RRR(fldx_s_op, (int)rk->encoding(), (int)rj->encoding(), (int)fd->encoding())); } ++ void fldx_d (FloatRegister fd, Register rj, Register rk) { emit_int32(insn_RRR(fldx_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)fd->encoding())); } ++ void fstx_s (FloatRegister fd, Register rj, Register rk) { emit_int32(insn_RRR(fstx_s_op, (int)rk->encoding(), (int)rj->encoding(), (int)fd->encoding())); } ++ void fstx_d (FloatRegister fd, Register rj, Register rk) { emit_int32(insn_RRR(fstx_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)fd->encoding())); } ++ ++ void ld_b (Register rd, const Address &src); ++ void ld_bu (Register rd, const Address &src); ++ void ld_d (Register rd, const Address &src); ++ void ld_h (Register rd, const Address &src); ++ void ld_hu (Register rd, const Address &src); ++ void ll_w (Register rd, const Address &src); ++ void ll_d (Register rd, const Address &src); ++ void ld_wu (Register rd, const Address &src); ++ void ld_w (Register rd, const Address &src); ++ void st_b (Register rd, const Address &dst); ++ void st_d (Register rd, const Address &dst); ++ void st_w (Register rd, const Address &dst); ++ void sc_w (Register rd, const Address &dst); ++ void sc_d (Register rd, const Address &dst); ++ void st_h (Register rd, const Address &dst); ++ void fld_s (FloatRegister fd, const Address &src); ++ void fld_d (FloatRegister fd, const Address &src); ++ void fst_s (FloatRegister fd, const Address &dst); ++ void fst_d (FloatRegister fd, const Address &dst); ++ ++ void amcas_b (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amcas_b_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amcas_h (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amcas_h_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amcas_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amcas_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amcas_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amcas_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amswap_b (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amswap_b_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amswap_h (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amswap_h_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amswap_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amswap_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amswap_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amswap_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amadd_b (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amadd_b_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amadd_h (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amadd_h_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amadd_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amadd_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amadd_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amadd_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amand_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amand_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amand_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amand_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amor_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amor_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amor_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amor_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amxor_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amxor_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amxor_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amxor_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammax_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammax_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammax_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammax_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammin_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammin_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammin_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammin_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammax_wu (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammax_wu_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammax_du (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammax_du_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammin_wu (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammin_wu_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammin_du (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammin_du_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amcas_db_b(Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amcas_db_b_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amcas_db_h(Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amcas_db_h_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amcas_db_w(Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amcas_db_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amcas_db_d(Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amcas_db_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amswap_db_b(Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amswap_db_b_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amswap_db_h(Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amswap_db_h_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amswap_db_w(Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amswap_db_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amswap_db_d(Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amswap_db_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amadd_db_b (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amadd_db_b_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amadd_db_h (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amadd_db_h_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amadd_db_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amadd_db_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amadd_db_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amadd_db_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amand_db_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amand_db_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amand_db_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amand_db_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amor_db_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amor_db_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amor_db_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amor_db_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amxor_db_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amxor_db_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void amxor_db_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(amxor_db_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammax_db_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammax_db_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammax_db_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammax_db_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammin_db_w (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammin_db_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammin_db_d (Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammin_db_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammax_db_wu(Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammax_db_wu_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammax_db_du(Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammax_db_du_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammin_db_wu(Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammin_db_wu_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ammin_db_du(Register rd, Register rk, Register rj) { assert_different_registers(rd, rj); assert_different_registers(rd, rk); emit_int32(insn_RRR(ammin_db_du_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void dbar(int hint) { ++ assert(is_uimm(hint, 15), "not a unsigned 15-bit int"); ++ ++ if (UseActiveCoresMP) ++ andi(R0, R0, 0); ++ else ++ emit_int32(insn_I15(dbar_op, hint)); ++ } ++ void ibar(int hint) { assert(is_uimm(hint, 15), "not a unsigned 15-bit int"); emit_int32(insn_I15(ibar_op, hint)); } ++ ++ void fldgt_s (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(fldgt_s_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void fldgt_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(fldgt_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void fldle_s (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(fldle_s_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void fldle_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(fldle_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void fstgt_s (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(fstgt_s_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void fstgt_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(fstgt_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void fstle_s (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(fstle_s_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void fstle_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(fstle_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void ldgt_b (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldgt_b_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldgt_h (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldgt_h_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldgt_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldgt_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldgt_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldgt_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldle_b (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldle_b_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldle_h (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldle_h_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldle_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldle_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void ldle_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(ldle_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void stgt_b (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(stgt_b_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void stgt_h (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(stgt_h_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void stgt_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(stgt_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void stgt_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(stgt_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void stle_b (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(stle_b_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void stle_h (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(stle_h_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void stle_w (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(stle_w_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ void stle_d (Register rd, Register rj, Register rk) { emit_int32(insn_RRR(stle_d_op, (int)rk->encoding(), (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void beqz(Register rj, int offs) { assert(is_simm(offs, 21), "not a signed 21-bit int"); emit_int32(insn_IRI(beqz_op, offs, (int)rj->encoding())); } ++ void bnez(Register rj, int offs) { assert(is_simm(offs, 21), "not a signed 21-bit int"); emit_int32(insn_IRI(bnez_op, offs, (int)rj->encoding())); } ++ void bceqz(ConditionalFlagRegister cj, int offs) { assert(is_simm(offs, 21), "not a signed 21-bit int"); emit_int32(insn_IRI(bccondz_op, offs, ( (0b00<<3) | (int)cj->encoding()))); } ++ void bcnez(ConditionalFlagRegister cj, int offs) { assert(is_simm(offs, 21), "not a signed 21-bit int"); emit_int32(insn_IRI(bccondz_op, offs, ( (0b01<<3) | (int)cj->encoding()))); } ++ ++ void jirl(Register rd, Register rj, int offs) { assert(is_simm(offs, 18) && ((offs & 3) == 0), "not a signed 18-bit int"); emit_int32(insn_I16RR(jirl_op, offs >> 2, (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void b(int offs) { assert(is_simm(offs, 26), "not a signed 26-bit int"); emit_int32(insn_I26(b_op, offs)); } ++ void bl(int offs) { assert(is_simm(offs, 26), "not a signed 26-bit int"); emit_int32(insn_I26(bl_op, offs)); } ++ ++ ++ void beq(Register rj, Register rd, int offs) { assert(is_simm(offs, 16), "not a signed 16-bit int"); emit_int32(insn_I16RR(beq_op, offs, (int)rj->encoding(), (int)rd->encoding())); } ++ void bne(Register rj, Register rd, int offs) { assert(is_simm(offs, 16), "not a signed 16-bit int"); emit_int32(insn_I16RR(bne_op, offs, (int)rj->encoding(), (int)rd->encoding())); } ++ void blt(Register rj, Register rd, int offs) { assert(is_simm(offs, 16), "not a signed 16-bit int"); emit_int32(insn_I16RR(blt_op, offs, (int)rj->encoding(), (int)rd->encoding())); } ++ void bge(Register rj, Register rd, int offs) { assert(is_simm(offs, 16), "not a signed 16-bit int"); emit_int32(insn_I16RR(bge_op, offs, (int)rj->encoding(), (int)rd->encoding())); } ++ void bltu(Register rj, Register rd, int offs) { assert(is_simm(offs, 16), "not a signed 16-bit int"); emit_int32(insn_I16RR(bltu_op, offs, (int)rj->encoding(), (int)rd->encoding())); } ++ void bgeu(Register rj, Register rd, int offs) { assert(is_simm(offs, 16), "not a signed 16-bit int"); emit_int32(insn_I16RR(bgeu_op, offs, (int)rj->encoding(), (int)rd->encoding())); } ++ ++ void beq (Register rj, Register rd, address entry) { beq (rj, rd, offset16(entry)); } ++ void bne (Register rj, Register rd, address entry) { bne (rj, rd, offset16(entry)); } ++ void blt (Register rj, Register rd, address entry) { blt (rj, rd, offset16(entry)); } ++ void bge (Register rj, Register rd, address entry) { bge (rj, rd, offset16(entry)); } ++ void bltu (Register rj, Register rd, address entry) { bltu (rj, rd, offset16(entry)); } ++ void bgeu (Register rj, Register rd, address entry) { bgeu (rj, rd, offset16(entry)); } ++ void beqz (Register rj, address entry) { beqz (rj, offset21(entry)); } ++ void bnez (Register rj, address entry) { bnez (rj, offset21(entry)); } ++ void b(address entry) { b(offset26(entry)); } ++ void bl(address entry) { bl(offset26(entry)); } ++ void bceqz(ConditionalFlagRegister cj, address entry) { bceqz(cj, offset21(entry)); } ++ void bcnez(ConditionalFlagRegister cj, address entry) { bcnez(cj, offset21(entry)); } ++ ++ void beq (Register rj, Register rd, Label& L) { beq (rj, rd, target(L)); } ++ void bne (Register rj, Register rd, Label& L) { bne (rj, rd, target(L)); } ++ void blt (Register rj, Register rd, Label& L) { blt (rj, rd, target(L)); } ++ void bge (Register rj, Register rd, Label& L) { bge (rj, rd, target(L)); } ++ void bltu (Register rj, Register rd, Label& L) { bltu (rj, rd, target(L)); } ++ void bgeu (Register rj, Register rd, Label& L) { bgeu (rj, rd, target(L)); } ++ void beqz (Register rj, Label& L) { beqz (rj, target(L)); } ++ void bnez (Register rj, Label& L) { bnez (rj, target(L)); } ++ void b(Label& L) { b(target(L)); } ++ void bl(Label& L) { bl(target(L)); } ++ void bceqz(ConditionalFlagRegister cj, Label& L) { bceqz(cj, target(L)); } ++ void bcnez(ConditionalFlagRegister cj, Label& L) { bcnez(cj, target(L)); } ++ ++ typedef enum { ++ // hint[4] ++ Completion = 0, ++ Ordering = (1 << 4), ++ ++ // The bitwise-not of the below constants is corresponding to the hint. This is convenient for OR operation. ++ // hint[3:2] and hint[1:0] ++ LoadLoad = ((1 << 3) | (1 << 1)), ++ LoadStore = ((1 << 3) | (1 << 0)), ++ StoreLoad = ((1 << 2) | (1 << 1)), ++ StoreStore = ((1 << 2) | (1 << 0)), ++ AnyAny = ((3 << 2) | (3 << 0)), ++ } Membar_mask_bits; ++ ++ // Serializes memory and blows flags ++ void membar(Membar_mask_bits hint) { ++ assert((hint & (3 << 0)) != 0, "membar mask unsupported!"); ++ assert((hint & (3 << 2)) != 0, "membar mask unsupported!"); ++ dbar(Ordering | (~hint & 0xf)); ++ } ++ ++ // LSX and LASX ++#define ASSERT_LSX assert(UseLSX, ""); ++#define ASSERT_LASX assert(UseLASX, ""); ++ ++ void vadd_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vadd_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vadd_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vadd_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vadd_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vadd_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vadd_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vadd_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vadd_q(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vadd_q_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvadd_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvadd_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvadd_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvadd_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvadd_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvadd_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvadd_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvadd_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvadd_q(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvadd_q_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vsub_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsub_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsub_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsub_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsub_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsub_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsub_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsub_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsub_q(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsub_q_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvsub_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsub_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsub_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsub_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsub_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsub_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsub_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsub_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsub_q(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsub_q_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vaddi_bu(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vaddi_bu_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vaddi_hu(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vaddi_hu_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vaddi_wu(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vaddi_wu_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vaddi_du(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vaddi_du_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvaddi_bu(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvaddi_bu_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvaddi_hu(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvaddi_hu_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvaddi_wu(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvaddi_wu_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvaddi_du(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvaddi_du_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vsubi_bu(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vsubi_bu_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vsubi_hu(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vsubi_hu_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vsubi_wu(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vsubi_wu_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vsubi_du(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vsubi_du_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvsubi_bu(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvsubi_bu_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsubi_hu(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvsubi_hu_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsubi_wu(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvsubi_wu_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsubi_du(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvsubi_du_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vneg_b(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vneg_b_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vneg_h(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vneg_h_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vneg_w(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vneg_w_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vneg_d(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vneg_d_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvneg_b(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvneg_b_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvneg_h(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvneg_h_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvneg_w(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvneg_w_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvneg_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvneg_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vhaddw_h_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhaddw_h_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vhaddw_w_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhaddw_w_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vhaddw_d_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhaddw_d_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vhaddw_q_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhaddw_q_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvhaddw_h_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhaddw_h_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvhaddw_w_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhaddw_w_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvhaddw_d_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhaddw_d_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvhaddw_q_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhaddw_q_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vhaddw_hu_bu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhaddw_hu_bu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vhaddw_wu_hu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhaddw_wu_hu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vhaddw_du_wu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhaddw_du_wu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vhaddw_qu_du(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhaddw_qu_du_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvhaddw_hu_bu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhaddw_hu_bu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvhaddw_wu_hu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhaddw_wu_hu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvhaddw_du_wu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhaddw_du_wu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvhaddw_qu_du(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhaddw_qu_du_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vhsubw_h_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhsubw_h_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vhsubw_w_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhsubw_w_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vhsubw_d_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhsubw_d_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vhsubw_q_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhsubw_q_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvhsubw_h_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhsubw_h_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvhsubw_w_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhsubw_w_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvhsubw_d_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhsubw_d_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvhsubw_q_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhsubw_q_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vhsubw_hu_bu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhsubw_hu_bu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vhsubw_wu_hu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhsubw_wu_hu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vhsubw_du_wu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhsubw_du_wu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vhsubw_qu_du(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vhsubw_qu_du_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvhsubw_hu_bu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhsubw_hu_bu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvhsubw_wu_hu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhsubw_wu_hu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvhsubw_du_wu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhsubw_du_wu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvhsubw_qu_du(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvhsubw_qu_du_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vabsd_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vabsd_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vabsd_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vabsd_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vabsd_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vabsd_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vabsd_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vabsd_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvabsd_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvabsd_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvabsd_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvabsd_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvabsd_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvabsd_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvabsd_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvabsd_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vmax_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmax_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmax_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmax_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmax_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmax_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmax_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmax_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvmax_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmax_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmax_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmax_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmax_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmax_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmax_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmax_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vmin_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmin_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmin_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmin_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmin_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmin_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmin_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmin_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvmin_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmin_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmin_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmin_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmin_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmin_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmin_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmin_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vmul_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmul_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmul_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmul_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmul_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmul_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmul_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmul_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvmul_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmul_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmul_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmul_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmul_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmul_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmul_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmul_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vmuh_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmuh_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmuh_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmuh_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmuh_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmuh_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmuh_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmuh_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvmuh_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmuh_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmuh_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmuh_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmuh_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmuh_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmuh_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmuh_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vmuh_bu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmuh_bu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmuh_hu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmuh_hu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmuh_wu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmuh_wu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmuh_du(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmuh_du_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvmuh_bu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmuh_bu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmuh_hu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmuh_hu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmuh_wu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmuh_wu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmuh_du(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmuh_du_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vmulwev_h_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmulwev_h_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmulwev_w_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmulwev_w_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmulwev_d_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmulwev_d_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmulwev_q_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmulwev_q_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvmulwev_h_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmulwev_h_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmulwev_w_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmulwev_w_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmulwev_d_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmulwev_d_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmulwev_q_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmulwev_q_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vmulwod_h_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmulwod_h_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmulwod_w_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmulwod_w_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmulwod_d_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmulwod_d_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmulwod_q_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmulwod_q_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvmulwod_h_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmulwod_h_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmulwod_w_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmulwod_w_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmulwod_d_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmulwod_d_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmulwod_q_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmulwod_q_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vmadd_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmadd_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmadd_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmadd_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmadd_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmadd_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmadd_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmadd_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvmadd_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmadd_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmadd_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmadd_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmadd_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmadd_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmadd_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmadd_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vmsub_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmsub_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmsub_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmsub_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmsub_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmsub_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vmsub_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vmsub_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvmsub_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmsub_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmsub_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmsub_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmsub_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmsub_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvmsub_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvmsub_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vext2xv_h_b(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(vext2xv_h_b_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void vext2xv_w_b(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(vext2xv_w_b_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void vext2xv_d_b(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(vext2xv_d_b_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void vext2xv_w_h(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(vext2xv_w_h_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void vext2xv_d_h(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(vext2xv_d_h_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void vext2xv_d_w(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(vext2xv_d_w_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vext2xv_hu_bu(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(vext2xv_hu_bu_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void vext2xv_wu_bu(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(vext2xv_wu_bu_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void vext2xv_du_bu(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(vext2xv_du_bu_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void vext2xv_wu_hu(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(vext2xv_wu_hu_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void vext2xv_du_hu(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(vext2xv_du_hu_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void vext2xv_du_wu(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(vext2xv_du_wu_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vldi(FloatRegister vd, int i13) { ASSERT_LSX emit_int32(insn_I13R( vldi_op, i13, (int)vd->encoding())); } ++ void xvldi(FloatRegister xd, int i13) { ASSERT_LASX emit_int32(insn_I13R(xvldi_op, i13, (int)xd->encoding())); } ++ ++ void vand_v(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vand_v_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvand_v(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvand_v_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vor_v(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vor_v_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvor_v(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvor_v_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vxor_v(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vxor_v_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvxor_v(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvxor_v_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vnor_v(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vnor_v_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvnor_v(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvnor_v_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vandn_v(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vandn_v_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvandn_v(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvandn_v_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vorn_v(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vorn_v_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvorn_v(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvorn_v_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vandi_b(FloatRegister vd, FloatRegister vj, int ui8) { ASSERT_LSX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR( vandi_b_op, ui8, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvandi_b(FloatRegister xd, FloatRegister xj, int ui8) { ASSERT_LASX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR(xvandi_b_op, ui8, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vori_b(FloatRegister vd, FloatRegister vj, int ui8) { ASSERT_LSX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR( vori_b_op, ui8, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvori_b(FloatRegister xd, FloatRegister xj, int ui8) { ASSERT_LASX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR(xvori_b_op, ui8, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vxori_b(FloatRegister vd, FloatRegister vj, int ui8) { ASSERT_LSX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR( vxori_b_op, ui8, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvxori_b(FloatRegister xd, FloatRegister xj, int ui8) { ASSERT_LASX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR(xvxori_b_op, ui8, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vnori_b(FloatRegister vd, FloatRegister vj, int ui8) { ASSERT_LSX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR( vnori_b_op, ui8, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvnori_b(FloatRegister xd, FloatRegister xj, int ui8) { ASSERT_LASX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR(xvnori_b_op, ui8, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vsll_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsll_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsll_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsll_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsll_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsll_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsll_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsll_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvsll_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsll_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsll_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsll_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsll_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsll_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsll_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsll_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vslli_b(FloatRegister vd, FloatRegister vj, int ui3) { ASSERT_LSX emit_int32(insn_I3RR( vslli_b_op, ui3, (int)vj->encoding(), (int)vd->encoding())); } ++ void vslli_h(FloatRegister vd, FloatRegister vj, int ui4) { ASSERT_LSX emit_int32(insn_I4RR( vslli_h_op, ui4, (int)vj->encoding(), (int)vd->encoding())); } ++ void vslli_w(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vslli_w_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vslli_d(FloatRegister vd, FloatRegister vj, int ui6) { ASSERT_LSX emit_int32(insn_I6RR( vslli_d_op, ui6, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvslli_b(FloatRegister xd, FloatRegister xj, int ui3) { ASSERT_LASX emit_int32(insn_I3RR(xvslli_b_op, ui3, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslli_h(FloatRegister xd, FloatRegister xj, int ui4) { ASSERT_LASX emit_int32(insn_I4RR(xvslli_h_op, ui4, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslli_w(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvslli_w_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslli_d(FloatRegister xd, FloatRegister xj, int ui6) { ASSERT_LASX emit_int32(insn_I6RR(xvslli_d_op, ui6, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vsrl_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsrl_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsrl_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsrl_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsrl_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsrl_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsrl_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsrl_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvsrl_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsrl_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsrl_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsrl_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsrl_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsrl_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsrl_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsrl_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vsrli_b(FloatRegister vd, FloatRegister vj, int ui3) { ASSERT_LSX emit_int32(insn_I3RR( vsrli_b_op, ui3, (int)vj->encoding(), (int)vd->encoding())); } ++ void vsrli_h(FloatRegister vd, FloatRegister vj, int ui4) { ASSERT_LSX emit_int32(insn_I4RR( vsrli_h_op, ui4, (int)vj->encoding(), (int)vd->encoding())); } ++ void vsrli_w(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vsrli_w_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vsrli_d(FloatRegister vd, FloatRegister vj, int ui6) { ASSERT_LSX emit_int32(insn_I6RR( vsrli_d_op, ui6, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvsrli_b(FloatRegister xd, FloatRegister xj, int ui3) { ASSERT_LASX emit_int32(insn_I3RR(xvsrli_b_op, ui3, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsrli_h(FloatRegister xd, FloatRegister xj, int ui4) { ASSERT_LASX emit_int32(insn_I4RR(xvsrli_h_op, ui4, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsrli_w(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvsrli_w_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsrli_d(FloatRegister xd, FloatRegister xj, int ui6) { ASSERT_LASX emit_int32(insn_I6RR(xvsrli_d_op, ui6, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vsra_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsra_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsra_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsra_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsra_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsra_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsra_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsra_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvsra_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsra_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsra_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsra_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsra_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsra_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsra_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsra_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vsrai_b(FloatRegister vd, FloatRegister vj, int ui3) { ASSERT_LSX emit_int32(insn_I3RR( vsrai_b_op, ui3, (int)vj->encoding(), (int)vd->encoding())); } ++ void vsrai_h(FloatRegister vd, FloatRegister vj, int ui4) { ASSERT_LSX emit_int32(insn_I4RR( vsrai_h_op, ui4, (int)vj->encoding(), (int)vd->encoding())); } ++ void vsrai_w(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vsrai_w_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vsrai_d(FloatRegister vd, FloatRegister vj, int ui6) { ASSERT_LSX emit_int32(insn_I6RR( vsrai_d_op, ui6, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvsrai_b(FloatRegister xd, FloatRegister xj, int ui3) { ASSERT_LASX emit_int32(insn_I3RR(xvsrai_b_op, ui3, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsrai_h(FloatRegister xd, FloatRegister xj, int ui4) { ASSERT_LASX emit_int32(insn_I4RR(xvsrai_h_op, ui4, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsrai_w(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvsrai_w_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsrai_d(FloatRegister xd, FloatRegister xj, int ui6) { ASSERT_LASX emit_int32(insn_I6RR(xvsrai_d_op, ui6, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vrotr_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vrotr_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vrotr_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vrotr_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vrotr_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vrotr_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vrotr_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vrotr_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvrotr_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvrotr_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvrotr_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvrotr_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvrotr_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvrotr_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvrotr_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvrotr_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vrotri_b(FloatRegister vd, FloatRegister vj, int ui3) { ASSERT_LSX emit_int32(insn_I3RR( vrotri_b_op, ui3, (int)vj->encoding(), (int)vd->encoding())); } ++ void vrotri_h(FloatRegister vd, FloatRegister vj, int ui4) { ASSERT_LSX emit_int32(insn_I4RR( vrotri_h_op, ui4, (int)vj->encoding(), (int)vd->encoding())); } ++ void vrotri_w(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vrotri_w_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vrotri_d(FloatRegister vd, FloatRegister vj, int ui6) { ASSERT_LSX emit_int32(insn_I6RR( vrotri_d_op, ui6, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvrotri_b(FloatRegister xd, FloatRegister xj, int ui3) { ASSERT_LASX emit_int32(insn_I3RR(xvrotri_b_op, ui3, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvrotri_h(FloatRegister xd, FloatRegister xj, int ui4) { ASSERT_LASX emit_int32(insn_I4RR(xvrotri_h_op, ui4, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvrotri_w(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvrotri_w_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvrotri_d(FloatRegister xd, FloatRegister xj, int ui6) { ASSERT_LASX emit_int32(insn_I6RR(xvrotri_d_op, ui6, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vsrlni_b_h(FloatRegister vd, FloatRegister vj, int ui4) { ASSERT_LSX emit_int32(insn_I4RR( vsrlni_b_h_op, ui4, (int)vj->encoding(), (int)vd->encoding())); } ++ void vsrlni_h_w(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vsrlni_h_w_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vsrlni_w_d(FloatRegister vd, FloatRegister vj, int ui6) { ASSERT_LSX emit_int32(insn_I6RR( vsrlni_w_d_op, ui6, (int)vj->encoding(), (int)vd->encoding())); } ++ void vsrlni_d_q(FloatRegister vd, FloatRegister vj, int ui7) { ASSERT_LSX emit_int32(insn_I7RR( vsrlni_d_q_op, ui7, (int)vj->encoding(), (int)vd->encoding())); } ++ ++ void vclo_b(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vclo_b_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vclo_h(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vclo_h_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vclo_w(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vclo_w_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vclo_d(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vclo_d_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvclo_b(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvclo_b_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvclo_h(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvclo_h_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvclo_w(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvclo_w_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvclo_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvclo_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vclz_b(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vclz_b_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vclz_h(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vclz_h_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vclz_w(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vclz_w_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vclz_d(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vclz_d_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvclz_b(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvclz_b_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvclz_h(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvclz_h_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvclz_w(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvclz_w_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvclz_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvclz_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vpcnt_b(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vpcnt_b_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vpcnt_h(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vpcnt_h_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vpcnt_w(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vpcnt_w_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vpcnt_d(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vpcnt_d_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvpcnt_b(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvpcnt_b_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvpcnt_h(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvpcnt_h_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvpcnt_w(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvpcnt_w_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvpcnt_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvpcnt_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vbitclr_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vbitclr_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitclr_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vbitclr_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitclr_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vbitclr_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitclr_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vbitclr_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvbitclr_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvbitclr_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitclr_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvbitclr_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitclr_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvbitclr_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitclr_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvbitclr_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vbitclri_b(FloatRegister vd, FloatRegister vj, int ui3) { ASSERT_LSX emit_int32(insn_I3RR( vbitclri_b_op, ui3, (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitclri_h(FloatRegister vd, FloatRegister vj, int ui4) { ASSERT_LSX emit_int32(insn_I4RR( vbitclri_h_op, ui4, (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitclri_w(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vbitclri_w_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitclri_d(FloatRegister vd, FloatRegister vj, int ui6) { ASSERT_LSX emit_int32(insn_I6RR( vbitclri_d_op, ui6, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvbitclri_b(FloatRegister xd, FloatRegister xj, int ui3) { ASSERT_LASX emit_int32(insn_I3RR(xvbitclri_b_op, ui3, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitclri_h(FloatRegister xd, FloatRegister xj, int ui4) { ASSERT_LASX emit_int32(insn_I4RR(xvbitclri_h_op, ui4, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitclri_w(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvbitclri_w_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitclri_d(FloatRegister xd, FloatRegister xj, int ui6) { ASSERT_LASX emit_int32(insn_I6RR(xvbitclri_d_op, ui6, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vbitset_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vbitset_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitset_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vbitset_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitset_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vbitset_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitset_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vbitset_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvbitset_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvbitset_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitset_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvbitset_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitset_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvbitset_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitset_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvbitset_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vbitseti_b(FloatRegister vd, FloatRegister vj, int ui3) { ASSERT_LSX emit_int32(insn_I3RR( vbitseti_b_op, ui3, (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitseti_h(FloatRegister vd, FloatRegister vj, int ui4) { ASSERT_LSX emit_int32(insn_I4RR( vbitseti_h_op, ui4, (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitseti_w(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vbitseti_w_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitseti_d(FloatRegister vd, FloatRegister vj, int ui6) { ASSERT_LSX emit_int32(insn_I6RR( vbitseti_d_op, ui6, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvbitseti_b(FloatRegister xd, FloatRegister xj, int ui3) { ASSERT_LASX emit_int32(insn_I3RR(xvbitseti_b_op, ui3, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitseti_h(FloatRegister xd, FloatRegister xj, int ui4) { ASSERT_LASX emit_int32(insn_I4RR(xvbitseti_h_op, ui4, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitseti_w(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvbitseti_w_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitseti_d(FloatRegister xd, FloatRegister xj, int ui6) { ASSERT_LASX emit_int32(insn_I6RR(xvbitseti_d_op, ui6, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vbitrev_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vbitrev_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitrev_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vbitrev_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitrev_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vbitrev_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitrev_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vbitrev_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvbitrev_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvbitrev_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitrev_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvbitrev_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitrev_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvbitrev_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitrev_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvbitrev_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vbitrevi_b(FloatRegister vd, FloatRegister vj, int ui3) { ASSERT_LSX emit_int32(insn_I3RR( vbitrevi_b_op, ui3, (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitrevi_h(FloatRegister vd, FloatRegister vj, int ui4) { ASSERT_LSX emit_int32(insn_I4RR( vbitrevi_h_op, ui4, (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitrevi_w(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vbitrevi_w_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vbitrevi_d(FloatRegister vd, FloatRegister vj, int ui6) { ASSERT_LSX emit_int32(insn_I6RR( vbitrevi_d_op, ui6, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvbitrevi_b(FloatRegister xd, FloatRegister xj, int ui3) { ASSERT_LASX emit_int32(insn_I3RR(xvbitrevi_b_op, ui3, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitrevi_h(FloatRegister xd, FloatRegister xj, int ui4) { ASSERT_LASX emit_int32(insn_I4RR(xvbitrevi_h_op, ui4, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitrevi_w(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvbitrevi_w_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvbitrevi_d(FloatRegister xd, FloatRegister xj, int ui6) { ASSERT_LASX emit_int32(insn_I6RR(xvbitrevi_d_op, ui6, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfrstp_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfrstp_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfrstp_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfrstp_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfrstp_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfrstp_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfrstp_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfrstp_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfrstpi_b(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vfrstpi_b_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vfrstpi_h(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vfrstpi_h_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfrstpi_b(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvfrstpi_b_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfrstpi_h(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvfrstpi_h_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfadd_s(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfadd_s_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfadd_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfadd_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfadd_s(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfadd_s_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfadd_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfadd_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfsub_s(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfsub_s_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfsub_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfsub_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfsub_s(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfsub_s_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfsub_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfsub_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfmul_s(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfmul_s_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfmul_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfmul_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfmul_s(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfmul_s_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfmul_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfmul_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfdiv_s(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfdiv_s_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfdiv_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfdiv_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfdiv_s(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfdiv_s_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfdiv_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfdiv_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfmadd_s(FloatRegister vd, FloatRegister vj, FloatRegister vk, FloatRegister va) { ASSERT_LSX emit_int32(insn_RRRR( vfmadd_s_op, (int)va->encoding(), (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfmadd_d(FloatRegister vd, FloatRegister vj, FloatRegister vk, FloatRegister va) { ASSERT_LSX emit_int32(insn_RRRR( vfmadd_d_op, (int)va->encoding(), (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfmadd_s(FloatRegister xd, FloatRegister xj, FloatRegister xk, FloatRegister xa) { ASSERT_LASX emit_int32(insn_RRRR(xvfmadd_s_op, (int)xa->encoding(), (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfmadd_d(FloatRegister xd, FloatRegister xj, FloatRegister xk, FloatRegister xa) { ASSERT_LASX emit_int32(insn_RRRR(xvfmadd_d_op, (int)xa->encoding(), (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfmsub_s(FloatRegister vd, FloatRegister vj, FloatRegister vk, FloatRegister va) { ASSERT_LSX emit_int32(insn_RRRR( vfmsub_s_op, (int)va->encoding(), (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfmsub_d(FloatRegister vd, FloatRegister vj, FloatRegister vk, FloatRegister va) { ASSERT_LSX emit_int32(insn_RRRR( vfmsub_d_op, (int)va->encoding(), (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfmsub_s(FloatRegister xd, FloatRegister xj, FloatRegister xk, FloatRegister xa) { ASSERT_LASX emit_int32(insn_RRRR(xvfmsub_s_op, (int)xa->encoding(), (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfmsub_d(FloatRegister xd, FloatRegister xj, FloatRegister xk, FloatRegister xa) { ASSERT_LASX emit_int32(insn_RRRR(xvfmsub_d_op, (int)xa->encoding(), (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfnmadd_s(FloatRegister vd, FloatRegister vj, FloatRegister vk, FloatRegister va) { ASSERT_LSX emit_int32(insn_RRRR( vfnmadd_s_op, (int)va->encoding(), (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfnmadd_d(FloatRegister vd, FloatRegister vj, FloatRegister vk, FloatRegister va) { ASSERT_LSX emit_int32(insn_RRRR( vfnmadd_d_op, (int)va->encoding(), (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfnmadd_s(FloatRegister xd, FloatRegister xj, FloatRegister xk, FloatRegister xa) { ASSERT_LASX emit_int32(insn_RRRR(xvfnmadd_s_op, (int)xa->encoding(), (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfnmadd_d(FloatRegister xd, FloatRegister xj, FloatRegister xk, FloatRegister xa) { ASSERT_LASX emit_int32(insn_RRRR(xvfnmadd_d_op, (int)xa->encoding(), (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfnmsub_s(FloatRegister vd, FloatRegister vj, FloatRegister vk, FloatRegister va) { ASSERT_LSX emit_int32(insn_RRRR( vfnmsub_s_op, (int)va->encoding(), (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfnmsub_d(FloatRegister vd, FloatRegister vj, FloatRegister vk, FloatRegister va) { ASSERT_LSX emit_int32(insn_RRRR( vfnmsub_d_op, (int)va->encoding(), (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfnmsub_s(FloatRegister xd, FloatRegister xj, FloatRegister xk, FloatRegister xa) { ASSERT_LASX emit_int32(insn_RRRR(xvfnmsub_s_op, (int)xa->encoding(), (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfnmsub_d(FloatRegister xd, FloatRegister xj, FloatRegister xk, FloatRegister xa) { ASSERT_LASX emit_int32(insn_RRRR(xvfnmsub_d_op, (int)xa->encoding(), (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfmax_s(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfmax_s_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfmax_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfmax_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfmax_s(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfmax_s_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfmax_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfmax_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfmin_s(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfmin_s_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfmin_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfmin_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfmin_s(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfmin_s_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfmin_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfmin_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfclass_s(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfclass_s_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vfclass_d(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfclass_d_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfclass_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfclass_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfclass_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfclass_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfsqrt_s(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfsqrt_s_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vfsqrt_d(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfsqrt_d_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfsqrt_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfsqrt_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfsqrt_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfsqrt_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfcvtl_s_h(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vfcvtl_s_h_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void vfcvtl_d_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vfcvtl_d_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvfcvtl_s_h(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfcvtl_s_h_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcvtl_d_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfcvtl_d_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfcvth_s_h(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vfcvth_s_h_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void vfcvth_d_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vfcvth_d_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvfcvth_s_h(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfcvth_s_h_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcvth_d_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfcvth_d_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfcvt_h_s(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfcvt_h_s_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcvt_s_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vfcvt_s_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfcvt_h_s(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfcvt_h_s_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcvt_s_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvfcvt_s_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfrintrne_s(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfrintrne_s_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vfrintrne_d(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfrintrne_d_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfrintrne_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfrintrne_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfrintrne_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfrintrne_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfrintrz_s(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfrintrz_s_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vfrintrz_d(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfrintrz_d_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfrintrz_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfrintrz_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfrintrz_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfrintrz_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfrintrp_s(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfrintrp_s_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vfrintrp_d(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfrintrp_d_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfrintrp_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfrintrp_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfrintrp_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfrintrp_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfrintrm_s(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfrintrm_s_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vfrintrm_d(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfrintrm_d_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfrintrm_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfrintrm_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfrintrm_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfrintrm_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfrint_s(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfrint_s_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void vfrint_d(FloatRegister vd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vfrint_d_op, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvfrint_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfrint_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfrint_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvfrint_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrne_w_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrne_w_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void vftintrne_l_d(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrne_l_d_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintrne_w_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrne_w_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvftintrne_l_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrne_l_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrz_w_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrz_w_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void vftintrz_l_d(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrz_l_d_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintrz_w_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrz_w_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvftintrz_l_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrz_l_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrp_w_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrp_w_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void vftintrp_l_d(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrp_l_d_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintrp_w_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrp_w_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvftintrp_l_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrp_l_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrm_w_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrm_w_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void vftintrm_l_d(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrm_l_d_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintrm_w_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrm_w_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvftintrm_l_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrm_l_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftint_w_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftint_w_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void vftint_l_d(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftint_l_d_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftint_w_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftint_w_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvftint_l_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftint_l_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrne_w_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vftintrne_w_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvftintrne_w_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvftintrne_w_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrz_w_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vftintrz_w_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvftintrz_w_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvftintrz_w_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrp_w_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vftintrp_w_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvftintrp_w_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvftintrp_w_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrm_w_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vftintrm_w_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvftintrm_w_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvftintrm_w_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftint_w_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vftint_w_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvftint_w_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvftint_w_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrnel_l_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrnel_l_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintrnel_l_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrnel_l_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrneh_l_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrneh_l_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintrneh_l_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrneh_l_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrzl_l_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrzl_l_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintrzl_l_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrzl_l_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrzh_l_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrzh_l_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintrzh_l_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrzh_l_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrpl_l_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrpl_l_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintrpl_l_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrpl_l_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrph_l_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrph_l_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintrph_l_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrph_l_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrml_l_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrml_l_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintrml_l_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrml_l_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintrmh_l_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintrmh_l_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintrmh_l_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintrmh_l_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftintl_l_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftintl_l_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftintl_l_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftintl_l_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vftinth_l_s(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vftinth_l_s_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvftinth_l_s(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvftinth_l_s_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vffint_s_w(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vffint_s_w_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void vffint_d_l(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vffint_d_l_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvffint_s_w(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvffint_s_w_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvffint_d_l(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvffint_d_l_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vffint_s_l(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vffint_s_l_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvffint_s_l(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvffint_s_l_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vffintl_d_w(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vffintl_d_w_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvffintl_d_w(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvffintl_d_w_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vffinth_d_w(FloatRegister vd, FloatRegister rj) { ASSERT_LSX emit_int32(insn_RR( vffinth_d_w_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvffinth_d_w(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvffinth_d_w_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vseq_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vseq_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vseq_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vseq_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vseq_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vseq_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vseq_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vseq_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvseq_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvseq_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvseq_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvseq_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvseq_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvseq_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvseq_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvseq_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vsle_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsle_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsle_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsle_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsle_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsle_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsle_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsle_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvsle_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsle_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsle_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsle_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsle_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsle_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsle_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsle_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vsle_bu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsle_bu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsle_hu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsle_hu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsle_wu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsle_wu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vsle_du(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vsle_du_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvsle_bu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsle_bu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsle_hu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsle_hu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsle_wu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsle_wu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvsle_du(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvsle_du_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vslt_b(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vslt_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vslt_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vslt_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vslt_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vslt_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vslt_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vslt_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvslt_b(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvslt_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslt_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvslt_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslt_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvslt_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslt_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvslt_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vslt_bu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vslt_bu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vslt_hu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vslt_hu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vslt_wu(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vslt_wu_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vslt_du(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vslt_du_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvslt_bu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvslt_bu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslt_hu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvslt_hu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslt_wu(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvslt_wu_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslt_du(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvslt_du_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vslti_b(FloatRegister vd, FloatRegister vj, int si5) { ASSERT_LSX assert(is_simm(si5, 5), "not a signed 5-bit int"); emit_int32(insn_I5RR( vslti_b_op, si5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vslti_h(FloatRegister vd, FloatRegister vj, int si5) { ASSERT_LSX assert(is_simm(si5, 5), "not a signed 5-bit int"); emit_int32(insn_I5RR( vslti_h_op, si5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vslti_w(FloatRegister vd, FloatRegister vj, int si5) { ASSERT_LSX assert(is_simm(si5, 5), "not a signed 5-bit int"); emit_int32(insn_I5RR( vslti_w_op, si5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vslti_d(FloatRegister vd, FloatRegister vj, int si5) { ASSERT_LSX assert(is_simm(si5, 5), "not a signed 5-bit int"); emit_int32(insn_I5RR( vslti_d_op, si5, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvslti_b(FloatRegister xd, FloatRegister xj, int si5) { ASSERT_LASX assert(is_simm(si5, 5), "not a signed 5-bit int"); emit_int32(insn_I5RR(xvslti_b_op, si5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslti_h(FloatRegister xd, FloatRegister xj, int si5) { ASSERT_LASX assert(is_simm(si5, 5), "not a signed 5-bit int"); emit_int32(insn_I5RR(xvslti_h_op, si5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslti_w(FloatRegister xd, FloatRegister xj, int si5) { ASSERT_LASX assert(is_simm(si5, 5), "not a signed 5-bit int"); emit_int32(insn_I5RR(xvslti_w_op, si5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslti_d(FloatRegister xd, FloatRegister xj, int si5) { ASSERT_LASX assert(is_simm(si5, 5), "not a signed 5-bit int"); emit_int32(insn_I5RR(xvslti_d_op, si5, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vslti_bu(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vslti_bu_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vslti_hu(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vslti_hu_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vslti_wu(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vslti_wu_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void vslti_du(FloatRegister vd, FloatRegister vj, int ui5) { ASSERT_LSX emit_int32(insn_I5RR( vslti_du_op, ui5, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvslti_bu(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvslti_bu_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslti_hu(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvslti_hu_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslti_wu(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvslti_wu_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvslti_du(FloatRegister xd, FloatRegister xj, int ui5) { ASSERT_LASX emit_int32(insn_I5RR(xvslti_du_op, ui5, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vfcmp_caf_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_caf , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cun_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_cun , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_ceq_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_ceq , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cueq_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_cueq, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_clt_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_clt , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cult_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_cult, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cle_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_cle , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cule_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_cule, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cne_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_cne , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cor_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_cor , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cune_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_cune, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_saf_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_saf , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sun_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_sun , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_seq_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_seq , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sueq_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_sueq, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_slt_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_slt , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sult_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_sult, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sle_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_sle , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sule_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_sule, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sne_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_sne , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sor_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_sor , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sune_s (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_s_op, fcmp_sune, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ ++ void vfcmp_caf_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_caf , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cun_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_cun , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_ceq_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_ceq , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cueq_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_cueq, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_clt_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_clt , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cult_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_cult, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cle_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_cle , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cule_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_cule, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cne_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_cne , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cor_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_cor , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_cune_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_cune, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_saf_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_saf , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sun_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_sun , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_seq_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_seq , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sueq_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_sueq, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_slt_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_slt , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sult_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_sult, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sle_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_sle , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sule_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_sule, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sne_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_sne , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sor_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_sor , (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vfcmp_sune_d (FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRRR( vfcmp_cond_d_op, fcmp_sune, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ ++ void xvfcmp_caf_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_caf , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cun_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_cun , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_ceq_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_ceq , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cueq_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_cueq, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_clt_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_clt , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cult_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_cult, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cle_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_cle , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cule_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_cule, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cne_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_cne , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cor_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_cor , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cune_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_cune, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_saf_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_saf , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sun_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_sun , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_seq_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_seq , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sueq_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_sueq, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_slt_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_slt , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sult_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_sult, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sle_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_sle , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sule_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_sule, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sne_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_sne , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sor_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_sor , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sune_s (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_s_op, fcmp_sune, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void xvfcmp_caf_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_caf , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cun_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_cun , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_ceq_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_ceq , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cueq_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_cueq, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_clt_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_clt , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cult_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_cult, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cle_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_cle , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cule_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_cule, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cne_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_cne , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cor_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_cor , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_cune_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_cune, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_saf_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_saf , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sun_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_sun , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_seq_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_seq , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sueq_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_sueq, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_slt_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_slt , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sult_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_sult, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sle_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_sle , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sule_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_sule, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sne_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_sne , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sor_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_sor , (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvfcmp_sune_d (FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRRR(xvfcmp_cond_d_op, fcmp_sune, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vbitsel_v(FloatRegister vd, FloatRegister vj, FloatRegister vk, FloatRegister va) { ASSERT_LSX emit_int32(insn_RRRR( vbitsel_v_op, (int)va->encoding(), (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvbitsel_v(FloatRegister xd, FloatRegister xj, FloatRegister xk, FloatRegister xa) { ASSERT_LASX emit_int32(insn_RRRR(xvbitsel_v_op, (int)xa->encoding(), (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vseteqz_v(ConditionalFlagRegister cd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vseteqz_v_op, (int)vj->encoding(), (int)cd->encoding())); } ++ void xvseteqz_v(ConditionalFlagRegister cd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvseteqz_v_op, (int)xj->encoding(), (int)cd->encoding())); } ++ ++ void vsetnez_v(ConditionalFlagRegister cd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vsetnez_v_op, (int)vj->encoding(), (int)cd->encoding())); } ++ void xvsetnez_v(ConditionalFlagRegister cd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvsetnez_v_op, (int)xj->encoding(), (int)cd->encoding())); } ++ ++ void vsetanyeqz_b(ConditionalFlagRegister cd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vsetanyeqz_b_op, (int)vj->encoding(), (int)cd->encoding())); } ++ void vsetanyeqz_h(ConditionalFlagRegister cd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vsetanyeqz_h_op, (int)vj->encoding(), (int)cd->encoding())); } ++ void vsetanyeqz_w(ConditionalFlagRegister cd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vsetanyeqz_w_op, (int)vj->encoding(), (int)cd->encoding())); } ++ void vsetanyeqz_d(ConditionalFlagRegister cd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vsetanyeqz_d_op, (int)vj->encoding(), (int)cd->encoding())); } ++ void xvsetanyeqz_b(ConditionalFlagRegister cd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvsetanyeqz_b_op, (int)xj->encoding(), (int)cd->encoding())); } ++ void xvsetanyeqz_h(ConditionalFlagRegister cd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvsetanyeqz_h_op, (int)xj->encoding(), (int)cd->encoding())); } ++ void xvsetanyeqz_w(ConditionalFlagRegister cd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvsetanyeqz_w_op, (int)xj->encoding(), (int)cd->encoding())); } ++ void xvsetanyeqz_d(ConditionalFlagRegister cd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvsetanyeqz_d_op, (int)xj->encoding(), (int)cd->encoding())); } ++ ++ void vsetallnez_b(ConditionalFlagRegister cd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vsetallnez_b_op, (int)vj->encoding(), (int)cd->encoding())); } ++ void vsetallnez_h(ConditionalFlagRegister cd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vsetallnez_h_op, (int)vj->encoding(), (int)cd->encoding())); } ++ void vsetallnez_w(ConditionalFlagRegister cd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vsetallnez_w_op, (int)vj->encoding(), (int)cd->encoding())); } ++ void vsetallnez_d(ConditionalFlagRegister cd, FloatRegister vj) { ASSERT_LSX emit_int32(insn_RR( vsetallnez_d_op, (int)vj->encoding(), (int)cd->encoding())); } ++ void xvsetallnez_b(ConditionalFlagRegister cd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvsetallnez_b_op, (int)xj->encoding(), (int)cd->encoding())); } ++ void xvsetallnez_h(ConditionalFlagRegister cd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvsetallnez_h_op, (int)xj->encoding(), (int)cd->encoding())); } ++ void xvsetallnez_w(ConditionalFlagRegister cd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvsetallnez_w_op, (int)xj->encoding(), (int)cd->encoding())); } ++ void xvsetallnez_d(ConditionalFlagRegister cd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvsetallnez_d_op, (int)xj->encoding(), (int)cd->encoding())); } ++ ++ void vinsgr2vr_b(FloatRegister vd, Register rj, int ui4) { ASSERT_LSX emit_int32(insn_I4RR( vinsgr2vr_b_op, ui4, (int)rj->encoding(), (int)vd->encoding())); } ++ void vinsgr2vr_h(FloatRegister vd, Register rj, int ui3) { ASSERT_LSX emit_int32(insn_I3RR( vinsgr2vr_h_op, ui3, (int)rj->encoding(), (int)vd->encoding())); } ++ void vinsgr2vr_w(FloatRegister vd, Register rj, int ui2) { ASSERT_LSX emit_int32(insn_I2RR( vinsgr2vr_w_op, ui2, (int)rj->encoding(), (int)vd->encoding())); } ++ void vinsgr2vr_d(FloatRegister vd, Register rj, int ui1) { ASSERT_LSX emit_int32(insn_I1RR( vinsgr2vr_d_op, ui1, (int)rj->encoding(), (int)vd->encoding())); } ++ ++ void xvinsgr2vr_w(FloatRegister xd, Register rj, int ui3) { ASSERT_LASX emit_int32(insn_I3RR(xvinsgr2vr_w_op, ui3, (int)rj->encoding(), (int)xd->encoding())); } ++ void xvinsgr2vr_d(FloatRegister xd, Register rj, int ui2) { ASSERT_LASX emit_int32(insn_I2RR(xvinsgr2vr_d_op, ui2, (int)rj->encoding(), (int)xd->encoding())); } ++ ++ void vpickve2gr_b(Register rd, FloatRegister vj, int ui4) { ASSERT_LSX emit_int32(insn_I4RR( vpickve2gr_b_op, ui4, (int)vj->encoding(), (int)rd->encoding())); } ++ void vpickve2gr_h(Register rd, FloatRegister vj, int ui3) { ASSERT_LSX emit_int32(insn_I3RR( vpickve2gr_h_op, ui3, (int)vj->encoding(), (int)rd->encoding())); } ++ void vpickve2gr_w(Register rd, FloatRegister vj, int ui2) { ASSERT_LSX emit_int32(insn_I2RR( vpickve2gr_w_op, ui2, (int)vj->encoding(), (int)rd->encoding())); } ++ void vpickve2gr_d(Register rd, FloatRegister vj, int ui1) { ASSERT_LSX emit_int32(insn_I1RR( vpickve2gr_d_op, ui1, (int)vj->encoding(), (int)rd->encoding())); } ++ ++ void vpickve2gr_bu(Register rd, FloatRegister vj, int ui4) { ASSERT_LSX emit_int32(insn_I4RR( vpickve2gr_bu_op, ui4, (int)vj->encoding(), (int)rd->encoding())); } ++ void vpickve2gr_hu(Register rd, FloatRegister vj, int ui3) { ASSERT_LSX emit_int32(insn_I3RR( vpickve2gr_hu_op, ui3, (int)vj->encoding(), (int)rd->encoding())); } ++ void vpickve2gr_wu(Register rd, FloatRegister vj, int ui2) { ASSERT_LSX emit_int32(insn_I2RR( vpickve2gr_wu_op, ui2, (int)vj->encoding(), (int)rd->encoding())); } ++ void vpickve2gr_du(Register rd, FloatRegister vj, int ui1) { ASSERT_LSX emit_int32(insn_I1RR( vpickve2gr_du_op, ui1, (int)vj->encoding(), (int)rd->encoding())); } ++ ++ void xvpickve2gr_w(Register rd, FloatRegister xj, int ui3) { ASSERT_LASX emit_int32(insn_I3RR(xvpickve2gr_w_op, ui3, (int)xj->encoding(), (int)rd->encoding())); } ++ void xvpickve2gr_d(Register rd, FloatRegister xj, int ui2) { ASSERT_LASX emit_int32(insn_I2RR(xvpickve2gr_d_op, ui2, (int)xj->encoding(), (int)rd->encoding())); } ++ ++ void xvpickve2gr_wu(Register rd, FloatRegister xj, int ui3) { ASSERT_LASX emit_int32(insn_I3RR(xvpickve2gr_wu_op, ui3, (int)xj->encoding(), (int)rd->encoding())); } ++ void xvpickve2gr_du(Register rd, FloatRegister xj, int ui2) { ASSERT_LASX emit_int32(insn_I2RR(xvpickve2gr_du_op, ui2, (int)xj->encoding(), (int)rd->encoding())); } ++ ++ void vreplgr2vr_b(FloatRegister vd, Register rj) { ASSERT_LSX emit_int32(insn_RR( vreplgr2vr_b_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void vreplgr2vr_h(FloatRegister vd, Register rj) { ASSERT_LSX emit_int32(insn_RR( vreplgr2vr_h_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void vreplgr2vr_w(FloatRegister vd, Register rj) { ASSERT_LSX emit_int32(insn_RR( vreplgr2vr_w_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void vreplgr2vr_d(FloatRegister vd, Register rj) { ASSERT_LSX emit_int32(insn_RR( vreplgr2vr_d_op, (int)rj->encoding(), (int)vd->encoding())); } ++ void xvreplgr2vr_b(FloatRegister xd, Register rj) { ASSERT_LASX emit_int32(insn_RR(xvreplgr2vr_b_op, (int)rj->encoding(), (int)xd->encoding())); } ++ void xvreplgr2vr_h(FloatRegister xd, Register rj) { ASSERT_LASX emit_int32(insn_RR(xvreplgr2vr_h_op, (int)rj->encoding(), (int)xd->encoding())); } ++ void xvreplgr2vr_w(FloatRegister xd, Register rj) { ASSERT_LASX emit_int32(insn_RR(xvreplgr2vr_w_op, (int)rj->encoding(), (int)xd->encoding())); } ++ void xvreplgr2vr_d(FloatRegister xd, Register rj) { ASSERT_LASX emit_int32(insn_RR(xvreplgr2vr_d_op, (int)rj->encoding(), (int)xd->encoding())); } ++ ++ void vreplvei_b(FloatRegister vd, FloatRegister vj, int ui4) { ASSERT_LSX emit_int32(insn_I4RR(vreplvei_b_op, ui4, (int)vj->encoding(), (int)vd->encoding())); } ++ void vreplvei_h(FloatRegister vd, FloatRegister vj, int ui3) { ASSERT_LSX emit_int32(insn_I3RR(vreplvei_h_op, ui3, (int)vj->encoding(), (int)vd->encoding())); } ++ void vreplvei_w(FloatRegister vd, FloatRegister vj, int ui2) { ASSERT_LSX emit_int32(insn_I2RR(vreplvei_w_op, ui2, (int)vj->encoding(), (int)vd->encoding())); } ++ void vreplvei_d(FloatRegister vd, FloatRegister vj, int ui1) { ASSERT_LSX emit_int32(insn_I1RR(vreplvei_d_op, ui1, (int)vj->encoding(), (int)vd->encoding())); } ++ ++ void xvreplve0_b(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvreplve0_b_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvreplve0_h(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvreplve0_h_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvreplve0_w(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvreplve0_w_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvreplve0_d(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvreplve0_d_op, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvreplve0_q(FloatRegister xd, FloatRegister xj) { ASSERT_LASX emit_int32(insn_RR(xvreplve0_q_op, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void xvinsve0_w(FloatRegister xd, FloatRegister xj, int ui3) { ASSERT_LASX emit_int32(insn_I3RR(xvinsve0_w_op, ui3, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvinsve0_d(FloatRegister xd, FloatRegister xj, int ui2) { ASSERT_LASX emit_int32(insn_I2RR(xvinsve0_d_op, ui2, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void xvpickve_w(FloatRegister xd, FloatRegister xj, int ui3) { ASSERT_LASX emit_int32(insn_I3RR(xvpickve_w_op, ui3, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvpickve_d(FloatRegister xd, FloatRegister xj, int ui2) { ASSERT_LASX emit_int32(insn_I2RR(xvpickve_d_op, ui2, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vilvl_b(FloatRegister vd, FloatRegister vj, FloatRegister vk){ ASSERT_LSX emit_int32(insn_RRR( vilvl_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vilvl_h(FloatRegister vd, FloatRegister vj, FloatRegister vk){ ASSERT_LSX emit_int32(insn_RRR( vilvl_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vilvl_w(FloatRegister vd, FloatRegister vj, FloatRegister vk){ ASSERT_LSX emit_int32(insn_RRR( vilvl_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vilvl_d(FloatRegister vd, FloatRegister vj, FloatRegister vk){ ASSERT_LSX emit_int32(insn_RRR( vilvl_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvilvl_b(FloatRegister xd, FloatRegister xj, FloatRegister xk){ ASSERT_LASX emit_int32(insn_RRR(xvilvl_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvilvl_h(FloatRegister xd, FloatRegister xj, FloatRegister xk){ ASSERT_LASX emit_int32(insn_RRR(xvilvl_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvilvl_w(FloatRegister xd, FloatRegister xj, FloatRegister xk){ ASSERT_LASX emit_int32(insn_RRR(xvilvl_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvilvl_d(FloatRegister xd, FloatRegister xj, FloatRegister xk){ ASSERT_LASX emit_int32(insn_RRR(xvilvl_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vilvh_b(FloatRegister vd, FloatRegister vj, FloatRegister vk){ ASSERT_LSX emit_int32(insn_RRR( vilvh_b_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vilvh_h(FloatRegister vd, FloatRegister vj, FloatRegister vk){ ASSERT_LSX emit_int32(insn_RRR( vilvh_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vilvh_w(FloatRegister vd, FloatRegister vj, FloatRegister vk){ ASSERT_LSX emit_int32(insn_RRR( vilvh_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vilvh_d(FloatRegister vd, FloatRegister vj, FloatRegister vk){ ASSERT_LSX emit_int32(insn_RRR( vilvh_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvilvh_b(FloatRegister xd, FloatRegister xj, FloatRegister xk){ ASSERT_LASX emit_int32(insn_RRR(xvilvh_b_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvilvh_h(FloatRegister xd, FloatRegister xj, FloatRegister xk){ ASSERT_LASX emit_int32(insn_RRR(xvilvh_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvilvh_w(FloatRegister xd, FloatRegister xj, FloatRegister xk){ ASSERT_LASX emit_int32(insn_RRR(xvilvh_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvilvh_d(FloatRegister xd, FloatRegister xj, FloatRegister xk){ ASSERT_LASX emit_int32(insn_RRR(xvilvh_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vshuf_b(FloatRegister vd, FloatRegister vj, FloatRegister vk, FloatRegister va) { ASSERT_LSX emit_int32(insn_RRRR( vshuf_b_op, (int)va->encoding(), (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void xvshuf_b(FloatRegister xd, FloatRegister xj, FloatRegister xk, FloatRegister xa) { ASSERT_LASX emit_int32(insn_RRRR(xvshuf_b_op, (int)xa->encoding(), (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vshuf_h(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vshuf_h_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vshuf_w(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vshuf_w_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ void vshuf_d(FloatRegister vd, FloatRegister vj, FloatRegister vk) { ASSERT_LSX emit_int32(insn_RRR( vshuf_d_op, (int)vk->encoding(), (int)vj->encoding(), (int)vd->encoding())); } ++ ++ void xvshuf_h(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvshuf_h_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvshuf_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvshuf_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ void xvshuf_d(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvshuf_d_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void xvperm_w(FloatRegister xd, FloatRegister xj, FloatRegister xk) { ASSERT_LASX emit_int32(insn_RRR(xvperm_w_op, (int)xk->encoding(), (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vshuf4i_b(FloatRegister vd, FloatRegister vj, int ui8) { ASSERT_LSX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR( vshuf4i_b_op, ui8, (int)vj->encoding(), (int)vd->encoding())); } ++ void vshuf4i_h(FloatRegister vd, FloatRegister vj, int ui8) { ASSERT_LSX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR( vshuf4i_h_op, ui8, (int)vj->encoding(), (int)vd->encoding())); } ++ void vshuf4i_w(FloatRegister vd, FloatRegister vj, int ui8) { ASSERT_LSX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR( vshuf4i_w_op, ui8, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvshuf4i_b(FloatRegister xd, FloatRegister xj, int ui8) { ASSERT_LASX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR(xvshuf4i_b_op, ui8, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvshuf4i_h(FloatRegister xd, FloatRegister xj, int ui8) { ASSERT_LASX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR(xvshuf4i_h_op, ui8, (int)xj->encoding(), (int)xd->encoding())); } ++ void xvshuf4i_w(FloatRegister xd, FloatRegister xj, int ui8) { ASSERT_LASX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR(xvshuf4i_w_op, ui8, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vshuf4i_d(FloatRegister vd, FloatRegister vj, int ui8) { ASSERT_LSX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR( vshuf4i_d_op, ui8, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvshuf4i_d(FloatRegister xd, FloatRegister xj, int ui8) { ASSERT_LASX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR(xvshuf4i_d_op, ui8, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vpermi_w(FloatRegister vd, FloatRegister vj, int ui8) { ASSERT_LSX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR( vpermi_w_op, ui8, (int)vj->encoding(), (int)vd->encoding())); } ++ void xvpermi_w(FloatRegister xd, FloatRegister xj, int ui8) { ASSERT_LASX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR(xvpermi_w_op, ui8, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void xvpermi_d(FloatRegister xd, FloatRegister xj, int ui8) { ASSERT_LASX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR(xvpermi_d_op, ui8, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void xvpermi_q(FloatRegister xd, FloatRegister xj, int ui8) { ASSERT_LASX assert(is_uimm(ui8, 8), "not a unsigned 8-bit int"); emit_int32(insn_I8RR(xvpermi_q_op, ui8, (int)xj->encoding(), (int)xd->encoding())); } ++ ++ void vld(FloatRegister vd, Register rj, int si12) { ASSERT_LSX assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR( vld_op, si12, (int)rj->encoding(), (int)vd->encoding()));} ++ void xvld(FloatRegister xd, Register rj, int si12) { ASSERT_LASX assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(xvld_op, si12, (int)rj->encoding(), (int)xd->encoding()));} ++ ++ void vst(FloatRegister vd, Register rj, int si12) { ASSERT_LSX assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR( vst_op, si12, (int)rj->encoding(), (int)vd->encoding()));} ++ void xvst(FloatRegister xd, Register rj, int si12) { ASSERT_LASX assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(xvst_op, si12, (int)rj->encoding(), (int)xd->encoding()));} ++ ++ void vldx(FloatRegister vd, Register rj, Register rk) { ASSERT_LSX emit_int32(insn_RRR( vldx_op, (int)rk->encoding(), (int)rj->encoding(), (int)vd->encoding())); } ++ void xvldx(FloatRegister xd, Register rj, Register rk) { ASSERT_LASX emit_int32(insn_RRR(xvldx_op, (int)rk->encoding(), (int)rj->encoding(), (int)xd->encoding())); } ++ ++ void vstx(FloatRegister vd, Register rj, Register rk) { ASSERT_LSX emit_int32(insn_RRR( vstx_op, (int)rk->encoding(), (int)rj->encoding(), (int)vd->encoding())); } ++ void xvstx(FloatRegister xd, Register rj, Register rk) { ASSERT_LASX emit_int32(insn_RRR(xvstx_op, (int)rk->encoding(), (int)rj->encoding(), (int)xd->encoding())); } ++ ++ void vldrepl_d(FloatRegister vd, Register rj, int si9) { ASSERT_LSX assert(is_simm(si9, 9), "not a signed 9-bit int"); emit_int32(insn_I9RR( vldrepl_d_op, si9, (int)rj->encoding(), (int)vd->encoding()));} ++ void vldrepl_w(FloatRegister vd, Register rj, int si10) { ASSERT_LSX assert(is_simm(si10, 10), "not a signed 10-bit int"); emit_int32(insn_I10RR( vldrepl_w_op, si10, (int)rj->encoding(), (int)vd->encoding()));} ++ void vldrepl_h(FloatRegister vd, Register rj, int si11) { ASSERT_LSX assert(is_simm(si11, 11), "not a signed 11-bit int"); emit_int32(insn_I11RR( vldrepl_h_op, si11, (int)rj->encoding(), (int)vd->encoding()));} ++ void vldrepl_b(FloatRegister vd, Register rj, int si12) { ASSERT_LSX assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR( vldrepl_b_op, si12, (int)rj->encoding(), (int)vd->encoding()));} ++ void xvldrepl_d(FloatRegister xd, Register rj, int si9) { ASSERT_LASX assert(is_simm(si9, 9), "not a signed 9-bit int"); emit_int32(insn_I9RR(xvldrepl_d_op, si9, (int)rj->encoding(), (int)xd->encoding()));} ++ void xvldrepl_w(FloatRegister xd, Register rj, int si10) { ASSERT_LASX assert(is_simm(si10, 10), "not a signed 10-bit int"); emit_int32(insn_I10RR(xvldrepl_w_op, si10, (int)rj->encoding(), (int)xd->encoding()));} ++ void xvldrepl_h(FloatRegister xd, Register rj, int si11) { ASSERT_LASX assert(is_simm(si11, 11), "not a signed 11-bit int"); emit_int32(insn_I11RR(xvldrepl_h_op, si11, (int)rj->encoding(), (int)xd->encoding()));} ++ void xvldrepl_b(FloatRegister xd, Register rj, int si12) { ASSERT_LASX assert(is_simm(si12, 12), "not a signed 12-bit int"); emit_int32(insn_I12RR(xvldrepl_b_op, si12, (int)rj->encoding(), (int)xd->encoding()));} ++ ++#undef ASSERT_LSX ++#undef ASSERT_LASX ++ ++public: ++ enum operand_size { byte, halfword, word, dword }; ++ // Creation ++ Assembler(CodeBuffer* code) : AbstractAssembler(code) {} ++ ++ // Decoding ++ static address locate_operand(address inst, WhichOperand which); ++ static address locate_next_instruction(address inst); ++}; ++ ++#endif // CPU_LOONGARCH_ASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/assembler_loongarch.inline.hpp b/src/hotspot/cpu/loongarch/assembler_loongarch.inline.hpp +--- a/src/hotspot/cpu/loongarch/assembler_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/assembler_loongarch.inline.hpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,33 @@ ++/* ++ * Copyright (c) 1997, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_ASSEMBLER_LOONGARCH_INLINE_HPP ++#define CPU_LOONGARCH_ASSEMBLER_LOONGARCH_INLINE_HPP ++ ++#include "asm/assembler.inline.hpp" ++#include "asm/codeBuffer.hpp" ++#include "code/codeCache.hpp" ++ ++#endif // CPU_LOONGARCH_ASSEMBLER_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/bytes_loongarch.hpp b/src/hotspot/cpu/loongarch/bytes_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/bytes_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/bytes_loongarch.hpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,64 @@ ++/* ++ * Copyright (c) 1997, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_BYTES_LOONGARCH_HPP ++#define CPU_LOONGARCH_BYTES_LOONGARCH_HPP ++ ++#include "memory/allStatic.hpp" ++#include "utilities/byteswap.hpp" ++ ++class Bytes: AllStatic { ++ public: ++ // Returns true if the byte ordering used by Java is different from the native byte ordering ++ // of the underlying machine. For example, this is true for Intel x86, but false for Solaris ++ // on Sparc. ++ // we use LoongArch, so return true ++ static inline bool is_Java_byte_ordering_different(){ return true; } ++ ++ ++ // Efficient reading and writing of unaligned unsigned data in platform-specific byte ordering ++ // (no special code is needed since LoongArch CPUs can access unaligned data) ++ static inline u2 get_native_u2(address p) { return *(u2*)p; } ++ static inline u4 get_native_u4(address p) { return *(u4*)p; } ++ static inline u8 get_native_u8(address p) { return *(u8*)p; } ++ ++ static inline void put_native_u2(address p, u2 x) { *(u2*)p = x; } ++ static inline void put_native_u4(address p, u4 x) { *(u4*)p = x; } ++ static inline void put_native_u8(address p, u8 x) { *(u8*)p = x; } ++ ++ ++ // Efficient reading and writing of unaligned unsigned data in Java ++ // byte ordering (i.e. big-endian ordering). Byte-order reversal is ++ // needed since LoongArch64 CPUs use little-endian format. ++ static inline u2 get_Java_u2(address p) { return byteswap(get_native_u2(p)); } ++ static inline u4 get_Java_u4(address p) { return byteswap(get_native_u4(p)); } ++ static inline u8 get_Java_u8(address p) { return byteswap(get_native_u8(p)); } ++ ++ static inline void put_Java_u2(address p, u2 x) { put_native_u2(p, byteswap(x)); } ++ static inline void put_Java_u4(address p, u4 x) { put_native_u4(p, byteswap(x)); } ++ static inline void put_Java_u8(address p, u8 x) { put_native_u8(p, byteswap(x)); } ++}; ++ ++#endif // CPU_LOONGARCH_BYTES_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_CodeStubs_loongarch_64.cpp b/src/hotspot/cpu/loongarch/c1_CodeStubs_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/c1_CodeStubs_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_CodeStubs_loongarch_64.cpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,338 @@ ++/* ++ * Copyright (c) 1999, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "c1/c1_CodeStubs.hpp" ++#include "c1/c1_FrameMap.hpp" ++#include "c1/c1_LIRAssembler.hpp" ++#include "c1/c1_MacroAssembler.hpp" ++#include "c1/c1_Runtime1.hpp" ++#include "classfile/javaClasses.hpp" ++#include "nativeInst_loongarch.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "vmreg_loongarch.inline.hpp" ++ ++#define __ ce->masm()-> ++ ++void C1SafepointPollStub::emit_code(LIR_Assembler* ce) { ++ __ bind(_entry); ++ InternalAddress safepoint_pc(ce->masm()->pc() - ce->masm()->offset() + safepoint_offset()); ++ __ lea(SCR2, safepoint_pc); ++ __ st_d(SCR2, Address(TREG, JavaThread::saved_exception_pc_offset())); ++ ++ assert(SharedRuntime::polling_page_return_handler_blob() != nullptr, ++ "polling page return stub not created yet"); ++ address stub = SharedRuntime::polling_page_return_handler_blob()->entry_point(); ++ ++ __ jmp(stub, relocInfo::runtime_call_type); ++} ++ ++void CounterOverflowStub::emit_code(LIR_Assembler* ce) { ++ __ bind(_entry); ++ Metadata *m = _method->as_constant_ptr()->as_metadata(); ++ __ mov_metadata(SCR2, m); ++ ce->store_parameter(SCR2, 1); ++ ce->store_parameter(_bci, 0); ++ __ call(Runtime1::entry_for(Runtime1::counter_overflow_id), relocInfo::runtime_call_type); ++ ce->add_call_info_here(_info); ++ ce->verify_oop_map(_info); ++ __ b(_continuation); ++} ++ ++void RangeCheckStub::emit_code(LIR_Assembler* ce) { ++ __ bind(_entry); ++ if (_info->deoptimize_on_exception()) { ++ address a = Runtime1::entry_for(Runtime1::predicate_failed_trap_id); ++ __ call(a, relocInfo::runtime_call_type); ++ ce->add_call_info_here(_info); ++ ce->verify_oop_map(_info); ++ debug_only(__ should_not_reach_here()); ++ return; ++ } ++ ++ if (_index->is_cpu_register()) { ++ __ move(SCR1, _index->as_register()); ++ } else { ++ __ li(SCR1, _index->as_jint()); ++ } ++ Runtime1::StubID stub_id; ++ if (_throw_index_out_of_bounds_exception) { ++ stub_id = Runtime1::throw_index_exception_id; ++ } else { ++ assert(_array != LIR_Opr::nullOpr(), "sanity"); ++ __ move(SCR2, _array->as_pointer_register()); ++ stub_id = Runtime1::throw_range_check_failed_id; ++ } ++ __ call(Runtime1::entry_for(stub_id), relocInfo::runtime_call_type); ++ ce->add_call_info_here(_info); ++ ce->verify_oop_map(_info); ++ debug_only(__ should_not_reach_here()); ++} ++ ++PredicateFailedStub::PredicateFailedStub(CodeEmitInfo* info) { ++ _info = new CodeEmitInfo(info); ++} ++ ++void PredicateFailedStub::emit_code(LIR_Assembler* ce) { ++ __ bind(_entry); ++ address a = Runtime1::entry_for(Runtime1::predicate_failed_trap_id); ++ __ call(a, relocInfo::runtime_call_type); ++ ce->add_call_info_here(_info); ++ ce->verify_oop_map(_info); ++ debug_only(__ should_not_reach_here()); ++} ++ ++void DivByZeroStub::emit_code(LIR_Assembler* ce) { ++ if (_offset != -1) { ++ ce->compilation()->implicit_exception_table()->append(_offset, __ offset()); ++ } ++ __ bind(_entry); ++ __ call(Runtime1::entry_for(Runtime1::throw_div0_exception_id), relocInfo::runtime_call_type); ++ ce->add_call_info_here(_info); ++ ce->verify_oop_map(_info); ++#ifdef ASSERT ++ __ should_not_reach_here(); ++#endif ++} ++ ++// Implementation of NewInstanceStub ++ ++NewInstanceStub::NewInstanceStub(LIR_Opr klass_reg, LIR_Opr result, ciInstanceKlass* klass, ++ CodeEmitInfo* info, Runtime1::StubID stub_id) { ++ _result = result; ++ _klass = klass; ++ _klass_reg = klass_reg; ++ _info = new CodeEmitInfo(info); ++ assert(stub_id == Runtime1::new_instance_id || ++ stub_id == Runtime1::fast_new_instance_id || ++ stub_id == Runtime1::fast_new_instance_init_check_id, ++ "need new_instance id"); ++ _stub_id = stub_id; ++} ++ ++void NewInstanceStub::emit_code(LIR_Assembler* ce) { ++ assert(__ rsp_offset() == 0, "frame size should be fixed"); ++ __ bind(_entry); ++ __ move(A3, _klass_reg->as_register()); ++ __ call(Runtime1::entry_for(_stub_id), relocInfo::runtime_call_type); ++ ce->add_call_info_here(_info); ++ ce->verify_oop_map(_info); ++ assert(_result->as_register() == A0, "result must in A0"); ++ __ b(_continuation); ++} ++ ++// Implementation of NewTypeArrayStub ++ ++NewTypeArrayStub::NewTypeArrayStub(LIR_Opr klass_reg, LIR_Opr length, LIR_Opr result, ++ CodeEmitInfo* info) { ++ _klass_reg = klass_reg; ++ _length = length; ++ _result = result; ++ _info = new CodeEmitInfo(info); ++} ++ ++void NewTypeArrayStub::emit_code(LIR_Assembler* ce) { ++ assert(__ rsp_offset() == 0, "frame size should be fixed"); ++ __ bind(_entry); ++ assert(_length->as_register() == S0, "length must in S0,"); ++ assert(_klass_reg->as_register() == A3, "klass_reg must in A3"); ++ __ call(Runtime1::entry_for(Runtime1::new_type_array_id), relocInfo::runtime_call_type); ++ ce->add_call_info_here(_info); ++ ce->verify_oop_map(_info); ++ assert(_result->as_register() == A0, "result must in A0"); ++ __ b(_continuation); ++} ++ ++// Implementation of NewObjectArrayStub ++ ++NewObjectArrayStub::NewObjectArrayStub(LIR_Opr klass_reg, LIR_Opr length, LIR_Opr result, ++ CodeEmitInfo* info) { ++ _klass_reg = klass_reg; ++ _result = result; ++ _length = length; ++ _info = new CodeEmitInfo(info); ++} ++ ++void NewObjectArrayStub::emit_code(LIR_Assembler* ce) { ++ assert(__ rsp_offset() == 0, "frame size should be fixed"); ++ __ bind(_entry); ++ assert(_length->as_register() == S0, "length must in S0,"); ++ assert(_klass_reg->as_register() == A3, "klass_reg must in A3"); ++ __ call(Runtime1::entry_for(Runtime1::new_object_array_id), relocInfo::runtime_call_type); ++ ce->add_call_info_here(_info); ++ ce->verify_oop_map(_info); ++ assert(_result->as_register() == A0, "result must in A0"); ++ __ b(_continuation); ++} ++ ++void MonitorEnterStub::emit_code(LIR_Assembler* ce) { ++ assert(__ rsp_offset() == 0, "frame size should be fixed"); ++ __ bind(_entry); ++ ce->store_parameter(_obj_reg->as_register(), 1); ++ ce->store_parameter(_lock_reg->as_register(), 0); ++ Runtime1::StubID enter_id; ++ if (ce->compilation()->has_fpu_code()) { ++ enter_id = Runtime1::monitorenter_id; ++ } else { ++ enter_id = Runtime1::monitorenter_nofpu_id; ++ } ++ __ call(Runtime1::entry_for(enter_id), relocInfo::runtime_call_type); ++ ce->add_call_info_here(_info); ++ ce->verify_oop_map(_info); ++ __ b(_continuation); ++} ++ ++void MonitorExitStub::emit_code(LIR_Assembler* ce) { ++ __ bind(_entry); ++ if (_compute_lock) { ++ // lock_reg was destroyed by fast unlocking attempt => recompute it ++ ce->monitor_address(_monitor_ix, _lock_reg); ++ } ++ ce->store_parameter(_lock_reg->as_register(), 0); ++ // note: non-blocking leaf routine => no call info needed ++ Runtime1::StubID exit_id; ++ if (ce->compilation()->has_fpu_code()) { ++ exit_id = Runtime1::monitorexit_id; ++ } else { ++ exit_id = Runtime1::monitorexit_nofpu_id; ++ } ++ __ lipc(RA, _continuation); ++ __ jmp(Runtime1::entry_for(exit_id), relocInfo::runtime_call_type); ++} ++ ++// Implementation of patching: ++// - Copy the code at given offset to an inlined buffer (first the bytes, then the number of bytes) ++// - Replace original code with a call to the stub ++// At Runtime: ++// - call to stub, jump to runtime ++// - in runtime: preserve all registers (rspecially objects, i.e., source and destination object) ++// - in runtime: after initializing class, restore original code, reexecute instruction ++ ++int PatchingStub::_patch_info_offset = -NativeGeneralJump::instruction_size; ++ ++void PatchingStub::align_patch_site(MacroAssembler* masm) { ++} ++ ++void PatchingStub::emit_code(LIR_Assembler* ce) { ++ assert(false, "LoongArch64 should not use C1 runtime patching"); ++} ++ ++void DeoptimizeStub::emit_code(LIR_Assembler* ce) { ++ __ bind(_entry); ++ ce->store_parameter(_trap_request, 0); ++ __ call(Runtime1::entry_for(Runtime1::deoptimize_id), relocInfo::runtime_call_type); ++ ce->add_call_info_here(_info); ++ DEBUG_ONLY(__ should_not_reach_here()); ++} ++ ++void ImplicitNullCheckStub::emit_code(LIR_Assembler* ce) { ++ address a; ++ if (_info->deoptimize_on_exception()) { ++ // Deoptimize, do not throw the exception, because it is probably wrong to do it here. ++ a = Runtime1::entry_for(Runtime1::predicate_failed_trap_id); ++ } else { ++ a = Runtime1::entry_for(Runtime1::throw_null_pointer_exception_id); ++ } ++ ++ ce->compilation()->implicit_exception_table()->append(_offset, __ offset()); ++ __ bind(_entry); ++ __ call(a, relocInfo::runtime_call_type); ++ ce->add_call_info_here(_info); ++ ce->verify_oop_map(_info); ++ debug_only(__ should_not_reach_here()); ++} ++ ++void SimpleExceptionStub::emit_code(LIR_Assembler* ce) { ++ assert(__ rsp_offset() == 0, "frame size should be fixed"); ++ ++ __ bind(_entry); ++ // pass the object in a scratch register because all other registers ++ // must be preserved ++ if (_obj->is_cpu_register()) { ++ __ move(SCR1, _obj->as_register()); ++ } ++ __ call(Runtime1::entry_for(_stub), relocInfo::runtime_call_type); ++ ce->add_call_info_here(_info); ++ debug_only(__ should_not_reach_here()); ++} ++ ++void ArrayCopyStub::emit_code(LIR_Assembler* ce) { ++ //---------------slow case: call to native----------------- ++ __ bind(_entry); ++ // Figure out where the args should go ++ // This should really convert the IntrinsicID to the Method* and signature ++ // but I don't know how to do that. ++ // ++ VMRegPair args[5]; ++ BasicType signature[5] = { T_OBJECT, T_INT, T_OBJECT, T_INT, T_INT}; ++ SharedRuntime::java_calling_convention(signature, args, 5); ++ ++ // push parameters ++ // (src, src_pos, dest, destPos, length) ++ Register r[5]; ++ r[0] = src()->as_register(); ++ r[1] = src_pos()->as_register(); ++ r[2] = dst()->as_register(); ++ r[3] = dst_pos()->as_register(); ++ r[4] = length()->as_register(); ++ ++ // next registers will get stored on the stack ++ for (int i = 0; i < 5 ; i++ ) { ++ VMReg r_1 = args[i].first(); ++ if (r_1->is_stack()) { ++ int st_off = r_1->reg2stack() * wordSize; ++ __ stptr_d (r[i], SP, st_off); ++ } else { ++ assert(r[i] == args[i].first()->as_Register(), "Wrong register for arg "); ++ } ++ } ++ ++ ce->align_call(lir_static_call); ++ ++ ce->emit_static_call_stub(); ++ if (ce->compilation()->bailed_out()) { ++ return; // CodeCache is full ++ } ++ AddressLiteral resolve(SharedRuntime::get_resolve_static_call_stub(), ++ relocInfo::static_call_type); ++ address call = __ trampoline_call(resolve); ++ if (call == nullptr) { ++ ce->bailout("trampoline stub overflow"); ++ return; ++ } ++ ce->add_call_info_here(info()); ++ ++#ifndef PRODUCT ++ if (PrintC1Statistics) { ++ __ li(SCR2, (address)&Runtime1::_arraycopy_slowcase_cnt); ++ __ increment(Address(SCR2)); ++ } ++#endif ++ ++ __ b(_continuation); ++} ++ ++#undef __ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_Defs_loongarch.hpp b/src/hotspot/cpu/loongarch/c1_Defs_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/c1_Defs_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_Defs_loongarch.hpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,88 @@ ++/* ++ * Copyright (c) 2000, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_C1_DEFS_LOONGARCH_HPP ++#define CPU_LOONGARCH_C1_DEFS_LOONGARCH_HPP ++ ++// native word offsets from memory address (little endian) ++enum { ++ pd_lo_word_offset_in_bytes = 0, ++ pd_hi_word_offset_in_bytes = BytesPerWord ++}; ++ ++// explicit rounding operations are required to implement the strictFP mode ++enum { ++ pd_strict_fp_requires_explicit_rounding = false ++}; ++ ++// FIXME: There are no callee-saved ++ ++// registers ++enum { ++ pd_nof_cpu_regs_frame_map = Register::number_of_registers, // number of registers used during code emission ++ pd_nof_fpu_regs_frame_map = FloatRegister::number_of_registers, // number of registers used during code emission ++ ++ pd_nof_caller_save_cpu_regs_frame_map = 15, // number of registers killed by calls ++ pd_nof_caller_save_fpu_regs_frame_map = 32, // number of registers killed by calls ++ ++ pd_first_callee_saved_reg = pd_nof_caller_save_cpu_regs_frame_map, ++ pd_last_callee_saved_reg = 21, ++ ++ pd_last_allocatable_cpu_reg = pd_nof_caller_save_cpu_regs_frame_map - 1, ++ ++ pd_nof_cpu_regs_reg_alloc = pd_nof_caller_save_cpu_regs_frame_map, // number of registers that are visible to register allocator ++ pd_nof_fpu_regs_reg_alloc = 32, // number of registers that are visible to register allocator ++ ++ pd_nof_cpu_regs_linearscan = 32, // number of registers visible to linear scan ++ pd_nof_fpu_regs_linearscan = pd_nof_fpu_regs_frame_map, // number of registers visible to linear scan ++ pd_nof_xmm_regs_linearscan = 0, // don't have vector registers ++ pd_first_cpu_reg = 0, ++ pd_last_cpu_reg = pd_nof_cpu_regs_reg_alloc - 1, ++ pd_first_byte_reg = 0, ++ pd_last_byte_reg = pd_nof_cpu_regs_reg_alloc - 1, ++ pd_first_fpu_reg = pd_nof_cpu_regs_frame_map, ++ pd_last_fpu_reg = pd_first_fpu_reg + 31, ++ ++ pd_first_callee_saved_fpu_reg = 24 + pd_first_fpu_reg, ++ pd_last_callee_saved_fpu_reg = 31 + pd_first_fpu_reg, ++}; ++ ++// Encoding of float value in debug info. This is true on x86 where ++// floats are extended to doubles when stored in the stack, false for ++// LoongArch64 where floats and doubles are stored in their native form. ++enum { ++ pd_float_saved_as_double = false ++}; ++ ++enum { ++ pd_two_operand_lir_form = false ++}; ++ ++// the number of stack required by ArrayCopyStub ++enum { ++ pd_arraycopystub_reserved_argument_area_size = 2 ++}; ++ ++#endif // CPU_LOONGARCH_C1_DEFS_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_FpuStackSim_loongarch_64.cpp b/src/hotspot/cpu/loongarch/c1_FpuStackSim_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/c1_FpuStackSim_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_FpuStackSim_loongarch_64.cpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,31 @@ ++/* ++ * Copyright (c) 2005, 2017, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++//-------------------------------------------------------- ++// FpuStackSim ++//-------------------------------------------------------- ++ ++// No FPU stack on LoongArch64 ++#include "precompiled.hpp" +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_FpuStackSim_loongarch.hpp b/src/hotspot/cpu/loongarch/c1_FpuStackSim_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/c1_FpuStackSim_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_FpuStackSim_loongarch.hpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,32 @@ ++/* ++ * Copyright (c) 2005, 2019, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_C1_FPUSTACKSIM_LOONGARCH_HPP ++#define CPU_LOONGARCH_C1_FPUSTACKSIM_LOONGARCH_HPP ++ ++// No FPU stack on LoongArch ++class FpuStackSim; ++ ++#endif // CPU_LOONGARCH_C1_FPUSTACKSIM_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_FrameMap_loongarch_64.cpp b/src/hotspot/cpu/loongarch/c1_FrameMap_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/c1_FrameMap_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_FrameMap_loongarch_64.cpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,345 @@ ++/* ++ * Copyright (c) 1999, 2019, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "c1/c1_FrameMap.hpp" ++#include "c1/c1_LIR.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "vmreg_loongarch.inline.hpp" ++ ++ ++LIR_Opr FrameMap::map_to_opr(BasicType type, VMRegPair* reg, bool) { ++ LIR_Opr opr = LIR_OprFact::illegalOpr; ++ VMReg r_1 = reg->first(); ++ VMReg r_2 = reg->second(); ++ if (r_1->is_stack()) { ++ // Convert stack slot to an SP offset ++ // The calling convention does not count the SharedRuntime::out_preserve_stack_slots() value ++ // so we must add it in here. ++ int st_off = (r_1->reg2stack() + SharedRuntime::out_preserve_stack_slots()) * VMRegImpl::stack_slot_size; ++ opr = LIR_OprFact::address(new LIR_Address(sp_opr, st_off, type)); ++ } else if (r_1->is_Register()) { ++ Register reg = r_1->as_Register(); ++ if (r_2->is_Register() && (type == T_LONG || type == T_DOUBLE)) { ++ Register reg2 = r_2->as_Register(); ++ assert(reg2 == reg, "must be same register"); ++ opr = as_long_opr(reg); ++ } else if (is_reference_type(type)) { ++ opr = as_oop_opr(reg); ++ } else if (type == T_METADATA) { ++ opr = as_metadata_opr(reg); ++ } else if (type == T_ADDRESS) { ++ opr = as_address_opr(reg); ++ } else { ++ opr = as_opr(reg); ++ } ++ } else if (r_1->is_FloatRegister()) { ++ assert(type == T_DOUBLE || type == T_FLOAT, "wrong type"); ++ int num = r_1->as_FloatRegister()->encoding(); ++ if (type == T_FLOAT) { ++ opr = LIR_OprFact::single_fpu(num); ++ } else { ++ opr = LIR_OprFact::double_fpu(num); ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++ return opr; ++} ++ ++LIR_Opr FrameMap::r0_opr; ++LIR_Opr FrameMap::ra_opr; ++LIR_Opr FrameMap::tp_opr; ++LIR_Opr FrameMap::sp_opr; ++LIR_Opr FrameMap::a0_opr; ++LIR_Opr FrameMap::a1_opr; ++LIR_Opr FrameMap::a2_opr; ++LIR_Opr FrameMap::a3_opr; ++LIR_Opr FrameMap::a4_opr; ++LIR_Opr FrameMap::a5_opr; ++LIR_Opr FrameMap::a6_opr; ++LIR_Opr FrameMap::a7_opr; ++LIR_Opr FrameMap::t0_opr; ++LIR_Opr FrameMap::t1_opr; ++LIR_Opr FrameMap::t2_opr; ++LIR_Opr FrameMap::t3_opr; ++LIR_Opr FrameMap::t4_opr; ++LIR_Opr FrameMap::t5_opr; ++LIR_Opr FrameMap::t6_opr; ++LIR_Opr FrameMap::t7_opr; ++LIR_Opr FrameMap::t8_opr; ++LIR_Opr FrameMap::rx_opr; ++LIR_Opr FrameMap::fp_opr; ++LIR_Opr FrameMap::s0_opr; ++LIR_Opr FrameMap::s1_opr; ++LIR_Opr FrameMap::s2_opr; ++LIR_Opr FrameMap::s3_opr; ++LIR_Opr FrameMap::s4_opr; ++LIR_Opr FrameMap::s5_opr; ++LIR_Opr FrameMap::s6_opr; ++LIR_Opr FrameMap::s7_opr; ++LIR_Opr FrameMap::s8_opr; ++ ++LIR_Opr FrameMap::receiver_opr; ++ ++LIR_Opr FrameMap::ra_oop_opr; ++LIR_Opr FrameMap::a0_oop_opr; ++LIR_Opr FrameMap::a1_oop_opr; ++LIR_Opr FrameMap::a2_oop_opr; ++LIR_Opr FrameMap::a3_oop_opr; ++LIR_Opr FrameMap::a4_oop_opr; ++LIR_Opr FrameMap::a5_oop_opr; ++LIR_Opr FrameMap::a6_oop_opr; ++LIR_Opr FrameMap::a7_oop_opr; ++LIR_Opr FrameMap::t0_oop_opr; ++LIR_Opr FrameMap::t1_oop_opr; ++LIR_Opr FrameMap::t2_oop_opr; ++LIR_Opr FrameMap::t3_oop_opr; ++LIR_Opr FrameMap::t4_oop_opr; ++LIR_Opr FrameMap::t5_oop_opr; ++LIR_Opr FrameMap::t6_oop_opr; ++LIR_Opr FrameMap::t7_oop_opr; ++LIR_Opr FrameMap::t8_oop_opr; ++LIR_Opr FrameMap::fp_oop_opr; ++LIR_Opr FrameMap::s0_oop_opr; ++LIR_Opr FrameMap::s1_oop_opr; ++LIR_Opr FrameMap::s2_oop_opr; ++LIR_Opr FrameMap::s3_oop_opr; ++LIR_Opr FrameMap::s4_oop_opr; ++LIR_Opr FrameMap::s5_oop_opr; ++LIR_Opr FrameMap::s6_oop_opr; ++LIR_Opr FrameMap::s7_oop_opr; ++LIR_Opr FrameMap::s8_oop_opr; ++ ++LIR_Opr FrameMap::scr1_opr; ++LIR_Opr FrameMap::scr2_opr; ++LIR_Opr FrameMap::scr1_long_opr; ++LIR_Opr FrameMap::scr2_long_opr; ++ ++LIR_Opr FrameMap::a0_metadata_opr; ++LIR_Opr FrameMap::a1_metadata_opr; ++LIR_Opr FrameMap::a2_metadata_opr; ++LIR_Opr FrameMap::a3_metadata_opr; ++LIR_Opr FrameMap::a4_metadata_opr; ++LIR_Opr FrameMap::a5_metadata_opr; ++ ++LIR_Opr FrameMap::long0_opr; ++LIR_Opr FrameMap::long1_opr; ++LIR_Opr FrameMap::fpu0_float_opr; ++LIR_Opr FrameMap::fpu0_double_opr; ++ ++LIR_Opr FrameMap::_caller_save_cpu_regs[] = {}; ++LIR_Opr FrameMap::_caller_save_fpu_regs[] = {}; ++ ++//-------------------------------------------------------- ++// FrameMap ++//-------------------------------------------------------- ++ ++void FrameMap::initialize() { ++ assert(!_init_done, "once"); ++ int i = 0; ++ ++ // caller save register ++ map_register(i, A0); a0_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, A1); a1_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, A2); a2_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, A3); a3_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, A4); a4_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, A5); a5_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, A6); a6_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, A7); a7_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, T0); t0_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, T1); t1_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, T2); t2_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, T3); t3_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, T5); t5_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, T6); t6_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, T8); t8_opr = LIR_OprFact::single_cpu(i); i++; ++ ++ // callee save register ++ map_register(i, S0); s0_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, S1); s1_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, S2); s2_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, S3); s3_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, S4); s4_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, S7); s7_opr = LIR_OprFact::single_cpu(i); i++; ++ map_register(i, S8); s8_opr = LIR_OprFact::single_cpu(i); i++; ++ ++ // special register ++ map_register(i, S5); s5_opr = LIR_OprFact::single_cpu(i); i++; // heapbase ++ map_register(i, S6); s6_opr = LIR_OprFact::single_cpu(i); i++; // thread ++ map_register(i, TP); tp_opr = LIR_OprFact::single_cpu(i); i++; // tp ++ map_register(i, FP); fp_opr = LIR_OprFact::single_cpu(i); i++; // fp ++ map_register(i, RA); ra_opr = LIR_OprFact::single_cpu(i); i++; // ra ++ map_register(i, SP); sp_opr = LIR_OprFact::single_cpu(i); i++; // sp ++ ++ // tmp register ++ map_register(i, T7); t7_opr = LIR_OprFact::single_cpu(i); i++; // scr1 ++ map_register(i, T4); t4_opr = LIR_OprFact::single_cpu(i); i++; // scr2 ++ ++ scr1_opr = t7_opr; ++ scr2_opr = t4_opr; ++ scr1_long_opr = LIR_OprFact::double_cpu(t7_opr->cpu_regnr(), t7_opr->cpu_regnr()); ++ scr2_long_opr = LIR_OprFact::double_cpu(t4_opr->cpu_regnr(), t4_opr->cpu_regnr()); ++ ++ long0_opr = LIR_OprFact::double_cpu(a0_opr->cpu_regnr(), a0_opr->cpu_regnr()); ++ long1_opr = LIR_OprFact::double_cpu(a1_opr->cpu_regnr(), a1_opr->cpu_regnr()); ++ ++ fpu0_float_opr = LIR_OprFact::single_fpu(0); ++ fpu0_double_opr = LIR_OprFact::double_fpu(0); ++ ++ // scr1, scr2 not included ++ _caller_save_cpu_regs[0] = a0_opr; ++ _caller_save_cpu_regs[1] = a1_opr; ++ _caller_save_cpu_regs[2] = a2_opr; ++ _caller_save_cpu_regs[3] = a3_opr; ++ _caller_save_cpu_regs[4] = a4_opr; ++ _caller_save_cpu_regs[5] = a5_opr; ++ _caller_save_cpu_regs[6] = a6_opr; ++ _caller_save_cpu_regs[7] = a7_opr; ++ _caller_save_cpu_regs[8] = t0_opr; ++ _caller_save_cpu_regs[9] = t1_opr; ++ _caller_save_cpu_regs[10] = t2_opr; ++ _caller_save_cpu_regs[11] = t3_opr; ++ _caller_save_cpu_regs[12] = t5_opr; ++ _caller_save_cpu_regs[13] = t6_opr; ++ _caller_save_cpu_regs[14] = t8_opr; ++ ++ for (int i = 0; i < 8; i++) { ++ _caller_save_fpu_regs[i] = LIR_OprFact::single_fpu(i); ++ } ++ ++ _init_done = true; ++ ++ ra_oop_opr = as_oop_opr(RA); ++ a0_oop_opr = as_oop_opr(A0); ++ a1_oop_opr = as_oop_opr(A1); ++ a2_oop_opr = as_oop_opr(A2); ++ a3_oop_opr = as_oop_opr(A3); ++ a4_oop_opr = as_oop_opr(A4); ++ a5_oop_opr = as_oop_opr(A5); ++ a6_oop_opr = as_oop_opr(A6); ++ a7_oop_opr = as_oop_opr(A7); ++ t0_oop_opr = as_oop_opr(T0); ++ t1_oop_opr = as_oop_opr(T1); ++ t2_oop_opr = as_oop_opr(T2); ++ t3_oop_opr = as_oop_opr(T3); ++ t4_oop_opr = as_oop_opr(T4); ++ t5_oop_opr = as_oop_opr(T5); ++ t6_oop_opr = as_oop_opr(T6); ++ t7_oop_opr = as_oop_opr(T7); ++ t8_oop_opr = as_oop_opr(T8); ++ fp_oop_opr = as_oop_opr(FP); ++ s0_oop_opr = as_oop_opr(S0); ++ s1_oop_opr = as_oop_opr(S1); ++ s2_oop_opr = as_oop_opr(S2); ++ s3_oop_opr = as_oop_opr(S3); ++ s4_oop_opr = as_oop_opr(S4); ++ s5_oop_opr = as_oop_opr(S5); ++ s6_oop_opr = as_oop_opr(S6); ++ s7_oop_opr = as_oop_opr(S7); ++ s8_oop_opr = as_oop_opr(S8); ++ ++ a0_metadata_opr = as_metadata_opr(A0); ++ a1_metadata_opr = as_metadata_opr(A1); ++ a2_metadata_opr = as_metadata_opr(A2); ++ a3_metadata_opr = as_metadata_opr(A3); ++ a4_metadata_opr = as_metadata_opr(A4); ++ a5_metadata_opr = as_metadata_opr(A5); ++ ++ sp_opr = as_pointer_opr(SP); ++ fp_opr = as_pointer_opr(FP); ++ ++ VMRegPair regs; ++ BasicType sig_bt = T_OBJECT; ++ SharedRuntime::java_calling_convention(&sig_bt, ®s, 1); ++ receiver_opr = as_oop_opr(regs.first()->as_Register()); ++ ++ for (int i = 0; i < nof_caller_save_fpu_regs; i++) { ++ _caller_save_fpu_regs[i] = LIR_OprFact::single_fpu(i); ++ } ++} ++ ++Address FrameMap::make_new_address(ByteSize sp_offset) const { ++ // for sp, based address use this: ++ // return Address(sp, in_bytes(sp_offset) - (framesize() - 2) * 4); ++ return Address(SP, in_bytes(sp_offset)); ++} ++ ++// ----------------mapping----------------------- ++// all mapping is based on fp addressing, except for simple leaf methods where we access ++// the locals sp based (and no frame is built) ++ ++// Frame for simple leaf methods (quick entries) ++// ++// +----------+ ++// | ret addr | <- TOS ++// +----------+ ++// | args | ++// | ...... | ++ ++// Frame for standard methods ++// ++// | .........| <- TOS ++// | locals | ++// +----------+ ++// | old fp, | <- RFP ++// +----------+ ++// | ret addr | ++// +----------+ ++// | args | ++// | .........| ++ ++// For OopMaps, map a local variable or spill index to an VMRegImpl name. ++// This is the offset from sp() in the frame of the slot for the index, ++// skewed by VMRegImpl::stack0 to indicate a stack location (vs.a register.) ++// ++// framesize + ++// stack0 stack0 0 <- VMReg ++// | | | ++// ...........|..............|.............| ++// 0 1 2 3 x x 4 5 6 ... | <- local indices ++// ^ ^ sp() ( x x indicate link ++// | | and return addr) ++// arguments non-argument locals ++ ++VMReg FrameMap::fpu_regname(int n) { ++ // Return the OptoReg name for the fpu stack slot "n" ++ // A spilled fpu stack slot comprises to two single-word OptoReg's. ++ return as_FloatRegister(n)->as_VMReg(); ++} ++ ++LIR_Opr FrameMap::stack_pointer() { ++ return FrameMap::sp_opr; ++} ++ ++// JSR 292 ++LIR_Opr FrameMap::method_handle_invoke_SP_save_opr() { ++ return LIR_OprFact::illegalOpr; // Not needed on LoongArch64 ++} ++ ++bool FrameMap::validate_frame() { ++ return true; ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_FrameMap_loongarch.hpp b/src/hotspot/cpu/loongarch/c1_FrameMap_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/c1_FrameMap_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_FrameMap_loongarch.hpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,143 @@ ++/* ++ * Copyright (c) 1999, 2019, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_C1_FRAMEMAP_LOONGARCH_HPP ++#define CPU_LOONGARCH_C1_FRAMEMAP_LOONGARCH_HPP ++ ++// On LoongArch64 the frame looks as follows: ++// ++// +-----------------------------+---------+----------------------------------------+----------------+----------- ++// | size_arguments-nof_reg_args | 2 words | size_locals-size_arguments+numreg_args | _size_monitors | spilling . ++// +-----------------------------+---------+----------------------------------------+----------------+----------- ++ ++ public: ++ static const int pd_c_runtime_reserved_arg_size; ++ ++ enum { ++ first_available_sp_in_frame = 0, ++ frame_pad_in_bytes = 16, ++ nof_reg_args = 8 ++ }; ++ ++ public: ++ static LIR_Opr receiver_opr; ++ ++ static LIR_Opr r0_opr; ++ static LIR_Opr ra_opr; ++ static LIR_Opr tp_opr; ++ static LIR_Opr sp_opr; ++ static LIR_Opr a0_opr; ++ static LIR_Opr a1_opr; ++ static LIR_Opr a2_opr; ++ static LIR_Opr a3_opr; ++ static LIR_Opr a4_opr; ++ static LIR_Opr a5_opr; ++ static LIR_Opr a6_opr; ++ static LIR_Opr a7_opr; ++ static LIR_Opr t0_opr; ++ static LIR_Opr t1_opr; ++ static LIR_Opr t2_opr; ++ static LIR_Opr t3_opr; ++ static LIR_Opr t4_opr; ++ static LIR_Opr t5_opr; ++ static LIR_Opr t6_opr; ++ static LIR_Opr t7_opr; ++ static LIR_Opr t8_opr; ++ static LIR_Opr rx_opr; ++ static LIR_Opr fp_opr; ++ static LIR_Opr s0_opr; ++ static LIR_Opr s1_opr; ++ static LIR_Opr s2_opr; ++ static LIR_Opr s3_opr; ++ static LIR_Opr s4_opr; ++ static LIR_Opr s5_opr; ++ static LIR_Opr s6_opr; ++ static LIR_Opr s7_opr; ++ static LIR_Opr s8_opr; ++ ++ static LIR_Opr ra_oop_opr; ++ static LIR_Opr a0_oop_opr; ++ static LIR_Opr a1_oop_opr; ++ static LIR_Opr a2_oop_opr; ++ static LIR_Opr a3_oop_opr; ++ static LIR_Opr a4_oop_opr; ++ static LIR_Opr a5_oop_opr; ++ static LIR_Opr a6_oop_opr; ++ static LIR_Opr a7_oop_opr; ++ static LIR_Opr t0_oop_opr; ++ static LIR_Opr t1_oop_opr; ++ static LIR_Opr t2_oop_opr; ++ static LIR_Opr t3_oop_opr; ++ static LIR_Opr t4_oop_opr; ++ static LIR_Opr t5_oop_opr; ++ static LIR_Opr t6_oop_opr; ++ static LIR_Opr t7_oop_opr; ++ static LIR_Opr t8_oop_opr; ++ static LIR_Opr fp_oop_opr; ++ static LIR_Opr s0_oop_opr; ++ static LIR_Opr s1_oop_opr; ++ static LIR_Opr s2_oop_opr; ++ static LIR_Opr s3_oop_opr; ++ static LIR_Opr s4_oop_opr; ++ static LIR_Opr s5_oop_opr; ++ static LIR_Opr s6_oop_opr; ++ static LIR_Opr s7_oop_opr; ++ static LIR_Opr s8_oop_opr; ++ ++ static LIR_Opr scr1_opr; ++ static LIR_Opr scr2_opr; ++ static LIR_Opr scr1_long_opr; ++ static LIR_Opr scr2_long_opr; ++ ++ static LIR_Opr a0_metadata_opr; ++ static LIR_Opr a1_metadata_opr; ++ static LIR_Opr a2_metadata_opr; ++ static LIR_Opr a3_metadata_opr; ++ static LIR_Opr a4_metadata_opr; ++ static LIR_Opr a5_metadata_opr; ++ ++ static LIR_Opr long0_opr; ++ static LIR_Opr long1_opr; ++ static LIR_Opr fpu0_float_opr; ++ static LIR_Opr fpu0_double_opr; ++ ++ static LIR_Opr as_long_opr(Register r) { ++ return LIR_OprFact::double_cpu(cpu_reg2rnr(r), cpu_reg2rnr(r)); ++ } ++ static LIR_Opr as_pointer_opr(Register r) { ++ return LIR_OprFact::double_cpu(cpu_reg2rnr(r), cpu_reg2rnr(r)); ++ } ++ ++ // VMReg name for spilled physical FPU stack slot n ++ static VMReg fpu_regname (int n); ++ ++ static bool is_caller_save_register(LIR_Opr opr) { return true; } ++ static bool is_caller_save_register(Register r) { return true; } ++ ++ static int nof_caller_save_cpu_regs() { return pd_nof_caller_save_cpu_regs_frame_map; } ++ static int last_cpu_reg() { return pd_last_cpu_reg; } ++ static int last_byte_reg() { return pd_last_byte_reg; } ++ ++#endif // CPU_LOONGARCH_C1_FRAMEMAP_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_globals_loongarch.hpp b/src/hotspot/cpu/loongarch/c1_globals_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/c1_globals_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_globals_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,64 @@ ++/* ++ * Copyright (c) 2000, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_C1_GLOBALS_LOONGARCH_HPP ++#define CPU_LOONGARCH_C1_GLOBALS_LOONGARCH_HPP ++ ++#include "utilities/globalDefinitions.hpp" ++#include "utilities/macros.hpp" ++ ++// Sets the default values for platform dependent flags used by the client compiler. ++// (see c1_globals.hpp) ++ ++#ifndef COMPILER2 ++define_pd_global(bool, BackgroundCompilation, true ); ++define_pd_global(bool, InlineIntrinsics, true ); ++define_pd_global(bool, PreferInterpreterNativeStubs, false); ++define_pd_global(bool, ProfileTraps, false); ++define_pd_global(bool, UseOnStackReplacement, true ); ++define_pd_global(bool, TieredCompilation, false); ++define_pd_global(intx, CompileThreshold, 1500 ); ++ ++define_pd_global(intx, OnStackReplacePercentage, 933 ); ++define_pd_global(intx, NewSizeThreadIncrease, 4*K ); ++define_pd_global(intx, InitialCodeCacheSize, 160*K); ++define_pd_global(intx, ReservedCodeCacheSize, 32*M ); ++define_pd_global(intx, NonProfiledCodeHeapSize, 13*M ); ++define_pd_global(intx, ProfiledCodeHeapSize, 14*M ); ++define_pd_global(intx, NonNMethodCodeHeapSize, 5*M ); ++define_pd_global(bool, ProfileInterpreter, false); ++define_pd_global(intx, CodeCacheExpansionSize, 32*K ); ++define_pd_global(uintx, CodeCacheMinBlockLength, 1); ++define_pd_global(uintx, CodeCacheMinimumUseSpace, 400*K); ++define_pd_global(bool, NeverActAsServerClassMachine, true ); ++define_pd_global(uint64_t,MaxRAM, 1ULL*G); ++define_pd_global(bool, CICompileOSR, true ); ++#endif // !COMPILER2 ++define_pd_global(bool, UseTypeProfile, false); ++ ++define_pd_global(bool, OptimizeSinglePrecision, true ); ++define_pd_global(bool, CSEArrayLength, false); ++ ++#endif // CPU_LOONGARCH_C1_GLOBALS_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_LinearScan_loongarch_64.cpp b/src/hotspot/cpu/loongarch/c1_LinearScan_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/c1_LinearScan_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_LinearScan_loongarch_64.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,33 @@ ++/* ++ * Copyright (c) 2005, 2011, Oracle and/or its affiliates. All rights reserved. All rights reserved. ++ * Copyright (c) 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "c1/c1_Instruction.hpp" ++#include "c1/c1_LinearScan.hpp" ++#include "utilities/bitMap.inline.hpp" ++ ++void LinearScan::allocate_fpu_stack() { ++ // No FPU stack on LoongArch64 ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_LinearScan_loongarch.hpp b/src/hotspot/cpu/loongarch/c1_LinearScan_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/c1_LinearScan_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_LinearScan_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,70 @@ ++/* ++ * Copyright (c) 2005, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_C1_LINEARSCAN_LOONGARCH_HPP ++#define CPU_LOONGARCH_C1_LINEARSCAN_LOONGARCH_HPP ++ ++inline bool LinearScan::is_processed_reg_num(int reg_num) { ++ return reg_num <= FrameMap::last_cpu_reg() || reg_num >= pd_nof_cpu_regs_frame_map; ++} ++ ++inline int LinearScan::num_physical_regs(BasicType type) { ++ return 1; ++} ++ ++inline bool LinearScan::requires_adjacent_regs(BasicType type) { ++ return false; ++} ++ ++inline bool LinearScan::is_caller_save(int assigned_reg) { ++ assert(assigned_reg >= 0 && assigned_reg < nof_regs, "should call this only for registers"); ++ if (assigned_reg < pd_first_callee_saved_reg) ++ return true; ++ if (assigned_reg > pd_last_callee_saved_reg && assigned_reg < pd_first_callee_saved_fpu_reg) ++ return true; ++ if (assigned_reg > pd_last_callee_saved_fpu_reg && assigned_reg < pd_last_fpu_reg) ++ return true; ++ return false; ++} ++ ++inline void LinearScan::pd_add_temps(LIR_Op* op) {} ++ ++// Implementation of LinearScanWalker ++inline bool LinearScanWalker::pd_init_regs_for_alloc(Interval* cur) { ++ if (allocator()->gen()->is_vreg_flag_set(cur->reg_num(), LIRGenerator::callee_saved)) { ++ assert(cur->type() != T_FLOAT && cur->type() != T_DOUBLE, "cpu regs only"); ++ _first_reg = pd_first_callee_saved_reg; ++ _last_reg = pd_last_callee_saved_reg; ++ return true; ++ } else if (cur->type() == T_INT || cur->type() == T_LONG || cur->type() == T_OBJECT || ++ cur->type() == T_ADDRESS || cur->type() == T_METADATA) { ++ _first_reg = pd_first_cpu_reg; ++ _last_reg = pd_last_allocatable_cpu_reg; ++ return true; ++ } ++ return false; ++} ++ ++#endif // CPU_LOONGARCH_C1_LINEARSCAN_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_LIRAssembler_loongarch_64.cpp b/src/hotspot/cpu/loongarch/c1_LIRAssembler_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/c1_LIRAssembler_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_LIRAssembler_loongarch_64.cpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,3369 @@ ++/* ++ * Copyright (c) 2000, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "asm/assembler.hpp" ++#include "c1/c1_CodeStubs.hpp" ++#include "c1/c1_Compilation.hpp" ++#include "c1/c1_LIRAssembler.hpp" ++#include "c1/c1_MacroAssembler.hpp" ++#include "c1/c1_Runtime1.hpp" ++#include "c1/c1_ValueStack.hpp" ++#include "ci/ciArrayKlass.hpp" ++#include "ci/ciInstance.hpp" ++#include "code/compiledIC.hpp" ++#include "gc/shared/collectedHeap.hpp" ++#include "gc/shared/gc_globals.hpp" ++#include "nativeInst_loongarch.hpp" ++#include "oops/objArrayKlass.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "utilities/powerOfTwo.hpp" ++#include "vmreg_loongarch.inline.hpp" ++ ++#ifndef PRODUCT ++#define COMMENT(x) do { __ block_comment(x); } while (0) ++#else ++#define COMMENT(x) ++#endif ++ ++NEEDS_CLEANUP // remove this definitions? ++ ++#define __ _masm-> ++ ++static void select_different_registers(Register preserve, Register extra, ++ Register &tmp1, Register &tmp2) { ++ if (tmp1 == preserve) { ++ assert_different_registers(tmp1, tmp2, extra); ++ tmp1 = extra; ++ } else if (tmp2 == preserve) { ++ assert_different_registers(tmp1, tmp2, extra); ++ tmp2 = extra; ++ } ++ assert_different_registers(preserve, tmp1, tmp2); ++} ++ ++static void select_different_registers(Register preserve, Register extra, ++ Register &tmp1, Register &tmp2, ++ Register &tmp3) { ++ if (tmp1 == preserve) { ++ assert_different_registers(tmp1, tmp2, tmp3, extra); ++ tmp1 = extra; ++ } else if (tmp2 == preserve) { ++ assert_different_registers(tmp1, tmp2, tmp3, extra); ++ tmp2 = extra; ++ } else if (tmp3 == preserve) { ++ assert_different_registers(tmp1, tmp2, tmp3, extra); ++ tmp3 = extra; ++ } ++ assert_different_registers(preserve, tmp1, tmp2, tmp3); ++} ++ ++bool LIR_Assembler::is_small_constant(LIR_Opr opr) { Unimplemented(); return false; } ++ ++LIR_Opr LIR_Assembler::receiverOpr() { ++ return FrameMap::receiver_opr; ++} ++ ++LIR_Opr LIR_Assembler::osrBufferPointer() { ++ return FrameMap::as_pointer_opr(receiverOpr()->as_register()); ++} ++ ++//--------------fpu register translations----------------------- ++ ++address LIR_Assembler::float_constant(float f) { ++ address const_addr = __ float_constant(f); ++ if (const_addr == nullptr) { ++ bailout("const section overflow"); ++ return __ code()->consts()->start(); ++ } else { ++ return const_addr; ++ } ++} ++ ++address LIR_Assembler::double_constant(double d) { ++ address const_addr = __ double_constant(d); ++ if (const_addr == nullptr) { ++ bailout("const section overflow"); ++ return __ code()->consts()->start(); ++ } else { ++ return const_addr; ++ } ++} ++ ++void LIR_Assembler::breakpoint() { Unimplemented(); } ++ ++void LIR_Assembler::push(LIR_Opr opr) { Unimplemented(); } ++ ++void LIR_Assembler::pop(LIR_Opr opr) { Unimplemented(); } ++ ++bool LIR_Assembler::is_literal_address(LIR_Address* addr) { Unimplemented(); return false; } ++ ++static Register as_reg(LIR_Opr op) { ++ return op->is_double_cpu() ? op->as_register_lo() : op->as_register(); ++} ++ ++static jlong as_long(LIR_Opr data) { ++ jlong result; ++ switch (data->type()) { ++ case T_INT: ++ result = (data->as_jint()); ++ break; ++ case T_LONG: ++ result = (data->as_jlong()); ++ break; ++ default: ++ ShouldNotReachHere(); ++ result = 0; // unreachable ++ } ++ return result; ++} ++ ++Address LIR_Assembler::as_Address(LIR_Address* addr) { ++ Register base = addr->base()->as_pointer_register(); ++ LIR_Opr opr = addr->index(); ++ if (opr->is_cpu_register()) { ++ Register index; ++ if (opr->is_single_cpu()) ++ index = opr->as_register(); ++ else ++ index = opr->as_register_lo(); ++ assert(addr->disp() == 0, "must be"); ++ return Address(base, index, Address::ScaleFactor(addr->scale())); ++ } else { ++ assert(addr->scale() == 0, "must be"); ++ return Address(base, addr->disp()); ++ } ++ return Address(); ++} ++ ++Address LIR_Assembler::as_Address_hi(LIR_Address* addr) { ++ ShouldNotReachHere(); ++ return Address(); ++} ++ ++Address LIR_Assembler::as_Address_lo(LIR_Address* addr) { ++ return as_Address(addr); // Ouch ++ // FIXME: This needs to be much more clever. See x86. ++} ++ ++// Ensure a valid Address (base + offset) to a stack-slot. If stack access is ++// not encodable as a base + (immediate) offset, generate an explicit address ++// calculation to hold the address in a temporary register. ++Address LIR_Assembler::stack_slot_address(int index, uint size, int adjust) { ++ precond(size == 4 || size == 8); ++ Address addr = frame_map()->address_for_slot(index, adjust); ++ precond(addr.index() == noreg); ++ precond(addr.base() == SP); ++ precond(addr.disp() > 0); ++ uint mask = size - 1; ++ assert((addr.disp() & mask) == 0, "scaled offsets only"); ++ return addr; ++} ++ ++void LIR_Assembler::osr_entry() { ++ offsets()->set_value(CodeOffsets::OSR_Entry, code_offset()); ++ BlockBegin* osr_entry = compilation()->hir()->osr_entry(); ++ ValueStack* entry_state = osr_entry->state(); ++ int number_of_locks = entry_state->locks_size(); ++ ++ // we jump here if osr happens with the interpreter ++ // state set up to continue at the beginning of the ++ // loop that triggered osr - in particular, we have ++ // the following registers setup: ++ // ++ // A2: osr buffer ++ // ++ ++ // build frame ++ ciMethod* m = compilation()->method(); ++ __ build_frame(initial_frame_size_in_bytes(), bang_size_in_bytes()); ++ ++ // OSR buffer is ++ // ++ // locals[nlocals-1..0] ++ // monitors[0..number_of_locks] ++ // ++ // locals is a direct copy of the interpreter frame so in the osr buffer ++ // so first slot in the local array is the last local from the interpreter ++ // and last slot is local[0] (receiver) from the interpreter ++ // ++ // Similarly with locks. The first lock slot in the osr buffer is the nth lock ++ // from the interpreter frame, the nth lock slot in the osr buffer is 0th lock ++ // in the interpreter frame (the method lock if a sync method) ++ ++ // Initialize monitors in the compiled activation. ++ // A2: pointer to osr buffer ++ // ++ // All other registers are dead at this point and the locals will be ++ // copied into place by code emitted in the IR. ++ ++ Register OSR_buf = osrBufferPointer()->as_pointer_register(); ++ { ++ assert(frame::interpreter_frame_monitor_size() == BasicObjectLock::size(), "adjust code below"); ++ int monitor_offset = BytesPerWord * method()->max_locals() + (2 * BytesPerWord) * (number_of_locks - 1); ++ // SharedRuntime::OSR_migration_begin() packs BasicObjectLocks in ++ // the OSR buffer using 2 word entries: first the lock and then ++ // the oop. ++ for (int i = 0; i < number_of_locks; i++) { ++ int slot_offset = monitor_offset - ((i * 2) * BytesPerWord); ++#ifdef ASSERT ++ // verify the interpreter's monitor has a non-null object ++ { ++ Label L; ++ __ ld_d(SCR1, Address(OSR_buf, slot_offset + 1 * BytesPerWord)); ++ __ bnez(SCR1, L); ++ __ stop("locked object is null"); ++ __ bind(L); ++ } ++#endif ++ __ ld_d(S0, Address(OSR_buf, slot_offset + 0)); ++ __ st_d(S0, frame_map()->address_for_monitor_lock(i)); ++ __ ld_d(S0, Address(OSR_buf, slot_offset + 1*BytesPerWord)); ++ __ st_d(S0, frame_map()->address_for_monitor_object(i)); ++ } ++ } ++} ++ ++// inline cache check; done before the frame is built. ++int LIR_Assembler::check_icache() { ++ Register receiver = FrameMap::receiver_opr->as_register(); ++ Register ic_klass = IC_Klass; ++ int start_offset = __ offset(); ++ Label dont; ++ ++ __ verify_oop(receiver); ++ ++ // explicit null check not needed since load from [klass_offset] causes a trap ++ // check against inline cache ++ assert(!MacroAssembler::needs_explicit_null_check(oopDesc::klass_offset_in_bytes()), ++ "must add explicit null check"); ++ ++ __ load_klass(SCR2, receiver); ++ __ beq(SCR2, ic_klass, dont); ++ ++ // if icache check fails, then jump to runtime routine ++ // Note: RECEIVER must still contain the receiver! ++ __ jmp(SharedRuntime::get_ic_miss_stub(), relocInfo::runtime_call_type); ++ ++ // We align the verified entry point unless the method body ++ // (including its inline cache check) will fit in a single 64-byte ++ // icache line. ++ if (!method()->is_accessor() || __ offset() - start_offset > 4 * 4) { ++ // force alignment after the cache check. ++ __ align(CodeEntryAlignment); ++ } ++ ++ __ bind(dont); ++ return start_offset; ++} ++ ++void LIR_Assembler::clinit_barrier(ciMethod* method) { ++ assert(VM_Version::supports_fast_class_init_checks(), "sanity"); ++ assert(!method->holder()->is_not_initialized(), "initialization should have been started"); ++ Label L_skip_barrier; ++ ++ __ mov_metadata(SCR2, method->holder()->constant_encoding()); ++ __ clinit_barrier(SCR2, SCR1, &L_skip_barrier /*L_fast_path*/); ++ __ jmp(SharedRuntime::get_handle_wrong_method_stub(), relocInfo::runtime_call_type); ++ __ bind(L_skip_barrier); ++} ++ ++void LIR_Assembler::jobject2reg(jobject o, Register reg) { ++ if (o == nullptr) { ++ __ move(reg, R0); ++ } else { ++ __ movoop(reg, o); ++ } ++} ++ ++void LIR_Assembler::deoptimize_trap(CodeEmitInfo *info) { ++ address target = nullptr; ++ ++ switch (patching_id(info)) { ++ case PatchingStub::access_field_id: ++ target = Runtime1::entry_for(Runtime1::access_field_patching_id); ++ break; ++ case PatchingStub::load_klass_id: ++ target = Runtime1::entry_for(Runtime1::load_klass_patching_id); ++ break; ++ case PatchingStub::load_mirror_id: ++ target = Runtime1::entry_for(Runtime1::load_mirror_patching_id); ++ break; ++ case PatchingStub::load_appendix_id: ++ target = Runtime1::entry_for(Runtime1::load_appendix_patching_id); ++ break; ++ default: ShouldNotReachHere(); ++ } ++ ++ __ call(target, relocInfo::runtime_call_type); ++ add_call_info_here(info); ++} ++ ++void LIR_Assembler::jobject2reg_with_patching(Register reg, CodeEmitInfo *info) { ++ deoptimize_trap(info); ++} ++ ++// This specifies the rsp decrement needed to build the frame ++int LIR_Assembler::initial_frame_size_in_bytes() const { ++ // if rounding, must let FrameMap know! ++ return in_bytes(frame_map()->framesize_in_bytes()); ++} ++ ++int LIR_Assembler::emit_exception_handler() { ++ // generate code for exception handler ++ address handler_base = __ start_a_stub(exception_handler_size()); ++ if (handler_base == nullptr) { ++ // not enough space left for the handler ++ bailout("exception handler overflow"); ++ return -1; ++ } ++ ++ int offset = code_offset(); ++ ++ // the exception oop and pc are in A0, and A1 ++ // no other registers need to be preserved, so invalidate them ++ __ invalidate_registers(false, true, true, true, true, true); ++ ++ // check that there is really an exception ++ __ verify_not_null_oop(A0); ++ ++ // search an exception handler (A0: exception oop, A1: throwing pc) ++ __ call(Runtime1::entry_for(Runtime1::handle_exception_from_callee_id), relocInfo::runtime_call_type); ++ __ should_not_reach_here(); ++ guarantee(code_offset() - offset <= exception_handler_size(), "overflow"); ++ __ end_a_stub(); ++ ++ return offset; ++} ++ ++// Emit the code to remove the frame from the stack in the exception unwind path. ++int LIR_Assembler::emit_unwind_handler() { ++#ifndef PRODUCT ++ if (CommentedAssembly) { ++ _masm->block_comment("Unwind handler"); ++ } ++#endif ++ ++ int offset = code_offset(); ++ ++ // Fetch the exception from TLS and clear out exception related thread state ++ __ ld_d(A0, Address(TREG, JavaThread::exception_oop_offset())); ++ __ st_d(R0, Address(TREG, JavaThread::exception_oop_offset())); ++ __ st_d(R0, Address(TREG, JavaThread::exception_pc_offset())); ++ ++ __ bind(_unwind_handler_entry); ++ __ verify_not_null_oop(V0); ++ if (method()->is_synchronized() || compilation()->env()->dtrace_method_probes()) { ++ __ move(S0, V0); // Preserve the exception ++ } ++ ++ // Perform needed unlocking ++ MonitorExitStub* stub = nullptr; ++ if (method()->is_synchronized()) { ++ monitor_address(0, FrameMap::a0_opr); ++ stub = new MonitorExitStub(FrameMap::a0_opr, true, 0); ++ if (LockingMode == LM_MONITOR) { ++ __ b(*stub->entry()); ++ } else { ++ __ unlock_object(A5, A4, A0, *stub->entry()); ++ } ++ __ bind(*stub->continuation()); ++ } ++ ++ if (compilation()->env()->dtrace_method_probes()) { ++ __ mov_metadata(A1, method()->constant_encoding()); ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::dtrace_method_exit), TREG, A1); ++ } ++ ++ if (method()->is_synchronized() || compilation()->env()->dtrace_method_probes()) { ++ __ move(A0, S0); // Restore the exception ++ } ++ ++ // remove the activation and dispatch to the unwind handler ++ __ block_comment("remove_frame and dispatch to the unwind handler"); ++ __ remove_frame(initial_frame_size_in_bytes()); ++ __ jmp(Runtime1::entry_for(Runtime1::unwind_exception_id), relocInfo::runtime_call_type); ++ ++ // Emit the slow path assembly ++ if (stub != nullptr) { ++ stub->emit_code(this); ++ } ++ ++ return offset; ++} ++ ++int LIR_Assembler::emit_deopt_handler() { ++ // generate code for exception handler ++ address handler_base = __ start_a_stub(deopt_handler_size()); ++ if (handler_base == nullptr) { ++ // not enough space left for the handler ++ bailout("deopt handler overflow"); ++ return -1; ++ } ++ ++ int offset = code_offset(); ++ ++ __ call(SharedRuntime::deopt_blob()->unpack(), relocInfo::runtime_call_type); ++ guarantee(code_offset() - offset <= deopt_handler_size(), "overflow"); ++ __ end_a_stub(); ++ ++ return offset; ++} ++ ++void LIR_Assembler::add_debug_info_for_branch(address adr, CodeEmitInfo* info) { ++ _masm->code_section()->relocate(adr, relocInfo::poll_type); ++ int pc_offset = code_offset(); ++ flush_debug_info(pc_offset); ++ info->record_debug_info(compilation()->debug_info_recorder(), pc_offset); ++ if (info->exception_handlers() != nullptr) { ++ compilation()->add_exception_handlers_for_pco(pc_offset, info->exception_handlers()); ++ } ++} ++ ++void LIR_Assembler::return_op(LIR_Opr result, C1SafepointPollStub* code_stub) { ++ assert(result->is_illegal() || !result->is_single_cpu() || result->as_register() == V0, ++ "word returns are in V0,"); ++ ++ // Pop the stack before the safepoint code ++ __ remove_frame(initial_frame_size_in_bytes()); ++ ++ if (StackReservedPages > 0 && compilation()->has_reserved_stack_access()) { ++ __ reserved_stack_check(); ++ } ++ ++ code_stub->set_safepoint_offset(__ offset()); ++ __ relocate(relocInfo::poll_return_type); ++ __ safepoint_poll(*code_stub->entry(), TREG, true /* at_return */, false /* acquire */, true /* in_nmethod */); ++ ++ __ jr(RA); ++} ++ ++int LIR_Assembler::safepoint_poll(LIR_Opr tmp, CodeEmitInfo* info) { ++ guarantee(info != nullptr, "Shouldn't be null"); ++ __ ld_d(SCR1, Address(TREG, JavaThread::polling_page_offset())); ++ add_debug_info_for_branch(info); // This isn't just debug info: it's the oop map ++ __ relocate(relocInfo::poll_type); ++ __ ld_w(SCR1, SCR1, 0); ++ return __ offset(); ++} ++ ++void LIR_Assembler::move_regs(Register from_reg, Register to_reg) { ++ __ move(to_reg, from_reg); ++} ++ ++void LIR_Assembler::swap_reg(Register a, Register b) { Unimplemented(); } ++ ++void LIR_Assembler::const2reg(LIR_Opr src, LIR_Opr dest, LIR_PatchCode patch_code, CodeEmitInfo* info) { ++ assert(src->is_constant(), "should not call otherwise"); ++ assert(dest->is_register(), "should not call otherwise"); ++ LIR_Const* c = src->as_constant_ptr(); ++ ++ switch (c->type()) { ++ case T_INT: ++ assert(patch_code == lir_patch_none, "no patching handled here"); ++ __ li(dest->as_register(), c->as_jint()); ++ break; ++ case T_ADDRESS: ++ assert(patch_code == lir_patch_none, "no patching handled here"); ++ __ li(dest->as_register(), c->as_jint()); ++ break; ++ case T_LONG: ++ assert(patch_code == lir_patch_none, "no patching handled here"); ++ __ li(dest->as_register_lo(), (intptr_t)c->as_jlong()); ++ break; ++ case T_OBJECT: ++ if (patch_code == lir_patch_none) { ++ jobject2reg(c->as_jobject(), dest->as_register()); ++ } else { ++ jobject2reg_with_patching(dest->as_register(), info); ++ } ++ break; ++ case T_METADATA: ++ if (patch_code != lir_patch_none) { ++ klass2reg_with_patching(dest->as_register(), info); ++ } else { ++ __ mov_metadata(dest->as_register(), c->as_metadata()); ++ } ++ break; ++ case T_FLOAT: ++ __ lea(SCR1, InternalAddress(float_constant(c->as_jfloat()))); ++ __ fld_s(dest->as_float_reg(), SCR1, 0); ++ break; ++ case T_DOUBLE: ++ __ lea(SCR1, InternalAddress(double_constant(c->as_jdouble()))); ++ __ fld_d(dest->as_double_reg(), SCR1, 0); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++} ++ ++void LIR_Assembler::const2stack(LIR_Opr src, LIR_Opr dest) { ++ LIR_Const* c = src->as_constant_ptr(); ++ switch (c->type()) { ++ case T_OBJECT: ++ if (!c->as_jobject()) ++ __ st_d(R0, frame_map()->address_for_slot(dest->single_stack_ix())); ++ else { ++ const2reg(src, FrameMap::scr1_opr, lir_patch_none, nullptr); ++ reg2stack(FrameMap::scr1_opr, dest, c->type(), false); ++ } ++ break; ++ case T_ADDRESS: ++ const2reg(src, FrameMap::scr1_opr, lir_patch_none, nullptr); ++ reg2stack(FrameMap::scr1_opr, dest, c->type(), false); ++ case T_INT: ++ case T_FLOAT: ++ if (c->as_jint_bits() == 0) ++ __ st_w(R0, frame_map()->address_for_slot(dest->single_stack_ix())); ++ else { ++ __ li(SCR2, c->as_jint_bits()); ++ __ st_w(SCR2, frame_map()->address_for_slot(dest->single_stack_ix())); ++ } ++ break; ++ case T_LONG: ++ case T_DOUBLE: ++ if (c->as_jlong_bits() == 0) ++ __ st_d(R0, frame_map()->address_for_slot(dest->double_stack_ix(), ++ lo_word_offset_in_bytes)); ++ else { ++ __ li(SCR2, (intptr_t)c->as_jlong_bits()); ++ __ st_d(SCR2, frame_map()->address_for_slot(dest->double_stack_ix(), ++ lo_word_offset_in_bytes)); ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++} ++ ++void LIR_Assembler::const2mem(LIR_Opr src, LIR_Opr dest, BasicType type, ++ CodeEmitInfo* info, bool wide) { ++ assert(src->is_constant(), "should not call otherwise"); ++ LIR_Const* c = src->as_constant_ptr(); ++ LIR_Address* to_addr = dest->as_address_ptr(); ++ ++ void (Assembler::* insn)(Register Rt, const Address &adr); ++ ++ switch (type) { ++ case T_ADDRESS: ++ assert(c->as_jint() == 0, "should be"); ++ insn = &Assembler::st_d; ++ break; ++ case T_LONG: ++ assert(c->as_jlong() == 0, "should be"); ++ insn = &Assembler::st_d; ++ break; ++ case T_INT: ++ assert(c->as_jint() == 0, "should be"); ++ insn = &Assembler::st_w; ++ break; ++ case T_OBJECT: ++ case T_ARRAY: ++ assert(c->as_jobject() == 0, "should be"); ++ if (UseCompressedOops && !wide) { ++ insn = &Assembler::st_w; ++ } else { ++ insn = &Assembler::st_d; ++ } ++ break; ++ case T_CHAR: ++ case T_SHORT: ++ assert(c->as_jint() == 0, "should be"); ++ insn = &Assembler::st_h; ++ break; ++ case T_BOOLEAN: ++ case T_BYTE: ++ assert(c->as_jint() == 0, "should be"); ++ insn = &Assembler::st_b; ++ break; ++ default: ++ ShouldNotReachHere(); ++ insn = &Assembler::st_d; // unreachable ++ } ++ ++ if (info) add_debug_info_for_null_check_here(info); ++ (_masm->*insn)(R0, as_Address(to_addr)); ++} ++ ++void LIR_Assembler::reg2reg(LIR_Opr src, LIR_Opr dest) { ++ assert(src->is_register(), "should not call otherwise"); ++ assert(dest->is_register(), "should not call otherwise"); ++ ++ // move between cpu-registers ++ if (dest->is_single_cpu()) { ++ if (src->type() == T_LONG) { ++ // Can do LONG -> OBJECT ++ move_regs(src->as_register_lo(), dest->as_register()); ++ return; ++ } ++ assert(src->is_single_cpu(), "must match"); ++ if (src->type() == T_OBJECT) { ++ __ verify_oop(src->as_register()); ++ } ++ move_regs(src->as_register(), dest->as_register()); ++ } else if (dest->is_double_cpu()) { ++ if (is_reference_type(src->type())) { ++ // Surprising to me but we can see move of a long to t_object ++ __ verify_oop(src->as_register()); ++ move_regs(src->as_register(), dest->as_register_lo()); ++ return; ++ } ++ assert(src->is_double_cpu(), "must match"); ++ Register f_lo = src->as_register_lo(); ++ Register f_hi = src->as_register_hi(); ++ Register t_lo = dest->as_register_lo(); ++ Register t_hi = dest->as_register_hi(); ++ assert(f_hi == f_lo, "must be same"); ++ assert(t_hi == t_lo, "must be same"); ++ move_regs(f_lo, t_lo); ++ } else if (dest->is_single_fpu()) { ++ __ fmov_s(dest->as_float_reg(), src->as_float_reg()); ++ } else if (dest->is_double_fpu()) { ++ __ fmov_d(dest->as_double_reg(), src->as_double_reg()); ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++void LIR_Assembler::reg2stack(LIR_Opr src, LIR_Opr dest, BasicType type, bool pop_fpu_stack) { ++ precond(src->is_register() && dest->is_stack()); ++ ++ uint const c_sz32 = sizeof(uint32_t); ++ uint const c_sz64 = sizeof(uint64_t); ++ ++ if (src->is_single_cpu()) { ++ int index = dest->single_stack_ix(); ++ if (is_reference_type(type)) { ++ __ st_d(src->as_register(), stack_slot_address(index, c_sz64)); ++ __ verify_oop(src->as_register()); ++ } else if (type == T_METADATA || type == T_DOUBLE || type == T_ADDRESS) { ++ __ st_d(src->as_register(), stack_slot_address(index, c_sz64)); ++ } else { ++ __ st_w(src->as_register(), stack_slot_address(index, c_sz32)); ++ } ++ } else if (src->is_double_cpu()) { ++ int index = dest->double_stack_ix(); ++ Address dest_addr_LO = stack_slot_address(index, c_sz64, lo_word_offset_in_bytes); ++ __ st_d(src->as_register_lo(), dest_addr_LO); ++ } else if (src->is_single_fpu()) { ++ int index = dest->single_stack_ix(); ++ __ fst_s(src->as_float_reg(), stack_slot_address(index, c_sz32)); ++ } else if (src->is_double_fpu()) { ++ int index = dest->double_stack_ix(); ++ __ fst_d(src->as_double_reg(), stack_slot_address(index, c_sz64)); ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++void LIR_Assembler::reg2mem(LIR_Opr src, LIR_Opr dest, BasicType type, LIR_PatchCode patch_code, ++ CodeEmitInfo* info, bool pop_fpu_stack, bool wide) { ++ LIR_Address* to_addr = dest->as_address_ptr(); ++ PatchingStub* patch = nullptr; ++ Register compressed_src = SCR2; ++ ++ if (patch_code != lir_patch_none) { ++ deoptimize_trap(info); ++ return; ++ } ++ ++ if (is_reference_type(type)) { ++ __ verify_oop(src->as_register()); ++ ++ if (UseCompressedOops && !wide) { ++ __ encode_heap_oop(compressed_src, src->as_register()); ++ } else { ++ compressed_src = src->as_register(); ++ } ++ } ++ ++ int null_check_here = code_offset(); ++ switch (type) { ++ case T_FLOAT: ++ __ fst_s(src->as_float_reg(), as_Address(to_addr)); ++ break; ++ case T_DOUBLE: ++ __ fst_d(src->as_double_reg(), as_Address(to_addr)); ++ break; ++ case T_ARRAY: // fall through ++ case T_OBJECT: // fall through ++ if (UseCompressedOops && !wide) { ++ __ st_w(compressed_src, as_Address(to_addr)); ++ } else { ++ __ st_d(compressed_src, as_Address(to_addr)); ++ } ++ break; ++ case T_METADATA: ++ // We get here to store a method pointer to the stack to pass to ++ // a dtrace runtime call. This can't work on 64 bit with ++ // compressed klass ptrs: T_METADATA can be a compressed klass ++ // ptr or a 64 bit method pointer. ++ ShouldNotReachHere(); ++ __ st_d(src->as_register(), as_Address(to_addr)); ++ break; ++ case T_ADDRESS: ++ __ st_d(src->as_register(), as_Address(to_addr)); ++ break; ++ case T_INT: ++ __ st_w(src->as_register(), as_Address(to_addr)); ++ break; ++ case T_LONG: ++ __ st_d(src->as_register_lo(), as_Address_lo(to_addr)); ++ break; ++ case T_BYTE: // fall through ++ case T_BOOLEAN: ++ __ st_b(src->as_register(), as_Address(to_addr)); ++ break; ++ case T_CHAR: // fall through ++ case T_SHORT: ++ __ st_h(src->as_register(), as_Address(to_addr)); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ if (info != nullptr) { ++ add_debug_info_for_null_check(null_check_here, info); ++ } ++} ++ ++void LIR_Assembler::stack2reg(LIR_Opr src, LIR_Opr dest, BasicType type) { ++ precond(src->is_stack() && dest->is_register()); ++ ++ uint const c_sz32 = sizeof(uint32_t); ++ uint const c_sz64 = sizeof(uint64_t); ++ ++ if (dest->is_single_cpu()) { ++ int index = src->single_stack_ix(); ++ if (is_reference_type(type)) { ++ __ ld_d(dest->as_register(), stack_slot_address(index, c_sz64)); ++ __ verify_oop(dest->as_register()); ++ } else if (type == T_METADATA || type == T_ADDRESS) { ++ __ ld_d(dest->as_register(), stack_slot_address(index, c_sz64)); ++ } else { ++ __ ld_w(dest->as_register(), stack_slot_address(index, c_sz32)); ++ } ++ } else if (dest->is_double_cpu()) { ++ int index = src->double_stack_ix(); ++ Address src_addr_LO = stack_slot_address(index, c_sz64, lo_word_offset_in_bytes); ++ __ ld_d(dest->as_register_lo(), src_addr_LO); ++ } else if (dest->is_single_fpu()) { ++ int index = src->single_stack_ix(); ++ __ fld_s(dest->as_float_reg(), stack_slot_address(index, c_sz32)); ++ } else if (dest->is_double_fpu()) { ++ int index = src->double_stack_ix(); ++ __ fld_d(dest->as_double_reg(), stack_slot_address(index, c_sz64)); ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++void LIR_Assembler::klass2reg_with_patching(Register reg, CodeEmitInfo* info) { ++ address target = nullptr; ++ ++ switch (patching_id(info)) { ++ case PatchingStub::access_field_id: ++ target = Runtime1::entry_for(Runtime1::access_field_patching_id); ++ break; ++ case PatchingStub::load_klass_id: ++ target = Runtime1::entry_for(Runtime1::load_klass_patching_id); ++ break; ++ case PatchingStub::load_mirror_id: ++ target = Runtime1::entry_for(Runtime1::load_mirror_patching_id); ++ break; ++ case PatchingStub::load_appendix_id: ++ target = Runtime1::entry_for(Runtime1::load_appendix_patching_id); ++ break; ++ default: ShouldNotReachHere(); ++ } ++ ++ __ call(target, relocInfo::runtime_call_type); ++ add_call_info_here(info); ++} ++ ++void LIR_Assembler::stack2stack(LIR_Opr src, LIR_Opr dest, BasicType type) { ++ LIR_Opr temp; ++ ++ if (type == T_LONG || type == T_DOUBLE) ++ temp = FrameMap::scr1_long_opr; ++ else ++ temp = FrameMap::scr1_opr; ++ ++ stack2reg(src, temp, src->type()); ++ reg2stack(temp, dest, dest->type(), false); ++} ++ ++void LIR_Assembler::mem2reg(LIR_Opr src, LIR_Opr dest, BasicType type, LIR_PatchCode patch_code, ++ CodeEmitInfo* info, bool wide) { ++ LIR_Address* addr = src->as_address_ptr(); ++ LIR_Address* from_addr = src->as_address_ptr(); ++ ++ if (addr->base()->type() == T_OBJECT) { ++ __ verify_oop(addr->base()->as_pointer_register()); ++ } ++ ++ if (patch_code != lir_patch_none) { ++ deoptimize_trap(info); ++ return; ++ } ++ ++ if (info != nullptr) { ++ add_debug_info_for_null_check_here(info); ++ } ++ int null_check_here = code_offset(); ++ switch (type) { ++ case T_FLOAT: ++ __ fld_s(dest->as_float_reg(), as_Address(from_addr)); ++ break; ++ case T_DOUBLE: ++ __ fld_d(dest->as_double_reg(), as_Address(from_addr)); ++ break; ++ case T_ARRAY: // fall through ++ case T_OBJECT: // fall through ++ if (UseCompressedOops && !wide) { ++ __ ld_wu(dest->as_register(), as_Address(from_addr)); ++ } else { ++ __ ld_d(dest->as_register(), as_Address(from_addr)); ++ } ++ break; ++ case T_METADATA: ++ // We get here to store a method pointer to the stack to pass to ++ // a dtrace runtime call. This can't work on 64 bit with ++ // compressed klass ptrs: T_METADATA can be a compressed klass ++ // ptr or a 64 bit method pointer. ++ ShouldNotReachHere(); ++ __ ld_d(dest->as_register(), as_Address(from_addr)); ++ break; ++ case T_ADDRESS: ++ __ ld_d(dest->as_register(), as_Address(from_addr)); ++ break; ++ case T_INT: ++ __ ld_w(dest->as_register(), as_Address(from_addr)); ++ break; ++ case T_LONG: ++ __ ld_d(dest->as_register_lo(), as_Address_lo(from_addr)); ++ break; ++ case T_BYTE: ++ __ ld_b(dest->as_register(), as_Address(from_addr)); ++ break; ++ case T_BOOLEAN: ++ __ ld_bu(dest->as_register(), as_Address(from_addr)); ++ break; ++ case T_CHAR: ++ __ ld_hu(dest->as_register(), as_Address(from_addr)); ++ break; ++ case T_SHORT: ++ __ ld_h(dest->as_register(), as_Address(from_addr)); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ ++ if (is_reference_type(type)) { ++ if (UseCompressedOops && !wide) { ++ __ decode_heap_oop(dest->as_register()); ++ } ++ ++ if (!(UseZGC && !ZGenerational)) { ++ // Load barrier has not yet been applied, so ZGC can't verify the oop here ++ __ verify_oop(dest->as_register()); ++ } ++ } ++} ++ ++int LIR_Assembler::array_element_size(BasicType type) const { ++ int elem_size = type2aelembytes(type); ++ return exact_log2(elem_size); ++} ++ ++void LIR_Assembler::emit_op3(LIR_Op3* op) { ++ switch (op->code()) { ++ case lir_idiv: ++ case lir_irem: ++ arithmetic_idiv(op->code(), op->in_opr1(), op->in_opr2(), op->in_opr3(), ++ op->result_opr(), op->info()); ++ break; ++ case lir_fmad: ++ __ fmadd_d(op->result_opr()->as_double_reg(), op->in_opr1()->as_double_reg(), ++ op->in_opr2()->as_double_reg(), op->in_opr3()->as_double_reg()); ++ break; ++ case lir_fmaf: ++ __ fmadd_s(op->result_opr()->as_float_reg(), op->in_opr1()->as_float_reg(), ++ op->in_opr2()->as_float_reg(), op->in_opr3()->as_float_reg()); ++ break; ++ default: ++ ShouldNotReachHere(); ++ break; ++ } ++} ++ ++void LIR_Assembler::emit_opBranch(LIR_OpBranch* op) { ++#ifdef ASSERT ++ assert(op->block() == nullptr || op->block()->label() == op->label(), "wrong label"); ++ if (op->block() != nullptr) _branch_target_blocks.append(op->block()); ++#endif ++ ++ if (op->cond() == lir_cond_always) { ++ if (op->info() != nullptr) ++ add_debug_info_for_branch(op->info()); ++ ++ __ b_far(*(op->label())); ++ } else { ++ emit_cmp_branch(op); ++ } ++} ++ ++void LIR_Assembler::emit_cmp_branch(LIR_OpBranch* op) { ++#ifdef ASSERT ++ if (op->ublock() != nullptr) _branch_target_blocks.append(op->ublock()); ++#endif ++ ++ if (op->info() != nullptr) { ++ assert(op->in_opr1()->is_address() || op->in_opr2()->is_address(), ++ "shouldn't be codeemitinfo for non-address operands"); ++ add_debug_info_for_null_check_here(op->info()); // exception possible ++ } ++ ++ Label& L = *(op->label()); ++ Assembler::Condition acond; ++ LIR_Opr opr1 = op->in_opr1(); ++ LIR_Opr opr2 = op->in_opr2(); ++ assert(op->condition() != lir_cond_always, "must be"); ++ ++ if (op->code() == lir_cond_float_branch) { ++ bool is_unordered = (op->ublock() == op->block()); ++ if (opr1->is_single_fpu()) { ++ FloatRegister reg1 = opr1->as_float_reg(); ++ assert(opr2->is_single_fpu(), "expect single float register"); ++ FloatRegister reg2 = opr2->as_float_reg(); ++ switch(op->condition()) { ++ case lir_cond_equal: ++ if (is_unordered) ++ __ fcmp_cueq_s(FCC0, reg1, reg2); ++ else ++ __ fcmp_ceq_s(FCC0, reg1, reg2); ++ break; ++ case lir_cond_notEqual: ++ if (is_unordered) ++ __ fcmp_cune_s(FCC0, reg1, reg2); ++ else ++ __ fcmp_cne_s(FCC0, reg1, reg2); ++ break; ++ case lir_cond_less: ++ if (is_unordered) ++ __ fcmp_cult_s(FCC0, reg1, reg2); ++ else ++ __ fcmp_clt_s(FCC0, reg1, reg2); ++ break; ++ case lir_cond_lessEqual: ++ if (is_unordered) ++ __ fcmp_cule_s(FCC0, reg1, reg2); ++ else ++ __ fcmp_cle_s(FCC0, reg1, reg2); ++ break; ++ case lir_cond_greaterEqual: ++ if (is_unordered) ++ __ fcmp_cule_s(FCC0, reg2, reg1); ++ else ++ __ fcmp_cle_s(FCC0, reg2, reg1); ++ break; ++ case lir_cond_greater: ++ if (is_unordered) ++ __ fcmp_cult_s(FCC0, reg2, reg1); ++ else ++ __ fcmp_clt_s(FCC0, reg2, reg1); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (opr1->is_double_fpu()) { ++ FloatRegister reg1 = opr1->as_double_reg(); ++ assert(opr2->is_double_fpu(), "expect double float register"); ++ FloatRegister reg2 = opr2->as_double_reg(); ++ switch(op->condition()) { ++ case lir_cond_equal: ++ if (is_unordered) ++ __ fcmp_cueq_d(FCC0, reg1, reg2); ++ else ++ __ fcmp_ceq_d(FCC0, reg1, reg2); ++ break; ++ case lir_cond_notEqual: ++ if (is_unordered) ++ __ fcmp_cune_d(FCC0, reg1, reg2); ++ else ++ __ fcmp_cne_d(FCC0, reg1, reg2); ++ break; ++ case lir_cond_less: ++ if (is_unordered) ++ __ fcmp_cult_d(FCC0, reg1, reg2); ++ else ++ __ fcmp_clt_d(FCC0, reg1, reg2); ++ break; ++ case lir_cond_lessEqual: ++ if (is_unordered) ++ __ fcmp_cule_d(FCC0, reg1, reg2); ++ else ++ __ fcmp_cle_d(FCC0, reg1, reg2); ++ break; ++ case lir_cond_greaterEqual: ++ if (is_unordered) ++ __ fcmp_cule_d(FCC0, reg2, reg1); ++ else ++ __ fcmp_cle_d(FCC0, reg2, reg1); ++ break; ++ case lir_cond_greater: ++ if (is_unordered) ++ __ fcmp_cult_d(FCC0, reg2, reg1); ++ else ++ __ fcmp_clt_d(FCC0, reg2, reg1); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++ __ bcnez(FCC0, L); ++ } else { ++ if (opr1->is_constant() && opr2->is_single_cpu()) { ++ // tableswitch ++ Unimplemented(); ++ } else if (opr1->is_single_cpu() || opr1->is_double_cpu()) { ++ Register reg1 = as_reg(opr1); ++ Register reg2 = noreg; ++ jlong imm2 = 0; ++ if (opr2->is_single_cpu()) { ++ // cpu register - cpu register ++ reg2 = opr2->as_register(); ++ } else if (opr2->is_double_cpu()) { ++ // cpu register - cpu register ++ reg2 = opr2->as_register_lo(); ++ } else if (opr2->is_constant()) { ++ switch(opr2->type()) { ++ case T_INT: ++ case T_ADDRESS: ++ imm2 = opr2->as_constant_ptr()->as_jint(); ++ break; ++ case T_LONG: ++ imm2 = opr2->as_constant_ptr()->as_jlong(); ++ break; ++ case T_METADATA: ++ imm2 = (intptr_t)opr2->as_constant_ptr()->as_metadata(); ++ break; ++ case T_OBJECT: ++ case T_ARRAY: ++ if (opr2->as_constant_ptr()->as_jobject() != nullptr) { ++ reg2 = SCR1; ++ jobject2reg(opr2->as_constant_ptr()->as_jobject(), reg2); ++ } else { ++ reg2 = R0; ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ break; ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++ if (reg2 == noreg) { ++ if (imm2 == 0) { ++ reg2 = R0; ++ } else { ++ reg2 = SCR1; ++ __ li(reg2, imm2); ++ } ++ } ++ switch (op->condition()) { ++ case lir_cond_equal: ++ __ beq_far(reg1, reg2, L); break; ++ case lir_cond_notEqual: ++ __ bne_far(reg1, reg2, L); break; ++ case lir_cond_less: ++ __ blt_far(reg1, reg2, L, true); break; ++ case lir_cond_lessEqual: ++ __ bge_far(reg2, reg1, L, true); break; ++ case lir_cond_greaterEqual: ++ __ bge_far(reg1, reg2, L, true); break; ++ case lir_cond_greater: ++ __ blt_far(reg2, reg1, L, true); break; ++ case lir_cond_belowEqual: ++ __ bge_far(reg2, reg1, L, false); break; ++ case lir_cond_aboveEqual: ++ __ bge_far(reg1, reg2, L, false); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ } ++} ++ ++void LIR_Assembler::emit_opConvert(LIR_OpConvert* op) { ++ LIR_Opr src = op->in_opr(); ++ LIR_Opr dest = op->result_opr(); ++ LIR_Opr tmp = op->tmp(); ++ ++ switch (op->bytecode()) { ++ case Bytecodes::_i2f: ++ __ movgr2fr_w(dest->as_float_reg(), src->as_register()); ++ __ ffint_s_w(dest->as_float_reg(), dest->as_float_reg()); ++ break; ++ case Bytecodes::_i2d: ++ __ movgr2fr_w(dest->as_double_reg(), src->as_register()); ++ __ ffint_d_w(dest->as_double_reg(), dest->as_double_reg()); ++ break; ++ case Bytecodes::_l2d: ++ __ movgr2fr_d(dest->as_double_reg(), src->as_register_lo()); ++ __ ffint_d_l(dest->as_double_reg(), dest->as_double_reg()); ++ break; ++ case Bytecodes::_l2f: ++ __ movgr2fr_d(dest->as_float_reg(), src->as_register_lo()); ++ __ ffint_s_l(dest->as_float_reg(), dest->as_float_reg()); ++ break; ++ case Bytecodes::_f2d: ++ __ fcvt_d_s(dest->as_double_reg(), src->as_float_reg()); ++ break; ++ case Bytecodes::_d2f: ++ __ fcvt_s_d(dest->as_float_reg(), src->as_double_reg()); ++ break; ++ case Bytecodes::_i2c: ++ __ bstrpick_w(dest->as_register(), src->as_register(), 15, 0); ++ break; ++ case Bytecodes::_i2l: ++ _masm->block_comment("FIXME: This could be a no-op"); ++ __ slli_w(dest->as_register_lo(), src->as_register(), 0); ++ break; ++ case Bytecodes::_i2s: ++ __ ext_w_h(dest->as_register(), src->as_register()); ++ break; ++ case Bytecodes::_i2b: ++ __ ext_w_b(dest->as_register(), src->as_register()); ++ break; ++ case Bytecodes::_l2i: ++ __ slli_w(dest->as_register(), src->as_register_lo(), 0); ++ break; ++ case Bytecodes::_d2l: ++ __ ftintrz_l_d(tmp->as_double_reg(), src->as_double_reg()); ++ __ movfr2gr_d(dest->as_register_lo(), tmp->as_double_reg()); ++ break; ++ case Bytecodes::_f2i: ++ __ ftintrz_w_s(tmp->as_float_reg(), src->as_float_reg()); ++ __ movfr2gr_s(dest->as_register(), tmp->as_float_reg()); ++ break; ++ case Bytecodes::_f2l: ++ __ ftintrz_l_s(tmp->as_float_reg(), src->as_float_reg()); ++ __ movfr2gr_d(dest->as_register_lo(), tmp->as_float_reg()); ++ break; ++ case Bytecodes::_d2i: ++ __ ftintrz_w_d(tmp->as_double_reg(), src->as_double_reg()); ++ __ movfr2gr_s(dest->as_register(), tmp->as_double_reg()); ++ break; ++ default: ShouldNotReachHere(); ++ } ++} ++ ++void LIR_Assembler::emit_alloc_obj(LIR_OpAllocObj* op) { ++ if (op->init_check()) { ++ __ ld_bu(SCR1, Address(op->klass()->as_register(), InstanceKlass::init_state_offset())); ++ __ li(SCR2, InstanceKlass::fully_initialized); ++ add_debug_info_for_null_check_here(op->stub()->info()); ++ __ bne_far(SCR1, SCR2, *op->stub()->entry()); ++ } ++ __ allocate_object(op->obj()->as_register(), op->tmp1()->as_register(), ++ op->tmp2()->as_register(), op->header_size(), ++ op->object_size(), op->klass()->as_register(), ++ *op->stub()->entry()); ++ __ bind(*op->stub()->continuation()); ++} ++ ++void LIR_Assembler::emit_alloc_array(LIR_OpAllocArray* op) { ++ Register len = op->len()->as_register(); ++ if (UseSlowPath || ++ (!UseFastNewObjectArray && is_reference_type(op->type())) || ++ (!UseFastNewTypeArray && !is_reference_type(op->type()))) { ++ __ b(*op->stub()->entry()); ++ } else { ++ Register tmp1 = op->tmp1()->as_register(); ++ Register tmp2 = op->tmp2()->as_register(); ++ Register tmp3 = op->tmp3()->as_register(); ++ if (len == tmp1) { ++ tmp1 = tmp3; ++ } else if (len == tmp2) { ++ tmp2 = tmp3; ++ } else if (len == tmp3) { ++ // everything is ok ++ } else { ++ __ move(tmp3, len); ++ } ++ __ allocate_array(op->obj()->as_register(), len, tmp1, tmp2, ++ arrayOopDesc::header_size(op->type()), ++ array_element_size(op->type()), ++ op->klass()->as_register(), ++ *op->stub()->entry()); ++ } ++ __ bind(*op->stub()->continuation()); ++} ++ ++void LIR_Assembler::type_profile_helper(Register mdo, ciMethodData *md, ciProfileData *data, ++ Register recv, Label* update_done) { ++ for (uint i = 0; i < ReceiverTypeData::row_limit(); i++) { ++ Label next_test; ++ // See if the receiver is receiver[n]. ++ __ lea(SCR2, Address(mdo, md->byte_offset_of_slot(data, ReceiverTypeData::receiver_offset(i)))); ++ __ ld_d(SCR1, Address(SCR2)); ++ __ bne(recv, SCR1, next_test); ++ Address data_addr(mdo, md->byte_offset_of_slot(data, ReceiverTypeData::receiver_count_offset(i))); ++ __ ld_d(SCR2, data_addr); ++ __ addi_d(SCR2, SCR2, DataLayout::counter_increment); ++ __ st_d(SCR2, data_addr); ++ __ b(*update_done); ++ __ bind(next_test); ++ } ++ ++ // Didn't find receiver; find next empty slot and fill it in ++ for (uint i = 0; i < ReceiverTypeData::row_limit(); i++) { ++ Label next_test; ++ __ lea(SCR2, Address(mdo, md->byte_offset_of_slot(data, ReceiverTypeData::receiver_offset(i)))); ++ Address recv_addr(SCR2); ++ __ ld_d(SCR1, recv_addr); ++ __ bnez(SCR1, next_test); ++ __ st_d(recv, recv_addr); ++ __ li(SCR1, DataLayout::counter_increment); ++ __ lea(SCR2, Address(mdo, md->byte_offset_of_slot(data, ReceiverTypeData::receiver_count_offset(i)))); ++ __ st_d(SCR1, Address(SCR2)); ++ __ b(*update_done); ++ __ bind(next_test); ++ } ++} ++ ++void LIR_Assembler::emit_typecheck_helper(LIR_OpTypeCheck *op, Label* success, ++ Label* failure, Label* obj_is_null) { ++ // we always need a stub for the failure case. ++ CodeStub* stub = op->stub(); ++ Register obj = op->object()->as_register(); ++ Register k_RInfo = op->tmp1()->as_register(); ++ Register klass_RInfo = op->tmp2()->as_register(); ++ Register dst = op->result_opr()->as_register(); ++ ciKlass* k = op->klass(); ++ Register Rtmp1 = noreg; ++ ++ // check if it needs to be profiled ++ ciMethodData* md; ++ ciProfileData* data; ++ ++ const bool should_profile = op->should_profile(); ++ ++ if (should_profile) { ++ ciMethod* method = op->profiled_method(); ++ assert(method != nullptr, "Should have method"); ++ int bci = op->profiled_bci(); ++ md = method->method_data_or_null(); ++ assert(md != nullptr, "Sanity"); ++ data = md->bci_to_data(bci); ++ assert(data != nullptr, "need data for type check"); ++ assert(data->is_ReceiverTypeData(), "need ReceiverTypeData for type check"); ++ } ++ ++ Label profile_cast_success, profile_cast_failure; ++ Label *success_target = should_profile ? &profile_cast_success : success; ++ Label *failure_target = should_profile ? &profile_cast_failure : failure; ++ ++ if (obj == k_RInfo) { ++ k_RInfo = dst; ++ } else if (obj == klass_RInfo) { ++ klass_RInfo = dst; ++ } ++ if (k->is_loaded() && !UseCompressedClassPointers) { ++ select_different_registers(obj, dst, k_RInfo, klass_RInfo); ++ } else { ++ Rtmp1 = op->tmp3()->as_register(); ++ select_different_registers(obj, dst, k_RInfo, klass_RInfo, Rtmp1); ++ } ++ ++ assert_different_registers(obj, k_RInfo, klass_RInfo); ++ ++ if (should_profile) { ++ Label not_null; ++ __ bnez(obj, not_null); ++ // Object is null; update MDO and exit ++ Register mdo = klass_RInfo; ++ __ mov_metadata(mdo, md->constant_encoding()); ++ Address data_addr = Address(mdo, md->byte_offset_of_slot(data, DataLayout::flags_offset())); ++ __ ld_bu(SCR2, data_addr); ++ __ ori(SCR2, SCR2, BitData::null_seen_byte_constant()); ++ __ st_b(SCR2, data_addr); ++ __ b(*obj_is_null); ++ __ bind(not_null); ++ } else { ++ __ beqz(obj, *obj_is_null); ++ } ++ ++ if (!k->is_loaded()) { ++ klass2reg_with_patching(k_RInfo, op->info_for_patch()); ++ } else { ++ __ mov_metadata(k_RInfo, k->constant_encoding()); ++ } ++ __ verify_oop(obj); ++ ++ if (op->fast_check()) { ++ // get object class ++ // not a safepoint as obj null check happens earlier ++ __ load_klass(SCR2, obj); ++ __ bne_far(SCR2, k_RInfo, *failure_target); ++ // successful cast, fall through to profile or jump ++ } else { ++ // get object class ++ // not a safepoint as obj null check happens earlier ++ __ load_klass(klass_RInfo, obj); ++ if (k->is_loaded()) { ++ // See if we get an immediate positive hit ++ __ ld_d(SCR1, Address(klass_RInfo, int64_t(k->super_check_offset()))); ++ if ((juint)in_bytes(Klass::secondary_super_cache_offset()) != k->super_check_offset()) { ++ __ bne_far(k_RInfo, SCR1, *failure_target); ++ // successful cast, fall through to profile or jump ++ } else { ++ // See if we get an immediate positive hit ++ __ beq_far(k_RInfo, SCR1, *success_target); ++ // check for self ++ __ beq_far(klass_RInfo, k_RInfo, *success_target); ++ ++ __ addi_d(SP, SP, -2 * wordSize); ++ __ st_d(k_RInfo, Address(SP, 0 * wordSize)); ++ __ st_d(klass_RInfo, Address(SP, 1 * wordSize)); ++ __ call(Runtime1::entry_for(Runtime1::slow_subtype_check_id), relocInfo::runtime_call_type); ++ __ ld_d(klass_RInfo, Address(SP, 0 * wordSize)); ++ __ addi_d(SP, SP, 2 * wordSize); ++ // result is a boolean ++ __ beqz(klass_RInfo, *failure_target); ++ // successful cast, fall through to profile or jump ++ } ++ } else { ++ // perform the fast part of the checking logic ++ __ check_klass_subtype_fast_path(klass_RInfo, k_RInfo, Rtmp1, success_target, failure_target, nullptr); ++ // call out-of-line instance of __ check_klass_subtype_slow_path(...): ++ __ addi_d(SP, SP, -2 * wordSize); ++ __ st_d(k_RInfo, Address(SP, 0 * wordSize)); ++ __ st_d(klass_RInfo, Address(SP, 1 * wordSize)); ++ __ call(Runtime1::entry_for(Runtime1::slow_subtype_check_id), relocInfo::runtime_call_type); ++ __ ld_d(k_RInfo, Address(SP, 0 * wordSize)); ++ __ ld_d(klass_RInfo, Address(SP, 1 * wordSize)); ++ __ addi_d(SP, SP, 2 * wordSize); ++ // result is a boolean ++ __ beqz(k_RInfo, *failure_target); ++ // successful cast, fall through to profile or jump ++ } ++ } ++ if (should_profile) { ++ Register mdo = klass_RInfo, recv = k_RInfo; ++ __ bind(profile_cast_success); ++ __ mov_metadata(mdo, md->constant_encoding()); ++ __ load_klass(recv, obj); ++ Label update_done; ++ type_profile_helper(mdo, md, data, recv, success); ++ __ b(*success); ++ ++ __ bind(profile_cast_failure); ++ __ mov_metadata(mdo, md->constant_encoding()); ++ Address counter_addr = Address(mdo, md->byte_offset_of_slot(data, CounterData::count_offset())); ++ __ ld_d(SCR2, counter_addr); ++ __ addi_d(SCR2, SCR2, -DataLayout::counter_increment); ++ __ st_d(SCR2, counter_addr); ++ __ b(*failure); ++ } ++ __ b(*success); ++} ++ ++void LIR_Assembler::emit_opTypeCheck(LIR_OpTypeCheck* op) { ++ const bool should_profile = op->should_profile(); ++ ++ LIR_Code code = op->code(); ++ if (code == lir_store_check) { ++ Register value = op->object()->as_register(); ++ Register array = op->array()->as_register(); ++ Register k_RInfo = op->tmp1()->as_register(); ++ Register klass_RInfo = op->tmp2()->as_register(); ++ Register Rtmp1 = op->tmp3()->as_register(); ++ CodeStub* stub = op->stub(); ++ ++ // check if it needs to be profiled ++ ciMethodData* md; ++ ciProfileData* data; ++ ++ if (should_profile) { ++ ciMethod* method = op->profiled_method(); ++ assert(method != nullptr, "Should have method"); ++ int bci = op->profiled_bci(); ++ md = method->method_data_or_null(); ++ assert(md != nullptr, "Sanity"); ++ data = md->bci_to_data(bci); ++ assert(data != nullptr, "need data for type check"); ++ assert(data->is_ReceiverTypeData(), "need ReceiverTypeData for type check"); ++ } ++ Label profile_cast_success, profile_cast_failure, done; ++ Label *success_target = should_profile ? &profile_cast_success : &done; ++ Label *failure_target = should_profile ? &profile_cast_failure : stub->entry(); ++ ++ if (should_profile) { ++ Label not_null; ++ __ bnez(value, not_null); ++ // Object is null; update MDO and exit ++ Register mdo = klass_RInfo; ++ __ mov_metadata(mdo, md->constant_encoding()); ++ Address data_addr = Address(mdo, md->byte_offset_of_slot(data, DataLayout::flags_offset())); ++ __ ld_bu(SCR2, data_addr); ++ __ ori(SCR2, SCR2, BitData::null_seen_byte_constant()); ++ __ st_b(SCR2, data_addr); ++ __ b(done); ++ __ bind(not_null); ++ } else { ++ __ beqz(value, done); ++ } ++ ++ add_debug_info_for_null_check_here(op->info_for_exception()); ++ __ load_klass(k_RInfo, array); ++ __ load_klass(klass_RInfo, value); ++ ++ // get instance klass (it's already uncompressed) ++ __ ld_d(k_RInfo, Address(k_RInfo, ObjArrayKlass::element_klass_offset())); ++ // perform the fast part of the checking logic ++ __ check_klass_subtype_fast_path(klass_RInfo, k_RInfo, Rtmp1, success_target, failure_target, nullptr); ++ // call out-of-line instance of __ check_klass_subtype_slow_path(...): ++ __ addi_d(SP, SP, -2 * wordSize); ++ __ st_d(k_RInfo, Address(SP, 0 * wordSize)); ++ __ st_d(klass_RInfo, Address(SP, 1 * wordSize)); ++ __ call(Runtime1::entry_for(Runtime1::slow_subtype_check_id), relocInfo::runtime_call_type); ++ __ ld_d(k_RInfo, Address(SP, 0 * wordSize)); ++ __ ld_d(klass_RInfo, Address(SP, 1 * wordSize)); ++ __ addi_d(SP, SP, 2 * wordSize); ++ // result is a boolean ++ __ beqz(k_RInfo, *failure_target); ++ // fall through to the success case ++ ++ if (should_profile) { ++ Register mdo = klass_RInfo, recv = k_RInfo; ++ __ bind(profile_cast_success); ++ __ mov_metadata(mdo, md->constant_encoding()); ++ __ load_klass(recv, value); ++ Label update_done; ++ type_profile_helper(mdo, md, data, recv, &done); ++ __ b(done); ++ ++ __ bind(profile_cast_failure); ++ __ mov_metadata(mdo, md->constant_encoding()); ++ Address counter_addr(mdo, md->byte_offset_of_slot(data, CounterData::count_offset())); ++ __ lea(SCR2, counter_addr); ++ __ ld_d(SCR1, Address(SCR2)); ++ __ addi_d(SCR1, SCR1, -DataLayout::counter_increment); ++ __ st_d(SCR1, Address(SCR2)); ++ __ b(*stub->entry()); ++ } ++ ++ __ bind(done); ++ } else if (code == lir_checkcast) { ++ Register obj = op->object()->as_register(); ++ Register dst = op->result_opr()->as_register(); ++ Label success; ++ emit_typecheck_helper(op, &success, op->stub()->entry(), &success); ++ __ bind(success); ++ if (dst != obj) { ++ __ move(dst, obj); ++ } ++ } else if (code == lir_instanceof) { ++ Register obj = op->object()->as_register(); ++ Register dst = op->result_opr()->as_register(); ++ Label success, failure, done; ++ emit_typecheck_helper(op, &success, &failure, &failure); ++ __ bind(failure); ++ __ move(dst, R0); ++ __ b(done); ++ __ bind(success); ++ __ li(dst, 1); ++ __ bind(done); ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++void LIR_Assembler::casw(Register addr, Register newval, Register cmpval, Register result, bool sign) { ++ __ cmpxchg32(Address(addr, 0), cmpval, newval, result, sign, ++ /* retold */ false, /* acquire */ true, /* weak */ false, /* exchange */ false); ++ // LA SC equals store-conditional dbar, so no need AnyAny after CAS. ++ //__ membar(__ AnyAny); ++} ++ ++void LIR_Assembler::casl(Register addr, Register newval, Register cmpval, Register result) { ++ __ cmpxchg(Address(addr, 0), cmpval, newval, result, ++ /* retold */ false, /* acquire */ true, /* weak */ false, /* exchange */ false); ++ // LA SC equals store-conditional dbar, so no need AnyAny after CAS. ++ //__ membar(__ AnyAny); ++} ++ ++void LIR_Assembler::emit_compare_and_swap(LIR_OpCompareAndSwap* op) { ++ assert(VM_Version::supports_cx8(), "wrong machine"); ++ Register addr; ++ if (op->addr()->is_register()) { ++ addr = as_reg(op->addr()); ++ } else { ++ assert(op->addr()->is_address(), "what else?"); ++ LIR_Address* addr_ptr = op->addr()->as_address_ptr(); ++ assert(addr_ptr->disp() == 0, "need 0 disp"); ++ assert(addr_ptr->index() == LIR_Opr::illegalOpr(), "need 0 index"); ++ addr = as_reg(addr_ptr->base()); ++ } ++ Register newval = as_reg(op->new_value()); ++ Register cmpval = as_reg(op->cmp_value()); ++ Register result = as_reg(op->result_opr()); ++ ++ if (op->code() == lir_cas_obj) { ++ if (UseCompressedOops) { ++ Register t1 = op->tmp1()->as_register(); ++ assert(op->tmp1()->is_valid(), "must be"); ++ Register t2 = op->tmp2()->as_register(); ++ assert(op->tmp2()->is_valid(), "must be"); ++ ++ __ encode_heap_oop(t1, cmpval); ++ cmpval = t1; ++ __ encode_heap_oop(t2, newval); ++ newval = t2; ++ casw(addr, newval, cmpval, result, false); ++ } else { ++ casl(addr, newval, cmpval, result); ++ } ++ } else if (op->code() == lir_cas_int) { ++ casw(addr, newval, cmpval, result, true); ++ } else { ++ casl(addr, newval, cmpval, result); ++ } ++} ++ ++void LIR_Assembler::cmove(LIR_Condition condition, LIR_Opr src1, LIR_Opr src2, LIR_Opr result, BasicType type, ++ LIR_Opr left, LIR_Opr right) { ++ assert(result->is_single_cpu() || result->is_double_cpu(), "expect single register for result"); ++ assert(left->is_single_cpu() || left->is_double_cpu(), "must be"); ++ Register regd = (result->type() == T_LONG) ? result->as_register_lo() : result->as_register(); ++ Register regl = as_reg(left); ++ Register regr = noreg; ++ Register reg1 = noreg; ++ Register reg2 = noreg; ++ jlong immr = 0; ++ ++ // comparison operands ++ if (right->is_single_cpu()) { ++ // cpu register - cpu register ++ regr = right->as_register(); ++ } else if (right->is_double_cpu()) { ++ // cpu register - cpu register ++ regr = right->as_register_lo(); ++ } else if (right->is_constant()) { ++ switch(right->type()) { ++ case T_INT: ++ case T_ADDRESS: ++ immr = right->as_constant_ptr()->as_jint(); ++ break; ++ case T_LONG: ++ immr = right->as_constant_ptr()->as_jlong(); ++ break; ++ case T_METADATA: ++ immr = (intptr_t)right->as_constant_ptr()->as_metadata(); ++ break; ++ case T_OBJECT: ++ case T_ARRAY: ++ if (right->as_constant_ptr()->as_jobject() != nullptr) { ++ regr = SCR1; ++ jobject2reg(right->as_constant_ptr()->as_jobject(), regr); ++ } else { ++ immr = 0; ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ break; ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++ ++ if (regr == noreg) { ++ switch (condition) { ++ case lir_cond_equal: ++ case lir_cond_notEqual: ++ if (!Assembler::is_simm(-immr, 12)) { ++ regr = SCR1; ++ __ li(regr, immr); ++ } ++ break; ++ default: ++ if (!Assembler::is_simm(immr, 12)) { ++ regr = SCR1; ++ __ li(regr, immr); ++ } ++ } ++ } ++ ++ // special cases ++ if (src1->is_constant() && src2->is_constant()) { ++ jlong val1 = 0, val2 = 0; ++ if (src1->type() == T_INT && src2->type() == T_INT) { ++ val1 = src1->as_jint(); ++ val2 = src2->as_jint(); ++ } else if (src1->type() == T_LONG && src2->type() == T_LONG) { ++ val1 = src1->as_jlong(); ++ val2 = src2->as_jlong(); ++ } ++ if (val1 == 0 && val2 == 1) { ++ if (regr == noreg) { ++ switch (condition) { ++ case lir_cond_equal: ++ if (immr == 0) { ++ __ sltu(regd, R0, regl); ++ } else { ++ __ addi_d(SCR1, regl, -immr); ++ __ li(regd, 1); ++ __ maskeqz(regd, regd, SCR1); ++ } ++ break; ++ case lir_cond_notEqual: ++ if (immr == 0) { ++ __ sltu(regd, R0, regl); ++ __ xori(regd, regd, 1); ++ } else { ++ __ addi_d(SCR1, regl, -immr); ++ __ li(regd, 1); ++ __ masknez(regd, regd, SCR1); ++ } ++ break; ++ case lir_cond_less: ++ __ slti(regd, regl, immr); ++ __ xori(regd, regd, 1); ++ break; ++ case lir_cond_lessEqual: ++ if (immr == 0) { ++ __ slt(regd, R0, regl); ++ } else { ++ __ li(SCR1, immr); ++ __ slt(regd, SCR1, regl); ++ } ++ break; ++ case lir_cond_greater: ++ if (immr == 0) { ++ __ slt(regd, R0, regl); ++ } else { ++ __ li(SCR1, immr); ++ __ slt(regd, SCR1, regl); ++ } ++ __ xori(regd, regd, 1); ++ break; ++ case lir_cond_greaterEqual: ++ __ slti(regd, regl, immr); ++ break; ++ case lir_cond_belowEqual: ++ if (immr == 0) { ++ __ sltu(regd, R0, regl); ++ } else { ++ __ li(SCR1, immr); ++ __ sltu(regd, SCR1, regl); ++ } ++ break; ++ case lir_cond_aboveEqual: ++ __ sltui(regd, regl, immr); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (condition) { ++ case lir_cond_equal: ++ __ sub_d(SCR1, regl, regr); ++ __ li(regd, 1); ++ __ maskeqz(regd, regd, SCR1); ++ break; ++ case lir_cond_notEqual: ++ __ sub_d(SCR1, regl, regr); ++ __ li(regd, 1); ++ __ masknez(regd, regd, SCR1); ++ break; ++ case lir_cond_less: ++ __ slt(regd, regl, regr); ++ __ xori(regd, regd, 1); ++ break; ++ case lir_cond_lessEqual: ++ __ slt(regd, regr, regl); ++ break; ++ case lir_cond_greater: ++ __ slt(regd, regr, regl); ++ __ xori(regd, regd, 1); ++ break; ++ case lir_cond_greaterEqual: ++ __ slt(regd, regl, regr); ++ break; ++ case lir_cond_belowEqual: ++ __ sltu(regd, regr, regl); ++ break; ++ case lir_cond_aboveEqual: ++ __ sltu(regd, regl, regr); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ return; ++ } else if (val1 == 1 && val2 == 0) { ++ if (regr == noreg) { ++ switch (condition) { ++ case lir_cond_equal: ++ if (immr == 0) { ++ __ sltu(regd, R0, regl); ++ __ xori(regd, regd, 1); ++ } else { ++ __ addi_d(SCR1, regl, -immr); ++ __ li(regd, 1); ++ __ masknez(regd, regd, SCR1); ++ } ++ break; ++ case lir_cond_notEqual: ++ if (immr == 0) { ++ __ sltu(regd, R0, regl); ++ } else { ++ __ addi_d(SCR1, regl, -immr); ++ __ li(regd, 1); ++ __ maskeqz(regd, regd, SCR1); ++ } ++ break; ++ case lir_cond_less: ++ __ slti(regd, regl, immr); ++ break; ++ case lir_cond_lessEqual: ++ if (immr == 0) { ++ __ slt(regd, R0, regl); ++ } else { ++ __ li(SCR1, immr); ++ __ slt(regd, SCR1, regl); ++ } ++ __ xori(regd, regd, 1); ++ break; ++ case lir_cond_greater: ++ if (immr == 0) { ++ __ slt(regd, R0, regl); ++ } else { ++ __ li(SCR1, immr); ++ __ slt(regd, SCR1, regl); ++ } ++ break; ++ case lir_cond_greaterEqual: ++ __ slti(regd, regl, immr); ++ __ xori(regd, regd, 1); ++ break; ++ case lir_cond_belowEqual: ++ if (immr == 0) { ++ __ sltu(regd, R0, regl); ++ } else { ++ __ li(SCR1, immr); ++ __ sltu(regd, SCR1, regl); ++ } ++ __ xori(regd, regd, 1); ++ break; ++ case lir_cond_aboveEqual: ++ __ sltui(regd, regl, immr); ++ __ xori(regd, regd, 1); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (condition) { ++ case lir_cond_equal: ++ __ sub_d(SCR1, regl, regr); ++ __ li(regd, 1); ++ __ masknez(regd, regd, SCR1); ++ break; ++ case lir_cond_notEqual: ++ __ sub_d(SCR1, regl, regr); ++ __ li(regd, 1); ++ __ maskeqz(regd, regd, SCR1); ++ break; ++ case lir_cond_less: ++ __ slt(regd, regl, regr); ++ break; ++ case lir_cond_lessEqual: ++ __ slt(regd, regr, regl); ++ __ xori(regd, regd, 1); ++ break; ++ case lir_cond_greater: ++ __ slt(regd, regr, regl); ++ break; ++ case lir_cond_greaterEqual: ++ __ slt(regd, regl, regr); ++ __ xori(regd, regd, 1); ++ break; ++ case lir_cond_belowEqual: ++ __ sltu(regd, regr, regl); ++ __ xori(regd, regd, 1); ++ break; ++ case lir_cond_aboveEqual: ++ __ sltu(regd, regl, regr); ++ __ xori(regd, regd, 1); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ return; ++ } ++ } ++ ++ // cmp ++ if (regr == noreg) { ++ switch (condition) { ++ case lir_cond_equal: ++ __ addi_d(SCR2, regl, -immr); ++ break; ++ case lir_cond_notEqual: ++ __ addi_d(SCR2, regl, -immr); ++ break; ++ case lir_cond_less: ++ __ slti(SCR2, regl, immr); ++ break; ++ case lir_cond_lessEqual: ++ __ li(SCR1, immr); ++ __ slt(SCR2, SCR1, regl); ++ break; ++ case lir_cond_greater: ++ __ li(SCR1, immr); ++ __ slt(SCR2, SCR1, regl); ++ break; ++ case lir_cond_greaterEqual: ++ __ slti(SCR2, regl, immr); ++ break; ++ case lir_cond_belowEqual: ++ __ li(SCR1, immr); ++ __ sltu(SCR2, SCR1, regl); ++ break; ++ case lir_cond_aboveEqual: ++ __ sltui(SCR2, regl, immr); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (condition) { ++ case lir_cond_equal: ++ __ sub_d(SCR2, regl, regr); ++ break; ++ case lir_cond_notEqual: ++ __ sub_d(SCR2, regl, regr); ++ break; ++ case lir_cond_less: ++ __ slt(SCR2, regl, regr); ++ break; ++ case lir_cond_lessEqual: ++ __ slt(SCR2, regr, regl); ++ break; ++ case lir_cond_greater: ++ __ slt(SCR2, regr, regl); ++ break; ++ case lir_cond_greaterEqual: ++ __ slt(SCR2, regl, regr); ++ break; ++ case lir_cond_belowEqual: ++ __ sltu(SCR2, regr, regl); ++ break; ++ case lir_cond_aboveEqual: ++ __ sltu(SCR2, regl, regr); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ ++ // value operands ++ if (src1->is_stack()) { ++ stack2reg(src1, result, result->type()); ++ reg1 = regd; ++ } else if (src1->is_constant()) { ++ const2reg(src1, result, lir_patch_none, nullptr); ++ reg1 = regd; ++ } else { ++ reg1 = (src1->type() == T_LONG) ? src1->as_register_lo() : src1->as_register(); ++ } ++ ++ if (src2->is_stack()) { ++ stack2reg(src2, FrameMap::scr1_opr, result->type()); ++ reg2 = SCR1; ++ } else if (src2->is_constant()) { ++ LIR_Opr tmp = src2->type() == T_LONG ? FrameMap::scr1_long_opr : FrameMap::scr1_opr; ++ const2reg(src2, tmp, lir_patch_none, nullptr); ++ reg2 = SCR1; ++ } else { ++ reg2 = (src2->type() == T_LONG) ? src2->as_register_lo() : src2->as_register(); ++ } ++ ++ // cmove ++ switch (condition) { ++ case lir_cond_equal: ++ __ masknez(regd, reg1, SCR2); ++ __ maskeqz(SCR2, reg2, SCR2); ++ break; ++ case lir_cond_notEqual: ++ __ maskeqz(regd, reg1, SCR2); ++ __ masknez(SCR2, reg2, SCR2); ++ break; ++ case lir_cond_less: ++ __ maskeqz(regd, reg1, SCR2); ++ __ masknez(SCR2, reg2, SCR2); ++ break; ++ case lir_cond_lessEqual: ++ __ masknez(regd, reg1, SCR2); ++ __ maskeqz(SCR2, reg2, SCR2); ++ break; ++ case lir_cond_greater: ++ __ maskeqz(regd, reg1, SCR2); ++ __ masknez(SCR2, reg2, SCR2); ++ break; ++ case lir_cond_greaterEqual: ++ __ masknez(regd, reg1, SCR2); ++ __ maskeqz(SCR2, reg2, SCR2); ++ break; ++ case lir_cond_belowEqual: ++ __ masknez(regd, reg1, SCR2); ++ __ maskeqz(SCR2, reg2, SCR2); ++ break; ++ case lir_cond_aboveEqual: ++ __ masknez(regd, reg1, SCR2); ++ __ maskeqz(SCR2, reg2, SCR2); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ ++ __ OR(regd, regd, SCR2); ++} ++ ++void LIR_Assembler::arith_op(LIR_Code code, LIR_Opr left, LIR_Opr right, LIR_Opr dest, ++ CodeEmitInfo* info, bool pop_fpu_stack) { ++ assert(info == nullptr, "should never be used, idiv/irem and ldiv/lrem not handled by this method"); ++ ++ if (left->is_single_cpu()) { ++ Register lreg = left->as_register(); ++ Register dreg = as_reg(dest); ++ ++ if (right->is_single_cpu()) { ++ // cpu register - cpu register ++ assert(left->type() == T_INT && right->type() == T_INT && dest->type() == T_INT, "should be"); ++ Register rreg = right->as_register(); ++ switch (code) { ++ case lir_add: __ add_w (dest->as_register(), lreg, rreg); break; ++ case lir_sub: __ sub_w (dest->as_register(), lreg, rreg); break; ++ case lir_mul: __ mul_w (dest->as_register(), lreg, rreg); break; ++ default: ShouldNotReachHere(); ++ } ++ } else if (right->is_double_cpu()) { ++ Register rreg = right->as_register_lo(); ++ // single_cpu + double_cpu: can happen with obj+long ++ assert(code == lir_add || code == lir_sub, "mismatched arithmetic op"); ++ switch (code) { ++ case lir_add: __ add_d(dreg, lreg, rreg); break; ++ case lir_sub: __ sub_d(dreg, lreg, rreg); break; ++ default: ShouldNotReachHere(); ++ } ++ } else if (right->is_constant()) { ++ // cpu register - constant ++ jlong c; ++ ++ // FIXME: This is fugly: we really need to factor all this logic. ++ switch(right->type()) { ++ case T_LONG: ++ c = right->as_constant_ptr()->as_jlong(); ++ break; ++ case T_INT: ++ case T_ADDRESS: ++ c = right->as_constant_ptr()->as_jint(); ++ break; ++ default: ++ ShouldNotReachHere(); ++ c = 0; // unreachable ++ break; ++ } ++ ++ assert(code == lir_add || code == lir_sub, "mismatched arithmetic op"); ++ if (c == 0 && dreg == lreg) { ++ COMMENT("effective nop elided"); ++ return; ++ } ++ ++ switch(left->type()) { ++ case T_INT: ++ switch (code) { ++ case lir_add: __ addi_w(dreg, lreg, c); break; ++ case lir_sub: __ addi_w(dreg, lreg, -c); break; ++ default: ShouldNotReachHere(); ++ } ++ break; ++ case T_OBJECT: ++ case T_ADDRESS: ++ switch (code) { ++ case lir_add: __ addi_d(dreg, lreg, c); break; ++ case lir_sub: __ addi_d(dreg, lreg, -c); break; ++ default: ShouldNotReachHere(); ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++ } else if (left->is_double_cpu()) { ++ Register lreg_lo = left->as_register_lo(); ++ ++ if (right->is_double_cpu()) { ++ // cpu register - cpu register ++ Register rreg_lo = right->as_register_lo(); ++ switch (code) { ++ case lir_add: __ add_d(dest->as_register_lo(), lreg_lo, rreg_lo); break; ++ case lir_sub: __ sub_d(dest->as_register_lo(), lreg_lo, rreg_lo); break; ++ case lir_mul: __ mul_d(dest->as_register_lo(), lreg_lo, rreg_lo); break; ++ case lir_div: __ div_d(dest->as_register_lo(), lreg_lo, rreg_lo); break; ++ case lir_rem: __ mod_d(dest->as_register_lo(), lreg_lo, rreg_lo); break; ++ default: ShouldNotReachHere(); ++ } ++ ++ } else if (right->is_constant()) { ++ jlong c = right->as_constant_ptr()->as_jlong(); ++ Register dreg = as_reg(dest); ++ switch (code) { ++ case lir_add: ++ case lir_sub: ++ if (c == 0 && dreg == lreg_lo) { ++ COMMENT("effective nop elided"); ++ return; ++ } ++ code == lir_add ? __ addi_d(dreg, lreg_lo, c) : __ addi_d(dreg, lreg_lo, -c); ++ break; ++ case lir_div: ++ assert(c > 0 && is_power_of_2(c), "divisor must be power-of-2 constant"); ++ if (c == 1) { ++ // move lreg_lo to dreg if divisor is 1 ++ __ move(dreg, lreg_lo); ++ } else { ++ unsigned int shift = log2i_exact(c); ++ // use scr1 as intermediate result register ++ __ srai_d(SCR1, lreg_lo, 63); ++ __ srli_d(SCR1, SCR1, 64 - shift); ++ __ add_d(SCR1, lreg_lo, SCR1); ++ __ srai_d(dreg, SCR1, shift); ++ } ++ break; ++ case lir_rem: ++ assert(c > 0 && is_power_of_2(c), "divisor must be power-of-2 constant"); ++ if (c == 1) { ++ // move 0 to dreg if divisor is 1 ++ __ move(dreg, R0); ++ } else { ++ // use scr1/2 as intermediate result register ++ __ sub_d(SCR1, R0, lreg_lo); ++ __ slt(SCR2, SCR1, R0); ++ __ andi(dreg, lreg_lo, c - 1); ++ __ andi(SCR1, SCR1, c - 1); ++ __ sub_d(SCR1, R0, SCR1); ++ __ maskeqz(dreg, dreg, SCR2); ++ __ masknez(SCR1, SCR1, SCR2); ++ __ OR(dreg, dreg, SCR1); ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++ } else if (left->is_single_fpu()) { ++ assert(right->is_single_fpu(), "right hand side of float arithmetics needs to be float register"); ++ switch (code) { ++ case lir_add: __ fadd_s (dest->as_float_reg(), left->as_float_reg(), right->as_float_reg()); break; ++ case lir_sub: __ fsub_s (dest->as_float_reg(), left->as_float_reg(), right->as_float_reg()); break; ++ case lir_mul: __ fmul_s (dest->as_float_reg(), left->as_float_reg(), right->as_float_reg()); break; ++ case lir_div: __ fdiv_s (dest->as_float_reg(), left->as_float_reg(), right->as_float_reg()); break; ++ default: ShouldNotReachHere(); ++ } ++ } else if (left->is_double_fpu()) { ++ if (right->is_double_fpu()) { ++ // fpu register - fpu register ++ switch (code) { ++ case lir_add: __ fadd_d (dest->as_double_reg(), left->as_double_reg(), right->as_double_reg()); break; ++ case lir_sub: __ fsub_d (dest->as_double_reg(), left->as_double_reg(), right->as_double_reg()); break; ++ case lir_mul: __ fmul_d (dest->as_double_reg(), left->as_double_reg(), right->as_double_reg()); break; ++ case lir_div: __ fdiv_d (dest->as_double_reg(), left->as_double_reg(), right->as_double_reg()); break; ++ default: ShouldNotReachHere(); ++ } ++ } else { ++ if (right->is_constant()) { ++ ShouldNotReachHere(); ++ } ++ ShouldNotReachHere(); ++ } ++ } else if (left->is_single_stack() || left->is_address()) { ++ assert(left == dest, "left and dest must be equal"); ++ ShouldNotReachHere(); ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++void LIR_Assembler::arith_fpu_implementation(LIR_Code code, int left_index, int right_index, ++ int dest_index, bool pop_fpu_stack) { ++ Unimplemented(); ++} ++ ++void LIR_Assembler::intrinsic_op(LIR_Code code, LIR_Opr value, LIR_Opr tmp, LIR_Opr dest, LIR_Op* op) { ++ switch(code) { ++ case lir_abs : __ fabs_d(dest->as_double_reg(), value->as_double_reg()); break; ++ case lir_sqrt: __ fsqrt_d(dest->as_double_reg(), value->as_double_reg()); break; ++ case lir_f2hf: __ flt_to_flt16(dest->as_register(), value->as_float_reg(), tmp->as_float_reg()); break; ++ case lir_hf2f: __ flt16_to_flt(dest->as_float_reg(), value->as_register(), tmp->as_float_reg()); break; ++ default : ShouldNotReachHere(); ++ } ++} ++ ++void LIR_Assembler::logic_op(LIR_Code code, LIR_Opr left, LIR_Opr right, LIR_Opr dst) { ++ assert(left->is_single_cpu() || left->is_double_cpu(), "expect single or double register"); ++ Register Rleft = left->is_single_cpu() ? left->as_register() : left->as_register_lo(); ++ ++ if (dst->is_single_cpu()) { ++ Register Rdst = dst->as_register(); ++ if (right->is_constant()) { ++ switch (code) { ++ case lir_logic_and: ++ if (Assembler::is_uimm(right->as_jint(), 12)) { ++ __ andi(Rdst, Rleft, right->as_jint()); ++ } else { ++ __ li(AT, right->as_jint()); ++ __ AND(Rdst, Rleft, AT); ++ } ++ break; ++ case lir_logic_or: __ ori(Rdst, Rleft, right->as_jint()); break; ++ case lir_logic_xor: __ xori(Rdst, Rleft, right->as_jint()); break; ++ default: ShouldNotReachHere(); break; ++ } ++ } else { ++ Register Rright = right->is_single_cpu() ? right->as_register() : right->as_register_lo(); ++ switch (code) { ++ case lir_logic_and: __ AND(Rdst, Rleft, Rright); break; ++ case lir_logic_or: __ OR(Rdst, Rleft, Rright); break; ++ case lir_logic_xor: __ XOR(Rdst, Rleft, Rright); break; ++ default: ShouldNotReachHere(); break; ++ } ++ } ++ } else { ++ Register Rdst = dst->as_register_lo(); ++ if (right->is_constant()) { ++ switch (code) { ++ case lir_logic_and: ++ if (Assembler::is_uimm(right->as_jlong(), 12)) { ++ __ andi(Rdst, Rleft, right->as_jlong()); ++ } else { ++ // We can guarantee that transform from HIR LogicOp is in range of ++ // uimm(12), but the common code directly generates LIR LogicAnd, ++ // and the right-operand is mask with all ones in the high bits. ++ __ li(AT, right->as_jlong()); ++ __ AND(Rdst, Rleft, AT); ++ } ++ break; ++ case lir_logic_or: __ ori(Rdst, Rleft, right->as_jlong()); break; ++ case lir_logic_xor: __ xori(Rdst, Rleft, right->as_jlong()); break; ++ default: ShouldNotReachHere(); break; ++ } ++ } else { ++ Register Rright = right->is_single_cpu() ? right->as_register() : right->as_register_lo(); ++ switch (code) { ++ case lir_logic_and: __ AND(Rdst, Rleft, Rright); break; ++ case lir_logic_or: __ OR(Rdst, Rleft, Rright); break; ++ case lir_logic_xor: __ XOR(Rdst, Rleft, Rright); break; ++ default: ShouldNotReachHere(); break; ++ } ++ } ++ } ++} ++ ++void LIR_Assembler::arithmetic_idiv(LIR_Code code, LIR_Opr left, LIR_Opr right, ++ LIR_Opr illegal, LIR_Opr result, CodeEmitInfo* info) { ++ // opcode check ++ assert((code == lir_idiv) || (code == lir_irem), "opcode must be idiv or irem"); ++ bool is_irem = (code == lir_irem); ++ ++ // operand check ++ assert(left->is_single_cpu(), "left must be register"); ++ assert(right->is_single_cpu() || right->is_constant(), "right must be register or constant"); ++ assert(result->is_single_cpu(), "result must be register"); ++ Register lreg = left->as_register(); ++ Register dreg = result->as_register(); ++ ++ // power-of-2 constant check and codegen ++ if (right->is_constant()) { ++ int c = right->as_constant_ptr()->as_jint(); ++ assert(c > 0 && is_power_of_2(c), "divisor must be power-of-2 constant"); ++ if (is_irem) { ++ if (c == 1) { ++ // move 0 to dreg if divisor is 1 ++ __ move(dreg, R0); ++ } else { ++ // use scr1/2 as intermediate result register ++ __ sub_w(SCR1, R0, lreg); ++ __ slt(SCR2, SCR1, R0); ++ __ andi(dreg, lreg, c - 1); ++ __ andi(SCR1, SCR1, c - 1); ++ __ sub_w(SCR1, R0, SCR1); ++ __ maskeqz(dreg, dreg, SCR2); ++ __ masknez(SCR1, SCR1, SCR2); ++ __ OR(dreg, dreg, SCR1); ++ } ++ } else { ++ if (c == 1) { ++ // move lreg to dreg if divisor is 1 ++ __ move(dreg, lreg); ++ } else { ++ unsigned int shift = exact_log2(c); ++ // use scr1 as intermediate result register ++ __ srai_w(SCR1, lreg, 31); ++ __ srli_w(SCR1, SCR1, 32 - shift); ++ __ add_w(SCR1, lreg, SCR1); ++ __ srai_w(dreg, SCR1, shift); ++ } ++ } ++ } else { ++ Register rreg = right->as_register(); ++ if (is_irem) ++ __ mod_w(dreg, lreg, rreg); ++ else ++ __ div_w(dreg, lreg, rreg); ++ } ++} ++ ++void LIR_Assembler::comp_op(LIR_Condition condition, LIR_Opr opr1, LIR_Opr opr2, LIR_Op2* op) { ++ Unimplemented(); ++} ++ ++void LIR_Assembler::comp_fl2i(LIR_Code code, LIR_Opr left, LIR_Opr right, LIR_Opr dst, LIR_Op2* op){ ++ if (code == lir_cmp_fd2i || code == lir_ucmp_fd2i) { ++ bool is_unordered_less = (code == lir_ucmp_fd2i); ++ if (left->is_single_fpu()) { ++ if (is_unordered_less) { ++ __ fcmp_clt_s(FCC0, right->as_float_reg(), left->as_float_reg()); ++ __ fcmp_cult_s(FCC1, left->as_float_reg(), right->as_float_reg()); ++ } else { ++ __ fcmp_cult_s(FCC0, right->as_float_reg(), left->as_float_reg()); ++ __ fcmp_clt_s(FCC1, left->as_float_reg(), right->as_float_reg()); ++ } ++ } else if (left->is_double_fpu()) { ++ if (is_unordered_less) { ++ __ fcmp_clt_d(FCC0, right->as_double_reg(), left->as_double_reg()); ++ __ fcmp_cult_d(FCC1, left->as_double_reg(), right->as_double_reg()); ++ } else { ++ __ fcmp_cult_d(FCC0, right->as_double_reg(), left->as_double_reg()); ++ __ fcmp_clt_d(FCC1, left->as_double_reg(), right->as_double_reg()); ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++ if (UseCF2GR) { ++ __ movcf2gr(dst->as_register(), FCC0); ++ __ movcf2gr(SCR1, FCC1); ++ } else { ++ LIR_Opr tmp = op->tmp1_opr(); ++ __ movcf2fr(tmp->as_float_reg(), FCC0); ++ __ movfr2gr_s(dst->as_register(), tmp->as_float_reg()); ++ __ movcf2fr(tmp->as_float_reg(), FCC1); ++ __ movfr2gr_s(SCR1, tmp->as_float_reg()); ++ } ++ __ sub_d(dst->as_register(), dst->as_register(), SCR1); ++ } else if (code == lir_cmp_l2i) { ++ __ slt(SCR1, left->as_register_lo(), right->as_register_lo()); ++ __ slt(dst->as_register(), right->as_register_lo(), left->as_register_lo()); ++ __ sub_d(dst->as_register(), dst->as_register(), SCR1); ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++void LIR_Assembler::align_call(LIR_Code code) {} ++ ++void LIR_Assembler::call(LIR_OpJavaCall* op, relocInfo::relocType rtype) { ++ address call = __ trampoline_call(AddressLiteral(op->addr(), rtype)); ++ if (call == nullptr) { ++ bailout("trampoline stub overflow"); ++ return; ++ } ++ add_call_info(code_offset(), op->info()); ++ __ post_call_nop(); ++} ++ ++void LIR_Assembler::ic_call(LIR_OpJavaCall* op) { ++ address call = __ ic_call(op->addr()); ++ if (call == nullptr) { ++ bailout("trampoline stub overflow"); ++ return; ++ } ++ add_call_info(code_offset(), op->info()); ++ __ post_call_nop(); ++} ++ ++void LIR_Assembler::emit_static_call_stub() { ++ address call_pc = __ pc(); ++ address stub = __ start_a_stub(call_stub_size()); ++ if (stub == nullptr) { ++ bailout("static call stub overflow"); ++ return; ++ } ++ ++ int start = __ offset(); ++ ++ __ relocate(static_stub_Relocation::spec(call_pc)); ++ ++ // Code stream for loading method may be changed. ++ __ ibar(0); ++ ++ // Rmethod contains Method*, it should be relocated for GC ++ // static stub relocation also tags the Method* in the code-stream. ++ __ mov_metadata(Rmethod, nullptr); ++ // This is recognized as unresolved by relocs/nativeInst/ic code ++ __ patchable_jump(__ pc()); ++ ++ assert(__ offset() - start + CompiledStaticCall::to_trampoline_stub_size() <= call_stub_size(), ++ "stub too big"); ++ __ end_a_stub(); ++} ++ ++void LIR_Assembler::throw_op(LIR_Opr exceptionPC, LIR_Opr exceptionOop, CodeEmitInfo* info) { ++ assert(exceptionOop->as_register() == A0, "must match"); ++ assert(exceptionPC->as_register() == A1, "must match"); ++ ++ // exception object is not added to oop map by LinearScan ++ // (LinearScan assumes that no oops are in fixed registers) ++ info->add_register_oop(exceptionOop); ++ Runtime1::StubID unwind_id; ++ ++ // get current pc information ++ // pc is only needed if the method has an exception handler, the unwind code does not need it. ++ if (compilation()->debug_info_recorder()->last_pc_offset() == __ offset()) { ++ // As no instructions have been generated yet for this LIR node it's ++ // possible that an oop map already exists for the current offset. ++ // In that case insert an dummy NOP here to ensure all oop map PCs ++ // are unique. See JDK-8237483. ++ __ nop(); ++ } ++ Label L; ++ int pc_for_athrow_offset = __ offset(); ++ __ bind(L); ++ __ lipc(exceptionPC->as_register(), L); ++ add_call_info(pc_for_athrow_offset, info); // for exception handler ++ ++ __ verify_not_null_oop(A0); ++ // search an exception handler (A0: exception oop, A1: throwing pc) ++ if (compilation()->has_fpu_code()) { ++ unwind_id = Runtime1::handle_exception_id; ++ } else { ++ unwind_id = Runtime1::handle_exception_nofpu_id; ++ } ++ __ call(Runtime1::entry_for(unwind_id), relocInfo::runtime_call_type); ++ ++ // FIXME: enough room for two byte trap ???? ++ __ nop(); ++} ++ ++void LIR_Assembler::unwind_op(LIR_Opr exceptionOop) { ++ assert(exceptionOop->as_register() == A0, "must match"); ++ __ b(_unwind_handler_entry); ++} ++ ++void LIR_Assembler::shift_op(LIR_Code code, LIR_Opr left, LIR_Opr count, LIR_Opr dest, LIR_Opr tmp) { ++ Register lreg = left->is_single_cpu() ? left->as_register() : left->as_register_lo(); ++ Register dreg = dest->is_single_cpu() ? dest->as_register() : dest->as_register_lo(); ++ ++ switch (left->type()) { ++ case T_INT: { ++ switch (code) { ++ case lir_shl: __ sll_w(dreg, lreg, count->as_register()); break; ++ case lir_shr: __ sra_w(dreg, lreg, count->as_register()); break; ++ case lir_ushr: __ srl_w(dreg, lreg, count->as_register()); break; ++ default: ShouldNotReachHere(); break; ++ } ++ break; ++ case T_LONG: ++ case T_ADDRESS: ++ case T_OBJECT: ++ switch (code) { ++ case lir_shl: __ sll_d(dreg, lreg, count->as_register()); break; ++ case lir_shr: __ sra_d(dreg, lreg, count->as_register()); break; ++ case lir_ushr: __ srl_d(dreg, lreg, count->as_register()); break; ++ default: ShouldNotReachHere(); break; ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ break; ++ } ++ } ++} ++ ++void LIR_Assembler::shift_op(LIR_Code code, LIR_Opr left, jint count, LIR_Opr dest) { ++ Register dreg = dest->is_single_cpu() ? dest->as_register() : dest->as_register_lo(); ++ Register lreg = left->is_single_cpu() ? left->as_register() : left->as_register_lo(); ++ ++ switch (left->type()) { ++ case T_INT: { ++ switch (code) { ++ case lir_shl: __ slli_w(dreg, lreg, count); break; ++ case lir_shr: __ srai_w(dreg, lreg, count); break; ++ case lir_ushr: __ srli_w(dreg, lreg, count); break; ++ default: ShouldNotReachHere(); break; ++ } ++ break; ++ case T_LONG: ++ case T_ADDRESS: ++ case T_OBJECT: ++ switch (code) { ++ case lir_shl: __ slli_d(dreg, lreg, count); break; ++ case lir_shr: __ srai_d(dreg, lreg, count); break; ++ case lir_ushr: __ srli_d(dreg, lreg, count); break; ++ default: ShouldNotReachHere(); break; ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ break; ++ } ++ } ++} ++ ++void LIR_Assembler::store_parameter(Register r, int offset_from_sp_in_words) { ++ assert(offset_from_sp_in_words >= 0, "invalid offset from sp"); ++ int offset_from_sp_in_bytes = offset_from_sp_in_words * BytesPerWord; ++ assert(offset_from_sp_in_bytes < frame_map()->reserved_argument_area_size(), "invalid offset"); ++ __ st_d(r, Address(SP, offset_from_sp_in_bytes)); ++} ++ ++void LIR_Assembler::store_parameter(jint c, int offset_from_sp_in_words) { ++ assert(offset_from_sp_in_words >= 0, "invalid offset from sp"); ++ int offset_from_sp_in_bytes = offset_from_sp_in_words * BytesPerWord; ++ assert(offset_from_sp_in_bytes < frame_map()->reserved_argument_area_size(), "invalid offset"); ++ __ li(SCR2, c); ++ __ st_d(SCR2, Address(SP, offset_from_sp_in_bytes)); ++} ++ ++void LIR_Assembler::store_parameter(jobject o, int offset_from_sp_in_words) { ++ ShouldNotReachHere(); ++} ++ ++// This code replaces a call to arraycopy; no exception may ++// be thrown in this code, they must be thrown in the System.arraycopy ++// activation frame; we could save some checks if this would not be the case ++void LIR_Assembler::emit_arraycopy(LIR_OpArrayCopy* op) { ++ ++ ciArrayKlass* default_type = op->expected_type(); ++ Register src = op->src()->as_register(); ++ Register dst = op->dst()->as_register(); ++ Register src_pos = op->src_pos()->as_register(); ++ Register dst_pos = op->dst_pos()->as_register(); ++ Register length = op->length()->as_register(); ++ Register tmp = op->tmp()->as_register(); ++ ++ CodeStub* stub = op->stub(); ++ int flags = op->flags(); ++ BasicType basic_type = default_type != nullptr ? default_type->element_type()->basic_type() : T_ILLEGAL; ++ if (is_reference_type(basic_type)) ++ basic_type = T_OBJECT; ++ ++ // if we don't know anything, just go through the generic arraycopy ++ if (default_type == nullptr) { ++ Label done; ++ assert(src == j_rarg0 && src_pos == j_rarg1, "mismatch in calling convention"); ++ ++ // Save the arguments in case the generic arraycopy fails and we ++ // have to fall back to the JNI stub ++ __ st_d(dst, Address(SP, 0 * BytesPerWord)); ++ __ st_d(dst_pos, Address(SP, 1 * BytesPerWord)); ++ __ st_d(length, Address(SP, 2 * BytesPerWord)); ++ __ st_d(src_pos, Address(SP, 3 * BytesPerWord)); ++ __ st_d(src, Address(SP, 4 * BytesPerWord)); ++ ++ address copyfunc_addr = StubRoutines::generic_arraycopy(); ++ assert(copyfunc_addr != nullptr, "generic arraycopy stub required"); ++ ++ // The arguments are in java calling convention so we shift them ++ // to C convention ++ assert_different_registers(A4, j_rarg0, j_rarg1, j_rarg2, j_rarg3); ++ __ move(A4, j_rarg4); ++ assert_different_registers(A3, j_rarg0, j_rarg1, j_rarg2); ++ __ move(A3, j_rarg3); ++ assert_different_registers(A2, j_rarg0, j_rarg1); ++ __ move(A2, j_rarg2); ++ assert_different_registers(A1, j_rarg0); ++ __ move(A1, j_rarg1); ++ __ move(A0, j_rarg0); ++#ifndef PRODUCT ++ if (PrintC1Statistics) { ++ __ li(SCR2, (address)&Runtime1::_generic_arraycopystub_cnt); ++ __ increment(SCR2, 1); ++ } ++#endif ++ __ call(copyfunc_addr, relocInfo::runtime_call_type); ++ ++ __ beqz(A0, *stub->continuation()); ++ __ move(tmp, A0); ++ ++ // Reload values from the stack so they are where the stub ++ // expects them. ++ __ ld_d(dst, Address(SP, 0 * BytesPerWord)); ++ __ ld_d(dst_pos, Address(SP, 1 * BytesPerWord)); ++ __ ld_d(length, Address(SP, 2 * BytesPerWord)); ++ __ ld_d(src_pos, Address(SP, 3 * BytesPerWord)); ++ __ ld_d(src, Address(SP, 4 * BytesPerWord)); ++ ++ // tmp is -1^K where K == partial copied count ++ __ nor(SCR1, tmp, R0); ++ // adjust length down and src/end pos up by partial copied count ++ __ sub_w(length, length, SCR1); ++ __ add_w(src_pos, src_pos, SCR1); ++ __ add_w(dst_pos, dst_pos, SCR1); ++ __ b(*stub->entry()); ++ ++ __ bind(*stub->continuation()); ++ return; ++ } ++ ++ assert(default_type != nullptr && default_type->is_array_klass() && default_type->is_loaded(), ++ "must be true at this point"); ++ ++ int elem_size = type2aelembytes(basic_type); ++ Address::ScaleFactor scale = Address::times(elem_size); ++ ++ Address src_length_addr = Address(src, arrayOopDesc::length_offset_in_bytes()); ++ Address dst_length_addr = Address(dst, arrayOopDesc::length_offset_in_bytes()); ++ Address src_klass_addr = Address(src, oopDesc::klass_offset_in_bytes()); ++ Address dst_klass_addr = Address(dst, oopDesc::klass_offset_in_bytes()); ++ ++ // test for null ++ if (flags & LIR_OpArrayCopy::src_null_check) { ++ __ beqz(src, *stub->entry()); ++ } ++ if (flags & LIR_OpArrayCopy::dst_null_check) { ++ __ beqz(dst, *stub->entry()); ++ } ++ ++ // If the compiler was not able to prove that exact type of the source or the destination ++ // of the arraycopy is an array type, check at runtime if the source or the destination is ++ // an instance type. ++ if (flags & LIR_OpArrayCopy::type_check) { ++ if (!(flags & LIR_OpArrayCopy::LIR_OpArrayCopy::dst_objarray)) { ++ __ load_klass(tmp, dst); ++ __ ld_w(SCR1, Address(tmp, in_bytes(Klass::layout_helper_offset()))); ++ __ li(SCR2, Klass::_lh_neutral_value); ++ __ bge_far(SCR1, SCR2, *stub->entry(), true); ++ } ++ ++ if (!(flags & LIR_OpArrayCopy::LIR_OpArrayCopy::src_objarray)) { ++ __ load_klass(tmp, src); ++ __ ld_w(SCR1, Address(tmp, in_bytes(Klass::layout_helper_offset()))); ++ __ li(SCR2, Klass::_lh_neutral_value); ++ __ bge_far(SCR1, SCR2, *stub->entry(), true); ++ } ++ } ++ ++ // check if negative ++ if (flags & LIR_OpArrayCopy::src_pos_positive_check) { ++ __ blt_far(src_pos, R0, *stub->entry(), true); ++ } ++ if (flags & LIR_OpArrayCopy::dst_pos_positive_check) { ++ __ blt_far(dst_pos, R0, *stub->entry(), true); ++ } ++ ++ if (flags & LIR_OpArrayCopy::length_positive_check) { ++ __ blt_far(length, R0, *stub->entry(), true); ++ } ++ ++ if (flags & LIR_OpArrayCopy::src_range_check) { ++ __ add_w(tmp, src_pos, length); ++ __ ld_wu(SCR1, src_length_addr); ++ __ blt_far(SCR1, tmp, *stub->entry(), false); ++ } ++ if (flags & LIR_OpArrayCopy::dst_range_check) { ++ __ add_w(tmp, dst_pos, length); ++ __ ld_wu(SCR1, dst_length_addr); ++ __ blt_far(SCR1, tmp, *stub->entry(), false); ++ } ++ ++ if (flags & LIR_OpArrayCopy::type_check) { ++ // We don't know the array types are compatible ++ if (basic_type != T_OBJECT) { ++ // Simple test for basic type arrays ++ if (UseCompressedClassPointers) { ++ __ ld_wu(tmp, src_klass_addr); ++ __ ld_wu(SCR1, dst_klass_addr); ++ } else { ++ __ ld_d(tmp, src_klass_addr); ++ __ ld_d(SCR1, dst_klass_addr); ++ } ++ __ bne_far(tmp, SCR1, *stub->entry()); ++ } else { ++ // For object arrays, if src is a sub class of dst then we can ++ // safely do the copy. ++ Label cont, slow; ++ ++ __ addi_d(SP, SP, -2 * wordSize); ++ __ st_d(dst, Address(SP, 0 * wordSize)); ++ __ st_d(src, Address(SP, 1 * wordSize)); ++ ++ __ load_klass(src, src); ++ __ load_klass(dst, dst); ++ ++ __ check_klass_subtype_fast_path(src, dst, tmp, &cont, &slow, nullptr); ++ ++ __ addi_d(SP, SP, -2 * wordSize); ++ __ st_d(dst, Address(SP, 0 * wordSize)); ++ __ st_d(src, Address(SP, 1 * wordSize)); ++ __ call(Runtime1::entry_for(Runtime1::slow_subtype_check_id), relocInfo::runtime_call_type); ++ __ ld_d(dst, Address(SP, 0 * wordSize)); ++ __ ld_d(src, Address(SP, 1 * wordSize)); ++ __ addi_d(SP, SP, 2 * wordSize); ++ ++ __ bnez(dst, cont); ++ ++ __ bind(slow); ++ __ ld_d(dst, Address(SP, 0 * wordSize)); ++ __ ld_d(src, Address(SP, 1 * wordSize)); ++ __ addi_d(SP, SP, 2 * wordSize); ++ ++ address copyfunc_addr = StubRoutines::checkcast_arraycopy(); ++ if (copyfunc_addr != nullptr) { // use stub if available ++ // src is not a sub class of dst so we have to do a ++ // per-element check. ++ ++ int mask = LIR_OpArrayCopy::src_objarray|LIR_OpArrayCopy::dst_objarray; ++ if ((flags & mask) != mask) { ++ // Check that at least both of them object arrays. ++ assert(flags & mask, "one of the two should be known to be an object array"); ++ ++ if (!(flags & LIR_OpArrayCopy::src_objarray)) { ++ __ load_klass(tmp, src); ++ } else if (!(flags & LIR_OpArrayCopy::dst_objarray)) { ++ __ load_klass(tmp, dst); ++ } ++ int lh_offset = in_bytes(Klass::layout_helper_offset()); ++ Address klass_lh_addr(tmp, lh_offset); ++ jint objArray_lh = Klass::array_layout_helper(T_OBJECT); ++ __ ld_w(SCR1, klass_lh_addr); ++ __ li(SCR2, objArray_lh); ++ __ XOR(SCR1, SCR1, SCR2); ++ __ bnez(SCR1, *stub->entry()); ++ } ++ ++ // Spill because stubs can use any register they like and it's ++ // easier to restore just those that we care about. ++ __ st_d(dst, Address(SP, 0 * BytesPerWord)); ++ __ st_d(dst_pos, Address(SP, 1 * BytesPerWord)); ++ __ st_d(length, Address(SP, 2 * BytesPerWord)); ++ __ st_d(src_pos, Address(SP, 3 * BytesPerWord)); ++ __ st_d(src, Address(SP, 4 * BytesPerWord)); ++ ++ __ lea(A0, Address(src, src_pos, scale)); ++ __ addi_d(A0, A0, arrayOopDesc::base_offset_in_bytes(basic_type)); ++ assert_different_registers(A0, dst, dst_pos, length); ++ __ load_klass(A4, dst); ++ assert_different_registers(A4, dst, dst_pos, length); ++ __ lea(A1, Address(dst, dst_pos, scale)); ++ __ addi_d(A1, A1, arrayOopDesc::base_offset_in_bytes(basic_type)); ++ assert_different_registers(A1, length); ++ __ bstrpick_d(A2, length, 31, 0); ++ __ ld_d(A4, Address(A4, ObjArrayKlass::element_klass_offset())); ++ __ ld_w(A3, Address(A4, Klass::super_check_offset_offset())); ++ __ call(copyfunc_addr, relocInfo::runtime_call_type); ++ ++#ifndef PRODUCT ++ if (PrintC1Statistics) { ++ Label failed; ++ __ bnez(A0, failed); ++ __ li(SCR2, (address)&Runtime1::_arraycopy_checkcast_cnt); ++ __ increment(SCR2, 1); ++ __ bind(failed); ++ } ++#endif ++ ++ __ beqz(A0, *stub->continuation()); ++ ++#ifndef PRODUCT ++ if (PrintC1Statistics) { ++ __ li(SCR2, (address)&Runtime1::_arraycopy_checkcast_attempt_cnt); ++ __ increment(SCR2, 1); ++ } ++#endif ++ assert_different_registers(dst, dst_pos, length, src_pos, src, tmp, SCR1); ++ __ move(tmp, A0); ++ ++ // Restore previously spilled arguments ++ __ ld_d(dst, Address(SP, 0 * BytesPerWord)); ++ __ ld_d(dst_pos, Address(SP, 1 * BytesPerWord)); ++ __ ld_d(length, Address(SP, 2 * BytesPerWord)); ++ __ ld_d(src_pos, Address(SP, 3 * BytesPerWord)); ++ __ ld_d(src, Address(SP, 4 * BytesPerWord)); ++ ++ // return value is -1^K where K is partial copied count ++ __ nor(SCR1, tmp, R0); ++ // adjust length down and src/end pos up by partial copied count ++ __ sub_w(length, length, SCR1); ++ __ add_w(src_pos, src_pos, SCR1); ++ __ add_w(dst_pos, dst_pos, SCR1); ++ } ++ ++ __ b(*stub->entry()); ++ ++ __ bind(cont); ++ __ ld_d(dst, Address(SP, 0 * wordSize)); ++ __ ld_d(src, Address(SP, 1 * wordSize)); ++ __ addi_d(SP, SP, 2 * wordSize); ++ } ++ } ++ ++#ifdef ASSERT ++ if (basic_type != T_OBJECT || !(flags & LIR_OpArrayCopy::type_check)) { ++ // Sanity check the known type with the incoming class. For the ++ // primitive case the types must match exactly with src.klass and ++ // dst.klass each exactly matching the default type. For the ++ // object array case, if no type check is needed then either the ++ // dst type is exactly the expected type and the src type is a ++ // subtype which we can't check or src is the same array as dst ++ // but not necessarily exactly of type default_type. ++ Label known_ok, halt; ++ __ mov_metadata(tmp, default_type->constant_encoding()); ++ if (UseCompressedClassPointers) { ++ __ encode_klass_not_null(tmp); ++ } ++ ++ if (basic_type != T_OBJECT) { ++ ++ if (UseCompressedClassPointers) { ++ __ ld_wu(SCR1, dst_klass_addr); ++ } else { ++ __ ld_d(SCR1, dst_klass_addr); ++ } ++ __ bne(tmp, SCR1, halt); ++ if (UseCompressedClassPointers) { ++ __ ld_wu(SCR1, src_klass_addr); ++ } else { ++ __ ld_d(SCR1, src_klass_addr); ++ } ++ __ beq(tmp, SCR1, known_ok); ++ } else { ++ if (UseCompressedClassPointers) { ++ __ ld_wu(SCR1, dst_klass_addr); ++ } else { ++ __ ld_d(SCR1, dst_klass_addr); ++ } ++ __ beq(tmp, SCR1, known_ok); ++ __ beq(src, dst, known_ok); ++ } ++ __ bind(halt); ++ __ stop("incorrect type information in arraycopy"); ++ __ bind(known_ok); ++ } ++#endif ++ ++#ifndef PRODUCT ++ if (PrintC1Statistics) { ++ __ li(SCR2, Runtime1::arraycopy_count_address(basic_type)); ++ __ increment(SCR2, 1); ++ } ++#endif ++ ++ __ lea(A0, Address(src, src_pos, scale)); ++ __ addi_d(A0, A0, arrayOopDesc::base_offset_in_bytes(basic_type)); ++ assert_different_registers(A0, dst, dst_pos, length); ++ __ lea(A1, Address(dst, dst_pos, scale)); ++ __ addi_d(A1, A1, arrayOopDesc::base_offset_in_bytes(basic_type)); ++ assert_different_registers(A1, length); ++ __ bstrpick_d(A2, length, 31, 0); ++ ++ bool disjoint = (flags & LIR_OpArrayCopy::overlapping) == 0; ++ bool aligned = (flags & LIR_OpArrayCopy::unaligned) == 0; ++ const char *name; ++ address entry = StubRoutines::select_arraycopy_function(basic_type, aligned, disjoint, name, false); ++ ++ CodeBlob *cb = CodeCache::find_blob(entry); ++ if (cb) { ++ __ call(entry, relocInfo::runtime_call_type); ++ } else { ++ __ call_VM_leaf(entry, 3); ++ } ++ ++ __ bind(*stub->continuation()); ++} ++ ++void LIR_Assembler::emit_lock(LIR_OpLock* op) { ++ Register obj = op->obj_opr()->as_register(); // may not be an oop ++ Register hdr = op->hdr_opr()->as_register(); ++ Register lock = op->lock_opr()->as_register(); ++ if (LockingMode == LM_MONITOR) { ++ if (op->info() != nullptr) { ++ add_debug_info_for_null_check_here(op->info()); ++ __ null_check(obj, -1); ++ } ++ __ b(*op->stub()->entry()); ++ } else if (op->code() == lir_lock) { ++ assert(BasicLock::displaced_header_offset_in_bytes() == 0, ++ "lock_reg must point to the displaced header"); ++ // add debug info for NullPointerException only if one is possible ++ int null_check_offset = __ lock_object(hdr, obj, lock, *op->stub()->entry()); ++ if (op->info() != nullptr) { ++ add_debug_info_for_null_check(null_check_offset, op->info()); ++ } ++ // done ++ } else if (op->code() == lir_unlock) { ++ assert(BasicLock::displaced_header_offset_in_bytes() == 0, ++ "lock_reg must point to the displaced header"); ++ __ unlock_object(hdr, obj, lock, *op->stub()->entry()); ++ } else { ++ Unimplemented(); ++ } ++ __ bind(*op->stub()->continuation()); ++} ++ ++void LIR_Assembler::emit_load_klass(LIR_OpLoadKlass* op) { ++ Register obj = op->obj()->as_pointer_register(); ++ Register result = op->result_opr()->as_pointer_register(); ++ ++ CodeEmitInfo* info = op->info(); ++ if (info != nullptr) { ++ add_debug_info_for_null_check_here(info); ++ } ++ ++ if (UseCompressedClassPointers) { ++ __ ld_wu(result, obj, oopDesc::klass_offset_in_bytes()); ++ __ decode_klass_not_null(result); ++ } else { ++ __ ld_d(result, obj, oopDesc::klass_offset_in_bytes()); ++ } ++} ++ ++void LIR_Assembler::emit_profile_call(LIR_OpProfileCall* op) { ++ ciMethod* method = op->profiled_method(); ++ ciMethod* callee = op->profiled_callee(); ++ int bci = op->profiled_bci(); ++ ++ // Update counter for all call types ++ ciMethodData* md = method->method_data_or_null(); ++ assert(md != nullptr, "Sanity"); ++ ciProfileData* data = md->bci_to_data(bci); ++ assert(data != nullptr && data->is_CounterData(), "need CounterData for calls"); ++ assert(op->mdo()->is_single_cpu(), "mdo must be allocated"); ++ Register mdo = op->mdo()->as_register(); ++ __ mov_metadata(mdo, md->constant_encoding()); ++ Address counter_addr(mdo, md->byte_offset_of_slot(data, CounterData::count_offset())); ++ // Perform additional virtual call profiling for invokevirtual and ++ // invokeinterface bytecodes ++ if (op->should_profile_receiver_type()) { ++ assert(op->recv()->is_single_cpu(), "recv must be allocated"); ++ Register recv = op->recv()->as_register(); ++ assert_different_registers(mdo, recv); ++ assert(data->is_VirtualCallData(), "need VirtualCallData for virtual calls"); ++ ciKlass* known_klass = op->known_holder(); ++ if (C1OptimizeVirtualCallProfiling && known_klass != nullptr) { ++ // We know the type that will be seen at this call site; we can ++ // statically update the MethodData* rather than needing to do ++ // dynamic tests on the receiver type ++ ++ // NOTE: we should probably put a lock around this search to ++ // avoid collisions by concurrent compilations ++ ciVirtualCallData* vc_data = (ciVirtualCallData*) data; ++ uint i; ++ for (i = 0; i < VirtualCallData::row_limit(); i++) { ++ ciKlass* receiver = vc_data->receiver(i); ++ if (known_klass->equals(receiver)) { ++ Address data_addr(mdo, md->byte_offset_of_slot(data, VirtualCallData::receiver_count_offset(i))); ++ __ ld_d(SCR2, data_addr); ++ __ addi_d(SCR2, SCR2, DataLayout::counter_increment); ++ __ st_d(SCR2, data_addr); ++ return; ++ } ++ } ++ ++ // Receiver type not found in profile data; select an empty slot ++ ++ // Note that this is less efficient than it should be because it ++ // always does a write to the receiver part of the ++ // VirtualCallData rather than just the first time ++ for (i = 0; i < VirtualCallData::row_limit(); i++) { ++ ciKlass* receiver = vc_data->receiver(i); ++ if (receiver == nullptr) { ++ Address recv_addr(mdo, md->byte_offset_of_slot(data, VirtualCallData::receiver_offset(i))); ++ __ mov_metadata(SCR2, known_klass->constant_encoding()); ++ __ lea(SCR1, recv_addr); ++ __ st_d(SCR2, SCR1, 0); ++ Address data_addr(mdo, md->byte_offset_of_slot(data, VirtualCallData::receiver_count_offset(i))); ++ __ ld_d(SCR2, data_addr); ++ __ addi_d(SCR2, SCR1, DataLayout::counter_increment); ++ __ st_d(SCR2, data_addr); ++ return; ++ } ++ } ++ } else { ++ __ load_klass(recv, recv); ++ Label update_done; ++ type_profile_helper(mdo, md, data, recv, &update_done); ++ // Receiver did not match any saved receiver and there is no empty row for it. ++ // Increment total counter to indicate polymorphic case. ++ __ ld_d(SCR2, counter_addr); ++ __ addi_d(SCR2, SCR2, DataLayout::counter_increment); ++ __ st_d(SCR2, counter_addr); ++ ++ __ bind(update_done); ++ } ++ } else { ++ // Static call ++ __ ld_d(SCR2, counter_addr); ++ __ addi_d(SCR2, SCR2, DataLayout::counter_increment); ++ __ st_d(SCR2, counter_addr); ++ } ++} ++ ++void LIR_Assembler::emit_delay(LIR_OpDelay*) { ++ Unimplemented(); ++} ++ ++void LIR_Assembler::monitor_address(int monitor_no, LIR_Opr dst) { ++ __ lea(dst->as_register(), frame_map()->address_for_monitor_lock(monitor_no)); ++} ++ ++void LIR_Assembler::emit_updatecrc32(LIR_OpUpdateCRC32* op) { ++ assert(op->crc()->is_single_cpu(), "crc must be register"); ++ assert(op->val()->is_single_cpu(), "byte value must be register"); ++ assert(op->result_opr()->is_single_cpu(), "result must be register"); ++ Register crc = op->crc()->as_register(); ++ Register val = op->val()->as_register(); ++ Register res = op->result_opr()->as_register(); ++ ++ assert_different_registers(val, crc, res); ++ __ li(res, StubRoutines::crc_table_addr()); ++ __ nor(crc, crc, R0); // ~crc ++ __ update_byte_crc32(crc, val, res); ++ __ nor(res, crc, R0); // ~crc ++} ++ ++void LIR_Assembler::emit_profile_type(LIR_OpProfileType* op) { ++ COMMENT("emit_profile_type {"); ++ Register obj = op->obj()->as_register(); ++ Register tmp = op->tmp()->as_pointer_register(); ++ Address mdo_addr = as_Address(op->mdp()->as_address_ptr()); ++ ciKlass* exact_klass = op->exact_klass(); ++ intptr_t current_klass = op->current_klass(); ++ bool not_null = op->not_null(); ++ bool no_conflict = op->no_conflict(); ++ ++ Label update, next, none; ++ ++ bool do_null = !not_null; ++ bool exact_klass_set = exact_klass != nullptr && ciTypeEntries::valid_ciklass(current_klass) == exact_klass; ++ bool do_update = !TypeEntries::is_type_unknown(current_klass) && !exact_klass_set; ++ ++ assert(do_null || do_update, "why are we here?"); ++ assert(!TypeEntries::was_null_seen(current_klass) || do_update, "why are we here?"); ++ assert(mdo_addr.base() != SCR1, "wrong register"); ++ ++ __ verify_oop(obj); ++ ++ if (tmp != obj) { ++ __ move(tmp, obj); ++ } ++ if (do_null) { ++ __ bnez(tmp, update); ++ if (!TypeEntries::was_null_seen(current_klass)) { ++ __ ld_d(SCR2, mdo_addr); ++ __ ori(SCR2, SCR2, TypeEntries::null_seen); ++ __ st_d(SCR2, mdo_addr); ++ } ++ if (do_update) { ++#ifndef ASSERT ++ __ b(next); ++ } ++#else ++ __ b(next); ++ } ++ } else { ++ __ bnez(tmp, update); ++ __ stop("unexpected null obj"); ++#endif ++ } ++ ++ __ bind(update); ++ ++ if (do_update) { ++#ifdef ASSERT ++ if (exact_klass != nullptr) { ++ Label ok; ++ __ load_klass(tmp, tmp); ++ __ mov_metadata(SCR1, exact_klass->constant_encoding()); ++ __ XOR(SCR1, tmp, SCR1); ++ __ beqz(SCR1, ok); ++ __ stop("exact klass and actual klass differ"); ++ __ bind(ok); ++ } ++#endif ++ if (!no_conflict) { ++ if (exact_klass == nullptr || TypeEntries::is_type_none(current_klass)) { ++ if (exact_klass != nullptr) { ++ __ mov_metadata(tmp, exact_klass->constant_encoding()); ++ } else { ++ __ load_klass(tmp, tmp); ++ } ++ ++ __ ld_d(SCR2, mdo_addr); ++ __ XOR(tmp, tmp, SCR2); ++ assert(TypeEntries::type_klass_mask == -4, "must be"); ++ __ bstrpick_d(SCR1, tmp, 63, 2); ++ // klass seen before, nothing to do. The unknown bit may have been ++ // set already but no need to check. ++ __ beqz(SCR1, next); ++ ++ __ andi(SCR1, tmp, TypeEntries::type_unknown); ++ __ bnez(SCR1, next); // already unknown. Nothing to do anymore. ++ ++ if (TypeEntries::is_type_none(current_klass)) { ++ __ beqz(SCR2, none); ++ __ li(SCR1, (u1)TypeEntries::null_seen); ++ __ beq(SCR2, SCR1, none); ++ // There is a chance that the checks above (re-reading profiling ++ // data from memory) fail if another thread has just set the ++ // profiling to this obj's klass ++ membar_acquire(); ++ __ ld_d(SCR2, mdo_addr); ++ __ XOR(tmp, tmp, SCR2); ++ assert(TypeEntries::type_klass_mask == -4, "must be"); ++ __ bstrpick_d(SCR1, tmp, 63, 2); ++ __ beqz(SCR1, next); ++ } ++ } else { ++ assert(ciTypeEntries::valid_ciklass(current_klass) != nullptr && ++ ciTypeEntries::valid_ciklass(current_klass) != exact_klass, "conflict only"); ++ ++ __ ld_d(tmp, mdo_addr); ++ __ andi(SCR2, tmp, TypeEntries::type_unknown); ++ __ bnez(SCR2, next); // already unknown. Nothing to do anymore. ++ } ++ ++ // different than before. Cannot keep accurate profile. ++ __ ld_d(SCR2, mdo_addr); ++ __ ori(SCR2, SCR2, TypeEntries::type_unknown); ++ __ st_d(SCR2, mdo_addr); ++ ++ if (TypeEntries::is_type_none(current_klass)) { ++ __ b(next); ++ ++ __ bind(none); ++ // first time here. Set profile type. ++ __ st_d(tmp, mdo_addr); ++ } ++ } else { ++ // There's a single possible klass at this profile point ++ assert(exact_klass != nullptr, "should be"); ++ if (TypeEntries::is_type_none(current_klass)) { ++ __ mov_metadata(tmp, exact_klass->constant_encoding()); ++ __ ld_d(SCR2, mdo_addr); ++ __ XOR(tmp, tmp, SCR2); ++ assert(TypeEntries::type_klass_mask == -4, "must be"); ++ __ bstrpick_d(SCR1, tmp, 63, 2); ++ __ beqz(SCR1, next); ++#ifdef ASSERT ++ { ++ Label ok; ++ __ ld_d(SCR1, mdo_addr); ++ __ beqz(SCR1, ok); ++ __ li(SCR2, (u1)TypeEntries::null_seen); ++ __ beq(SCR1, SCR2, ok); ++ // may have been set by another thread ++ membar_acquire(); ++ __ mov_metadata(SCR1, exact_klass->constant_encoding()); ++ __ ld_d(SCR2, mdo_addr); ++ __ XOR(SCR2, SCR1, SCR2); ++ assert(TypeEntries::type_mask == -2, "must be"); ++ __ bstrpick_d(SCR2, SCR2, 63, 1); ++ __ beqz(SCR2, ok); ++ ++ __ stop("unexpected profiling mismatch"); ++ __ bind(ok); ++ } ++#endif ++ // first time here. Set profile type. ++ __ st_d(tmp, mdo_addr); ++ } else { ++ assert(ciTypeEntries::valid_ciklass(current_klass) != nullptr && ++ ciTypeEntries::valid_ciklass(current_klass) != exact_klass, "inconsistent"); ++ ++ __ ld_d(tmp, mdo_addr); ++ __ andi(SCR1, tmp, TypeEntries::type_unknown); ++ __ bnez(SCR1, next); // already unknown. Nothing to do anymore. ++ ++ __ ori(tmp, tmp, TypeEntries::type_unknown); ++ __ st_d(tmp, mdo_addr); ++ // FIXME: Write barrier needed here? ++ } ++ } ++ ++ __ bind(next); ++ } ++ COMMENT("} emit_profile_type"); ++} ++ ++void LIR_Assembler::align_backward_branch_target() {} ++ ++void LIR_Assembler::negate(LIR_Opr left, LIR_Opr dest, LIR_Opr tmp) { ++ // tmp must be unused ++ assert(tmp->is_illegal(), "wasting a register if tmp is allocated"); ++ ++ if (left->is_single_cpu()) { ++ assert(dest->is_single_cpu(), "expect single result reg"); ++ __ sub_w(dest->as_register(), R0, left->as_register()); ++ } else if (left->is_double_cpu()) { ++ assert(dest->is_double_cpu(), "expect double result reg"); ++ __ sub_d(dest->as_register_lo(), R0, left->as_register_lo()); ++ } else if (left->is_single_fpu()) { ++ assert(dest->is_single_fpu(), "expect single float result reg"); ++ __ fneg_s(dest->as_float_reg(), left->as_float_reg()); ++ } else { ++ assert(left->is_double_fpu(), "expect double float operand reg"); ++ assert(dest->is_double_fpu(), "expect double float result reg"); ++ __ fneg_d(dest->as_double_reg(), left->as_double_reg()); ++ } ++} ++ ++void LIR_Assembler::leal(LIR_Opr addr, LIR_Opr dest, LIR_PatchCode patch_code, ++ CodeEmitInfo* info) { ++ if (patch_code != lir_patch_none) { ++ deoptimize_trap(info); ++ return; ++ } ++ ++ __ lea(dest->as_register_lo(), as_Address(addr->as_address_ptr())); ++} ++ ++void LIR_Assembler::rt_call(LIR_Opr result, address dest, const LIR_OprList* args, ++ LIR_Opr tmp, CodeEmitInfo* info) { ++ assert(!tmp->is_valid(), "don't need temporary"); ++ __ call(dest, relocInfo::runtime_call_type); ++ if (info != nullptr) { ++ add_call_info_here(info); ++ } ++ __ post_call_nop(); ++} ++ ++void LIR_Assembler::volatile_move_op(LIR_Opr src, LIR_Opr dest, BasicType type, ++ CodeEmitInfo* info) { ++ if (dest->is_address() || src->is_address()) { ++ move_op(src, dest, type, lir_patch_none, info, /*pop_fpu_stack*/ false, /*wide*/ false); ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++#ifdef ASSERT ++// emit run-time assertion ++void LIR_Assembler::emit_assert(LIR_OpAssert* op) { ++ assert(op->code() == lir_assert, "must be"); ++ Label ok; ++ ++ if (op->in_opr1()->is_valid()) { ++ assert(op->in_opr2()->is_valid(), "both operands must be valid"); ++ assert(op->in_opr1()->is_cpu_register() || op->in_opr2()->is_cpu_register(), "must be"); ++ Register reg1 = as_reg(op->in_opr1()); ++ Register reg2 = as_reg(op->in_opr2()); ++ switch (op->condition()) { ++ case lir_cond_equal: __ beq(reg1, reg2, ok); break; ++ case lir_cond_notEqual: __ bne(reg1, reg2, ok); break; ++ case lir_cond_less: __ blt(reg1, reg2, ok); break; ++ case lir_cond_lessEqual: __ bge(reg2, reg1, ok); break; ++ case lir_cond_greaterEqual: __ bge(reg1, reg2, ok); break; ++ case lir_cond_greater: __ blt(reg2, reg1, ok); break; ++ case lir_cond_belowEqual: __ bgeu(reg2, reg1, ok); break; ++ case lir_cond_aboveEqual: __ bgeu(reg1, reg2, ok); break; ++ default: ShouldNotReachHere(); ++ } ++ } else { ++ assert(op->in_opr2()->is_illegal(), "both operands must be illegal"); ++ assert(op->condition() == lir_cond_always, "no other conditions allowed"); ++ } ++ if (op->halt()) { ++ const char* str = __ code_string(op->msg()); ++ __ stop(str); ++ } else { ++ breakpoint(); ++ } ++ __ bind(ok); ++} ++#endif ++ ++#ifndef PRODUCT ++#define COMMENT(x) do { __ block_comment(x); } while (0) ++#else ++#define COMMENT(x) ++#endif ++ ++void LIR_Assembler::membar() { ++ COMMENT("membar"); ++ __ membar(Assembler::AnyAny); ++} ++ ++void LIR_Assembler::membar_acquire() { ++ __ membar(Assembler::Membar_mask_bits(Assembler::LoadLoad | Assembler::LoadStore)); ++} ++ ++void LIR_Assembler::membar_release() { ++ __ membar(Assembler::Membar_mask_bits(Assembler::LoadStore|Assembler::StoreStore)); ++} ++ ++void LIR_Assembler::membar_loadload() { ++ __ membar(Assembler::LoadLoad); ++} ++ ++void LIR_Assembler::membar_storestore() { ++ __ membar(MacroAssembler::StoreStore); ++} ++ ++void LIR_Assembler::membar_loadstore() { ++ __ membar(MacroAssembler::LoadStore); ++} ++ ++void LIR_Assembler::membar_storeload() { ++ __ membar(MacroAssembler::StoreLoad); ++} ++ ++void LIR_Assembler::on_spin_wait() { ++ Unimplemented(); ++} ++ ++void LIR_Assembler::get_thread(LIR_Opr result_reg) { ++ __ move(result_reg->as_register(), TREG); ++} ++ ++void LIR_Assembler::peephole(LIR_List *lir) { ++} ++ ++void LIR_Assembler::atomic_op(LIR_Code code, LIR_Opr src, LIR_Opr data, ++ LIR_Opr dest, LIR_Opr tmp_op) { ++ Address addr = as_Address(src->as_address_ptr()); ++ BasicType type = src->type(); ++ Register dst = as_reg(dest); ++ Register tmp = as_reg(tmp_op); ++ bool is_oop = is_reference_type(type); ++ ++ if (Assembler::is_simm(addr.disp(), 12)) { ++ __ addi_d(tmp, addr.base(), addr.disp()); ++ } else { ++ __ li(tmp, addr.disp()); ++ __ add_d(tmp, addr.base(), tmp); ++ } ++ if (addr.index() != noreg) { ++ if (addr.scale() != Address::no_scale) ++ __ alsl_d(tmp, addr.index(), tmp, addr.scale() - 1); ++ else ++ __ add_d(tmp, tmp, addr.index()); ++ } ++ ++ switch(type) { ++ case T_INT: ++ break; ++ case T_LONG: ++ break; ++ case T_OBJECT: ++ case T_ARRAY: ++ if (UseCompressedOops) { ++ // unsigned int ++ } else { ++ // long ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ ++ if (code == lir_xadd) { ++ Register inc = noreg; ++ if (data->is_constant()) { ++ inc = SCR1; ++ __ li(inc, as_long(data)); ++ } else { ++ inc = as_reg(data); ++ } ++ switch(type) { ++ case T_INT: ++ __ amadd_db_w(dst, inc, tmp); ++ break; ++ case T_LONG: ++ __ amadd_db_d(dst, inc, tmp); ++ break; ++ case T_OBJECT: ++ case T_ARRAY: ++ if (UseCompressedOops) { ++ __ amadd_db_w(dst, inc, tmp); ++ __ lu32i_d(dst, 0); ++ } else { ++ __ amadd_db_d(dst, inc, tmp); ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (code == lir_xchg) { ++ Register obj = as_reg(data); ++ if (is_oop && UseCompressedOops) { ++ __ encode_heap_oop(SCR2, obj); ++ obj = SCR2; ++ } ++ switch(type) { ++ case T_INT: ++ __ amswap_db_w(dst, obj, tmp); ++ break; ++ case T_LONG: ++ __ amswap_db_d(dst, obj, tmp); ++ break; ++ case T_OBJECT: ++ case T_ARRAY: ++ if (UseCompressedOops) { ++ __ amswap_db_w(dst, obj, tmp); ++ __ lu32i_d(dst, 0); ++ } else { ++ __ amswap_db_d(dst, obj, tmp); ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ if (is_oop && UseCompressedOops) { ++ __ decode_heap_oop(dst); ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++#undef __ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_LIRAssembler_loongarch.hpp b/src/hotspot/cpu/loongarch/c1_LIRAssembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/c1_LIRAssembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_LIRAssembler_loongarch.hpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,84 @@ ++/* ++ * Copyright (c) 2000, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_C1_LIRASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_C1_LIRASSEMBLER_LOONGARCH_HPP ++ ++// ArrayCopyStub needs access to bailout ++friend class ArrayCopyStub; ++ ++ private: ++ int array_element_size(BasicType type) const; ++ ++ void arith_fpu_implementation(LIR_Code code, int left_index, int right_index, ++ int dest_index, bool pop_fpu_stack); ++ ++ // helper functions which checks for overflow and sets bailout if it ++ // occurs. Always returns a valid embeddable pointer but in the ++ // bailout case the pointer won't be to unique storage. ++ address float_constant(float f); ++ address double_constant(double d); ++ ++ address int_constant(jlong n); ++ ++ bool is_literal_address(LIR_Address* addr); ++ ++ // Ensure we have a valid Address (base+offset) to a stack-slot. ++ Address stack_slot_address(int index, uint shift, int adjust = 0); ++ ++ // Record the type of the receiver in ReceiverTypeData ++ void type_profile_helper(Register mdo, ciMethodData *md, ciProfileData *data, ++ Register recv, Label* update_done); ++ void add_debug_info_for_branch(address adr, CodeEmitInfo* info); ++ ++ void casw(Register addr, Register newval, Register cmpval, Register result, bool sign); ++ void casl(Register addr, Register newval, Register cmpval, Register result); ++ ++ void poll_for_safepoint(relocInfo::relocType rtype, CodeEmitInfo* info = nullptr); ++ ++ static const int max_tableswitches = 20; ++ struct tableswitch switches[max_tableswitches]; ++ int tableswitch_count; ++ ++ void init() { tableswitch_count = 0; } ++ ++ void deoptimize_trap(CodeEmitInfo *info); ++ ++ void emit_cmp_branch(LIR_OpBranch* op); ++ ++ enum { ++ // call stub: CompiledStaticCall::to_interp_stub_size() + ++ // CompiledStaticCall::to_trampoline_stub_size() ++ _call_stub_size = 13 * NativeInstruction::nop_instruction_size, ++ _exception_handler_size = DEBUG_ONLY(1*K) NOT_DEBUG(175), ++ _deopt_handler_size = 7 * NativeInstruction::nop_instruction_size ++ }; ++ ++public: ++ void store_parameter(Register r, int offset_from_sp_in_words); ++ void store_parameter(jint c, int offset_from_sp_in_words); ++ void store_parameter(jobject c, int offset_from_sp_in_words); ++ ++#endif // CPU_LOONGARCH_C1_LIRASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_LIRGenerator_loongarch_64.cpp b/src/hotspot/cpu/loongarch/c1_LIRGenerator_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/c1_LIRGenerator_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_LIRGenerator_loongarch_64.cpp 2024-02-20 10:42:36.152196787 +0800 +@@ -0,0 +1,1391 @@ ++/* ++ * Copyright (c) 2005, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "c1/c1_Compilation.hpp" ++#include "c1/c1_FrameMap.hpp" ++#include "c1/c1_Instruction.hpp" ++#include "c1/c1_LIRAssembler.hpp" ++#include "c1/c1_LIRGenerator.hpp" ++#include "c1/c1_Runtime1.hpp" ++#include "c1/c1_ValueStack.hpp" ++#include "ci/ciArray.hpp" ++#include "ci/ciObjArrayKlass.hpp" ++#include "ci/ciTypeArrayKlass.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "utilities/powerOfTwo.hpp" ++#include "vmreg_loongarch.inline.hpp" ++ ++#ifdef ASSERT ++#define __ gen()->lir(__FILE__, __LINE__)-> ++#else ++#define __ gen()->lir()-> ++#endif ++ ++// Item will be loaded into a byte register; Intel only ++void LIRItem::load_byte_item() { ++ load_item(); ++} ++ ++void LIRItem::load_nonconstant() { ++ LIR_Opr r = value()->operand(); ++ if (r->is_constant()) { ++ _result = r; ++ } else { ++ load_item(); ++ } ++} ++ ++//-------------------------------------------------------------- ++// LIRGenerator ++//-------------------------------------------------------------- ++ ++LIR_Opr LIRGenerator::exceptionOopOpr() { return FrameMap::a0_oop_opr; } ++LIR_Opr LIRGenerator::exceptionPcOpr() { return FrameMap::a1_opr; } ++LIR_Opr LIRGenerator::divInOpr() { Unimplemented(); return LIR_OprFact::illegalOpr; } ++LIR_Opr LIRGenerator::divOutOpr() { Unimplemented(); return LIR_OprFact::illegalOpr; } ++LIR_Opr LIRGenerator::remOutOpr() { Unimplemented(); return LIR_OprFact::illegalOpr; } ++LIR_Opr LIRGenerator::shiftCountOpr() { Unimplemented(); return LIR_OprFact::illegalOpr; } ++LIR_Opr LIRGenerator::syncLockOpr() { return new_register(T_INT); } ++LIR_Opr LIRGenerator::syncTempOpr() { return FrameMap::a0_opr; } ++LIR_Opr LIRGenerator::getThreadTemp() { return LIR_OprFact::illegalOpr; } ++ ++LIR_Opr LIRGenerator::result_register_for(ValueType* type, bool callee) { ++ LIR_Opr opr; ++ switch (type->tag()) { ++ case intTag: opr = FrameMap::a0_opr; break; ++ case objectTag: opr = FrameMap::a0_oop_opr; break; ++ case longTag: opr = FrameMap::long0_opr; break; ++ case floatTag: opr = FrameMap::fpu0_float_opr; break; ++ case doubleTag: opr = FrameMap::fpu0_double_opr; break; ++ case addressTag: ++ default: ShouldNotReachHere(); return LIR_OprFact::illegalOpr; ++ } ++ ++ assert(opr->type_field() == as_OprType(as_BasicType(type)), "type mismatch"); ++ return opr; ++} ++ ++LIR_Opr LIRGenerator::rlock_byte(BasicType type) { ++ LIR_Opr reg = new_register(T_INT); ++ set_vreg_flag(reg, LIRGenerator::byte_reg); ++ return reg; ++} ++ ++//--------- loading items into registers -------------------------------- ++ ++bool LIRGenerator::can_store_as_constant(Value v, BasicType type) const { ++ if (v->type()->as_IntConstant() != nullptr) { ++ return v->type()->as_IntConstant()->value() == 0L; ++ } else if (v->type()->as_LongConstant() != nullptr) { ++ return v->type()->as_LongConstant()->value() == 0L; ++ } else if (v->type()->as_ObjectConstant() != nullptr) { ++ return v->type()->as_ObjectConstant()->value()->is_null_object(); ++ } else { ++ return false; ++ } ++} ++ ++bool LIRGenerator::can_inline_as_constant(Value v) const { ++ // FIXME: Just a guess ++ if (v->type()->as_IntConstant() != nullptr) { ++ return Assembler::is_simm(v->type()->as_IntConstant()->value(), 12); ++ } else if (v->type()->as_LongConstant() != nullptr) { ++ return v->type()->as_LongConstant()->value() == 0L; ++ } else if (v->type()->as_ObjectConstant() != nullptr) { ++ return v->type()->as_ObjectConstant()->value()->is_null_object(); ++ } else { ++ return false; ++ } ++} ++ ++bool LIRGenerator::can_inline_as_constant(LIR_Const* c) const { return false; } ++ ++LIR_Opr LIRGenerator::safepoint_poll_register() { ++ return LIR_OprFact::illegalOpr; ++} ++ ++LIR_Address* LIRGenerator::generate_address(LIR_Opr base, LIR_Opr index, ++ int shift, int disp, BasicType type) { ++ assert(base->is_register(), "must be"); ++ intx large_disp = disp; ++ ++ // accumulate fixed displacements ++ if (index->is_constant()) { ++ LIR_Const *constant = index->as_constant_ptr(); ++ if (constant->type() == T_INT) { ++ large_disp += index->as_jint() << shift; ++ } else { ++ assert(constant->type() == T_LONG, "should be"); ++ jlong c = index->as_jlong() << shift; ++ if ((jlong)((jint)c) == c) { ++ large_disp += c; ++ index = LIR_OprFact::illegalOpr; ++ } else { ++ LIR_Opr tmp = new_register(T_LONG); ++ __ move(index, tmp); ++ index = tmp; ++ // apply shift and displacement below ++ } ++ } ++ } ++ ++ if (index->is_register()) { ++ // apply the shift and accumulate the displacement ++ if (shift > 0) { ++ LIR_Opr tmp = new_pointer_register(); ++ __ shift_left(index, shift, tmp); ++ index = tmp; ++ } ++ if (large_disp != 0) { ++ LIR_Opr tmp = new_pointer_register(); ++ if (Assembler::is_simm(large_disp, 12)) { ++ __ add(index, LIR_OprFact::intptrConst(large_disp), tmp); ++ index = tmp; ++ } else { ++ __ move(LIR_OprFact::intptrConst(large_disp), tmp); ++ __ add(tmp, index, tmp); ++ index = tmp; ++ } ++ large_disp = 0; ++ } ++ } else if (large_disp != 0 && !Assembler::is_simm(large_disp, 12)) { ++ // index is illegal so replace it with the displacement loaded into a register ++ index = new_pointer_register(); ++ __ move(LIR_OprFact::intptrConst(large_disp), index); ++ large_disp = 0; ++ } ++ ++ // at this point we either have base + index or base + displacement ++ if (large_disp == 0 && index->is_register()) { ++ return new LIR_Address(base, index, type); ++ } else { ++ assert(Assembler::is_simm(large_disp, 12), "must be"); ++ return new LIR_Address(base, large_disp, type); ++ } ++} ++ ++LIR_Address* LIRGenerator::emit_array_address(LIR_Opr array_opr, LIR_Opr index_opr, BasicType type) { ++ int offset_in_bytes = arrayOopDesc::base_offset_in_bytes(type); ++ int elem_size = type2aelembytes(type); ++ int shift = exact_log2(elem_size); ++ ++ LIR_Address* addr; ++ if (index_opr->is_constant()) { ++ addr = new LIR_Address(array_opr, offset_in_bytes + (intx)(index_opr->as_jint()) * elem_size, type); ++ } else { ++ if (offset_in_bytes) { ++ LIR_Opr tmp = new_pointer_register(); ++ __ add(array_opr, LIR_OprFact::intConst(offset_in_bytes), tmp); ++ array_opr = tmp; ++ offset_in_bytes = 0; ++ } ++ addr = new LIR_Address(array_opr, index_opr, LIR_Address::scale(type), offset_in_bytes, type); ++ } ++ return addr; ++} ++ ++LIR_Opr LIRGenerator::load_immediate(jlong x, BasicType type) { ++ LIR_Opr r; ++ if (type == T_LONG) { ++ r = LIR_OprFact::longConst(x); ++ if (!Assembler::is_simm(x, 12)) { ++ LIR_Opr tmp = new_register(type); ++ __ move(r, tmp); ++ return tmp; ++ } ++ } else if (type == T_INT) { ++ r = LIR_OprFact::intConst(checked_cast(x)); ++ if (!Assembler::is_simm(x, 12)) { ++ // This is all rather nasty. We don't know whether our constant ++ // is required for a logical or an arithmetic operation, wo we ++ // don't know what the range of valid values is!! ++ LIR_Opr tmp = new_register(type); ++ __ move(r, tmp); ++ return tmp; ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++ return r; ++} ++ ++void LIRGenerator::increment_counter(address counter, BasicType type, int step) { ++ LIR_Opr pointer = new_pointer_register(); ++ __ move(LIR_OprFact::intptrConst(counter), pointer); ++ LIR_Address* addr = new LIR_Address(pointer, type); ++ increment_counter(addr, step); ++} ++ ++void LIRGenerator::increment_counter(LIR_Address* addr, int step) { ++ LIR_Opr imm; ++ switch(addr->type()) { ++ case T_INT: ++ imm = LIR_OprFact::intConst(step); ++ break; ++ case T_LONG: ++ imm = LIR_OprFact::longConst(step); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ LIR_Opr reg = new_register(addr->type()); ++ __ load(addr, reg); ++ __ add(reg, imm, reg); ++ __ store(reg, addr); ++} ++ ++void LIRGenerator::cmp_mem_int(LIR_Condition condition, LIR_Opr base, int disp, int c, CodeEmitInfo* info) { ++ LIR_Opr reg = new_register(T_INT); ++ __ load(generate_address(base, disp, T_INT), reg, info); ++ __ cmp(condition, reg, LIR_OprFact::intConst(c)); ++} ++ ++void LIRGenerator::cmp_reg_mem(LIR_Condition condition, LIR_Opr reg, LIR_Opr base, int disp, BasicType type, CodeEmitInfo* info) { ++ LIR_Opr reg1 = new_register(T_INT); ++ __ load(generate_address(base, disp, type), reg1, info); ++ __ cmp(condition, reg, reg1); ++} ++ ++bool LIRGenerator::strength_reduce_multiply(LIR_Opr left, jint c, LIR_Opr result, LIR_Opr tmp) { ++ if (is_power_of_2(c - 1)) { ++ __ shift_left(left, exact_log2(c - 1), tmp); ++ __ add(tmp, left, result); ++ return true; ++ } else if (is_power_of_2(c + 1)) { ++ __ shift_left(left, exact_log2(c + 1), tmp); ++ __ sub(tmp, left, result); ++ return true; ++ } else { ++ return false; ++ } ++} ++ ++void LIRGenerator::store_stack_parameter (LIR_Opr item, ByteSize offset_from_sp) { ++ BasicType type = item->type(); ++ __ store(item, new LIR_Address(FrameMap::sp_opr, in_bytes(offset_from_sp), type)); ++} ++ ++void LIRGenerator::array_store_check(LIR_Opr value, LIR_Opr array, CodeEmitInfo* store_check_info, ++ ciMethod* profiled_method, int profiled_bci) { ++ LIR_Opr tmp1 = new_register(objectType); ++ LIR_Opr tmp2 = new_register(objectType); ++ LIR_Opr tmp3 = new_register(objectType); ++ __ store_check(value, array, tmp1, tmp2, tmp3, store_check_info, profiled_method, profiled_bci); ++} ++ ++//---------------------------------------------------------------------- ++// visitor functions ++//---------------------------------------------------------------------- ++ ++void LIRGenerator::do_MonitorEnter(MonitorEnter* x) { ++ assert(x->is_pinned(),""); ++ LIRItem obj(x->obj(), this); ++ obj.load_item(); ++ ++ set_no_result(x); ++ ++ // "lock" stores the address of the monitor stack slot, so this is not an oop ++ LIR_Opr lock = new_register(T_INT); ++ ++ CodeEmitInfo* info_for_exception = nullptr; ++ if (x->needs_null_check()) { ++ info_for_exception = state_for(x); ++ } ++ // this CodeEmitInfo must not have the xhandlers because here the ++ // object is already locked (xhandlers expect object to be unlocked) ++ CodeEmitInfo* info = state_for(x, x->state(), true); ++ monitor_enter(obj.result(), lock, syncTempOpr(), LIR_OprFact::illegalOpr, ++ x->monitor_no(), info_for_exception, info); ++} ++ ++void LIRGenerator::do_MonitorExit(MonitorExit* x) { ++ assert(x->is_pinned(),""); ++ ++ LIRItem obj(x->obj(), this); ++ obj.dont_load_item(); ++ ++ LIR_Opr lock = new_register(T_INT); ++ LIR_Opr obj_temp = new_register(T_INT); ++ set_no_result(x); ++ monitor_exit(obj_temp, lock, syncTempOpr(), LIR_OprFact::illegalOpr, x->monitor_no()); ++} ++ ++void LIRGenerator::do_NegateOp(NegateOp* x) { ++ LIRItem from(x->x(), this); ++ from.load_item(); ++ LIR_Opr result = rlock_result(x); ++ __ negate (from.result(), result); ++} ++ ++// for _fadd, _fmul, _fsub, _fdiv, _frem ++// _dadd, _dmul, _dsub, _ddiv, _drem ++void LIRGenerator::do_ArithmeticOp_FPU(ArithmeticOp* x) { ++ if (x->op() == Bytecodes::_frem || x->op() == Bytecodes::_drem) { ++ // float remainder is implemented as a direct call into the runtime ++ LIRItem right(x->x(), this); ++ LIRItem left(x->y(), this); ++ ++ BasicTypeList signature(2); ++ if (x->op() == Bytecodes::_frem) { ++ signature.append(T_FLOAT); ++ signature.append(T_FLOAT); ++ } else { ++ signature.append(T_DOUBLE); ++ signature.append(T_DOUBLE); ++ } ++ CallingConvention* cc = frame_map()->c_calling_convention(&signature); ++ ++ const LIR_Opr result_reg = result_register_for(x->type()); ++ left.load_item_force(cc->at(1)); ++ right.load_item(); ++ ++ __ move(right.result(), cc->at(0)); ++ ++ address entry; ++ if (x->op() == Bytecodes::_frem) { ++ entry = CAST_FROM_FN_PTR(address, SharedRuntime::frem); ++ } else { ++ entry = CAST_FROM_FN_PTR(address, SharedRuntime::drem); ++ } ++ ++ LIR_Opr result = rlock_result(x); ++ __ call_runtime_leaf(entry, getThreadTemp(), result_reg, cc->args()); ++ __ move(result_reg, result); ++ return; ++ } ++ ++ LIRItem left(x->x(), this); ++ LIRItem right(x->y(), this); ++ LIRItem* left_arg = &left; ++ LIRItem* right_arg = &right; ++ ++ // Always load right hand side. ++ right.load_item(); ++ ++ if (!left.is_register()) ++ left.load_item(); ++ ++ LIR_Opr reg = rlock(x); ++ ++ arithmetic_op_fpu(x->op(), reg, left.result(), right.result()); ++ ++ set_result(x, round_item(reg)); ++} ++ ++// for _ladd, _lmul, _lsub, _ldiv, _lrem ++void LIRGenerator::do_ArithmeticOp_Long(ArithmeticOp* x) { ++ // missing test if instr is commutative and if we should swap ++ LIRItem left(x->x(), this); ++ LIRItem right(x->y(), this); ++ ++ if (x->op() == Bytecodes::_ldiv || x->op() == Bytecodes::_lrem) { ++ left.load_item(); ++ bool need_zero_check = true; ++ if (right.is_constant()) { ++ jlong c = right.get_jlong_constant(); ++ // no need to do div-by-zero check if the divisor is a non-zero constant ++ if (c != 0) need_zero_check = false; ++ // do not load right if the divisor is a power-of-2 constant ++ if (c > 0 && is_power_of_2(c) && Assembler::is_uimm(c - 1, 12)) { ++ right.dont_load_item(); ++ } else { ++ right.load_item(); ++ } ++ } else { ++ right.load_item(); ++ } ++ if (need_zero_check) { ++ CodeEmitInfo* info = state_for(x); ++ __ cmp(lir_cond_equal, right.result(), LIR_OprFact::longConst(0)); ++ __ branch(lir_cond_equal, new DivByZeroStub(info)); ++ } ++ ++ rlock_result(x); ++ switch (x->op()) { ++ case Bytecodes::_lrem: ++ __ rem (left.result(), right.result(), x->operand()); ++ break; ++ case Bytecodes::_ldiv: ++ __ div (left.result(), right.result(), x->operand()); ++ break; ++ default: ++ ShouldNotReachHere(); ++ break; ++ } ++ } else { ++ assert(x->op() == Bytecodes::_lmul || x->op() == Bytecodes::_ladd || x->op() == Bytecodes::_lsub, ++ "expect lmul, ladd or lsub"); ++ // add, sub, mul ++ left.load_item(); ++ if (!right.is_register()) { ++ if (x->op() == Bytecodes::_lmul || !right.is_constant() || ++ (x->op() == Bytecodes::_ladd && !Assembler::is_simm(right.get_jlong_constant(), 12)) || ++ (x->op() == Bytecodes::_lsub && !Assembler::is_simm(-right.get_jlong_constant(), 12))) { ++ right.load_item(); ++ } else { // add, sub ++ assert(x->op() == Bytecodes::_ladd || x->op() == Bytecodes::_lsub, "expect ladd or lsub"); ++ // don't load constants to save register ++ right.load_nonconstant(); ++ } ++ } ++ rlock_result(x); ++ arithmetic_op_long(x->op(), x->operand(), left.result(), right.result(), nullptr); ++ } ++} ++ ++// for: _iadd, _imul, _isub, _idiv, _irem ++void LIRGenerator::do_ArithmeticOp_Int(ArithmeticOp* x) { ++ // Test if instr is commutative and if we should swap ++ LIRItem left(x->x(), this); ++ LIRItem right(x->y(), this); ++ LIRItem* left_arg = &left; ++ LIRItem* right_arg = &right; ++ if (x->is_commutative() && left.is_stack() && right.is_register()) { ++ // swap them if left is real stack (or cached) and right is real register(not cached) ++ left_arg = &right; ++ right_arg = &left; ++ } ++ ++ left_arg->load_item(); ++ ++ // do not need to load right, as we can handle stack and constants ++ if (x->op() == Bytecodes::_idiv || x->op() == Bytecodes::_irem) { ++ rlock_result(x); ++ bool need_zero_check = true; ++ if (right.is_constant()) { ++ jint c = right.get_jint_constant(); ++ // no need to do div-by-zero check if the divisor is a non-zero constant ++ if (c != 0) need_zero_check = false; ++ // do not load right if the divisor is a power-of-2 constant ++ if (c > 0 && is_power_of_2(c) && Assembler::is_uimm(c - 1, 12)) { ++ right_arg->dont_load_item(); ++ } else { ++ right_arg->load_item(); ++ } ++ } else { ++ right_arg->load_item(); ++ } ++ if (need_zero_check) { ++ CodeEmitInfo* info = state_for(x); ++ __ cmp(lir_cond_equal, right_arg->result(), LIR_OprFact::longConst(0)); ++ __ branch(lir_cond_equal, new DivByZeroStub(info)); ++ } ++ ++ LIR_Opr ill = LIR_OprFact::illegalOpr; ++ if (x->op() == Bytecodes::_irem) { ++ __ irem(left_arg->result(), right_arg->result(), x->operand(), ill, nullptr); ++ } else if (x->op() == Bytecodes::_idiv) { ++ __ idiv(left_arg->result(), right_arg->result(), x->operand(), ill, nullptr); ++ } ++ } else if (x->op() == Bytecodes::_iadd || x->op() == Bytecodes::_isub) { ++ if (right.is_constant() && ++ ((x->op() == Bytecodes::_iadd && Assembler::is_simm(right.get_jint_constant(), 12)) || ++ (x->op() == Bytecodes::_isub && Assembler::is_simm(-right.get_jint_constant(), 12)))) { ++ right.load_nonconstant(); ++ } else { ++ right.load_item(); ++ } ++ rlock_result(x); ++ arithmetic_op_int(x->op(), x->operand(), left_arg->result(), right_arg->result(), LIR_OprFact::illegalOpr); ++ } else { ++ assert (x->op() == Bytecodes::_imul, "expect imul"); ++ if (right.is_constant()) { ++ jint c = right.get_jint_constant(); ++ if (c > 0 && c < max_jint && (is_power_of_2(c) || is_power_of_2(c - 1) || is_power_of_2(c + 1))) { ++ right_arg->dont_load_item(); ++ } else { ++ // Cannot use constant op. ++ right_arg->load_item(); ++ } ++ } else { ++ right.load_item(); ++ } ++ rlock_result(x); ++ arithmetic_op_int(x->op(), x->operand(), left_arg->result(), right_arg->result(), new_register(T_INT)); ++ } ++} ++ ++void LIRGenerator::do_ArithmeticOp(ArithmeticOp* x) { ++ // when an operand with use count 1 is the left operand, then it is ++ // likely that no move for 2-operand-LIR-form is necessary ++ if (x->is_commutative() && x->y()->as_Constant() == nullptr && x->x()->use_count() > x->y()->use_count()) { ++ x->swap_operands(); ++ } ++ ++ ValueTag tag = x->type()->tag(); ++ assert(x->x()->type()->tag() == tag && x->y()->type()->tag() == tag, "wrong parameters"); ++ switch (tag) { ++ case floatTag: ++ case doubleTag: do_ArithmeticOp_FPU(x); return; ++ case longTag: do_ArithmeticOp_Long(x); return; ++ case intTag: do_ArithmeticOp_Int(x); return; ++ default: ShouldNotReachHere(); return; ++ } ++} ++ ++// _ishl, _lshl, _ishr, _lshr, _iushr, _lushr ++void LIRGenerator::do_ShiftOp(ShiftOp* x) { ++ LIRItem left(x->x(), this); ++ LIRItem right(x->y(), this); ++ ++ left.load_item(); ++ ++ rlock_result(x); ++ if (right.is_constant()) { ++ right.dont_load_item(); ++ int c; ++ switch (x->op()) { ++ case Bytecodes::_ishl: ++ c = right.get_jint_constant() & 0x1f; ++ __ shift_left(left.result(), c, x->operand()); ++ break; ++ case Bytecodes::_ishr: ++ c = right.get_jint_constant() & 0x1f; ++ __ shift_right(left.result(), c, x->operand()); ++ break; ++ case Bytecodes::_iushr: ++ c = right.get_jint_constant() & 0x1f; ++ __ unsigned_shift_right(left.result(), c, x->operand()); ++ break; ++ case Bytecodes::_lshl: ++ c = right.get_jint_constant() & 0x3f; ++ __ shift_left(left.result(), c, x->operand()); ++ break; ++ case Bytecodes::_lshr: ++ c = right.get_jint_constant() & 0x3f; ++ __ shift_right(left.result(), c, x->operand()); ++ break; ++ case Bytecodes::_lushr: ++ c = right.get_jint_constant() & 0x3f; ++ __ unsigned_shift_right(left.result(), c, x->operand()); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ right.load_item(); ++ LIR_Opr tmp = new_register(T_INT); ++ switch (x->op()) { ++ case Bytecodes::_ishl: ++ __ logical_and(right.result(), LIR_OprFact::intConst(0x1f), tmp); ++ __ shift_left(left.result(), tmp, x->operand(), tmp); ++ break; ++ case Bytecodes::_ishr: ++ __ logical_and(right.result(), LIR_OprFact::intConst(0x1f), tmp); ++ __ shift_right(left.result(), tmp, x->operand(), tmp); ++ break; ++ case Bytecodes::_iushr: ++ __ logical_and(right.result(), LIR_OprFact::intConst(0x1f), tmp); ++ __ unsigned_shift_right(left.result(), tmp, x->operand(), tmp); ++ break; ++ case Bytecodes::_lshl: ++ __ logical_and(right.result(), LIR_OprFact::intConst(0x3f), tmp); ++ __ shift_left(left.result(), tmp, x->operand(), tmp); ++ break; ++ case Bytecodes::_lshr: ++ __ logical_and(right.result(), LIR_OprFact::intConst(0x3f), tmp); ++ __ shift_right(left.result(), tmp, x->operand(), tmp); ++ break; ++ case Bytecodes::_lushr: ++ __ logical_and(right.result(), LIR_OprFact::intConst(0x3f), tmp); ++ __ unsigned_shift_right(left.result(), tmp, x->operand(), tmp); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++} ++ ++// _iand, _land, _ior, _lor, _ixor, _lxor ++void LIRGenerator::do_LogicOp(LogicOp* x) { ++ LIRItem left(x->x(), this); ++ LIRItem right(x->y(), this); ++ ++ left.load_item(); ++ ++ rlock_result(x); ++ if (right.is_constant() ++ && ((right.type()->tag() == intTag ++ && Assembler::is_uimm(right.get_jint_constant(), 12)) ++ || (right.type()->tag() == longTag ++ && Assembler::is_uimm(right.get_jlong_constant(), 12)))) { ++ right.dont_load_item(); ++ } else { ++ right.load_item(); ++ } ++ switch (x->op()) { ++ case Bytecodes::_iand: ++ case Bytecodes::_land: ++ __ logical_and(left.result(), right.result(), x->operand()); break; ++ case Bytecodes::_ior: ++ case Bytecodes::_lor: ++ __ logical_or (left.result(), right.result(), x->operand()); break; ++ case Bytecodes::_ixor: ++ case Bytecodes::_lxor: ++ __ logical_xor(left.result(), right.result(), x->operand()); break; ++ default: Unimplemented(); ++ } ++} ++ ++// _lcmp, _fcmpl, _fcmpg, _dcmpl, _dcmpg ++void LIRGenerator::do_CompareOp(CompareOp* x) { ++ LIRItem left(x->x(), this); ++ LIRItem right(x->y(), this); ++ ValueTag tag = x->x()->type()->tag(); ++ if (tag == longTag) { ++ left.set_destroys_register(); ++ } ++ left.load_item(); ++ right.load_item(); ++ LIR_Opr reg = rlock_result(x); ++ ++ if (x->x()->type()->is_float_kind()) { ++ Bytecodes::Code code = x->op(); ++ __ fcmp2int(left.result(), right.result(), reg, ++ (code == Bytecodes::_fcmpl || code == Bytecodes::_dcmpl), ++ new_register(T_FLOAT)); ++ } else if (x->x()->type()->tag() == longTag) { ++ __ lcmp2int(left.result(), right.result(), reg); ++ } else { ++ Unimplemented(); ++ } ++} ++ ++LIR_Opr LIRGenerator::atomic_cmpxchg(BasicType type, LIR_Opr addr, ++ LIRItem& cmp_value, LIRItem& new_value) { ++ LIR_Opr ill = LIR_OprFact::illegalOpr; // for convenience ++ new_value.load_item(); ++ cmp_value.load_item(); ++ LIR_Opr result = new_register(T_INT); ++ if (is_reference_type(type)) { ++ __ cas_obj(addr, cmp_value.result(), new_value.result(), ++ new_register(T_INT), new_register(T_INT), result); ++ } else if (type == T_INT) { ++ __ cas_int(addr->as_address_ptr()->base(), cmp_value.result(), ++ new_value.result(), ill, ill, result); ++ } else if (type == T_LONG) { ++ __ cas_long(addr->as_address_ptr()->base(), cmp_value.result(), ++ new_value.result(), ill, ill, result); ++ } else { ++ ShouldNotReachHere(); ++ Unimplemented(); ++ } ++ return result; ++} ++ ++LIR_Opr LIRGenerator::atomic_xchg(BasicType type, LIR_Opr addr, LIRItem& value) { ++ bool is_oop = is_reference_type(type); ++ LIR_Opr result = new_register(type); ++ value.load_item(); ++ assert(type == T_INT || is_oop LP64_ONLY( || type == T_LONG ), "unexpected type"); ++ LIR_Opr tmp = new_register(T_INT); ++ __ xchg(addr, value.result(), result, tmp); ++ return result; ++} ++ ++LIR_Opr LIRGenerator::atomic_add(BasicType type, LIR_Opr addr, LIRItem& value) { ++ LIR_Opr result = new_register(type); ++ value.load_item(); ++ assert(type == T_INT LP64_ONLY( || type == T_LONG ), "unexpected type"); ++ LIR_Opr tmp = new_register(T_INT); ++ __ xadd(addr, value.result(), result, tmp); ++ return result; ++} ++ ++void LIRGenerator::do_MathIntrinsic(Intrinsic* x) { ++ assert(x->number_of_arguments() == 1 || (x->number_of_arguments() == 2 && x->id() == vmIntrinsics::_dpow), ++ "wrong type"); ++ if (x->id() == vmIntrinsics::_dexp || x->id() == vmIntrinsics::_dlog || ++ x->id() == vmIntrinsics::_dpow || x->id() == vmIntrinsics::_dcos || ++ x->id() == vmIntrinsics::_dsin || x->id() == vmIntrinsics::_dtan || ++ x->id() == vmIntrinsics::_dlog10) { ++ do_LibmIntrinsic(x); ++ return; ++ } ++ switch (x->id()) { ++ case vmIntrinsics::_dabs: ++ case vmIntrinsics::_dsqrt: ++ case vmIntrinsics::_dsqrt_strict: ++ case vmIntrinsics::_floatToFloat16: ++ case vmIntrinsics::_float16ToFloat: { ++ assert(x->number_of_arguments() == 1, "wrong type"); ++ LIRItem value(x->argument_at(0), this); ++ value.load_item(); ++ LIR_Opr src = value.result(); ++ LIR_Opr dst = rlock_result(x); ++ ++ switch (x->id()) { ++ case vmIntrinsics::_dsqrt: ++ case vmIntrinsics::_dsqrt_strict: { ++ __ sqrt(src, dst, LIR_OprFact::illegalOpr); ++ break; ++ } ++ case vmIntrinsics::_dabs: { ++ __ abs(src, dst, LIR_OprFact::illegalOpr); ++ break; ++ } ++ case vmIntrinsics::_floatToFloat16: { ++ LIR_Opr tmp = new_register(T_FLOAT); ++ __ move(LIR_OprFact::floatConst(-0.0), tmp); ++ __ f2hf(src, dst, tmp); ++ break; ++ } ++ case vmIntrinsics::_float16ToFloat: { ++ LIR_Opr tmp = new_register(T_FLOAT); ++ __ move(LIR_OprFact::floatConst(-0.0), tmp); ++ __ hf2f(src, dst, tmp); ++ break; ++ } ++ default: ++ ShouldNotReachHere(); ++ } ++ break; ++ } ++ default: ++ ShouldNotReachHere(); ++ } ++} ++ ++void LIRGenerator::do_LibmIntrinsic(Intrinsic* x) { ++ LIRItem value(x->argument_at(0), this); ++ value.set_destroys_register(); ++ ++ LIR_Opr calc_result = rlock_result(x); ++ LIR_Opr result_reg = result_register_for(x->type()); ++ ++ CallingConvention* cc = nullptr; ++ ++ if (x->id() == vmIntrinsics::_dpow) { ++ LIRItem value1(x->argument_at(1), this); ++ ++ value1.set_destroys_register(); ++ ++ BasicTypeList signature(2); ++ signature.append(T_DOUBLE); ++ signature.append(T_DOUBLE); ++ cc = frame_map()->c_calling_convention(&signature); ++ value.load_item_force(cc->at(0)); ++ value1.load_item_force(cc->at(1)); ++ } else { ++ BasicTypeList signature(1); ++ signature.append(T_DOUBLE); ++ cc = frame_map()->c_calling_convention(&signature); ++ value.load_item_force(cc->at(0)); ++ } ++ ++ switch (x->id()) { ++ case vmIntrinsics::_dexp: ++ if (StubRoutines::dexp() != nullptr) { ++ __ call_runtime_leaf(StubRoutines::dexp(), getThreadTemp(), result_reg, cc->args()); ++ } else { ++ __ call_runtime_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::dexp), getThreadTemp(), result_reg, cc->args()); ++ } ++ break; ++ case vmIntrinsics::_dlog: ++ if (StubRoutines::dlog() != nullptr) { ++ __ call_runtime_leaf(StubRoutines::dlog(), getThreadTemp(), result_reg, cc->args()); ++ } else { ++ __ call_runtime_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::dlog), getThreadTemp(), result_reg, cc->args()); ++ } ++ break; ++ case vmIntrinsics::_dlog10: ++ if (StubRoutines::dlog10() != nullptr) { ++ __ call_runtime_leaf(StubRoutines::dlog10(), getThreadTemp(), result_reg, cc->args()); ++ } else { ++ __ call_runtime_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::dlog10), getThreadTemp(), result_reg, cc->args()); ++ } ++ break; ++ case vmIntrinsics::_dpow: ++ if (StubRoutines::dpow() != nullptr) { ++ __ call_runtime_leaf(StubRoutines::dpow(), getThreadTemp(), result_reg, cc->args()); ++ } else { ++ __ call_runtime_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::dpow), getThreadTemp(), result_reg, cc->args()); ++ } ++ break; ++ case vmIntrinsics::_dsin: ++ if (StubRoutines::dsin() != nullptr) { ++ __ call_runtime_leaf(StubRoutines::dsin(), getThreadTemp(), result_reg, cc->args()); ++ } else { ++ __ call_runtime_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::dsin), getThreadTemp(), result_reg, cc->args()); ++ } ++ break; ++ case vmIntrinsics::_dcos: ++ if (StubRoutines::dcos() != nullptr) { ++ __ call_runtime_leaf(StubRoutines::dcos(), getThreadTemp(), result_reg, cc->args()); ++ } else { ++ __ call_runtime_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::dcos), getThreadTemp(), result_reg, cc->args()); ++ } ++ break; ++ case vmIntrinsics::_dtan: ++ if (StubRoutines::dtan() != nullptr) { ++ __ call_runtime_leaf(StubRoutines::dtan(), getThreadTemp(), result_reg, cc->args()); ++ } else { ++ __ call_runtime_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::dtan), getThreadTemp(), result_reg, cc->args()); ++ } ++ break; ++ default: ShouldNotReachHere(); ++ } ++ __ move(result_reg, calc_result); ++} ++ ++void LIRGenerator::do_ArrayCopy(Intrinsic* x) { ++ ++ assert(x->number_of_arguments() == 5, "wrong type"); ++ ++ // Make all state_for calls early since they can emit code ++ CodeEmitInfo* info = state_for(x, x->state()); ++ ++ LIRItem src(x->argument_at(0), this); ++ LIRItem src_pos(x->argument_at(1), this); ++ LIRItem dst(x->argument_at(2), this); ++ LIRItem dst_pos(x->argument_at(3), this); ++ LIRItem length(x->argument_at(4), this); ++ ++ // operands for arraycopy must use fixed registers, otherwise ++ // LinearScan will fail allocation (because arraycopy always needs a ++ // call) ++ ++ // The java calling convention will give us enough registers ++ // so that on the stub side the args will be perfect already. ++ // On the other slow/special case side we call C and the arg ++ // positions are not similar enough to pick one as the best. ++ // Also because the java calling convention is a "shifted" version ++ // of the C convention we can process the java args trivially into C ++ // args without worry of overwriting during the xfer ++ ++ src.load_item_force (FrameMap::as_oop_opr(j_rarg0)); ++ src_pos.load_item_force (FrameMap::as_opr(j_rarg1)); ++ dst.load_item_force (FrameMap::as_oop_opr(j_rarg2)); ++ dst_pos.load_item_force (FrameMap::as_opr(j_rarg3)); ++ length.load_item_force (FrameMap::as_opr(j_rarg4)); ++ ++ LIR_Opr tmp = FrameMap::as_opr(j_rarg5); ++ ++ set_no_result(x); ++ ++ int flags; ++ ciArrayKlass* expected_type; ++ arraycopy_helper(x, &flags, &expected_type); ++ ++ __ arraycopy(src.result(), src_pos.result(), dst.result(), dst_pos.result(), ++ length.result(), tmp, expected_type, flags, info); // does add_safepoint ++} ++ ++void LIRGenerator::do_update_CRC32(Intrinsic* x) { ++ assert(UseCRC32Intrinsics, "why are we here?"); ++ // Make all state_for calls early since they can emit code ++ LIR_Opr result = rlock_result(x); ++ int flags = 0; ++ switch (x->id()) { ++ case vmIntrinsics::_updateCRC32: { ++ LIRItem crc(x->argument_at(0), this); ++ LIRItem val(x->argument_at(1), this); ++ // val is destroyed by update_crc32 ++ val.set_destroys_register(); ++ crc.load_item(); ++ val.load_item(); ++ __ update_crc32(crc.result(), val.result(), result); ++ break; ++ } ++ case vmIntrinsics::_updateBytesCRC32: ++ case vmIntrinsics::_updateByteBufferCRC32: { ++ bool is_updateBytes = (x->id() == vmIntrinsics::_updateBytesCRC32); ++ ++ LIRItem crc(x->argument_at(0), this); ++ LIRItem buf(x->argument_at(1), this); ++ LIRItem off(x->argument_at(2), this); ++ LIRItem len(x->argument_at(3), this); ++ buf.load_item(); ++ off.load_nonconstant(); ++ ++ LIR_Opr index = off.result(); ++ int offset = is_updateBytes ? arrayOopDesc::base_offset_in_bytes(T_BYTE) : 0; ++ if(off.result()->is_constant()) { ++ index = LIR_OprFact::illegalOpr; ++ offset += off.result()->as_jint(); ++ } ++ LIR_Opr base_op = buf.result(); ++ ++ if (index->is_valid()) { ++ LIR_Opr tmp = new_register(T_LONG); ++ __ convert(Bytecodes::_i2l, index, tmp); ++ index = tmp; ++ } ++ ++ if (offset) { ++ LIR_Opr tmp = new_pointer_register(); ++ __ add(base_op, LIR_OprFact::intConst(offset), tmp); ++ base_op = tmp; ++ offset = 0; ++ } ++ ++ LIR_Address* a = new LIR_Address(base_op, index, offset, T_BYTE); ++ BasicTypeList signature(3); ++ signature.append(T_INT); ++ signature.append(T_ADDRESS); ++ signature.append(T_INT); ++ CallingConvention* cc = frame_map()->c_calling_convention(&signature); ++ const LIR_Opr result_reg = result_register_for(x->type()); ++ ++ LIR_Opr addr = new_pointer_register(); ++ __ leal(LIR_OprFact::address(a), addr); ++ ++ crc.load_item_force(cc->at(0)); ++ __ move(addr, cc->at(1)); ++ len.load_item_force(cc->at(2)); ++ ++ __ call_runtime_leaf(StubRoutines::updateBytesCRC32(), getThreadTemp(), result_reg, cc->args()); ++ __ move(result_reg, result); ++ ++ break; ++ } ++ default: { ++ ShouldNotReachHere(); ++ } ++ } ++} ++ ++void LIRGenerator::do_update_CRC32C(Intrinsic* x) { ++ assert(UseCRC32CIntrinsics, "why are we here?"); ++ // Make all state_for calls early since they can emit code ++ LIR_Opr result = rlock_result(x); ++ int flags = 0; ++ switch (x->id()) { ++ case vmIntrinsics::_updateBytesCRC32C: ++ case vmIntrinsics::_updateDirectByteBufferCRC32C: { ++ bool is_updateBytes = (x->id() == vmIntrinsics::_updateBytesCRC32C); ++ int offset = is_updateBytes ? arrayOopDesc::base_offset_in_bytes(T_BYTE) : 0; ++ ++ LIRItem crc(x->argument_at(0), this); ++ LIRItem buf(x->argument_at(1), this); ++ LIRItem off(x->argument_at(2), this); ++ LIRItem end(x->argument_at(3), this); ++ ++ buf.load_item(); ++ off.load_nonconstant(); ++ end.load_nonconstant(); ++ ++ // len = end - off ++ LIR_Opr len = end.result(); ++ LIR_Opr tmpA = new_register(T_INT); ++ LIR_Opr tmpB = new_register(T_INT); ++ __ move(end.result(), tmpA); ++ __ move(off.result(), tmpB); ++ __ sub(tmpA, tmpB, tmpA); ++ len = tmpA; ++ ++ LIR_Opr index = off.result(); ++ if(off.result()->is_constant()) { ++ index = LIR_OprFact::illegalOpr; ++ offset += off.result()->as_jint(); ++ } ++ LIR_Opr base_op = buf.result(); ++ ++ if (index->is_valid()) { ++ LIR_Opr tmp = new_register(T_LONG); ++ __ convert(Bytecodes::_i2l, index, tmp); ++ index = tmp; ++ } ++ ++ if (offset) { ++ LIR_Opr tmp = new_pointer_register(); ++ __ add(base_op, LIR_OprFact::intConst(offset), tmp); ++ base_op = tmp; ++ offset = 0; ++ } ++ ++ LIR_Address* a = new LIR_Address(base_op, index, offset, T_BYTE); ++ BasicTypeList signature(3); ++ signature.append(T_INT); ++ signature.append(T_ADDRESS); ++ signature.append(T_INT); ++ CallingConvention* cc = frame_map()->c_calling_convention(&signature); ++ const LIR_Opr result_reg = result_register_for(x->type()); ++ ++ LIR_Opr addr = new_pointer_register(); ++ __ leal(LIR_OprFact::address(a), addr); ++ ++ crc.load_item_force(cc->at(0)); ++ __ move(addr, cc->at(1)); ++ __ move(len, cc->at(2)); ++ ++ __ call_runtime_leaf(StubRoutines::updateBytesCRC32C(), getThreadTemp(), result_reg, cc->args()); ++ __ move(result_reg, result); ++ ++ break; ++ } ++ default: { ++ ShouldNotReachHere(); ++ } ++ } ++} ++ ++void LIRGenerator::do_FmaIntrinsic(Intrinsic* x) { ++ assert(x->number_of_arguments() == 3, "wrong type"); ++ assert(UseFMA, "Needs FMA instructions support."); ++ LIRItem value(x->argument_at(0), this); ++ LIRItem value1(x->argument_at(1), this); ++ LIRItem value2(x->argument_at(2), this); ++ ++ value.load_item(); ++ value1.load_item(); ++ value2.load_item(); ++ ++ LIR_Opr calc_input = value.result(); ++ LIR_Opr calc_input1 = value1.result(); ++ LIR_Opr calc_input2 = value2.result(); ++ LIR_Opr calc_result = rlock_result(x); ++ ++ switch (x->id()) { ++ case vmIntrinsics::_fmaD: ++ __ fmad(calc_input, calc_input1, calc_input2, calc_result); ++ break; ++ case vmIntrinsics::_fmaF: ++ __ fmaf(calc_input, calc_input1, calc_input2, calc_result); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++} ++ ++void LIRGenerator::do_vectorizedMismatch(Intrinsic* x) { ++ fatal("vectorizedMismatch intrinsic is not implemented on this platform"); ++} ++ ++// _i2l, _i2f, _i2d, _l2i, _l2f, _l2d, _f2i, _f2l, _f2d, _d2i, _d2l, _d2f ++// _i2b, _i2c, _i2s ++void LIRGenerator::do_Convert(Convert* x) { ++ LIRItem value(x->value(), this); ++ value.load_item(); ++ LIR_Opr input = value.result(); ++ LIR_Opr result = rlock(x); ++ ++ // arguments of lir_convert ++ LIR_Opr conv_input = input; ++ LIR_Opr conv_result = result; ++ ++ switch (x->op()) { ++ case Bytecodes::_f2i: ++ case Bytecodes::_f2l: ++ __ convert(x->op(), conv_input, conv_result, nullptr, new_register(T_FLOAT)); ++ break; ++ case Bytecodes::_d2i: ++ case Bytecodes::_d2l: ++ __ convert(x->op(), conv_input, conv_result, nullptr, new_register(T_DOUBLE)); ++ break; ++ default: ++ __ convert(x->op(), conv_input, conv_result); ++ break; ++ } ++ ++ assert(result->is_virtual(), "result must be virtual register"); ++ set_result(x, result); ++} ++ ++void LIRGenerator::do_NewInstance(NewInstance* x) { ++#ifndef PRODUCT ++ if (PrintNotLoaded && !x->klass()->is_loaded()) { ++ tty->print_cr(" ###class not loaded at new bci %d", x->printable_bci()); ++ } ++#endif ++ CodeEmitInfo* info = state_for(x, x->state()); ++ LIR_Opr reg = result_register_for(x->type()); ++ new_instance(reg, x->klass(), x->is_unresolved(), ++ FrameMap::t0_oop_opr, ++ FrameMap::t1_oop_opr, ++ FrameMap::a4_oop_opr, ++ LIR_OprFact::illegalOpr, ++ FrameMap::a3_metadata_opr, info); ++ LIR_Opr result = rlock_result(x); ++ __ move(reg, result); ++} ++ ++void LIRGenerator::do_NewTypeArray(NewTypeArray* x) { ++ CodeEmitInfo* info = state_for(x, x->state()); ++ ++ LIRItem length(x->length(), this); ++ length.load_item_force(FrameMap::s0_opr); ++ ++ LIR_Opr reg = result_register_for(x->type()); ++ LIR_Opr tmp1 = FrameMap::t0_oop_opr; ++ LIR_Opr tmp2 = FrameMap::t1_oop_opr; ++ LIR_Opr tmp3 = FrameMap::a5_oop_opr; ++ LIR_Opr tmp4 = reg; ++ LIR_Opr klass_reg = FrameMap::a3_metadata_opr; ++ LIR_Opr len = length.result(); ++ BasicType elem_type = x->elt_type(); ++ ++ __ metadata2reg(ciTypeArrayKlass::make(elem_type)->constant_encoding(), klass_reg); ++ ++ CodeStub* slow_path = new NewTypeArrayStub(klass_reg, len, reg, info); ++ __ allocate_array(reg, len, tmp1, tmp2, tmp3, tmp4, elem_type, klass_reg, slow_path); ++ ++ LIR_Opr result = rlock_result(x); ++ __ move(reg, result); ++} ++ ++void LIRGenerator::do_NewObjectArray(NewObjectArray* x) { ++ LIRItem length(x->length(), this); ++ // in case of patching (i.e., object class is not yet loaded), we need to reexecute the instruction ++ // and therefore provide the state before the parameters have been consumed ++ CodeEmitInfo* patching_info = nullptr; ++ if (!x->klass()->is_loaded() || PatchALot) { ++ patching_info = state_for(x, x->state_before()); ++ } ++ ++ CodeEmitInfo* info = state_for(x, x->state()); ++ ++ LIR_Opr reg = result_register_for(x->type()); ++ LIR_Opr tmp1 = FrameMap::t0_oop_opr; ++ LIR_Opr tmp2 = FrameMap::t1_oop_opr; ++ LIR_Opr tmp3 = FrameMap::a5_oop_opr; ++ LIR_Opr tmp4 = reg; ++ LIR_Opr klass_reg = FrameMap::a3_metadata_opr; ++ ++ length.load_item_force(FrameMap::s0_opr); ++ LIR_Opr len = length.result(); ++ ++ CodeStub* slow_path = new NewObjectArrayStub(klass_reg, len, reg, info); ++ ciKlass* obj = (ciKlass*) ciObjArrayKlass::make(x->klass()); ++ if (obj == ciEnv::unloaded_ciobjarrayklass()) { ++ BAILOUT("encountered unloaded_ciobjarrayklass due to out of memory error"); ++ } ++ klass2reg_with_patching(klass_reg, obj, patching_info); ++ __ allocate_array(reg, len, tmp1, tmp2, tmp3, tmp4, T_OBJECT, klass_reg, slow_path); ++ ++ LIR_Opr result = rlock_result(x); ++ __ move(reg, result); ++} ++ ++void LIRGenerator::do_NewMultiArray(NewMultiArray* x) { ++ Values* dims = x->dims(); ++ int i = dims->length(); ++ LIRItemList* items = new LIRItemList(i, i, nullptr); ++ while (i-- > 0) { ++ LIRItem* size = new LIRItem(dims->at(i), this); ++ items->at_put(i, size); ++ } ++ ++ // Evaluate state_for early since it may emit code. ++ CodeEmitInfo* patching_info = nullptr; ++ if (!x->klass()->is_loaded() || PatchALot) { ++ patching_info = state_for(x, x->state_before()); ++ ++ // Cannot re-use same xhandlers for multiple CodeEmitInfos, so ++ // clone all handlers (NOTE: Usually this is handled transparently ++ // by the CodeEmitInfo cloning logic in CodeStub constructors but ++ // is done explicitly here because a stub isn't being used). ++ x->set_exception_handlers(new XHandlers(x->exception_handlers())); ++ } ++ CodeEmitInfo* info = state_for(x, x->state()); ++ ++ i = dims->length(); ++ while (i-- > 0) { ++ LIRItem* size = items->at(i); ++ size->load_item(); ++ ++ store_stack_parameter(size->result(), in_ByteSize(i*4)); ++ } ++ ++ LIR_Opr klass_reg = FrameMap::a0_metadata_opr; ++ klass2reg_with_patching(klass_reg, x->klass(), patching_info); ++ ++ LIR_Opr rank = FrameMap::s0_opr; ++ __ move(LIR_OprFact::intConst(x->rank()), rank); ++ LIR_Opr varargs = FrameMap::a2_opr; ++ __ move(FrameMap::sp_opr, varargs); ++ LIR_OprList* args = new LIR_OprList(3); ++ args->append(klass_reg); ++ args->append(rank); ++ args->append(varargs); ++ LIR_Opr reg = result_register_for(x->type()); ++ __ call_runtime(Runtime1::entry_for(Runtime1::new_multi_array_id), ++ LIR_OprFact::illegalOpr, ++ reg, args, info); ++ ++ LIR_Opr result = rlock_result(x); ++ __ move(reg, result); ++} ++ ++void LIRGenerator::do_BlockBegin(BlockBegin* x) { ++ // nothing to do for now ++} ++ ++void LIRGenerator::do_CheckCast(CheckCast* x) { ++ LIRItem obj(x->obj(), this); ++ ++ CodeEmitInfo* patching_info = nullptr; ++ if (!x->klass()->is_loaded() || ++ (PatchALot && !x->is_incompatible_class_change_check() && ++ !x->is_invokespecial_receiver_check())) { ++ // must do this before locking the destination register as an oop register, ++ // and before the obj is loaded (the latter is for deoptimization) ++ patching_info = state_for(x, x->state_before()); ++ } ++ obj.load_item(); ++ ++ // info for exceptions ++ CodeEmitInfo* info_for_exception = ++ (x->needs_exception_state() ? state_for(x) : ++ state_for(x, x->state_before(), true /*ignore_xhandler*/)); ++ ++ CodeStub* stub; ++ if (x->is_incompatible_class_change_check()) { ++ assert(patching_info == nullptr, "can't patch this"); ++ stub = new SimpleExceptionStub(Runtime1::throw_incompatible_class_change_error_id, ++ LIR_OprFact::illegalOpr, info_for_exception); ++ } else if (x->is_invokespecial_receiver_check()) { ++ assert(patching_info == nullptr, "can't patch this"); ++ stub = new DeoptimizeStub(info_for_exception, ++ Deoptimization::Reason_class_check, ++ Deoptimization::Action_none); ++ } else { ++ stub = new SimpleExceptionStub(Runtime1::throw_class_cast_exception_id, ++ obj.result(), info_for_exception); ++ } ++ LIR_Opr reg = rlock_result(x); ++ LIR_Opr tmp3 = LIR_OprFact::illegalOpr; ++ if (!x->klass()->is_loaded() || UseCompressedClassPointers) { ++ tmp3 = new_register(objectType); ++ } ++ __ checkcast(reg, obj.result(), x->klass(), ++ new_register(objectType), new_register(objectType), tmp3, ++ x->direct_compare(), info_for_exception, patching_info, stub, ++ x->profiled_method(), x->profiled_bci()); ++} ++ ++void LIRGenerator::do_InstanceOf(InstanceOf* x) { ++ LIRItem obj(x->obj(), this); ++ ++ // result and test object may not be in same register ++ LIR_Opr reg = rlock_result(x); ++ CodeEmitInfo* patching_info = nullptr; ++ if ((!x->klass()->is_loaded() || PatchALot)) { ++ // must do this before locking the destination register as an oop register ++ patching_info = state_for(x, x->state_before()); ++ } ++ obj.load_item(); ++ LIR_Opr tmp3 = LIR_OprFact::illegalOpr; ++ if (!x->klass()->is_loaded() || UseCompressedClassPointers) { ++ tmp3 = new_register(objectType); ++ } ++ __ instanceof(reg, obj.result(), x->klass(), ++ new_register(objectType), new_register(objectType), tmp3, ++ x->direct_compare(), patching_info, x->profiled_method(), x->profiled_bci()); ++} ++ ++void LIRGenerator::do_If(If* x) { ++ assert(x->number_of_sux() == 2, "inconsistency"); ++ ValueTag tag = x->x()->type()->tag(); ++ bool is_safepoint = x->is_safepoint(); ++ ++ If::Condition cond = x->cond(); ++ ++ LIRItem xitem(x->x(), this); ++ LIRItem yitem(x->y(), this); ++ LIRItem* xin = &xitem; ++ LIRItem* yin = &yitem; ++ ++ if (tag == longTag) { ++ // for longs, only conditions "eql", "neq", "lss", "geq" are valid; ++ // mirror for other conditions ++ if (cond == If::gtr || cond == If::leq) { ++ cond = Instruction::mirror(cond); ++ xin = &yitem; ++ yin = &xitem; ++ } ++ xin->set_destroys_register(); ++ } ++ xin->load_item(); ++ ++ if (tag == longTag) { ++ if (yin->is_constant() && yin->get_jlong_constant() == 0) { ++ yin->dont_load_item(); ++ } else { ++ yin->load_item(); ++ } ++ } else if (tag == intTag) { ++ if (yin->is_constant() && yin->get_jint_constant() == 0) { ++ yin->dont_load_item(); ++ } else { ++ yin->load_item(); ++ } ++ } else { ++ yin->load_item(); ++ } ++ ++ set_no_result(x); ++ ++ LIR_Opr left = xin->result(); ++ LIR_Opr right = yin->result(); ++ ++ // add safepoint before generating condition code so it can be recomputed ++ if (x->is_safepoint()) { ++ // increment backedge counter if needed ++ increment_backedge_counter_conditionally(lir_cond(cond), left, right, state_for(x, x->state_before()), ++ x->tsux()->bci(), x->fsux()->bci(), x->profiled_bci()); ++ __ safepoint(LIR_OprFact::illegalOpr, state_for(x, x->state_before())); ++ } ++ ++ __ cmp(lir_cond(cond), left, right); ++ // Generate branch profiling. Profiling code doesn't kill flags. ++ profile_branch(x, cond); ++ move_to_phi(x->state()); ++ if (x->x()->type()->is_float_kind()) { ++ __ branch(lir_cond(cond), x->tsux(), x->usux()); ++ } else { ++ __ branch(lir_cond(cond), x->tsux()); ++ } ++ assert(x->default_sux() == x->fsux(), "wrong destination above"); ++ __ jump(x->default_sux()); ++} ++ ++LIR_Opr LIRGenerator::getThreadPointer() { ++ return FrameMap::as_pointer_opr(TREG); ++} ++ ++void LIRGenerator::trace_block_entry(BlockBegin* block) { Unimplemented(); } ++ ++void LIRGenerator::volatile_field_store(LIR_Opr value, LIR_Address* address, ++ CodeEmitInfo* info) { ++ __ volatile_store_mem_reg(value, address, info); ++} ++ ++void LIRGenerator::volatile_field_load(LIR_Address* address, LIR_Opr result, ++ CodeEmitInfo* info) { ++ // 8179954: We need to make sure that the code generated for ++ // volatile accesses forms a sequentially-consistent set of ++ // operations when combined with STLR and LDAR. Without a leading ++ // membar it's possible for a simple Dekker test to fail if loads ++ // use LD;DMB but stores use STLR. This can happen if C2 compiles ++ // the stores in one method and C1 compiles the loads in another. ++ if (!CompilerConfig::is_c1_only_no_jvmci()) { ++ __ membar(); ++ } ++ __ volatile_load_mem_reg(address, result, info); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_LIR_loongarch_64.cpp b/src/hotspot/cpu/loongarch/c1_LIR_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/c1_LIR_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_LIR_loongarch_64.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,57 @@ ++/* ++ * Copyright (c) 2016, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/register.hpp" ++#include "c1/c1_LIR.hpp" ++ ++FloatRegister LIR_Opr::as_float_reg() const { ++ return as_FloatRegister(fpu_regnr()); ++} ++ ++FloatRegister LIR_Opr::as_double_reg() const { ++ return as_FloatRegister(fpu_regnrLo()); ++} ++ ++// Reg2 unused. ++LIR_Opr LIR_OprFact::double_fpu(int reg1, int reg2) { ++ assert(as_FloatRegister(reg2) == fnoreg, "Not used on this platform"); ++ return (LIR_Opr)(intptr_t)((reg1 << LIR_Opr::reg1_shift) | ++ (reg1 << LIR_Opr::reg2_shift) | ++ LIR_Opr::double_type | ++ LIR_Opr::fpu_register | ++ LIR_Opr::double_size); ++} ++ ++#ifndef PRODUCT ++void LIR_Address::verify() const { ++ assert(base()->is_cpu_register(), "wrong base operand"); ++ assert(index()->is_illegal() || index()->is_double_cpu() || ++ index()->is_single_cpu(), "wrong index operand"); ++ assert(base()->type() == T_ADDRESS || base()->type() == T_OBJECT || ++ base()->type() == T_LONG || base()->type() == T_METADATA, ++ "wrong type for addresses"); ++} ++#endif // PRODUCT +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_MacroAssembler_loongarch_64.cpp b/src/hotspot/cpu/loongarch/c1_MacroAssembler_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/c1_MacroAssembler_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_MacroAssembler_loongarch_64.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,350 @@ ++/* ++ * Copyright (c) 1999, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "c1/c1_MacroAssembler.hpp" ++#include "c1/c1_Runtime1.hpp" ++#include "gc/shared/barrierSetAssembler.hpp" ++#include "gc/shared/collectedHeap.hpp" ++#include "gc/shared/tlab_globals.hpp" ++#include "interpreter/interpreter.hpp" ++#include "oops/arrayOop.hpp" ++#include "oops/markWord.hpp" ++#include "runtime/basicLock.hpp" ++#include "runtime/os.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/stubRoutines.hpp" ++ ++int C1_MacroAssembler::lock_object(Register hdr, Register obj, Register disp_hdr, Label& slow_case) { ++ const int aligned_mask = BytesPerWord -1; ++ const int hdr_offset = oopDesc::mark_offset_in_bytes(); ++ assert_different_registers(hdr, obj, disp_hdr); ++ int null_check_offset = -1; ++ ++ verify_oop(obj); ++ ++ // save object being locked into the BasicObjectLock ++ st_d(obj, Address(disp_hdr, BasicObjectLock::obj_offset())); ++ ++ null_check_offset = offset(); ++ ++ if (DiagnoseSyncOnValueBasedClasses != 0) { ++ load_klass(hdr, obj); ++ ld_w(hdr, Address(hdr, Klass::access_flags_offset())); ++ li(SCR1, JVM_ACC_IS_VALUE_BASED_CLASS); ++ andr(SCR1, hdr, SCR1); ++ bnez(SCR1, slow_case); ++ } ++ ++ // Load object header ++ ld_d(hdr, Address(obj, hdr_offset)); ++ if (LockingMode == LM_LIGHTWEIGHT) { ++ lightweight_lock(obj, hdr, SCR1, SCR2, slow_case); ++ } else if (LockingMode == LM_LEGACY) { ++ Label done; ++ // and mark it as unlocked ++ ori(hdr, hdr, markWord::unlocked_value); ++ // save unlocked object header into the displaced header location on the stack ++ st_d(hdr, Address(disp_hdr, 0)); ++ // test if object header is still the same (i.e. unlocked), and if so, store the ++ // displaced header address in the object header - if it is not the same, get the ++ // object header instead ++ lea(SCR2, Address(obj, hdr_offset)); ++ cmpxchg(Address(SCR2, 0), hdr, disp_hdr, SCR1, true, true /* acquire */, done); ++ // if the object header was the same, we're done ++ // if the object header was not the same, it is now in the hdr register ++ // => test if it is a stack pointer into the same stack (recursive locking), i.e.: ++ // ++ // 1) (hdr & aligned_mask) == 0 ++ // 2) sp <= hdr ++ // 3) hdr <= sp + page_size ++ // ++ // these 3 tests can be done by evaluating the following expression: ++ // ++ // (hdr - sp) & (aligned_mask - page_size) ++ // ++ // assuming both the stack pointer and page_size have their least ++ // significant 2 bits cleared and page_size is a power of 2 ++ sub_d(hdr, hdr, SP); ++ li(SCR1, aligned_mask - os::vm_page_size()); ++ andr(hdr, hdr, SCR1); ++ // for recursive locking, the result is zero => save it in the displaced header ++ // location (null in the displaced hdr location indicates recursive locking) ++ st_d(hdr, Address(disp_hdr, 0)); ++ // otherwise we don't care about the result and handle locking via runtime call ++ bnez(hdr, slow_case); ++ // done ++ bind(done); ++ } ++ increment(Address(TREG, JavaThread::held_monitor_count_offset()), 1); ++ return null_check_offset; ++} ++ ++void C1_MacroAssembler::unlock_object(Register hdr, Register obj, Register disp_hdr, Label& slow_case) { ++ const int aligned_mask = BytesPerWord -1; ++ const int hdr_offset = oopDesc::mark_offset_in_bytes(); ++ assert(hdr != obj && hdr != disp_hdr && obj != disp_hdr, "registers must be different"); ++ Label done; ++ ++ if (LockingMode != LM_LIGHTWEIGHT) { ++ // load displaced header ++ ld_d(hdr, Address(disp_hdr, 0)); ++ // if the loaded hdr is null we had recursive locking ++ // if we had recursive locking, we are done ++ beqz(hdr, done); ++ } ++ ++ // load object ++ ld_d(obj, Address(disp_hdr, BasicObjectLock::obj_offset())); ++ verify_oop(obj); ++ if (LockingMode == LM_LIGHTWEIGHT) { ++ ld_d(hdr, Address(obj, oopDesc::mark_offset_in_bytes())); ++ // We cannot use tbnz here, the target might be too far away and cannot ++ // be encoded. ++ andi(AT, hdr, markWord::monitor_value); ++ bnez(AT, slow_case); ++ lightweight_unlock(obj, hdr, SCR1, SCR2, slow_case); ++ } else if (LockingMode == LM_LEGACY) { ++ // test if object header is pointing to the displaced header, and if so, restore ++ // the displaced header in the object - if the object header is not pointing to ++ // the displaced header, get the object header instead ++ // if the object header was not pointing to the displaced header, ++ // we do unlocking via runtime call ++ if (hdr_offset) { ++ lea(SCR1, Address(obj, hdr_offset)); ++ cmpxchg(Address(SCR1, 0), disp_hdr, hdr, SCR2, false, true /* acquire */, done, &slow_case); ++ } else { ++ cmpxchg(Address(obj, 0), disp_hdr, hdr, SCR2, false, true /* acquire */, done, &slow_case); ++ } ++ // done ++ bind(done); ++ } ++ decrement(Address(TREG, JavaThread::held_monitor_count_offset()), 1); ++} ++ ++// Defines obj, preserves var_size_in_bytes ++void C1_MacroAssembler::try_allocate(Register obj, Register var_size_in_bytes, ++ int con_size_in_bytes, Register t1, Register t2, ++ Label& slow_case) { ++ if (UseTLAB) { ++ tlab_allocate(obj, var_size_in_bytes, con_size_in_bytes, t1, t2, slow_case); ++ } else { ++ b_far(slow_case); ++ } ++} ++ ++void C1_MacroAssembler::initialize_header(Register obj, Register klass, Register len, ++ Register t1, Register t2) { ++ assert_different_registers(obj, klass, len); ++ // This assumes that all prototype bits fit in an int32_t ++ li(t1, (int32_t)(intptr_t)markWord::prototype().value()); ++ st_d(t1, Address(obj, oopDesc::mark_offset_in_bytes())); ++ ++ if (UseCompressedClassPointers) { // Take care not to kill klass ++ encode_klass_not_null(t1, klass); ++ st_w(t1, Address(obj, oopDesc::klass_offset_in_bytes())); ++ } else { ++ st_d(klass, Address(obj, oopDesc::klass_offset_in_bytes())); ++ } ++ ++ if (len->is_valid()) { ++ st_w(len, Address(obj, arrayOopDesc::length_offset_in_bytes())); ++ } else if (UseCompressedClassPointers) { ++ store_klass_gap(obj, R0); ++ } ++} ++ ++// preserves obj, destroys len_in_bytes ++// ++// Scratch registers: t1 = T0, t2 = T1 ++// ++void C1_MacroAssembler::initialize_body(Register obj, Register len_in_bytes, ++ int hdr_size_in_bytes, Register t1, Register t2) { ++ assert(hdr_size_in_bytes >= 0, "header size must be positive or 0"); ++ assert(t1 == T0 && t2 == T1, "must be"); ++ Label done; ++ ++ // len_in_bytes is positive and ptr sized ++ addi_d(len_in_bytes, len_in_bytes, -hdr_size_in_bytes); ++ beqz(len_in_bytes, done); ++ ++ // zero_words() takes ptr in t1 and count in bytes in t2 ++ lea(t1, Address(obj, hdr_size_in_bytes)); ++ addi_d(t2, len_in_bytes, -BytesPerWord); ++ ++ Label loop; ++ bind(loop); ++ stx_d(R0, t1, t2); ++ addi_d(t2, t2, -BytesPerWord); ++ bge(t2, R0, loop); ++ ++ bind(done); ++} ++ ++void C1_MacroAssembler::allocate_object(Register obj, Register t1, Register t2, int header_size, ++ int object_size, Register klass, Label& slow_case) { ++ assert_different_registers(obj, t1, t2); ++ assert(header_size >= 0 && object_size >= header_size, "illegal sizes"); ++ ++ try_allocate(obj, noreg, object_size * BytesPerWord, t1, t2, slow_case); ++ ++ initialize_object(obj, klass, noreg, object_size * HeapWordSize, t1, t2, UseTLAB); ++} ++ ++// Scratch registers: t1 = T0, t2 = T1 ++void C1_MacroAssembler::initialize_object(Register obj, Register klass, Register var_size_in_bytes, ++ int con_size_in_bytes, Register t1, Register t2, ++ bool is_tlab_allocated) { ++ assert((con_size_in_bytes & MinObjAlignmentInBytesMask) == 0, ++ "con_size_in_bytes is not multiple of alignment"); ++ const int hdr_size_in_bytes = instanceOopDesc::header_size() * HeapWordSize; ++ ++ initialize_header(obj, klass, noreg, t1, t2); ++ ++ if (!(UseTLAB && ZeroTLAB && is_tlab_allocated)) { ++ // clear rest of allocated space ++ const Register index = t2; ++ if (var_size_in_bytes != noreg) { ++ move(index, var_size_in_bytes); ++ initialize_body(obj, index, hdr_size_in_bytes, t1, t2); ++ } else if (con_size_in_bytes > hdr_size_in_bytes) { ++ con_size_in_bytes -= hdr_size_in_bytes; ++ lea(t1, Address(obj, hdr_size_in_bytes)); ++ Label loop; ++ li(SCR1, con_size_in_bytes - BytesPerWord); ++ bind(loop); ++ stx_d(R0, t1, SCR1); ++ addi_d(SCR1, SCR1, -BytesPerWord); ++ bge(SCR1, R0, loop); ++ } ++ } ++ ++ membar(StoreStore); ++ ++ if (CURRENT_ENV->dtrace_alloc_probes()) { ++ assert(obj == A0, "must be"); ++ call(Runtime1::entry_for(Runtime1::dtrace_object_alloc_id), relocInfo::runtime_call_type); ++ } ++ ++ verify_oop(obj); ++} ++ ++void C1_MacroAssembler::allocate_array(Register obj, Register len, Register t1, Register t2, ++ int header_size, int f, Register klass, Label& slow_case) { ++ assert_different_registers(obj, len, t1, t2, klass); ++ ++ // determine alignment mask ++ assert(!(BytesPerWord & 1), "must be a multiple of 2 for masking code to work"); ++ ++ // check for negative or excessive length ++ li(SCR1, (int32_t)max_array_allocation_length); ++ bge_far(len, SCR1, slow_case, false); ++ ++ const Register arr_size = t2; // okay to be the same ++ // align object end ++ li(arr_size, (int32_t)header_size * BytesPerWord + MinObjAlignmentInBytesMask); ++ slli_w(SCR1, len, f); ++ add_d(arr_size, arr_size, SCR1); ++ bstrins_d(arr_size, R0, exact_log2(MinObjAlignmentInBytesMask + 1) - 1, 0); ++ ++ try_allocate(obj, arr_size, 0, t1, t2, slow_case); ++ ++ initialize_header(obj, klass, len, t1, t2); ++ ++ // clear rest of allocated space ++ initialize_body(obj, arr_size, header_size * BytesPerWord, t1, t2); ++ ++ membar(StoreStore); ++ ++ if (CURRENT_ENV->dtrace_alloc_probes()) { ++ assert(obj == A0, "must be"); ++ call(Runtime1::entry_for(Runtime1::dtrace_object_alloc_id), relocInfo::runtime_call_type); ++ } ++ ++ verify_oop(obj); ++} ++ ++void C1_MacroAssembler::build_frame(int framesize, int bang_size_in_bytes) { ++ assert(bang_size_in_bytes >= framesize, "stack bang size incorrect"); ++ // Make sure there is enough stack space for this method's activation. ++ // Note that we do this before creating a frame. ++ generate_stack_overflow_check(bang_size_in_bytes); ++ MacroAssembler::build_frame(framesize); ++ ++ // Insert nmethod entry barrier into frame. ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ bs->nmethod_entry_barrier(this, nullptr /* slow_path */, nullptr /* continuation */, nullptr /* guard */); ++} ++ ++void C1_MacroAssembler::remove_frame(int framesize) { ++ MacroAssembler::remove_frame(framesize); ++} ++ ++void C1_MacroAssembler::verified_entry(bool breakAtEntry) { ++ // If we have to make this method not-entrant we'll overwrite its ++ // first instruction with a jump. For this action to be legal we ++ // must ensure that this first instruction is a b, bl, nop, break. ++ // Make it a NOP. ++ nop(); ++} ++ ++void C1_MacroAssembler::load_parameter(int offset_in_words, Register reg) { ++ // FP + -2: link ++ // + -1: return address ++ // + 0: argument with offset 0 ++ // + 1: argument with offset 1 ++ // + 2: ... ++ ++ ld_d(reg, Address(FP, offset_in_words * BytesPerWord)); ++} ++ ++#ifndef PRODUCT ++void C1_MacroAssembler::verify_stack_oop(int stack_offset) { ++ if (!VerifyOops) return; ++ verify_oop_addr(Address(SP, stack_offset)); ++} ++ ++void C1_MacroAssembler::verify_not_null_oop(Register r) { ++ if (!VerifyOops) return; ++ Label not_null; ++ bnez(r, not_null); ++ stop("non-null oop required"); ++ bind(not_null); ++ verify_oop(r); ++} ++ ++void C1_MacroAssembler::invalidate_registers(bool inv_a0, bool inv_s0, bool inv_a2, ++ bool inv_a3, bool inv_a4, bool inv_a5) { ++#ifdef ASSERT ++ static int nn; ++ if (inv_a0) li(A0, 0xDEAD); ++ if (inv_s0) li(S0, 0xDEAD); ++ if (inv_a2) li(A2, nn++); ++ if (inv_a3) li(A3, 0xDEAD); ++ if (inv_a4) li(A4, 0xDEAD); ++ if (inv_a5) li(A5, 0xDEAD); ++#endif ++} ++#endif // ifndef PRODUCT +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_MacroAssembler_loongarch.hpp b/src/hotspot/cpu/loongarch/c1_MacroAssembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/c1_MacroAssembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_MacroAssembler_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,111 @@ ++/* ++ * Copyright (c) 1999, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_C1_MACROASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_C1_MACROASSEMBLER_LOONGARCH_HPP ++ ++using MacroAssembler::build_frame; ++using MacroAssembler::null_check; ++ ++// C1_MacroAssembler contains high-level macros for C1 ++ ++ private: ++ int _rsp_offset; // track rsp changes ++ // initialization ++ void pd_init() { _rsp_offset = 0; } ++ ++ public: ++ void try_allocate( ++ Register obj, // result: pointer to object after successful allocation ++ Register var_size_in_bytes, // object size in bytes if unknown at compile time; invalid otherwise ++ int con_size_in_bytes, // object size in bytes if known at compile time ++ Register t1, // temp register ++ Register t2, // temp register ++ Label& slow_case // continuation point if fast allocation fails ++ ); ++ ++ void initialize_header(Register obj, Register klass, Register len, Register t1, Register t2); ++ void initialize_body(Register obj, Register len_in_bytes, int hdr_size_in_bytes, Register t1, Register t2); ++ ++ // locking ++ // hdr : must be A0, contents destroyed ++ // obj : must point to the object to lock, contents preserved ++ // disp_hdr: must point to the displaced header location, contents preserved ++ // returns code offset at which to add null check debug information ++ int lock_object (Register swap, Register obj, Register disp_hdr, Label& slow_case); ++ ++ // unlocking ++ // hdr : contents destroyed ++ // obj : must point to the object to lock, contents preserved ++ // disp_hdr: must be A0 & must point to the displaced header location, contents destroyed ++ void unlock_object(Register swap, Register obj, Register lock, Label& slow_case); ++ ++ void initialize_object( ++ Register obj, // result: pointer to object after successful allocation ++ Register klass, // object klass ++ Register var_size_in_bytes, // object size in bytes if unknown at compile time; invalid otherwise ++ int con_size_in_bytes, // object size in bytes if known at compile time ++ Register t1, // temp register ++ Register t2, // temp register ++ bool is_tlab_allocated // the object was allocated in a TLAB; relevant for the implementation of ZeroTLAB ++ ); ++ ++ // allocation of fixed-size objects ++ // (can also be used to allocate fixed-size arrays, by setting ++ // hdr_size correctly and storing the array length afterwards) ++ // obj : will contain pointer to allocated object ++ // t1, t2 : scratch registers - contents destroyed ++ // header_size: size of object header in words ++ // object_size: total size of object in words ++ // slow_case : exit to slow case implementation if fast allocation fails ++ void allocate_object(Register obj, Register t1, Register t2, int header_size, ++ int object_size, Register klass, Label& slow_case); ++ ++ enum { ++ max_array_allocation_length = 0x00FFFFFF ++ }; ++ ++ // allocation of arrays ++ // obj : will contain pointer to allocated object ++ // len : array length in number of elements ++ // t : scratch register - contents destroyed ++ // header_size: size of object header in words ++ // f : element scale factor ++ // slow_case : exit to slow case implementation if fast allocation fails ++ void allocate_array(Register obj, Register len, Register t, Register t2, int header_size, ++ int f, Register klass, Label& slow_case); ++ ++ int rsp_offset() const { return _rsp_offset; } ++ void set_rsp_offset(int n) { _rsp_offset = n; } ++ ++ void invalidate_registers(bool inv_a0, bool inv_s0, bool inv_a2, bool inv_a3, ++ bool inv_a4, bool inv_a5) PRODUCT_RETURN; ++ ++ // This platform only uses signal-based null checks. The Label is not needed. ++ void null_check(Register r, Label *Lnull = nullptr) { MacroAssembler::null_check(r); } ++ ++ void load_parameter(int offset_in_words, Register reg); ++ ++#endif // CPU_LOONGARCH_C1_MACROASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c1_Runtime1_loongarch_64.cpp b/src/hotspot/cpu/loongarch/c1_Runtime1_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/c1_Runtime1_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c1_Runtime1_loongarch_64.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,1041 @@ ++/* ++ * Copyright (c) 1999, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/assembler.hpp" ++#include "c1/c1_CodeStubs.hpp" ++#include "c1/c1_Defs.hpp" ++#include "c1/c1_MacroAssembler.hpp" ++#include "c1/c1_Runtime1.hpp" ++#include "compiler/disassembler.hpp" ++#include "compiler/oopMap.hpp" ++#include "gc/shared/cardTable.hpp" ++#include "gc/shared/cardTableBarrierSet.hpp" ++#include "gc/shared/collectedHeap.hpp" ++#include "gc/shared/tlab_globals.hpp" ++#include "interpreter/interpreter.hpp" ++#include "memory/universe.hpp" ++#include "nativeInst_loongarch.hpp" ++#include "oops/compiledICHolder.hpp" ++#include "oops/oop.inline.hpp" ++#include "prims/jvmtiExport.hpp" ++#include "register_loongarch.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/signature.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "runtime/vframe.hpp" ++#include "runtime/vframeArray.hpp" ++#include "utilities/powerOfTwo.hpp" ++#include "vmreg_loongarch.inline.hpp" ++ ++// Implementation of StubAssembler ++ ++int StubAssembler::call_RT(Register oop_result1, Register metadata_result, address entry, int args_size) { ++ // setup registers ++ assert(!(oop_result1->is_valid() || metadata_result->is_valid()) || oop_result1 != metadata_result, ++ "registers must be different"); ++ assert(oop_result1 != TREG && metadata_result != TREG, "registers must be different"); ++ assert(args_size >= 0, "illegal args_size"); ++ bool align_stack = false; ++ ++ move(A0, TREG); ++ set_num_rt_args(0); // Nothing on stack ++ ++ Label retaddr; ++ set_last_Java_frame(SP, FP, retaddr); ++ ++ // do the call ++ call(entry, relocInfo::runtime_call_type); ++ bind(retaddr); ++ int call_offset = offset(); ++ // verify callee-saved register ++#ifdef ASSERT ++ { Label L; ++ get_thread(SCR1); ++ beq(TREG, SCR1, L); ++ stop("StubAssembler::call_RT: TREG not callee saved?"); ++ bind(L); ++ } ++#endif ++ reset_last_Java_frame(true); ++ ++ // check for pending exceptions ++ { Label L; ++ // check for pending exceptions (java_thread is set upon return) ++ ld_d(SCR1, Address(TREG, Thread::pending_exception_offset())); ++ beqz(SCR1, L); ++ // exception pending => remove activation and forward to exception handler ++ // make sure that the vm_results are cleared ++ if (oop_result1->is_valid()) { ++ st_d(R0, Address(TREG, JavaThread::vm_result_offset())); ++ } ++ if (metadata_result->is_valid()) { ++ st_d(R0, Address(TREG, JavaThread::vm_result_2_offset())); ++ } ++ if (frame_size() == no_frame_size) { ++ leave(); ++ jmp(StubRoutines::forward_exception_entry(), relocInfo::runtime_call_type); ++ } else if (_stub_id == Runtime1::forward_exception_id) { ++ should_not_reach_here(); ++ } else { ++ jmp(Runtime1::entry_for(Runtime1::forward_exception_id), relocInfo::runtime_call_type); ++ } ++ bind(L); ++ } ++ // get oop results if there are any and reset the values in the thread ++ if (oop_result1->is_valid()) { ++ get_vm_result(oop_result1, TREG); ++ } ++ if (metadata_result->is_valid()) { ++ get_vm_result_2(metadata_result, TREG); ++ } ++ return call_offset; ++} ++ ++int StubAssembler::call_RT(Register oop_result1, Register metadata_result, ++ address entry, Register arg1) { ++ move(A1, arg1); ++ return call_RT(oop_result1, metadata_result, entry, 1); ++} ++ ++int StubAssembler::call_RT(Register oop_result1, Register metadata_result, ++ address entry, Register arg1, Register arg2) { ++ if (A1 == arg2) { ++ if (A2 == arg1) { ++ move(SCR1, arg1); ++ move(arg1, arg2); ++ move(arg2, SCR1); ++ } else { ++ move(A2, arg2); ++ move(A1, arg1); ++ } ++ } else { ++ move(A1, arg1); ++ move(A2, arg2); ++ } ++ return call_RT(oop_result1, metadata_result, entry, 2); ++} ++ ++int StubAssembler::call_RT(Register oop_result1, Register metadata_result, ++ address entry, Register arg1, Register arg2, Register arg3) { ++ // if there is any conflict use the stack ++ if (arg1 == A2 || arg1 == A3 || ++ arg2 == A1 || arg2 == A3 || ++ arg3 == A1 || arg3 == A2) { ++ addi_d(SP, SP, -4 * wordSize); ++ st_d(arg1, Address(SP, 0 * wordSize)); ++ st_d(arg2, Address(SP, 1 * wordSize)); ++ st_d(arg3, Address(SP, 2 * wordSize)); ++ ld_d(arg1, Address(SP, 0 * wordSize)); ++ ld_d(arg2, Address(SP, 1 * wordSize)); ++ ld_d(arg3, Address(SP, 2 * wordSize)); ++ addi_d(SP, SP, 4 * wordSize); ++ } else { ++ move(A1, arg1); ++ move(A2, arg2); ++ move(A3, arg3); ++ } ++ return call_RT(oop_result1, metadata_result, entry, 3); ++} ++ ++enum return_state_t { ++ does_not_return, requires_return ++}; ++ ++// Implementation of StubFrame ++ ++class StubFrame: public StackObj { ++ private: ++ StubAssembler* _sasm; ++ bool _return_state; ++ ++ public: ++ StubFrame(StubAssembler* sasm, const char* name, bool must_gc_arguments, ++ return_state_t return_state=requires_return); ++ void load_argument(int offset_in_words, Register reg); ++ ++ ~StubFrame(); ++};; ++ ++void StubAssembler::prologue(const char* name, bool must_gc_arguments) { ++ set_info(name, must_gc_arguments); ++ enter(); ++} ++ ++void StubAssembler::epilogue() { ++ leave(); ++ jr(RA); ++} ++ ++#define __ _sasm-> ++ ++StubFrame::StubFrame(StubAssembler* sasm, const char* name, bool must_gc_arguments, ++ return_state_t return_state) { ++ _sasm = sasm; ++ _return_state = return_state; ++ __ prologue(name, must_gc_arguments); ++} ++ ++// load parameters that were stored with LIR_Assembler::store_parameter ++// Note: offsets for store_parameter and load_argument must match ++void StubFrame::load_argument(int offset_in_words, Register reg) { ++ __ load_parameter(offset_in_words, reg); ++} ++ ++StubFrame::~StubFrame() { ++ if (_return_state == requires_return) { ++ __ epilogue(); ++ } else { ++ __ should_not_reach_here(); ++ } ++} ++ ++#undef __ ++ ++// Implementation of Runtime1 ++ ++#define __ sasm-> ++ ++const int float_regs_as_doubles_size_in_slots = pd_nof_fpu_regs_frame_map * 2; ++ ++// Stack layout for saving/restoring all the registers needed during a runtime ++// call (this includes deoptimization) ++// Note: note that users of this frame may well have arguments to some runtime ++// while these values are on the stack. These positions neglect those arguments ++// but the code in save_live_registers will take the argument count into ++// account. ++// ++ ++enum reg_save_layout { ++ reg_save_frame_size = 32 /* float */ + 30 /* integer, except zr, tp */ ++}; ++ ++// Save off registers which might be killed by calls into the runtime. ++// Tries to smart of about FP registers. In particular we separate ++// saving and describing the FPU registers for deoptimization since we ++// have to save the FPU registers twice if we describe them. The ++// deopt blob is the only thing which needs to describe FPU registers. ++// In all other cases it should be sufficient to simply save their ++// current value. ++ ++static int cpu_reg_save_offsets[FrameMap::nof_cpu_regs]; ++static int fpu_reg_save_offsets[FrameMap::nof_fpu_regs]; ++static int reg_save_size_in_words; ++static int frame_size_in_bytes = -1; ++ ++static OopMap* generate_oop_map(StubAssembler* sasm, bool save_fpu_registers) { ++ int frame_size_in_bytes = reg_save_frame_size * BytesPerWord; ++ sasm->set_frame_size(frame_size_in_bytes / BytesPerWord); ++ int frame_size_in_slots = frame_size_in_bytes / VMRegImpl::stack_slot_size; ++ OopMap* oop_map = new OopMap(frame_size_in_slots, 0); ++ ++ for (int i = A0->encoding(); i <= T8->encoding(); i++) { ++ Register r = as_Register(i); ++ if (i != SCR1->encoding() && i != SCR2->encoding()) { ++ int sp_offset = cpu_reg_save_offsets[i]; ++ oop_map->set_callee_saved(VMRegImpl::stack2reg(sp_offset), r->as_VMReg()); ++ } ++ } ++ ++ if (save_fpu_registers) { ++ for (int i = 0; i < FrameMap::nof_fpu_regs; i++) { ++ FloatRegister r = as_FloatRegister(i); ++ int sp_offset = fpu_reg_save_offsets[i]; ++ oop_map->set_callee_saved(VMRegImpl::stack2reg(sp_offset), r->as_VMReg()); ++ } ++ } ++ ++ return oop_map; ++} ++ ++static OopMap* save_live_registers(StubAssembler* sasm, ++ bool save_fpu_registers = true) { ++ __ block_comment("save_live_registers"); ++ ++ // integer registers except zr & ra & tp & sp & rx. 4 is due to alignment. ++ __ addi_d(SP, SP, -(32 - 4 + 32) * wordSize); ++ ++ for (int i = 4; i < 21; i++) ++ __ st_d(as_Register(i), Address(SP, (32 + i - 4) * wordSize)); ++ for (int i = 22; i < 32; i++) ++ __ st_d(as_Register(i), Address(SP, (32 + i - 4) * wordSize)); ++ ++ if (save_fpu_registers) { ++ for (int i = 0; i < 32; i++) ++ __ fst_d(as_FloatRegister(i), Address(SP, i * wordSize)); ++ } ++ ++ return generate_oop_map(sasm, save_fpu_registers); ++} ++ ++static void restore_live_registers(StubAssembler* sasm, bool restore_fpu_registers = true) { ++ if (restore_fpu_registers) { ++ for (int i = 0; i < 32; i ++) ++ __ fld_d(as_FloatRegister(i), Address(SP, i * wordSize)); ++ } ++ ++ for (int i = 4; i < 21; i++) ++ __ ld_d(as_Register(i), Address(SP, (32 + i - 4) * wordSize)); ++ for (int i = 22; i < 32; i++) ++ __ ld_d(as_Register(i), Address(SP, (32 + i - 4) * wordSize)); ++ ++ __ addi_d(SP, SP, (32 - 4 + 32) * wordSize); ++} ++ ++static void restore_live_registers_except_a0(StubAssembler* sasm, bool restore_fpu_registers = true) { ++ if (restore_fpu_registers) { ++ for (int i = 0; i < 32; i ++) ++ __ fld_d(as_FloatRegister(i), Address(SP, i * wordSize)); ++ } ++ ++ for (int i = 5; i < 21; i++) ++ __ ld_d(as_Register(i), Address(SP, (32 + i - 4) * wordSize)); ++ for (int i = 22; i < 32; i++) ++ __ ld_d(as_Register(i), Address(SP, (32 + i - 4) * wordSize)); ++ ++ __ addi_d(SP, SP, (32 - 4 + 32) * wordSize); ++} ++ ++void Runtime1::initialize_pd() { ++ int sp_offset = 0; ++ int i; ++ ++ // all float registers are saved explicitly ++ assert(FrameMap::nof_fpu_regs == 32, "double registers not handled here"); ++ for (i = 0; i < FrameMap::nof_fpu_regs; i++) { ++ fpu_reg_save_offsets[i] = sp_offset; ++ sp_offset += 2; // SP offsets are in halfwords ++ } ++ ++ for (i = 4; i < FrameMap::nof_cpu_regs; i++) { ++ Register r = as_Register(i); ++ cpu_reg_save_offsets[i] = sp_offset; ++ sp_offset += 2; // SP offsets are in halfwords ++ } ++} ++ ++// target: the entry point of the method that creates and posts the exception oop ++// has_argument: true if the exception needs arguments (passed in SCR1 and SCR2) ++ ++OopMapSet* Runtime1::generate_exception_throw(StubAssembler* sasm, address target, ++ bool has_argument) { ++ // make a frame and preserve the caller's caller-save registers ++ OopMap* oop_map = save_live_registers(sasm); ++ int call_offset; ++ if (!has_argument) { ++ call_offset = __ call_RT(noreg, noreg, target); ++ } else { ++ __ move(A1, SCR1); ++ __ move(A2, SCR2); ++ call_offset = __ call_RT(noreg, noreg, target); ++ } ++ OopMapSet* oop_maps = new OopMapSet(); ++ oop_maps->add_gc_map(call_offset, oop_map); ++ return oop_maps; ++} ++ ++OopMapSet* Runtime1::generate_handle_exception(StubID id, StubAssembler *sasm) { ++ __ block_comment("generate_handle_exception"); ++ ++ // incoming parameters ++ const Register exception_oop = A0; ++ const Register exception_pc = A1; ++ // other registers used in this stub ++ ++ // Save registers, if required. ++ OopMapSet* oop_maps = new OopMapSet(); ++ OopMap* oop_map = nullptr; ++ switch (id) { ++ case forward_exception_id: ++ // We're handling an exception in the context of a compiled frame. ++ // The registers have been saved in the standard places. Perform ++ // an exception lookup in the caller and dispatch to the handler ++ // if found. Otherwise unwind and dispatch to the callers ++ // exception handler. ++ oop_map = generate_oop_map(sasm, 1 /*thread*/); ++ ++ // load and clear pending exception oop into A0 ++ __ ld_d(exception_oop, Address(TREG, Thread::pending_exception_offset())); ++ __ st_d(R0, Address(TREG, Thread::pending_exception_offset())); ++ ++ // load issuing PC (the return address for this stub) into A1 ++ __ ld_d(exception_pc, Address(FP, frame::return_addr_offset * BytesPerWord)); ++ ++ // make sure that the vm_results are cleared (may be unnecessary) ++ __ st_d(R0, Address(TREG, JavaThread::vm_result_offset())); ++ __ st_d(R0, Address(TREG, JavaThread::vm_result_2_offset())); ++ break; ++ case handle_exception_nofpu_id: ++ case handle_exception_id: ++ // At this point all registers MAY be live. ++ oop_map = save_live_registers(sasm, id != handle_exception_nofpu_id); ++ break; ++ case handle_exception_from_callee_id: { ++ // At this point all registers except exception oop (A0) and ++ // exception pc (RA) are dead. ++ const int frame_size = 2 /*fp, return address*/; ++ oop_map = new OopMap(frame_size * VMRegImpl::slots_per_word, 0); ++ sasm->set_frame_size(frame_size); ++ break; ++ } ++ default: ShouldNotReachHere(); ++ } ++ ++ // verify that only A0 and A1 are valid at this time ++ __ invalidate_registers(false, true, true, true, true, true); ++ // verify that A0 contains a valid exception ++ __ verify_not_null_oop(exception_oop); ++ ++#ifdef ASSERT ++ // check that fields in JavaThread for exception oop and issuing pc are ++ // empty before writing to them ++ Label oop_empty; ++ __ ld_d(SCR1, Address(TREG, JavaThread::exception_oop_offset())); ++ __ beqz(SCR1, oop_empty); ++ __ stop("exception oop already set"); ++ __ bind(oop_empty); ++ ++ Label pc_empty; ++ __ ld_d(SCR1, Address(TREG, JavaThread::exception_pc_offset())); ++ __ beqz(SCR1, pc_empty); ++ __ stop("exception pc already set"); ++ __ bind(pc_empty); ++#endif ++ ++ // save exception oop and issuing pc into JavaThread ++ // (exception handler will load it from here) ++ __ st_d(exception_oop, Address(TREG, JavaThread::exception_oop_offset())); ++ __ st_d(exception_pc, Address(TREG, JavaThread::exception_pc_offset())); ++ ++ // patch throwing pc into return address (has bci & oop map) ++ __ st_d(exception_pc, Address(FP, frame::return_addr_offset * BytesPerWord)); ++ ++ // compute the exception handler. ++ // the exception oop and the throwing pc are read from the fields in JavaThread ++ int call_offset = __ call_RT(noreg, noreg, CAST_FROM_FN_PTR(address, exception_handler_for_pc)); ++ oop_maps->add_gc_map(call_offset, oop_map); ++ ++ // A0: handler address ++ // will be the deopt blob if nmethod was deoptimized while we looked up ++ // handler regardless of whether handler existed in the nmethod. ++ ++ // only A0 is valid at this time, all other registers have been destroyed by the runtime call ++ __ invalidate_registers(false, true, true, true, true, true); ++ ++ // patch the return address, this stub will directly return to the exception handler ++ __ st_d(A0, Address(FP, frame::return_addr_offset * BytesPerWord)); ++ ++ switch (id) { ++ case forward_exception_id: ++ case handle_exception_nofpu_id: ++ case handle_exception_id: ++ // Restore the registers that were saved at the beginning. ++ restore_live_registers(sasm, id != handle_exception_nofpu_id); ++ break; ++ case handle_exception_from_callee_id: ++ break; ++ default: ShouldNotReachHere(); ++ } ++ ++ return oop_maps; ++} ++ ++void Runtime1::generate_unwind_exception(StubAssembler *sasm) { ++ // incoming parameters ++ const Register exception_oop = A0; ++ // callee-saved copy of exception_oop during runtime call ++ const Register exception_oop_callee_saved = S0; ++ // other registers used in this stub ++ const Register exception_pc = A1; ++ const Register handler_addr = A3; ++ ++ // verify that only A0, is valid at this time ++ __ invalidate_registers(false, true, true, true, true, true); ++ ++#ifdef ASSERT ++ // check that fields in JavaThread for exception oop and issuing pc are empty ++ Label oop_empty; ++ __ ld_d(SCR1, Address(TREG, JavaThread::exception_oop_offset())); ++ __ beqz(SCR1, oop_empty); ++ __ stop("exception oop must be empty"); ++ __ bind(oop_empty); ++ ++ Label pc_empty; ++ __ ld_d(SCR1, Address(TREG, JavaThread::exception_pc_offset())); ++ __ beqz(SCR1, pc_empty); ++ __ stop("exception pc must be empty"); ++ __ bind(pc_empty); ++#endif ++ ++ // Save our return address because ++ // exception_handler_for_return_address will destroy it. We also ++ // save exception_oop ++ __ addi_d(SP, SP, -2 * wordSize); ++ __ st_d(RA, Address(SP, 0 * wordSize)); ++ __ st_d(exception_oop, Address(SP, 1 * wordSize)); ++ ++ // search the exception handler address of the caller (using the return address) ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::exception_handler_for_return_address), TREG, RA); ++ // V0: exception handler address of the caller ++ ++ // Only V0 is valid at this time; all other registers have been ++ // destroyed by the call. ++ __ invalidate_registers(false, true, true, true, false, true); ++ ++ // move result of call into correct register ++ __ move(handler_addr, A0); ++ ++ // get throwing pc (= return address). ++ // RA has been destroyed by the call ++ __ ld_d(RA, Address(SP, 0 * wordSize)); ++ __ ld_d(exception_oop, Address(SP, 1 * wordSize)); ++ __ addi_d(SP, SP, 2 * wordSize); ++ __ move(A1, RA); ++ ++ __ verify_not_null_oop(exception_oop); ++ ++ // continue at exception handler (return address removed) ++ // note: do *not* remove arguments when unwinding the ++ // activation since the caller assumes having ++ // all arguments on the stack when entering the ++ // runtime to determine the exception handler ++ // (GC happens at call site with arguments!) ++ // A0: exception oop ++ // A1: throwing pc ++ // A3: exception handler ++ __ jr(handler_addr); ++} ++ ++OopMapSet* Runtime1::generate_patching(StubAssembler* sasm, address target) { ++ // use the maximum number of runtime-arguments here because it is difficult to ++ // distinguish each RT-Call. ++ // Note: This number affects also the RT-Call in generate_handle_exception because ++ // the oop-map is shared for all calls. ++ DeoptimizationBlob* deopt_blob = SharedRuntime::deopt_blob(); ++ assert(deopt_blob != nullptr, "deoptimization blob must have been created"); ++ ++ OopMap* oop_map = save_live_registers(sasm); ++ ++ __ move(A0, TREG); ++ Label retaddr; ++ __ set_last_Java_frame(SP, FP, retaddr); ++ // do the call ++ __ call(target, relocInfo::runtime_call_type); ++ __ bind(retaddr); ++ OopMapSet* oop_maps = new OopMapSet(); ++ oop_maps->add_gc_map(__ offset(), oop_map); ++ // verify callee-saved register ++#ifdef ASSERT ++ { Label L; ++ __ get_thread(SCR1); ++ __ beq(TREG, SCR1, L); ++ __ stop("StubAssembler::call_RT: TREG not callee saved?"); ++ __ bind(L); ++ } ++#endif ++ ++ __ reset_last_Java_frame(true); ++ ++#ifdef ASSERT ++ // check that fields in JavaThread for exception oop and issuing pc are empty ++ Label oop_empty; ++ __ ld_d(SCR1, Address(TREG, Thread::pending_exception_offset())); ++ __ beqz(SCR1, oop_empty); ++ __ stop("exception oop must be empty"); ++ __ bind(oop_empty); ++ ++ Label pc_empty; ++ __ ld_d(SCR1, Address(TREG, JavaThread::exception_pc_offset())); ++ __ beqz(SCR1, pc_empty); ++ __ stop("exception pc must be empty"); ++ __ bind(pc_empty); ++#endif ++ ++ // Runtime will return true if the nmethod has been deoptimized, this is the ++ // expected scenario and anything else is an error. Note that we maintain a ++ // check on the result purely as a defensive measure. ++ Label no_deopt; ++ __ beqz(A0, no_deopt); // Have we deoptimized? ++ ++ // Perform a re-execute. The proper return address is already on the stack, ++ // we just need to restore registers, pop all of our frame but the return ++ // address and jump to the deopt blob. ++ restore_live_registers(sasm); ++ __ leave(); ++ __ jmp(deopt_blob->unpack_with_reexecution(), relocInfo::runtime_call_type); ++ ++ __ bind(no_deopt); ++ __ stop("deopt not performed"); ++ ++ return oop_maps; ++} ++ ++OopMapSet* Runtime1::generate_code_for(StubID id, StubAssembler* sasm) { ++ // for better readability ++ const bool must_gc_arguments = true; ++ const bool dont_gc_arguments = false; ++ ++ // default value; overwritten for some optimized stubs that are called ++ // from methods that do not use the fpu ++ bool save_fpu_registers = true; ++ ++ // stub code & info for the different stubs ++ OopMapSet* oop_maps = nullptr; ++ OopMap* oop_map = nullptr; ++ switch (id) { ++ { ++ case forward_exception_id: ++ { ++ oop_maps = generate_handle_exception(id, sasm); ++ __ leave(); ++ __ jr(RA); ++ } ++ break; ++ ++ case throw_div0_exception_id: ++ { ++ StubFrame f(sasm, "throw_div0_exception", dont_gc_arguments, does_not_return); ++ oop_maps = generate_exception_throw(sasm, CAST_FROM_FN_PTR(address, throw_div0_exception), false); ++ } ++ break; ++ ++ case throw_null_pointer_exception_id: ++ { ++ StubFrame f(sasm, "throw_null_pointer_exception", dont_gc_arguments, does_not_return); ++ oop_maps = generate_exception_throw(sasm, CAST_FROM_FN_PTR(address, throw_null_pointer_exception), false); ++ } ++ break; ++ ++ case new_instance_id: ++ case fast_new_instance_id: ++ case fast_new_instance_init_check_id: ++ { ++ Register klass = A3; // Incoming ++ Register obj = A0; // Result ++ ++ if (id == new_instance_id) { ++ __ set_info("new_instance", dont_gc_arguments); ++ } else if (id == fast_new_instance_id) { ++ __ set_info("fast new_instance", dont_gc_arguments); ++ } else { ++ assert(id == fast_new_instance_init_check_id, "bad StubID"); ++ __ set_info("fast new_instance init check", dont_gc_arguments); ++ } ++ ++ __ enter(); ++ OopMap* map = save_live_registers(sasm); ++ int call_offset = __ call_RT(obj, noreg, CAST_FROM_FN_PTR(address, new_instance), klass); ++ oop_maps = new OopMapSet(); ++ oop_maps->add_gc_map(call_offset, map); ++ restore_live_registers_except_a0(sasm); ++ __ verify_oop(obj); ++ __ leave(); ++ __ jr(RA); ++ ++ // A0,: new instance ++ } ++ ++ break; ++ ++ case counter_overflow_id: ++ { ++ Register bci = A0, method = A1; ++ __ enter(); ++ OopMap* map = save_live_registers(sasm); ++ // Retrieve bci ++ __ ld_w(bci, Address(FP, 0 * BytesPerWord)); ++ // And a pointer to the Method* ++ __ ld_d(method, Address(FP, 1 * BytesPerWord)); ++ int call_offset = __ call_RT(noreg, noreg, CAST_FROM_FN_PTR(address, counter_overflow), bci, method); ++ oop_maps = new OopMapSet(); ++ oop_maps->add_gc_map(call_offset, map); ++ restore_live_registers(sasm); ++ __ leave(); ++ __ jr(RA); ++ } ++ break; ++ ++ case new_type_array_id: ++ case new_object_array_id: ++ { ++ Register length = S0; // Incoming ++ Register klass = A3; // Incoming ++ Register obj = A0; // Result ++ ++ if (id == new_type_array_id) { ++ __ set_info("new_type_array", dont_gc_arguments); ++ } else { ++ __ set_info("new_object_array", dont_gc_arguments); ++ } ++ ++#ifdef ASSERT ++ // assert object type is really an array of the proper kind ++ { ++ Label ok; ++ Register t0 = obj; ++ __ ld_w(t0, Address(klass, Klass::layout_helper_offset())); ++ __ srai_w(t0, t0, Klass::_lh_array_tag_shift); ++ int tag = ((id == new_type_array_id) ++ ? Klass::_lh_array_tag_type_value ++ : Klass::_lh_array_tag_obj_value); ++ __ li(SCR1, tag); ++ __ beq(t0, SCR1, ok); ++ __ stop("assert(is an array klass)"); ++ __ should_not_reach_here(); ++ __ bind(ok); ++ } ++#endif // ASSERT ++ ++ __ enter(); ++ OopMap* map = save_live_registers(sasm); ++ int call_offset; ++ if (id == new_type_array_id) { ++ call_offset = __ call_RT(obj, noreg, CAST_FROM_FN_PTR(address, new_type_array), klass, length); ++ } else { ++ call_offset = __ call_RT(obj, noreg, CAST_FROM_FN_PTR(address, new_object_array), klass, length); ++ } ++ ++ oop_maps = new OopMapSet(); ++ oop_maps->add_gc_map(call_offset, map); ++ restore_live_registers_except_a0(sasm); ++ ++ __ verify_oop(obj); ++ __ leave(); ++ __ jr(RA); ++ ++ // A0: new array ++ } ++ break; ++ ++ case new_multi_array_id: ++ { ++ StubFrame f(sasm, "new_multi_array", dont_gc_arguments); ++ // A0,: klass ++ // S0,: rank ++ // A2: address of 1st dimension ++ OopMap* map = save_live_registers(sasm); ++ __ move(A1, A0); ++ __ move(A3, A2); ++ __ move(A2, S0); ++ int call_offset = __ call_RT(A0, noreg, CAST_FROM_FN_PTR(address, new_multi_array), A1, A2, A3); ++ ++ oop_maps = new OopMapSet(); ++ oop_maps->add_gc_map(call_offset, map); ++ restore_live_registers_except_a0(sasm); ++ ++ // A0,: new multi array ++ __ verify_oop(A0); ++ } ++ break; ++ ++ case register_finalizer_id: ++ { ++ __ set_info("register_finalizer", dont_gc_arguments); ++ ++ // This is called via call_runtime so the arguments ++ // will be place in C abi locations ++ ++ __ verify_oop(A0); ++ ++ // load the klass and check the has finalizer flag ++ Label register_finalizer; ++ Register t = A5; ++ __ load_klass(t, A0); ++ __ ld_w(t, Address(t, Klass::access_flags_offset())); ++ __ li(SCR1, JVM_ACC_HAS_FINALIZER); ++ __ andr(SCR1, t, SCR1); ++ __ bnez(SCR1, register_finalizer); ++ __ jr(RA); ++ ++ __ bind(register_finalizer); ++ __ enter(); ++ OopMap* oop_map = save_live_registers(sasm); ++ int call_offset = __ call_RT(noreg, noreg, CAST_FROM_FN_PTR(address, SharedRuntime::register_finalizer), A0); ++ oop_maps = new OopMapSet(); ++ oop_maps->add_gc_map(call_offset, oop_map); ++ ++ // Now restore all the live registers ++ restore_live_registers(sasm); ++ ++ __ leave(); ++ __ jr(RA); ++ } ++ break; ++ ++ case throw_class_cast_exception_id: ++ { ++ StubFrame f(sasm, "throw_class_cast_exception", dont_gc_arguments, does_not_return); ++ oop_maps = generate_exception_throw(sasm, CAST_FROM_FN_PTR(address, throw_class_cast_exception), true); ++ } ++ break; ++ ++ case throw_incompatible_class_change_error_id: ++ { ++ StubFrame f(sasm, "throw_incompatible_class_cast_exception", dont_gc_arguments, does_not_return); ++ oop_maps = generate_exception_throw(sasm, CAST_FROM_FN_PTR(address, throw_incompatible_class_change_error), false); ++ } ++ break; ++ ++ case slow_subtype_check_id: ++ { ++ // Typical calling sequence: ++ // __ push(klass_RInfo); // object klass or other subclass ++ // __ push(sup_k_RInfo); // array element klass or other superclass ++ // __ bl(slow_subtype_check); ++ // Note that the subclass is pushed first, and is therefore deepest. ++ enum layout { ++ a0_off, a0_off_hi, ++ a2_off, a2_off_hi, ++ a4_off, a4_off_hi, ++ a5_off, a5_off_hi, ++ sup_k_off, sup_k_off_hi, ++ klass_off, klass_off_hi, ++ framesize, ++ result_off = sup_k_off ++ }; ++ ++ __ set_info("slow_subtype_check", dont_gc_arguments); ++ __ addi_d(SP, SP, -4 * wordSize); ++ __ st_d(A0, Address(SP, a0_off * VMRegImpl::stack_slot_size)); ++ __ st_d(A2, Address(SP, a2_off * VMRegImpl::stack_slot_size)); ++ __ st_d(A4, Address(SP, a4_off * VMRegImpl::stack_slot_size)); ++ __ st_d(A5, Address(SP, a5_off * VMRegImpl::stack_slot_size)); ++ ++ // This is called by pushing args and not with C abi ++ __ ld_d(A4, Address(SP, klass_off * VMRegImpl::stack_slot_size)); // subclass ++ __ ld_d(A0, Address(SP, sup_k_off * VMRegImpl::stack_slot_size)); // superclass ++ ++ Label miss; ++ __ check_klass_subtype_slow_path(A4, A0, A2, A5, nullptr, &miss); ++ ++ // fallthrough on success: ++ __ li(SCR1, 1); ++ __ st_d(SCR1, Address(SP, result_off * VMRegImpl::stack_slot_size)); // result ++ __ ld_d(A0, Address(SP, a0_off * VMRegImpl::stack_slot_size)); ++ __ ld_d(A2, Address(SP, a2_off * VMRegImpl::stack_slot_size)); ++ __ ld_d(A4, Address(SP, a4_off * VMRegImpl::stack_slot_size)); ++ __ ld_d(A5, Address(SP, a5_off * VMRegImpl::stack_slot_size)); ++ __ addi_d(SP, SP, 4 * wordSize); ++ __ jr(RA); ++ ++ __ bind(miss); ++ __ st_d(R0, Address(SP, result_off * VMRegImpl::stack_slot_size)); // result ++ __ ld_d(A0, Address(SP, a0_off * VMRegImpl::stack_slot_size)); ++ __ ld_d(A2, Address(SP, a2_off * VMRegImpl::stack_slot_size)); ++ __ ld_d(A4, Address(SP, a4_off * VMRegImpl::stack_slot_size)); ++ __ ld_d(A5, Address(SP, a5_off * VMRegImpl::stack_slot_size)); ++ __ addi_d(SP, SP, 4 * wordSize); ++ __ jr(RA); ++ } ++ break; ++ ++ case monitorenter_nofpu_id: ++ save_fpu_registers = false; ++ // fall through ++ case monitorenter_id: ++ { ++ StubFrame f(sasm, "monitorenter", dont_gc_arguments); ++ OopMap* map = save_live_registers(sasm, save_fpu_registers); ++ ++ // Called with store_parameter and not C abi ++ ++ f.load_argument(1, A0); // A0,: object ++ f.load_argument(0, A1); // A1,: lock address ++ ++ int call_offset = __ call_RT(noreg, noreg, CAST_FROM_FN_PTR(address, monitorenter), A0, A1); ++ ++ oop_maps = new OopMapSet(); ++ oop_maps->add_gc_map(call_offset, map); ++ restore_live_registers(sasm, save_fpu_registers); ++ } ++ break; ++ ++ case monitorexit_nofpu_id: ++ save_fpu_registers = false; ++ // fall through ++ case monitorexit_id: ++ { ++ StubFrame f(sasm, "monitorexit", dont_gc_arguments); ++ OopMap* map = save_live_registers(sasm, save_fpu_registers); ++ ++ // Called with store_parameter and not C abi ++ ++ f.load_argument(0, A0); // A0,: lock address ++ ++ // note: really a leaf routine but must setup last java sp ++ // => use call_RT for now (speed can be improved by ++ // doing last java sp setup manually) ++ int call_offset = __ call_RT(noreg, noreg, CAST_FROM_FN_PTR(address, monitorexit), A0); ++ ++ oop_maps = new OopMapSet(); ++ oop_maps->add_gc_map(call_offset, map); ++ restore_live_registers(sasm, save_fpu_registers); ++ } ++ break; ++ ++ case deoptimize_id: ++ { ++ StubFrame f(sasm, "deoptimize", dont_gc_arguments, does_not_return); ++ OopMap* oop_map = save_live_registers(sasm); ++ f.load_argument(0, A1); ++ int call_offset = __ call_RT(noreg, noreg, CAST_FROM_FN_PTR(address, deoptimize), A1); ++ ++ oop_maps = new OopMapSet(); ++ oop_maps->add_gc_map(call_offset, oop_map); ++ restore_live_registers(sasm); ++ DeoptimizationBlob* deopt_blob = SharedRuntime::deopt_blob(); ++ assert(deopt_blob != nullptr, "deoptimization blob must have been created"); ++ __ leave(); ++ __ jmp(deopt_blob->unpack_with_reexecution(), relocInfo::runtime_call_type); ++ } ++ break; ++ ++ case throw_range_check_failed_id: ++ { ++ StubFrame f(sasm, "range_check_failed", dont_gc_arguments, does_not_return); ++ oop_maps = generate_exception_throw(sasm, CAST_FROM_FN_PTR(address, throw_range_check_exception), true); ++ } ++ break; ++ ++ case unwind_exception_id: ++ { ++ __ set_info("unwind_exception", dont_gc_arguments); ++ // note: no stubframe since we are about to leave the current ++ // activation and we are calling a leaf VM function only. ++ generate_unwind_exception(sasm); ++ } ++ break; ++ ++ case access_field_patching_id: ++ { ++ StubFrame f(sasm, "access_field_patching", dont_gc_arguments, does_not_return); ++ // we should set up register map ++ oop_maps = generate_patching(sasm, CAST_FROM_FN_PTR(address, access_field_patching)); ++ } ++ break; ++ ++ case load_klass_patching_id: ++ { ++ StubFrame f(sasm, "load_klass_patching", dont_gc_arguments, does_not_return); ++ // we should set up register map ++ oop_maps = generate_patching(sasm, CAST_FROM_FN_PTR(address, move_klass_patching)); ++ } ++ break; ++ ++ case load_mirror_patching_id: ++ { ++ StubFrame f(sasm, "load_mirror_patching", dont_gc_arguments, does_not_return); ++ // we should set up register map ++ oop_maps = generate_patching(sasm, CAST_FROM_FN_PTR(address, move_mirror_patching)); ++ } ++ break; ++ ++ case load_appendix_patching_id: ++ { ++ StubFrame f(sasm, "load_appendix_patching", dont_gc_arguments, does_not_return); ++ // we should set up register map ++ oop_maps = generate_patching(sasm, CAST_FROM_FN_PTR(address, move_appendix_patching)); ++ } ++ break; ++ ++ case handle_exception_nofpu_id: ++ case handle_exception_id: ++ { ++ StubFrame f(sasm, "handle_exception", dont_gc_arguments); ++ oop_maps = generate_handle_exception(id, sasm); ++ } ++ break; ++ ++ case handle_exception_from_callee_id: ++ { ++ StubFrame f(sasm, "handle_exception_from_callee", dont_gc_arguments); ++ oop_maps = generate_handle_exception(id, sasm); ++ } ++ break; ++ ++ case throw_index_exception_id: ++ { ++ StubFrame f(sasm, "index_range_check_failed", dont_gc_arguments, does_not_return); ++ oop_maps = generate_exception_throw(sasm, CAST_FROM_FN_PTR(address, throw_index_exception), true); ++ } ++ break; ++ ++ case throw_array_store_exception_id: ++ { ++ StubFrame f(sasm, "throw_array_store_exception", dont_gc_arguments, does_not_return); ++ // tos + 0: link ++ // + 1: return address ++ oop_maps = generate_exception_throw(sasm, CAST_FROM_FN_PTR(address, throw_array_store_exception), true); ++ } ++ break; ++ ++ case predicate_failed_trap_id: ++ { ++ StubFrame f(sasm, "predicate_failed_trap", dont_gc_arguments, does_not_return); ++ ++ OopMap* map = save_live_registers(sasm); ++ ++ int call_offset = __ call_RT(noreg, noreg, CAST_FROM_FN_PTR(address, predicate_failed_trap)); ++ oop_maps = new OopMapSet(); ++ oop_maps->add_gc_map(call_offset, map); ++ restore_live_registers(sasm); ++ __ leave(); ++ DeoptimizationBlob* deopt_blob = SharedRuntime::deopt_blob(); ++ assert(deopt_blob != nullptr, "deoptimization blob must have been created"); ++ ++ __ jmp(deopt_blob->unpack_with_reexecution(), relocInfo::runtime_call_type); ++ } ++ break; ++ ++ case dtrace_object_alloc_id: ++ { ++ // A0: object ++ StubFrame f(sasm, "dtrace_object_alloc", dont_gc_arguments); ++ save_live_registers(sasm); ++ ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), A0); ++ ++ restore_live_registers(sasm); ++ } ++ break; ++ ++ default: ++ { ++ StubFrame f(sasm, "unimplemented entry", dont_gc_arguments, does_not_return); ++ __ li(A0, (int)id); ++ __ call_RT(noreg, noreg, CAST_FROM_FN_PTR(address, unimplemented_entry), A0); ++ } ++ break; ++ } ++ } ++ return oop_maps; ++} ++ ++#undef __ ++ ++const char *Runtime1::pd_name_for_address(address entry) { ++ Unimplemented(); ++ return 0; ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c2_CodeStubs_loongarch.cpp b/src/hotspot/cpu/loongarch/c2_CodeStubs_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/c2_CodeStubs_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c2_CodeStubs_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,91 @@ ++/* ++ * Copyright (c) 2020, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "opto/c2_MacroAssembler.hpp" ++#include "opto/c2_CodeStubs.hpp" ++#include "runtime/objectMonitor.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/stubRoutines.hpp" ++ ++#define __ masm. ++ ++int C2SafepointPollStub::max_size() const { ++ return 4 * 4; ++} ++ ++void C2SafepointPollStub::emit(C2_MacroAssembler& masm) { ++ assert(SharedRuntime::polling_page_return_handler_blob() != nullptr, ++ "polling page return stub not created yet"); ++ address stub = SharedRuntime::polling_page_return_handler_blob()->entry_point(); ++ ++ __ bind(entry()); ++ InternalAddress safepoint_pc(masm.pc() - masm.offset() + _safepoint_offset); ++ __ lea(AT, safepoint_pc); ++ __ st_d(AT, Address(TREG, JavaThread::saved_exception_pc_offset())); ++ __ jmp(stub, relocInfo::runtime_call_type); ++} ++ ++int C2EntryBarrierStub::max_size() const { ++ return 5 * 4; ++} ++ ++void C2EntryBarrierStub::emit(C2_MacroAssembler& masm) { ++ __ bind(entry()); ++ __ call_long(StubRoutines::la::method_entry_barrier()); ++ __ b(continuation()); ++ ++ __ bind(guard()); ++ __ relocate(entry_guard_Relocation::spec()); ++ __ emit_int32(0); // nmethod guard value ++} ++ ++int C2HandleAnonOMOwnerStub::max_size() const { ++ // Max size of stub has been determined by testing with 0, in which case ++ // C2CodeStubList::emit() will throw an assertion and report the actual size that ++ // is needed. ++ return 24; ++} ++ ++void C2HandleAnonOMOwnerStub::emit(C2_MacroAssembler& masm) { ++ __ bind(entry()); ++ Register mon = monitor(); ++ Register t = tmp(); ++ assert(t != noreg, "need tmp register"); ++ // Fix owner to be the current thread. ++ __ st_d(TREG, Address(mon, ObjectMonitor::owner_offset())); ++ ++ // Pop owner object from lock-stack. ++ __ ld_wu(t, Address(TREG, JavaThread::lock_stack_top_offset())); ++ __ addi_w(t, t, -oopSize); ++#ifdef ASSERT ++ __ stx_d(R0, TREG, t); ++#endif ++ __ st_w(t, Address(TREG, JavaThread::lock_stack_top_offset())); ++ ++ __ b(continuation()); ++} ++ ++#undef __ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c2_globals_loongarch.hpp b/src/hotspot/cpu/loongarch/c2_globals_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/c2_globals_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c2_globals_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,85 @@ ++/* ++ * Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_C2_GLOBALS_LOONGARCH_HPP ++#define CPU_LOONGARCH_C2_GLOBALS_LOONGARCH_HPP ++ ++#include "utilities/globalDefinitions.hpp" ++#include "utilities/macros.hpp" ++ ++// Sets the default values for platform dependent flags used by the server compiler. ++// (see c2_globals.hpp). Alpha-sorted. ++define_pd_global(bool, BackgroundCompilation, true); ++define_pd_global(bool, UseTLAB, true); ++define_pd_global(bool, ResizeTLAB, true); ++define_pd_global(bool, CICompileOSR, true); ++define_pd_global(bool, InlineIntrinsics, true); ++define_pd_global(bool, PreferInterpreterNativeStubs, false); ++define_pd_global(bool, ProfileTraps, true); ++define_pd_global(bool, UseOnStackReplacement, true); ++define_pd_global(bool, ProfileInterpreter, true); ++define_pd_global(bool, TieredCompilation, true); ++define_pd_global(intx, CompileThreshold, 10000); ++define_pd_global(intx, BackEdgeThreshold, 100000); ++ ++define_pd_global(intx, OnStackReplacePercentage, 140); ++define_pd_global(intx, ConditionalMoveLimit, 3); ++define_pd_global(intx, FreqInlineSize, 325); ++define_pd_global(intx, MinJumpTableSize, 10); ++define_pd_global(intx, InteriorEntryAlignment, 16); ++define_pd_global(intx, NewSizeThreadIncrease, ScaleForWordSize(4*K)); ++define_pd_global(intx, LoopUnrollLimit, 60); ++define_pd_global(intx, LoopPercentProfileLimit, 10); ++// InitialCodeCacheSize derived from specjbb2000 run. ++define_pd_global(intx, InitialCodeCacheSize, 2496*K); // Integral multiple of CodeCacheExpansionSize ++define_pd_global(intx, CodeCacheExpansionSize, 64*K); ++ ++// Ergonomics related flags ++define_pd_global(uint64_t,MaxRAM, 128ULL*G); ++define_pd_global(intx, RegisterCostAreaRatio, 16000); ++ ++// Peephole and CISC spilling both break the graph, and so makes the ++// scheduler sick. ++define_pd_global(bool, OptoPeephole, false); ++define_pd_global(bool, UseCISCSpill, false); ++define_pd_global(bool, OptoScheduling, false); ++define_pd_global(bool, OptoBundling, false); ++define_pd_global(bool, OptoRegScheduling, false); ++define_pd_global(bool, SuperWordLoopUnrollAnalysis, true); ++define_pd_global(bool, IdealizeClearArrayNode, true); ++ ++define_pd_global(intx, ReservedCodeCacheSize, 48*M); ++define_pd_global(intx, NonProfiledCodeHeapSize, 21*M); ++define_pd_global(intx, ProfiledCodeHeapSize, 22*M); ++define_pd_global(intx, NonNMethodCodeHeapSize, 5*M ); ++define_pd_global(uintx, CodeCacheMinBlockLength, 4); ++define_pd_global(uintx, CodeCacheMinimumUseSpace, 400*K); ++ ++define_pd_global(bool, TrapBasedRangeChecks, false); ++ ++// Ergonomics related flags ++define_pd_global(bool, NeverActAsServerClassMachine, false); ++ ++#endif // CPU_LOONGARCH_C2_GLOBALS_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c2_init_loongarch.cpp b/src/hotspot/cpu/loongarch/c2_init_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/c2_init_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c2_init_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,37 @@ ++/* ++ * Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "opto/compile.hpp" ++#include "opto/node.hpp" ++ ++// processor dependent initialization for LoongArch ++ ++extern void reg_mask_init(); ++ ++void Compile::pd_compiler2_init() { ++ guarantee(CodeEntryAlignment >= InteriorEntryAlignment, "" ); ++ reg_mask_init(); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c2_MacroAssembler_loongarch.cpp b/src/hotspot/cpu/loongarch/c2_MacroAssembler_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/c2_MacroAssembler_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c2_MacroAssembler_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,1903 @@ ++/* ++ * Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/assembler.hpp" ++#include "asm/assembler.inline.hpp" ++#include "opto/c2_MacroAssembler.hpp" ++#include "opto/compile.hpp" ++#include "opto/intrinsicnode.hpp" ++#include "opto/output.hpp" ++#include "opto/subnode.hpp" ++#include "runtime/objectMonitor.hpp" ++#include "runtime/stubRoutines.hpp" ++ ++ ++// using the cr register as the bool result: 0 for failed; others success. ++void C2_MacroAssembler::fast_lock_c2(Register oop, Register box, Register flag, ++ Register disp_hdr, Register tmp) { ++ Label cont; ++ Label object_has_monitor; ++ Label count, no_count; ++ ++ assert_different_registers(oop, box, tmp, disp_hdr, flag); ++ ++ // Load markWord from object into displaced_header. ++ assert(oopDesc::mark_offset_in_bytes() == 0, "offset of _mark is not 0"); ++ ld_d(disp_hdr, oop, oopDesc::mark_offset_in_bytes()); ++ ++ if (DiagnoseSyncOnValueBasedClasses != 0) { ++ load_klass(flag, oop); ++ ld_wu(flag, Address(flag, Klass::access_flags_offset())); ++ li(AT, JVM_ACC_IS_VALUE_BASED_CLASS); ++ andr(AT, flag, AT); ++ move(flag, R0); ++ bnez(AT, cont); ++ } ++ ++ // Check for existing monitor ++ andi(AT, disp_hdr, markWord::monitor_value); ++ bnez(AT, object_has_monitor); // inflated vs stack-locked|neutral|bias ++ ++ if (LockingMode == LM_MONITOR) { ++ move(flag, R0); // Set zero flag to indicate 'failure' ++ b(cont); ++ } else if (LockingMode == LM_LEGACY) { ++ // Set tmp to be (markWord of object | UNLOCK_VALUE). ++ ori(tmp, disp_hdr, markWord::unlocked_value); ++ ++ // Initialize the box. (Must happen before we update the object mark!) ++ st_d(tmp, box, BasicLock::displaced_header_offset_in_bytes()); ++ ++ // If cmpxchg is succ, then flag = 1 ++ cmpxchg(Address(oop, 0), tmp, box, flag, true, true /* acquire */); ++ bnez(flag, cont); ++ ++ // If the compare-and-exchange succeeded, then we found an unlocked ++ // object, will have now locked it will continue at label cont ++ // We did not see an unlocked object so try the fast recursive case. ++ ++ // Check if the owner is self by comparing the value in the ++ // markWord of object (disp_hdr) with the stack pointer. ++ sub_d(disp_hdr, tmp, SP); ++ li(tmp, (intptr_t) (~(os::vm_page_size()-1) | (uintptr_t)markWord::lock_mask_in_place)); ++ // If (mark & lock_mask) == 0 and mark - sp < page_size, ++ // we are stack-locking and goto cont, ++ // hence we can store 0 as the displaced header in the box, ++ // which indicates that it is a recursive lock. ++ andr(tmp, disp_hdr, tmp); ++ st_d(tmp, box, BasicLock::displaced_header_offset_in_bytes()); ++ sltui(flag, tmp, 1); // flag = (tmp == 0) ? 1 : 0 ++ b(cont); ++ } else { ++ assert(LockingMode == LM_LIGHTWEIGHT, "must be"); ++ lightweight_lock(oop, disp_hdr, flag, SCR1, no_count); ++ b(count); ++ } ++ ++ // Handle existing monitor. ++ bind(object_has_monitor); ++ ++ // The object's monitor m is unlocked if m->owner is null, ++ // otherwise m->owner may contain a thread or a stack address. ++ // ++ // Try to CAS m->owner from null to current thread. ++ move(AT, R0); ++ addi_d(tmp, disp_hdr, in_bytes(ObjectMonitor::owner_offset()) - markWord::monitor_value); ++ cmpxchg(Address(tmp, 0), AT, TREG, flag, true, true /* acquire */); ++ if (LockingMode != LM_LIGHTWEIGHT) { ++ // Store a non-null value into the box to avoid looking like a re-entrant ++ // lock. The fast-path monitor unlock code checks for ++ // markWord::monitor_value so use markWord::unused_mark which has the ++ // relevant bit set, and also matches ObjectSynchronizer::enter. ++ li(tmp, (address)markWord::unused_mark().value()); ++ st_d(tmp, Address(box, BasicLock::displaced_header_offset_in_bytes())); ++ } ++ bnez(flag, cont); // CAS success means locking succeeded ++ ++ bne(AT, TREG, cont); // Check for recursive locking ++ ++ // Recursive lock case ++ li(flag, 1); ++ increment(Address(disp_hdr, in_bytes(ObjectMonitor::recursions_offset()) - markWord::monitor_value), 1); ++ ++ bind(cont); ++ // flag == 1 indicates success ++ // flag == 0 indicates failure ++ beqz(flag, no_count); ++ ++ bind(count); ++ increment(Address(TREG, JavaThread::held_monitor_count_offset()), 1); ++ ++ bind(no_count); ++} ++ ++// using cr flag to indicate the fast_unlock result: 0 for failed; others success. ++void C2_MacroAssembler::fast_unlock_c2(Register oop, Register box, Register flag, ++ Register disp_hdr, Register tmp) { ++ Label cont; ++ Label object_has_monitor; ++ Label count, no_count; ++ ++ assert_different_registers(oop, box, tmp, disp_hdr, flag); ++ ++ // Find the lock address and load the displaced header from the stack. ++ ld_d(disp_hdr, Address(box, BasicLock::displaced_header_offset_in_bytes())); ++ ++ if (LockingMode == LM_LEGACY) { ++ // If the displaced header is 0, we have a recursive unlock. ++ sltui(flag, disp_hdr, 1); // flag = (disp_hdr == 0) ? 1 : 0 ++ beqz(disp_hdr, cont); ++ } ++ ++ assert(oopDesc::mark_offset_in_bytes() == 0, "offset of _mark is not 0"); ++ ++ // Handle existing monitor. ++ ld_d(tmp, oop, oopDesc::mark_offset_in_bytes()); ++ andi(AT, tmp, markWord::monitor_value); ++ bnez(AT, object_has_monitor); ++ ++ if (LockingMode == LM_MONITOR) { ++ move(flag, R0); // Set zero flag to indicate 'failure' ++ b(cont); ++ } else if (LockingMode == LM_LEGACY) { ++ // Check if it is still a light weight lock, this is true if we ++ // see the stack address of the basicLock in the markWord of the ++ // object. ++ cmpxchg(Address(oop, 0), box, disp_hdr, flag, false, false /* acquire */); ++ b(cont); ++ } else { ++ assert(LockingMode == LM_LIGHTWEIGHT, "must be"); ++ lightweight_unlock(oop, tmp, flag, box, no_count); ++ b(count); ++ } ++ ++ // Handle existing monitor. ++ bind(object_has_monitor); ++ ++ addi_d(tmp, tmp, -(int)markWord::monitor_value); // monitor ++ ++ if (LockingMode == LM_LIGHTWEIGHT) { ++ // If the owner is anonymous, we need to fix it -- in an outline stub. ++ Register tmp2 = disp_hdr; ++ ld_d(tmp2, Address(tmp, ObjectMonitor::owner_offset())); ++ // We cannot use tbnz here, the target might be too far away and cannot ++ // be encoded. ++ assert_different_registers(tmp2, AT); ++ li(AT, (uint64_t)ObjectMonitor::ANONYMOUS_OWNER); ++ andr(AT, tmp2, AT); ++ C2HandleAnonOMOwnerStub* stub = new (Compile::current()->comp_arena()) C2HandleAnonOMOwnerStub(tmp, tmp2); ++ Compile::current()->output()->add_stub(stub); ++ bnez(AT, stub->entry()); ++ bind(stub->continuation()); ++ } ++ ++ ld_d(disp_hdr, Address(tmp, ObjectMonitor::recursions_offset())); ++ ++ Label notRecursive; ++ beqz(disp_hdr, notRecursive); ++ ++ // Recursive lock ++ addi_d(disp_hdr, disp_hdr, -1); ++ st_d(disp_hdr, Address(tmp, ObjectMonitor::recursions_offset())); ++ li(flag, 1); ++ b(cont); ++ ++ bind(notRecursive); ++ ld_d(flag, Address(tmp, ObjectMonitor::EntryList_offset())); ++ ld_d(disp_hdr, Address(tmp, ObjectMonitor::cxq_offset())); ++ orr(AT, flag, disp_hdr); ++ ++ move(flag, R0); ++ bnez(AT, cont); ++ ++ addi_d(AT, tmp, in_bytes(ObjectMonitor::owner_offset())); ++ amswap_db_d(tmp, R0, AT); ++ li(flag, 1); ++ ++ bind(cont); ++ // flag == 1 indicates success ++ // flag == 0 indicates failure ++ beqz(flag, no_count); ++ ++ bind(count); ++ decrement(Address(TREG, JavaThread::held_monitor_count_offset()), 1); ++ ++ bind(no_count); ++} ++ ++typedef void (MacroAssembler::* load_chr_insn)(Register rd, const Address &adr); ++ ++void C2_MacroAssembler::string_indexof(Register haystack, Register needle, ++ Register haystack_len, Register needle_len, ++ Register result, int ae) ++{ ++ assert(ae != StrIntrinsicNode::LU, "Invalid encoding"); ++ ++ Label LINEARSEARCH, LINEARSTUB, DONE, NOMATCH; ++ ++ bool isLL = ae == StrIntrinsicNode::LL; ++ ++ bool needle_isL = ae == StrIntrinsicNode::LL || ae == StrIntrinsicNode::UL; ++ bool haystack_isL = ae == StrIntrinsicNode::LL || ae == StrIntrinsicNode::LU; ++ ++ int needle_chr_size = needle_isL ? 1 : 2; ++ int haystack_chr_size = haystack_isL ? 1 : 2; ++ ++ Address::ScaleFactor needle_chr_shift = needle_isL ? Address::no_scale ++ : Address::times_2; ++ Address::ScaleFactor haystack_chr_shift = haystack_isL ? Address::no_scale ++ : Address::times_2; ++ ++ load_chr_insn needle_load_1chr = needle_isL ? (load_chr_insn)&MacroAssembler::ld_bu ++ : (load_chr_insn)&MacroAssembler::ld_hu; ++ load_chr_insn haystack_load_1chr = haystack_isL ? (load_chr_insn)&MacroAssembler::ld_bu ++ : (load_chr_insn)&MacroAssembler::ld_hu; ++ ++ // Note, inline_string_indexOf() generates checks: ++ // if (pattern.count > src.count) return -1; ++ // if (pattern.count == 0) return 0; ++ ++ // We have two strings, a source string in haystack, haystack_len and a pattern string ++ // in needle, needle_len. Find the first occurrence of pattern in source or return -1. ++ ++ // For larger pattern and source we use a simplified Boyer Moore algorithm. ++ // With a small pattern and source we use linear scan. ++ ++ // needle_len >= 8 && needle_len < 256 && needle_len < haystack_len/4, use bmh algorithm. ++ ++ // needle_len < 8, use linear scan ++ li(AT, 8); ++ blt(needle_len, AT, LINEARSEARCH); ++ ++ // needle_len >= 256, use linear scan ++ li(AT, 256); ++ bge(needle_len, AT, LINEARSTUB); ++ ++ // needle_len >= haystack_len/4, use linear scan ++ srli_d(AT, haystack_len, 2); ++ bge(needle_len, AT, LINEARSTUB); ++ ++ // Boyer-Moore-Horspool introduction: ++ // The Boyer Moore alogorithm is based on the description here:- ++ // ++ // http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm ++ // ++ // This describes and algorithm with 2 shift rules. The 'Bad Character' rule ++ // and the 'Good Suffix' rule. ++ // ++ // These rules are essentially heuristics for how far we can shift the ++ // pattern along the search string. ++ // ++ // The implementation here uses the 'Bad Character' rule only because of the ++ // complexity of initialisation for the 'Good Suffix' rule. ++ // ++ // This is also known as the Boyer-Moore-Horspool algorithm: ++ // ++ // http://en.wikipedia.org/wiki/Boyer-Moore-Horspool_algorithm ++ // ++ // #define ASIZE 256 ++ // ++ // int bm(unsigned char *pattern, int m, unsigned char *src, int n) { ++ // int i, j; ++ // unsigned c; ++ // unsigned char bc[ASIZE]; ++ // ++ // /* Preprocessing */ ++ // for (i = 0; i < ASIZE; ++i) ++ // bc[i] = m; ++ // for (i = 0; i < m - 1; ) { ++ // c = pattern[i]; ++ // ++i; ++ // // c < 256 for Latin1 string, so, no need for branch ++ // #ifdef PATTERN_STRING_IS_LATIN1 ++ // bc[c] = m - i; ++ // #else ++ // if (c < ASIZE) bc[c] = m - i; ++ // #endif ++ // } ++ // ++ // /* Searching */ ++ // j = 0; ++ // while (j <= n - m) { ++ // c = src[i+j]; ++ // if (pattern[m-1] == c) ++ // int k; ++ // for (k = m - 2; k >= 0 && pattern[k] == src[k + j]; --k); ++ // if (k < 0) return j; ++ // // c < 256 for Latin1 string, so, no need for branch ++ // #ifdef SOURCE_STRING_IS_LATIN1_AND_PATTERN_STRING_IS_LATIN1 ++ // // LL case: (c< 256) always true. Remove branch ++ // j += bc[pattern[j+m-1]]; ++ // #endif ++ // #ifdef SOURCE_STRING_IS_UTF_AND_PATTERN_STRING_IS_UTF ++ // // UU case: need if (c if not. ++ // if (c < ASIZE) ++ // j += bc[pattern[j+m-1]]; ++ // else ++ // j += m ++ // #endif ++ // } ++ // return -1; ++ // } ++ ++ Label BCLOOP, BCSKIP, BMLOOPSTR2, BMLOOPSTR1, BMSKIP, BMADV, BMMATCH, ++ BMLOOPSTR1_LASTCMP, BMLOOPSTR1_CMP, BMLOOPSTR1_AFTER_LOAD; ++ ++ Register haystack_end = haystack_len; ++ Register result_tmp = result; ++ ++ Register nlen_tmp = T0; // needle len tmp ++ Register skipch = T1; ++ Register last_byte = T2; ++ Register last_dword = T3; ++ Register orig_haystack = T4; ++ Register ch1 = T5; ++ Register ch2 = T6; ++ ++ RegSet spilled_regs = RegSet::range(T0, T6); ++ ++ push(spilled_regs); ++ ++ // pattern length is >=8, so, we can read at least 1 register for cases when ++ // UTF->Latin1 conversion is not needed(8 LL or 4UU) and half register for ++ // UL case. We'll re-read last character in inner pre-loop code to have ++ // single outer pre-loop load ++ const int first_step = isLL ? 7 : 3; ++ ++ const int ASIZE = 256; ++ ++ addi_d(SP, SP, -ASIZE); ++ ++ // init BC offset table with default value: needle_len ++ // ++ // for (i = 0; i < ASIZE; ++i) ++ // bc[i] = m; ++ if (UseLASX) { ++ xvreplgr2vr_b(fscratch, needle_len); ++ ++ for (int i = 0; i < ASIZE; i += 32) { ++ xvst(fscratch, SP, i); ++ } ++ } else if (UseLSX) { ++ vreplgr2vr_b(fscratch, needle_len); ++ ++ for (int i = 0; i < ASIZE; i += 16) { ++ vst(fscratch, SP, i); ++ } ++ } else { ++ move(AT, needle_len); ++ bstrins_d(AT, AT, 15, 8); ++ bstrins_d(AT, AT, 31, 16); ++ bstrins_d(AT, AT, 63, 32); ++ ++ for (int i = 0; i < ASIZE; i += 8) { ++ st_d(AT, SP, i); ++ } ++ } ++ ++ sub_d(nlen_tmp, haystack_len, needle_len); ++ lea(haystack_end, Address(haystack, nlen_tmp, haystack_chr_shift, 0)); ++ addi_d(ch2, needle_len, -1); // bc offset init value ++ move(nlen_tmp, needle); ++ ++ // for (i = 0; i < m - 1; ) { ++ // c = pattern[i]; ++ // ++i; ++ // // c < 256 for Latin1 string, so, no need for branch ++ // #ifdef PATTERN_STRING_IS_LATIN1 ++ // bc[c] = m - i; ++ // #else ++ // if (c < ASIZE) bc[c] = m - i; ++ // #endif ++ // } ++ bind(BCLOOP); ++ (this->*needle_load_1chr)(ch1, Address(nlen_tmp)); ++ addi_d(nlen_tmp, nlen_tmp, needle_chr_size); ++ if (!needle_isL) { ++ // ae == StrIntrinsicNode::UU ++ li(AT, 256u); ++ bgeu(ch1, AT, BCSKIP); // GE for UTF ++ } ++ stx_b(ch2, SP, ch1); // store skip offset to BC offset table ++ ++ bind(BCSKIP); ++ addi_d(ch2, ch2, -1); // for next pattern element, skip distance -1 ++ blt(R0, ch2, BCLOOP); ++ ++ if (needle_isL == haystack_isL) { ++ // load last 8 pattern bytes (8LL/4UU symbols) ++ ld_d(last_dword, Address(needle, needle_len, needle_chr_shift, -wordSize)); ++ addi_d(nlen_tmp, needle_len, -1); // m - 1, index of the last element in pattern ++ move(orig_haystack, haystack); ++ bstrpick_d(last_byte, last_dword, 63, 64 - 8 * needle_chr_size); // UU/LL: pattern[m-1] ++ } else { ++ // UL: from UTF-16(source) search Latin1(pattern) ++ // load last 4 bytes(4 symbols) ++ ld_wu(last_byte, Address(needle, needle_len, Address::no_scale, -wordSize / 2)); ++ addi_d(nlen_tmp, needle_len, -1); // m - 1, index of the last element in pattern ++ move(orig_haystack, haystack); ++ // convert Latin1 to UTF. eg: 0x0000abcd -> 0x0a0b0c0d ++ bstrpick_d(last_dword, last_byte, 7, 0); ++ srli_d(last_byte, last_byte, 8); ++ bstrins_d(last_dword, last_byte, 23, 16); ++ srli_d(last_byte, last_byte, 8); ++ bstrins_d(last_dword, last_byte, 39, 32); ++ srli_d(last_byte, last_byte, 8); // last_byte: 0x0000000a ++ bstrins_d(last_dword, last_byte, 55, 48); // last_dword: 0x0a0b0c0d ++ } ++ ++ // i = m - 1; ++ // skipch = j + i; ++ // if (skipch == pattern[m - 1] ++ // for (k = m - 2; k >= 0 && pattern[k] == src[k + j]; --k); ++ // else ++ // move j with bad char offset table ++ bind(BMLOOPSTR2); ++ // compare pattern to source string backward ++ (this->*haystack_load_1chr)(skipch, Address(haystack, nlen_tmp, haystack_chr_shift, 0)); ++ addi_d(nlen_tmp, nlen_tmp, -first_step); // nlen_tmp is positive here, because needle_len >= 8 ++ bne(last_byte, skipch, BMSKIP); // if not equal, skipch is bad char ++ ld_d(ch2, Address(haystack, nlen_tmp, haystack_chr_shift, 0)); // load 8 bytes from source string ++ move(ch1, last_dword); ++ if (isLL) { ++ b(BMLOOPSTR1_AFTER_LOAD); ++ } else { ++ addi_d(nlen_tmp, nlen_tmp, -1); // no need to branch for UU/UL case. cnt1 >= 8 ++ b(BMLOOPSTR1_CMP); ++ } ++ ++ bind(BMLOOPSTR1); ++ (this->*needle_load_1chr)(ch1, Address(needle, nlen_tmp, needle_chr_shift, 0)); ++ (this->*haystack_load_1chr)(ch2, Address(haystack, nlen_tmp, haystack_chr_shift, 0)); ++ ++ bind(BMLOOPSTR1_AFTER_LOAD); ++ addi_d(nlen_tmp, nlen_tmp, -1); ++ blt(nlen_tmp, R0, BMLOOPSTR1_LASTCMP); ++ ++ bind(BMLOOPSTR1_CMP); ++ beq(ch1, ch2, BMLOOPSTR1); ++ ++ bind(BMSKIP); ++ if (!isLL) { ++ // if we've met UTF symbol while searching Latin1 pattern, then we can ++ // skip needle_len symbols ++ if (needle_isL != haystack_isL) { ++ move(result_tmp, needle_len); ++ } else { ++ li(result_tmp, 1); ++ } ++ li(AT, 256u); ++ bgeu(skipch, AT, BMADV); // GE for UTF ++ } ++ ldx_bu(result_tmp, SP, skipch); // load skip offset ++ ++ bind(BMADV); ++ addi_d(nlen_tmp, needle_len, -1); ++ // move haystack after bad char skip offset ++ lea(haystack, Address(haystack, result_tmp, haystack_chr_shift, 0)); ++ bge(haystack_end, haystack, BMLOOPSTR2); ++ addi_d(SP, SP, ASIZE); ++ b(NOMATCH); ++ ++ bind(BMLOOPSTR1_LASTCMP); ++ bne(ch1, ch2, BMSKIP); ++ ++ bind(BMMATCH); ++ sub_d(result, haystack, orig_haystack); ++ if (!haystack_isL) { ++ srli_d(result, result, 1); ++ } ++ addi_d(SP, SP, ASIZE); ++ pop(spilled_regs); ++ b(DONE); ++ ++ bind(LINEARSTUB); ++ li(AT, 16); // small patterns still should be handled by simple algorithm ++ blt(needle_len, AT, LINEARSEARCH); ++ move(result, R0); ++ address stub; ++ if (isLL) { ++ stub = StubRoutines::la::string_indexof_linear_ll(); ++ assert(stub != nullptr, "string_indexof_linear_ll stub has not been generated"); ++ } else if (needle_isL) { ++ stub = StubRoutines::la::string_indexof_linear_ul(); ++ assert(stub != nullptr, "string_indexof_linear_ul stub has not been generated"); ++ } else { ++ stub = StubRoutines::la::string_indexof_linear_uu(); ++ assert(stub != nullptr, "string_indexof_linear_uu stub has not been generated"); ++ } ++ trampoline_call(RuntimeAddress(stub)); ++ b(DONE); ++ ++ bind(NOMATCH); ++ li(result, -1); ++ pop(spilled_regs); ++ b(DONE); ++ ++ bind(LINEARSEARCH); ++ string_indexof_linearscan(haystack, needle, haystack_len, needle_len, -1, result, ae); ++ ++ bind(DONE); ++} ++ ++void C2_MacroAssembler::string_indexof_linearscan(Register haystack, Register needle, ++ Register haystack_len, Register needle_len, ++ int needle_con_cnt, Register result, int ae) ++{ ++ // Note: ++ // needle_con_cnt > 0 means needle_len register is invalid, needle length is constant ++ // for UU/LL: needle_con_cnt[1, 4], UL: needle_con_cnt = 1 ++ assert(needle_con_cnt <= 4, "Invalid needle constant count"); ++ assert(ae != StrIntrinsicNode::LU, "Invalid encoding"); ++ ++ Register hlen_neg = haystack_len; ++ Register nlen_neg = needle_len; ++ Register result_tmp = result; ++ ++ Register nlen_tmp = A0, hlen_tmp = A1; ++ Register first = A2, ch1 = A3, ch2 = AT; ++ ++ RegSet spilled_regs = RegSet::range(A0, A3); ++ ++ push(spilled_regs); ++ ++ bool isLL = ae == StrIntrinsicNode::LL; ++ ++ bool needle_isL = ae == StrIntrinsicNode::LL || ae == StrIntrinsicNode::UL; ++ bool haystack_isL = ae == StrIntrinsicNode::LL || ae == StrIntrinsicNode::LU; ++ int needle_chr_shift = needle_isL ? 0 : 1; ++ int haystack_chr_shift = haystack_isL ? 0 : 1; ++ int needle_chr_size = needle_isL ? 1 : 2; ++ int haystack_chr_size = haystack_isL ? 1 : 2; ++ ++ load_chr_insn needle_load_1chr = needle_isL ? (load_chr_insn)&MacroAssembler::ld_bu ++ : (load_chr_insn)&MacroAssembler::ld_hu; ++ load_chr_insn haystack_load_1chr = haystack_isL ? (load_chr_insn)&MacroAssembler::ld_bu ++ : (load_chr_insn)&MacroAssembler::ld_hu; ++ load_chr_insn load_2chr = isLL ? (load_chr_insn)&MacroAssembler::ld_hu ++ : (load_chr_insn)&MacroAssembler::ld_wu; ++ load_chr_insn load_4chr = isLL ? (load_chr_insn)&MacroAssembler::ld_wu ++ : (load_chr_insn)&MacroAssembler::ld_d; ++ ++ Label DO1, DO2, DO3, MATCH, NOMATCH, DONE; ++ ++ if (needle_con_cnt == -1) { ++ Label DOSHORT, FIRST_LOOP, STR2_NEXT, STR1_LOOP, STR1_NEXT; ++ ++ li(AT, needle_isL == haystack_isL ? 4 : 2); // UU/LL:4, UL:2 ++ blt(needle_len, AT, DOSHORT); ++ ++ sub_d(result_tmp, haystack_len, needle_len); ++ ++ (this->*needle_load_1chr)(first, Address(needle)); ++ if (!haystack_isL) slli_d(result_tmp, result_tmp, haystack_chr_shift); ++ add_d(haystack, haystack, result_tmp); ++ sub_d(hlen_neg, R0, result_tmp); ++ if (!needle_isL) slli_d(needle_len, needle_len, needle_chr_shift); ++ add_d(needle, needle, needle_len); ++ sub_d(nlen_neg, R0, needle_len); ++ ++ bind(FIRST_LOOP); ++ (this->*haystack_load_1chr)(ch2, Address(haystack, hlen_neg, Address::no_scale, 0)); ++ beq(first, ch2, STR1_LOOP); ++ ++ bind(STR2_NEXT); ++ addi_d(hlen_neg, hlen_neg, haystack_chr_size); ++ bge(R0, hlen_neg, FIRST_LOOP); ++ b(NOMATCH); ++ ++ bind(STR1_LOOP); ++ addi_d(nlen_tmp, nlen_neg, needle_chr_size); ++ addi_d(hlen_tmp, hlen_neg, haystack_chr_size); ++ bge(nlen_tmp, R0, MATCH); ++ ++ bind(STR1_NEXT); ++ (this->*needle_load_1chr)(ch1, Address(needle, nlen_tmp, Address::no_scale, 0)); ++ (this->*haystack_load_1chr)(ch2, Address(haystack, hlen_tmp, Address::no_scale, 0)); ++ bne(ch1, ch2, STR2_NEXT); ++ addi_d(nlen_tmp, nlen_tmp, needle_chr_size); ++ addi_d(hlen_tmp, hlen_tmp, haystack_chr_size); ++ blt(nlen_tmp, R0, STR1_NEXT); ++ b(MATCH); ++ ++ bind(DOSHORT); ++ if (needle_isL == haystack_isL) { ++ li(AT, 2); ++ blt(needle_len, AT, DO1); // needle_len == 1 ++ blt(AT, needle_len, DO3); // needle_len == 3 ++ // if needle_len == 2 then goto DO2 ++ } ++ } ++ ++ if (needle_con_cnt == 4) { ++ Label CH1_LOOP; ++ (this->*load_4chr)(ch1, Address(needle)); ++ addi_d(result_tmp, haystack_len, -4); ++ if (!haystack_isL) slli_d(result_tmp, result_tmp, haystack_chr_shift); ++ add_d(haystack, haystack, result_tmp); ++ sub_d(hlen_neg, R0, result_tmp); ++ ++ bind(CH1_LOOP); ++ (this->*load_4chr)(ch2, Address(haystack, hlen_neg, Address::no_scale, 0)); ++ beq(ch1, ch2, MATCH); ++ addi_d(hlen_neg, hlen_neg, haystack_chr_size); ++ bge(R0, hlen_neg, CH1_LOOP); ++ b(NOMATCH); ++ } ++ ++ if ((needle_con_cnt == -1 && needle_isL == haystack_isL) || needle_con_cnt == 2) { ++ Label CH1_LOOP; ++ bind(DO2); ++ (this->*load_2chr)(ch1, Address(needle)); ++ addi_d(result_tmp, haystack_len, -2); ++ if (!haystack_isL) slli_d(result_tmp, result_tmp, haystack_chr_shift); ++ add_d(haystack, haystack, result_tmp); ++ sub_d(hlen_neg, R0, result_tmp); ++ ++ bind(CH1_LOOP); ++ (this->*load_2chr)(ch2, Address(haystack, hlen_neg, Address::no_scale, 0)); ++ beq(ch1, ch2, MATCH); ++ addi_d(hlen_neg, hlen_neg, haystack_chr_size); ++ bge(R0, hlen_neg, CH1_LOOP); ++ b(NOMATCH); ++ } ++ ++ if ((needle_con_cnt == -1 && needle_isL == haystack_isL) || needle_con_cnt == 3) { ++ Label FIRST_LOOP, STR2_NEXT, STR1_LOOP; ++ ++ bind(DO3); ++ (this->*load_2chr)(first, Address(needle)); ++ (this->*needle_load_1chr)(ch1, Address(needle, 2 * needle_chr_size)); ++ addi_d(result_tmp, haystack_len, -3); ++ if (!haystack_isL) slli_d(result_tmp, result_tmp, haystack_chr_shift); ++ add_d(haystack, haystack, result_tmp); ++ sub_d(hlen_neg, R0, result_tmp); ++ ++ bind(FIRST_LOOP); ++ (this->*load_2chr)(ch2, Address(haystack, hlen_neg, Address::no_scale, 0)); ++ beq(first, ch2, STR1_LOOP); ++ ++ bind(STR2_NEXT); ++ addi_d(hlen_neg, hlen_neg, haystack_chr_size); ++ bge(R0, hlen_neg, FIRST_LOOP); ++ b(NOMATCH); ++ ++ bind(STR1_LOOP); ++ (this->*haystack_load_1chr)(ch2, Address(haystack, hlen_neg, Address::no_scale, 2 * haystack_chr_size)); ++ bne(ch1, ch2, STR2_NEXT); ++ b(MATCH); ++ } ++ ++ if (needle_con_cnt == -1 || needle_con_cnt == 1) { ++ Label CH1_LOOP, HAS_ZERO, DO1_SHORT, DO1_LOOP; ++ Register mask01 = nlen_tmp; ++ Register mask7f = hlen_tmp; ++ Register masked = first; ++ ++ bind(DO1); ++ (this->*needle_load_1chr)(ch1, Address(needle)); ++ li(AT, 8); ++ blt(haystack_len, AT, DO1_SHORT); ++ ++ addi_d(result_tmp, haystack_len, -8 / haystack_chr_size); ++ if (!haystack_isL) slli_d(result_tmp, result_tmp, haystack_chr_shift); ++ add_d(haystack, haystack, result_tmp); ++ sub_d(hlen_neg, R0, result_tmp); ++ ++ if (haystack_isL) bstrins_d(ch1, ch1, 15, 8); ++ bstrins_d(ch1, ch1, 31, 16); ++ bstrins_d(ch1, ch1, 63, 32); ++ ++ li(mask01, haystack_isL ? 0x0101010101010101 : 0x0001000100010001); ++ li(mask7f, haystack_isL ? 0x7f7f7f7f7f7f7f7f : 0x7fff7fff7fff7fff); ++ ++ bind(CH1_LOOP); ++ ldx_d(ch2, haystack, hlen_neg); ++ xorr(ch2, ch1, ch2); ++ sub_d(masked, ch2, mask01); ++ orr(ch2, ch2, mask7f); ++ andn(masked, masked, ch2); ++ bnez(masked, HAS_ZERO); ++ addi_d(hlen_neg, hlen_neg, 8); ++ blt(hlen_neg, R0, CH1_LOOP); ++ ++ li(AT, 8); ++ bge(hlen_neg, AT, NOMATCH); ++ move(hlen_neg, R0); ++ b(CH1_LOOP); ++ ++ bind(HAS_ZERO); ++ ctz_d(masked, masked); ++ srli_d(masked, masked, 3); ++ add_d(hlen_neg, hlen_neg, masked); ++ b(MATCH); ++ ++ bind(DO1_SHORT); ++ addi_d(result_tmp, haystack_len, -1); ++ if (!haystack_isL) slli_d(result_tmp, result_tmp, haystack_chr_shift); ++ add_d(haystack, haystack, result_tmp); ++ sub_d(hlen_neg, R0, result_tmp); ++ ++ bind(DO1_LOOP); ++ (this->*haystack_load_1chr)(ch2, Address(haystack, hlen_neg, Address::no_scale, 0)); ++ beq(ch1, ch2, MATCH); ++ addi_d(hlen_neg, hlen_neg, haystack_chr_size); ++ bge(R0, hlen_neg, DO1_LOOP); ++ } ++ ++ bind(NOMATCH); ++ li(result, -1); ++ b(DONE); ++ ++ bind(MATCH); ++ add_d(result, result_tmp, hlen_neg); ++ if (!haystack_isL) srai_d(result, result, haystack_chr_shift); ++ ++ bind(DONE); ++ pop(spilled_regs); ++} ++ ++void C2_MacroAssembler::string_indexof_char(Register str1, Register cnt1, ++ Register ch, Register result, ++ Register tmp1, Register tmp2, ++ Register tmp3) ++{ ++ Label CH1_LOOP, HAS_ZERO, DO1_SHORT, DO1_LOOP, NOMATCH, DONE; ++ ++ beqz(cnt1, NOMATCH); ++ ++ move(result, R0); ++ ori(tmp1, R0, 4); ++ blt(cnt1, tmp1, DO1_LOOP); ++ ++ // UTF-16 char occupies 16 bits ++ // ch -> chchchch ++ bstrins_d(ch, ch, 31, 16); ++ bstrins_d(ch, ch, 63, 32); ++ ++ li(tmp2, 0x0001000100010001); ++ li(tmp3, 0x7fff7fff7fff7fff); ++ ++ bind(CH1_LOOP); ++ ld_d(AT, str1, 0); ++ xorr(AT, ch, AT); ++ sub_d(tmp1, AT, tmp2); ++ orr(AT, AT, tmp3); ++ andn(tmp1, tmp1, AT); ++ bnez(tmp1, HAS_ZERO); ++ addi_d(str1, str1, 8); ++ addi_d(result, result, 4); ++ ++ // meet the end of string ++ beq(cnt1, result, NOMATCH); ++ ++ addi_d(tmp1, result, 4); ++ bge(tmp1, cnt1, DO1_SHORT); ++ b(CH1_LOOP); ++ ++ bind(HAS_ZERO); ++ ctz_d(tmp1, tmp1); ++ srli_d(tmp1, tmp1, 4); ++ add_d(result, result, tmp1); ++ b(DONE); ++ ++ // restore ch ++ bind(DO1_SHORT); ++ bstrpick_d(ch, ch, 15, 0); ++ ++ bind(DO1_LOOP); ++ ld_hu(tmp1, str1, 0); ++ beq(ch, tmp1, DONE); ++ addi_d(str1, str1, 2); ++ addi_d(result, result, 1); ++ blt(result, cnt1, DO1_LOOP); ++ ++ bind(NOMATCH); ++ addi_d(result, R0, -1); ++ ++ bind(DONE); ++} ++ ++void C2_MacroAssembler::stringL_indexof_char(Register str1, Register cnt1, ++ Register ch, Register result, ++ Register tmp1, Register tmp2, ++ Register tmp3) ++{ ++ Label CH1_LOOP, HAS_ZERO, DO1_SHORT, DO1_LOOP, NOMATCH, DONE; ++ ++ beqz(cnt1, NOMATCH); ++ ++ move(result, R0); ++ ori(tmp1, R0, 8); ++ blt(cnt1, tmp1, DO1_LOOP); ++ ++ // Latin-1 char occupies 8 bits ++ // ch -> chchchchchchchch ++ bstrins_d(ch, ch, 15, 8); ++ bstrins_d(ch, ch, 31, 16); ++ bstrins_d(ch, ch, 63, 32); ++ ++ li(tmp2, 0x0101010101010101); ++ li(tmp3, 0x7f7f7f7f7f7f7f7f); ++ ++ bind(CH1_LOOP); ++ ld_d(AT, str1, 0); ++ xorr(AT, ch, AT); ++ sub_d(tmp1, AT, tmp2); ++ orr(AT, AT, tmp3); ++ andn(tmp1, tmp1, AT); ++ bnez(tmp1, HAS_ZERO); ++ addi_d(str1, str1, 8); ++ addi_d(result, result, 8); ++ ++ // meet the end of string ++ beq(cnt1, result, NOMATCH); ++ ++ addi_d(tmp1, result, 8); ++ bge(tmp1, cnt1, DO1_SHORT); ++ b(CH1_LOOP); ++ ++ bind(HAS_ZERO); ++ ctz_d(tmp1, tmp1); ++ srli_d(tmp1, tmp1, 3); ++ add_d(result, result, tmp1); ++ b(DONE); ++ ++ // restore ch ++ bind(DO1_SHORT); ++ bstrpick_d(ch, ch, 7, 0); ++ ++ bind(DO1_LOOP); ++ ld_bu(tmp1, str1, 0); ++ beq(ch, tmp1, DONE); ++ addi_d(str1, str1, 1); ++ addi_d(result, result, 1); ++ blt(result, cnt1, DO1_LOOP); ++ ++ bind(NOMATCH); ++ addi_d(result, R0, -1); ++ ++ bind(DONE); ++} ++ ++// Compare strings, used for char[] and byte[]. ++void C2_MacroAssembler::string_compare(Register str1, Register str2, ++ Register cnt1, Register cnt2, Register result, ++ int ae, Register tmp1, Register tmp2, ++ FloatRegister vtmp1, FloatRegister vtmp2) { ++ Label L, Loop, LoopEnd, HaveResult, Done, Loop_Start, ++ V_L, V_Loop, V_Result, V_Start; ++ ++ bool isLL = ae == StrIntrinsicNode::LL; ++ bool isLU = ae == StrIntrinsicNode::LU; ++ bool isUL = ae == StrIntrinsicNode::UL; ++ bool isUU = ae == StrIntrinsicNode::UU; ++ ++ bool str1_isL = isLL || isLU; ++ bool str2_isL = isLL || isUL; ++ ++ int charsInWord = isLL ? wordSize : wordSize/2; ++ int charsInFloatRegister = (UseLASX && (isLL||isUU))?(isLL? 32 : 16):(isLL? 16 : 8); ++ ++ if (!str1_isL) srli_w(cnt1, cnt1, 1); ++ if (!str2_isL) srli_w(cnt2, cnt2, 1); ++ ++ // compute the difference of lengths (in result) ++ sub_d(result, cnt1, cnt2); // result holds the difference of two lengths ++ ++ // compute the shorter length (in cnt1) ++ bge(cnt2, cnt1, V_Start); ++ move(cnt1, cnt2); ++ ++ bind(V_Start); ++ // it is hard to apply the xvilvl to flate 16 bytes into 32 bytes, ++ // so we employ the LASX only for the LL or UU StrIntrinsicNode. ++ if (UseLASX && (isLL || isUU)) { ++ ori(AT, R0, charsInFloatRegister); ++ addi_d(tmp1, R0, 16); ++ xvinsgr2vr_d(fscratch, R0, 0); ++ xvinsgr2vr_d(fscratch, tmp1, 2); ++ bind(V_Loop); ++ blt(cnt1, AT, Loop_Start); ++ if (isLL) { ++ xvld(vtmp1, str1, 0); ++ xvld(vtmp2, str2, 0); ++ xvxor_v(vtmp1, vtmp1, vtmp2); ++ xvseteqz_v(FCC0, vtmp1); ++ bceqz(FCC0, V_L); ++ ++ addi_d(str1, str1, 32); ++ addi_d(str2, str2, 32); ++ addi_d(cnt1, cnt1, -charsInFloatRegister); ++ b(V_Loop); ++ ++ bind(V_L); ++ xvxor_v(vtmp2, vtmp2, vtmp2); ++ xvabsd_b(vtmp1, vtmp1, vtmp2); ++ xvneg_b(vtmp1, vtmp1); ++ xvfrstp_b(vtmp2, vtmp1, fscratch); ++ xvpickve2gr_du(tmp1, vtmp2, 0); ++ addi_d(cnt2, R0, 16); ++ bne(tmp1, cnt2, V_Result); ++ ++ xvpickve2gr_du(tmp1, vtmp2, 2); ++ addi_d(tmp1, tmp1, 16); ++ ++ // the index value was stored in tmp1 ++ bind(V_Result); ++ ldx_bu(result, str1, tmp1); ++ ldx_bu(tmp2, str2, tmp1); ++ sub_d(result, result, tmp2); ++ b(Done); ++ } else if (isUU) { ++ xvld(vtmp1, str1, 0); ++ xvld(vtmp2, str2, 0); ++ xvxor_v(vtmp1, vtmp1, vtmp2); ++ xvseteqz_v(FCC0, vtmp1); ++ bceqz(FCC0, V_L); ++ ++ addi_d(str1, str1, 32); ++ addi_d(str2, str2, 32); ++ addi_d(cnt1, cnt1, -charsInFloatRegister); ++ b(V_Loop); ++ ++ bind(V_L); ++ xvxor_v(vtmp2, vtmp2, vtmp2); ++ xvabsd_h(vtmp1, vtmp1, vtmp2); ++ xvneg_h(vtmp1, vtmp1); ++ xvfrstp_h(vtmp2, vtmp1, fscratch); ++ xvpickve2gr_du(tmp1, vtmp2, 0); ++ addi_d(cnt2, R0, 8); ++ bne(tmp1, cnt2, V_Result); ++ ++ xvpickve2gr_du(tmp1, vtmp2, 2); ++ addi_d(tmp1, tmp1, 8); ++ ++ // the index value was stored in tmp1 ++ bind(V_Result); ++ slli_d(tmp1, tmp1, 1); ++ ldx_hu(result, str1, tmp1); ++ ldx_hu(tmp2, str2, tmp1); ++ sub_d(result, result, tmp2); ++ b(Done); ++ } ++ } else if (UseLSX) { ++ ori(AT, R0, charsInFloatRegister); ++ vxor_v(fscratch, fscratch, fscratch); ++ bind(V_Loop); ++ blt(cnt1, AT, Loop_Start); ++ if (isLL) { ++ vld(vtmp1, str1, 0); ++ vld(vtmp2, str2, 0); ++ vxor_v(vtmp1, vtmp1, vtmp2); ++ vseteqz_v(FCC0, vtmp1); ++ bceqz(FCC0, V_L); ++ ++ addi_d(str1, str1, 16); ++ addi_d(str2, str2, 16); ++ addi_d(cnt1, cnt1, -charsInFloatRegister); ++ b(V_Loop); ++ ++ bind(V_L); ++ vxor_v(vtmp2, vtmp2, vtmp2); ++ vabsd_b(vtmp1, vtmp1, vtmp2); ++ vneg_b(vtmp1, vtmp1); ++ vfrstpi_b(vtmp2, vtmp1, 0); ++ vpickve2gr_bu(tmp1, vtmp2, 0); ++ ++ // the index value was stored in tmp1 ++ ldx_bu(result, str1, tmp1); ++ ldx_bu(tmp2, str2, tmp1); ++ sub_d(result, result, tmp2); ++ b(Done); ++ } else if (isLU) { ++ vld(vtmp1, str1, 0); ++ vld(vtmp2, str2, 0); ++ vilvl_b(vtmp1, fscratch, vtmp1); ++ vxor_v(vtmp1, vtmp1, vtmp2); ++ vseteqz_v(FCC0, vtmp1); ++ bceqz(FCC0, V_L); ++ ++ addi_d(str1, str1, 8); ++ addi_d(str2, str2, 16); ++ addi_d(cnt1, cnt1, -charsInFloatRegister); ++ b(V_Loop); ++ ++ bind(V_L); ++ vxor_v(vtmp2, vtmp2, vtmp2); ++ vabsd_h(vtmp1, vtmp1, vtmp2); ++ vneg_h(vtmp1, vtmp1); ++ vfrstpi_h(vtmp2, vtmp1, 0); ++ vpickve2gr_bu(tmp1, vtmp2, 0); ++ ++ // the index value was stored in tmp1 ++ ldx_bu(result, str1, tmp1); ++ slli_d(tmp1, tmp1, 1); ++ ldx_hu(tmp2, str2, tmp1); ++ sub_d(result, result, tmp2); ++ b(Done); ++ } else if (isUL) { ++ vld(vtmp1, str1, 0); ++ vld(vtmp2, str2, 0); ++ vilvl_b(vtmp2, fscratch, vtmp2); ++ vxor_v(vtmp1, vtmp1, vtmp2); ++ vseteqz_v(FCC0, vtmp1); ++ bceqz(FCC0, V_L); ++ ++ addi_d(str1, str1, 16); ++ addi_d(str2, str2, 8); ++ addi_d(cnt1, cnt1, -charsInFloatRegister); ++ b(V_Loop); ++ ++ bind(V_L); ++ vxor_v(vtmp2, vtmp2, vtmp2); ++ vabsd_h(vtmp1, vtmp1, vtmp2); ++ vneg_h(vtmp1, vtmp1); ++ vfrstpi_h(vtmp2, vtmp1, 0); ++ vpickve2gr_bu(tmp1, vtmp2, 0); ++ ++ // the index value was stored in tmp1 ++ ldx_bu(tmp2, str2, tmp1); ++ slli_d(tmp1, tmp1, 1); ++ ldx_hu(result, str1, tmp1); ++ sub_d(result, result, tmp2); ++ b(Done); ++ } else if (isUU) { ++ vld(vtmp1, str1, 0); ++ vld(vtmp2, str2, 0); ++ vxor_v(vtmp1, vtmp1, vtmp2); ++ vseteqz_v(FCC0, vtmp1); ++ bceqz(FCC0, V_L); ++ ++ addi_d(str1, str1, 16); ++ addi_d(str2, str2, 16); ++ addi_d(cnt1, cnt1, -charsInFloatRegister); ++ b(V_Loop); ++ ++ bind(V_L); ++ vxor_v(vtmp2, vtmp2, vtmp2); ++ vabsd_h(vtmp1, vtmp1, vtmp2); ++ vneg_h(vtmp1, vtmp1); ++ vfrstpi_h(vtmp2, vtmp1, 0); ++ vpickve2gr_bu(tmp1, vtmp2, 0); ++ ++ // the index value was stored in tmp1 ++ slli_d(tmp1, tmp1, 1); ++ ldx_hu(result, str1, tmp1); ++ ldx_hu(tmp2, str2, tmp1); ++ sub_d(result, result, tmp2); ++ b(Done); ++ } ++ } ++ ++ // Now the shorter length is in cnt1 and cnt2 can be used as a tmp register ++ // ++ // For example: ++ // If isLL == true and cnt1 > 8, we load 8 bytes from str1 and str2. (Suppose A1 and B1 are different) ++ // tmp1: A7 A6 A5 A4 A3 A2 A1 A0 ++ // tmp2: B7 B6 B5 B4 B3 B2 B1 B0 ++ // ++ // Then Use xor to find the difference between tmp1 and tmp2, right shift. ++ // tmp1: 00 A7 A6 A5 A4 A3 A2 A1 ++ // tmp2: 00 B7 B6 B5 B4 B3 B2 B1 ++ // ++ // Fetch 0 to 7 bits of tmp1 and tmp2, subtract to get the result. ++ // Other types are similar to isLL. ++ ++ bind(Loop_Start); ++ ori(AT, R0, charsInWord); ++ bind(Loop); ++ blt(cnt1, AT, LoopEnd); ++ if (isLL) { ++ ld_d(tmp1, str1, 0); ++ ld_d(tmp2, str2, 0); ++ beq(tmp1, tmp2, L); ++ xorr(cnt2, tmp1, tmp2); ++ ctz_d(cnt2, cnt2); ++ andi(cnt2, cnt2, 0x38); ++ srl_d(tmp1, tmp1, cnt2); ++ srl_d(tmp2, tmp2, cnt2); ++ bstrpick_d(tmp1, tmp1, 7, 0); ++ bstrpick_d(tmp2, tmp2, 7, 0); ++ sub_d(result, tmp1, tmp2); ++ b(Done); ++ bind(L); ++ addi_d(str1, str1, 8); ++ addi_d(str2, str2, 8); ++ addi_d(cnt1, cnt1, -charsInWord); ++ b(Loop); ++ } else if (isLU) { ++ ld_wu(cnt2, str1, 0); ++ andr(tmp1, R0, R0); ++ bstrins_d(tmp1, cnt2, 7, 0); ++ srli_d(cnt2, cnt2, 8); ++ bstrins_d(tmp1, cnt2, 23, 16); ++ srli_d(cnt2, cnt2, 8); ++ bstrins_d(tmp1, cnt2, 39, 32); ++ srli_d(cnt2, cnt2, 8); ++ bstrins_d(tmp1, cnt2, 55, 48); ++ ld_d(tmp2, str2, 0); ++ beq(tmp1, tmp2, L); ++ xorr(cnt2, tmp1, tmp2); ++ ctz_d(cnt2, cnt2); ++ andi(cnt2, cnt2, 0x30); ++ srl_d(tmp1, tmp1, cnt2); ++ srl_d(tmp2, tmp2, cnt2); ++ bstrpick_d(tmp1, tmp1, 15, 0); ++ bstrpick_d(tmp2, tmp2, 15, 0); ++ sub_d(result, tmp1, tmp2); ++ b(Done); ++ bind(L); ++ addi_d(str1, str1, 4); ++ addi_d(str2, str2, 8); ++ addi_d(cnt1, cnt1, -charsInWord); ++ b(Loop); ++ } else if (isUL) { ++ ld_wu(cnt2, str2, 0); ++ andr(tmp2, R0, R0); ++ bstrins_d(tmp2, cnt2, 7, 0); ++ srli_d(cnt2, cnt2, 8); ++ bstrins_d(tmp2, cnt2, 23, 16); ++ srli_d(cnt2, cnt2, 8); ++ bstrins_d(tmp2, cnt2, 39, 32); ++ srli_d(cnt2, cnt2, 8); ++ bstrins_d(tmp2, cnt2, 55, 48); ++ ld_d(tmp1, str1, 0); ++ beq(tmp1, tmp2, L); ++ xorr(cnt2, tmp1, tmp2); ++ ctz_d(cnt2, cnt2); ++ andi(cnt2, cnt2, 0x30); ++ srl_d(tmp1, tmp1, cnt2); ++ srl_d(tmp2, tmp2, cnt2); ++ bstrpick_d(tmp1, tmp1, 15, 0); ++ bstrpick_d(tmp2, tmp2, 15, 0); ++ sub_d(result, tmp1, tmp2); ++ b(Done); ++ bind(L); ++ addi_d(str1, str1, 8); ++ addi_d(str2, str2, 4); ++ addi_d(cnt1, cnt1, -charsInWord); ++ b(Loop); ++ } else { // isUU ++ ld_d(tmp1, str1, 0); ++ ld_d(tmp2, str2, 0); ++ beq(tmp1, tmp2, L); ++ xorr(cnt2, tmp1, tmp2); ++ ctz_d(cnt2, cnt2); ++ andi(cnt2, cnt2, 0x30); ++ srl_d(tmp1, tmp1, cnt2); ++ srl_d(tmp2, tmp2, cnt2); ++ bstrpick_d(tmp1, tmp1, 15, 0); ++ bstrpick_d(tmp2, tmp2, 15, 0); ++ sub_d(result, tmp1, tmp2); ++ b(Done); ++ bind(L); ++ addi_d(str1, str1, 8); ++ addi_d(str2, str2, 8); ++ addi_d(cnt1, cnt1, -charsInWord); ++ b(Loop); ++ } ++ ++ bind(LoopEnd); ++ beqz(cnt1, Done); ++ if (str1_isL) { ++ ld_bu(tmp1, str1, 0); ++ } else { ++ ld_hu(tmp1, str1, 0); ++ } ++ ++ // compare current character ++ if (str2_isL) { ++ ld_bu(tmp2, str2, 0); ++ } else { ++ ld_hu(tmp2, str2, 0); ++ } ++ bne(tmp1, tmp2, HaveResult); ++ addi_d(str1, str1, str1_isL ? 1 : 2); ++ addi_d(str2, str2, str2_isL ? 1 : 2); ++ addi_d(cnt1, cnt1, -1); ++ b(LoopEnd); ++ ++ bind(HaveResult); ++ sub_d(result, tmp1, tmp2); ++ ++ bind(Done); ++} ++ ++// Compare char[] or byte[] arrays or substrings. ++void C2_MacroAssembler::arrays_equals(Register str1, Register str2, ++ Register cnt, Register tmp1, Register tmp2, Register result, ++ bool is_char, bool is_array) { ++ assert_different_registers(str1, str2, result, cnt, tmp1, tmp2); ++ Label Loop, LoopEnd, ShortLoop, True, False; ++ Label A_IS_NOT_NULL, A_MIGHT_BE_NULL; ++ ++ int length_offset = arrayOopDesc::length_offset_in_bytes(); ++ int base_offset = arrayOopDesc::base_offset_in_bytes(is_char ? T_CHAR : T_BYTE); ++ ++ addi_d(result, R0, 1); ++ // Check the input args ++ beq(str1, str2, True); // May have read barriers for str1 and str2 if is_array is true. ++ if (is_array) { ++ // Need additional checks for arrays_equals. ++ andr(tmp1, str1, str2); ++ beqz(tmp1, A_MIGHT_BE_NULL); ++ bind(A_IS_NOT_NULL); ++ ++ // Check the lengths ++ ld_w(cnt, str1, length_offset); ++ ld_w(tmp1, str2, length_offset); ++ bne(cnt, tmp1, False); ++ } ++ beqz(cnt, True); ++ ++ if (is_array) { ++ addi_d(str1, str1, base_offset); ++ addi_d(str2, str2, base_offset); ++ } ++ ++ if (is_char && is_array) { ++ slli_w(cnt, cnt, 1); ++ } ++ move(AT, R0); ++ addi_w(cnt, cnt, -8); ++ blt(cnt, R0, LoopEnd); ++ bind(Loop); ++ ldx_d(tmp1, str1, AT); ++ ldx_d(tmp2, str2, AT); ++ bne(tmp1, tmp2, False); ++ addi_w(AT, AT, 8); ++ addi_w(cnt, cnt, -8); ++ bge(cnt, R0, Loop); ++ li(tmp1, -8); ++ beq(cnt, tmp1, True); ++ ++ bind(LoopEnd); ++ addi_d(cnt, cnt, 8); ++ ++ bind(ShortLoop); ++ ldx_bu(tmp1, str1, AT); ++ ldx_bu(tmp2, str2, AT); ++ bne(tmp1, tmp2, False); ++ addi_w(AT, AT, 1); ++ addi_w(cnt, cnt, -1); ++ bnez(cnt, ShortLoop); ++ b(True); ++ ++ if (is_array) { ++ bind(A_MIGHT_BE_NULL); ++ beqz(str1, False); ++ beqz(str2, False); ++ b(A_IS_NOT_NULL); ++ } ++ ++ bind(False); ++ move(result, R0); ++ ++ bind(True); ++} ++ ++void C2_MacroAssembler::loadstore(Register reg, Register base, int disp, int type) { ++ switch (type) { ++ case STORE_BYTE: st_b (reg, base, disp); break; ++ case STORE_CHAR: ++ case STORE_SHORT: st_h (reg, base, disp); break; ++ case STORE_INT: st_w (reg, base, disp); break; ++ case STORE_LONG: st_d (reg, base, disp); break; ++ case LOAD_BYTE: ld_b (reg, base, disp); break; ++ case LOAD_U_BYTE: ld_bu(reg, base, disp); break; ++ case LOAD_SHORT: ld_h (reg, base, disp); break; ++ case LOAD_U_SHORT: ld_hu(reg, base, disp); break; ++ case LOAD_INT: ld_w (reg, base, disp); break; ++ case LOAD_U_INT: ld_wu(reg, base, disp); break; ++ case LOAD_LONG: ld_d (reg, base, disp); break; ++ case LOAD_LINKED_LONG: ++ ll_d(reg, base, disp); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++} ++ ++void C2_MacroAssembler::loadstore(Register reg, Register base, Register disp, int type) { ++ switch (type) { ++ case STORE_BYTE: stx_b (reg, base, disp); break; ++ case STORE_CHAR: ++ case STORE_SHORT: stx_h (reg, base, disp); break; ++ case STORE_INT: stx_w (reg, base, disp); break; ++ case STORE_LONG: stx_d (reg, base, disp); break; ++ case LOAD_BYTE: ldx_b (reg, base, disp); break; ++ case LOAD_U_BYTE: ldx_bu(reg, base, disp); break; ++ case LOAD_SHORT: ldx_h (reg, base, disp); break; ++ case LOAD_U_SHORT: ldx_hu(reg, base, disp); break; ++ case LOAD_INT: ldx_w (reg, base, disp); break; ++ case LOAD_U_INT: ldx_wu(reg, base, disp); break; ++ case LOAD_LONG: ldx_d (reg, base, disp); break; ++ case LOAD_LINKED_LONG: ++ add_d(AT, base, disp); ++ ll_d(reg, AT, 0); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++} ++ ++void C2_MacroAssembler::loadstore(FloatRegister reg, Register base, int disp, int type) { ++ switch (type) { ++ case STORE_FLOAT: fst_s(reg, base, disp); break; ++ case STORE_DOUBLE: fst_d(reg, base, disp); break; ++ case STORE_VECTORX: vst (reg, base, disp); break; ++ case STORE_VECTORY: xvst (reg, base, disp); break; ++ case LOAD_FLOAT: fld_s(reg, base, disp); break; ++ case LOAD_DOUBLE: fld_d(reg, base, disp); break; ++ case LOAD_VECTORX: vld (reg, base, disp); break; ++ case LOAD_VECTORY: xvld (reg, base, disp); break; ++ default: ++ ShouldNotReachHere(); ++ } ++} ++ ++void C2_MacroAssembler::loadstore(FloatRegister reg, Register base, Register disp, int type) { ++ switch (type) { ++ case STORE_FLOAT: fstx_s(reg, base, disp); break; ++ case STORE_DOUBLE: fstx_d(reg, base, disp); break; ++ case STORE_VECTORX: vstx (reg, base, disp); break; ++ case STORE_VECTORY: xvstx (reg, base, disp); break; ++ case LOAD_FLOAT: fldx_s(reg, base, disp); break; ++ case LOAD_DOUBLE: fldx_d(reg, base, disp); break; ++ case LOAD_VECTORX: vldx (reg, base, disp); break; ++ case LOAD_VECTORY: xvldx (reg, base, disp); break; ++ default: ++ ShouldNotReachHere(); ++ } ++} ++ ++void C2_MacroAssembler::reduce_ins_v(FloatRegister vec1, FloatRegister vec2, FloatRegister vec3, BasicType type, int opcode) { ++ switch (type) { ++ case T_BYTE: ++ switch (opcode) { ++ case Op_AddReductionVI: vadd_b(vec1, vec2, vec3); break; ++ case Op_MulReductionVI: vmul_b(vec1, vec2, vec3); break; ++ case Op_MaxReductionV: vmax_b(vec1, vec2, vec3); break; ++ case Op_MinReductionV: vmin_b(vec1, vec2, vec3); break; ++ case Op_AndReductionV: vand_v(vec1, vec2, vec3); break; ++ case Op_OrReductionV: vor_v(vec1, vec2, vec3); break; ++ case Op_XorReductionV: vxor_v(vec1, vec2, vec3); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ break; ++ case T_SHORT: ++ switch (opcode) { ++ case Op_AddReductionVI: vadd_h(vec1, vec2, vec3); break; ++ case Op_MulReductionVI: vmul_h(vec1, vec2, vec3); break; ++ case Op_MaxReductionV: vmax_h(vec1, vec2, vec3); break; ++ case Op_MinReductionV: vmin_h(vec1, vec2, vec3); break; ++ case Op_AndReductionV: vand_v(vec1, vec2, vec3); break; ++ case Op_OrReductionV: vor_v(vec1, vec2, vec3); break; ++ case Op_XorReductionV: vxor_v(vec1, vec2, vec3); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ break; ++ case T_INT: ++ switch (opcode) { ++ case Op_AddReductionVI: vadd_w(vec1, vec2, vec3); break; ++ case Op_MulReductionVI: vmul_w(vec1, vec2, vec3); break; ++ case Op_MaxReductionV: vmax_w(vec1, vec2, vec3); break; ++ case Op_MinReductionV: vmin_w(vec1, vec2, vec3); break; ++ case Op_AndReductionV: vand_v(vec1, vec2, vec3); break; ++ case Op_OrReductionV: vor_v(vec1, vec2, vec3); break; ++ case Op_XorReductionV: vxor_v(vec1, vec2, vec3); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ break; ++ case T_LONG: ++ switch (opcode) { ++ case Op_AddReductionVL: vadd_d(vec1, vec2, vec3); break; ++ case Op_MulReductionVL: vmul_d(vec1, vec2, vec3); break; ++ case Op_MaxReductionV: vmax_d(vec1, vec2, vec3); break; ++ case Op_MinReductionV: vmin_d(vec1, vec2, vec3); break; ++ case Op_AndReductionV: vand_v(vec1, vec2, vec3); break; ++ case Op_OrReductionV: vor_v(vec1, vec2, vec3); break; ++ case Op_XorReductionV: vxor_v(vec1, vec2, vec3); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++} ++ ++void C2_MacroAssembler::reduce_ins_r(Register reg1, Register reg2, Register reg3, BasicType type, int opcode) { ++ switch (type) { ++ case T_BYTE: ++ case T_SHORT: ++ case T_INT: ++ switch (opcode) { ++ case Op_AddReductionVI: add_w(reg1, reg2, reg3); break; ++ case Op_MulReductionVI: mul_w(reg1, reg2, reg3); break; ++ case Op_AndReductionV: andr(reg1, reg2, reg3); break; ++ case Op_OrReductionV: orr(reg1, reg2, reg3); break; ++ case Op_XorReductionV: xorr(reg1, reg2, reg3); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ break; ++ case T_LONG: ++ switch (opcode) { ++ case Op_AddReductionVL: add_d(reg1, reg2, reg3); break; ++ case Op_MulReductionVL: mul_d(reg1, reg2, reg3); break; ++ case Op_AndReductionV: andr(reg1, reg2, reg3); break; ++ case Op_OrReductionV: orr(reg1, reg2, reg3); break; ++ case Op_XorReductionV: xorr(reg1, reg2, reg3); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++} ++ ++void C2_MacroAssembler::reduce_ins_f(FloatRegister reg1, FloatRegister reg2, FloatRegister reg3, BasicType type, int opcode) { ++ switch (type) { ++ case T_FLOAT: ++ switch (opcode) { ++ case Op_AddReductionVF: fadd_s(reg1, reg2, reg3); break; ++ case Op_MulReductionVF: fmul_s(reg1, reg2, reg3); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ break; ++ case T_DOUBLE: ++ switch (opcode) { ++ case Op_AddReductionVD: fadd_d(reg1, reg2, reg3); break; ++ case Op_MulReductionVD: fmul_d(reg1, reg2, reg3); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++} ++ ++void C2_MacroAssembler::reduce(Register dst, Register src, FloatRegister vsrc, FloatRegister tmp1, FloatRegister tmp2, BasicType type, int opcode, int vector_size) { ++ if (vector_size == 32) { ++ xvpermi_d(tmp1, vsrc, 0b00001110); ++ reduce_ins_v(tmp1, vsrc, tmp1, type, opcode); ++ vpermi_w(tmp2, tmp1, 0b00001110); ++ reduce_ins_v(tmp1, tmp2, tmp1, type, opcode); ++ } else if (vector_size == 16) { ++ vpermi_w(tmp1, vsrc, 0b00001110); ++ reduce_ins_v(tmp1, vsrc, tmp1, type, opcode); ++ } else if (vector_size == 8) { ++ vshuf4i_w(tmp1, vsrc, 0b00000001); ++ reduce_ins_v(tmp1, vsrc, tmp1, type, opcode); ++ } else if (vector_size == 4) { ++ vshuf4i_h(tmp1, vsrc, 0b00000001); ++ reduce_ins_v(tmp1, vsrc, tmp1, type, opcode); ++ } else { ++ ShouldNotReachHere(); ++ } ++ ++ if (type != T_LONG) { ++ if (vector_size > 8) { ++ vshuf4i_w(tmp2, tmp1, 0b00000001); ++ reduce_ins_v(tmp1, tmp2, tmp1, type, opcode); ++ } ++ if (type != T_INT) { ++ if (vector_size > 4) { ++ vshuf4i_h(tmp2, tmp1, 0b00000001); ++ reduce_ins_v(tmp1, tmp2, tmp1, type, opcode); ++ } ++ if (type != T_SHORT) { ++ vshuf4i_b(tmp2, tmp1, 0b00000001); ++ reduce_ins_v(tmp1, tmp2, tmp1, type, opcode); ++ } ++ } ++ } ++ ++ switch (type) { ++ case T_BYTE: vpickve2gr_b(dst, tmp1, 0); break; ++ case T_SHORT: vpickve2gr_h(dst, tmp1, 0); break; ++ case T_INT: vpickve2gr_w(dst, tmp1, 0); break; ++ case T_LONG: vpickve2gr_d(dst, tmp1, 0); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ if (opcode == Op_MaxReductionV) { ++ slt(AT, dst, src); ++ masknez(dst, dst, AT); ++ maskeqz(AT, src, AT); ++ orr(dst, dst, AT); ++ } else if (opcode == Op_MinReductionV) { ++ slt(AT, src, dst); ++ masknez(dst, dst, AT); ++ maskeqz(AT, src, AT); ++ orr(dst, dst, AT); ++ } else { ++ reduce_ins_r(dst, dst, src, type, opcode); ++ } ++ switch (type) { ++ case T_BYTE: ext_w_b(dst, dst); break; ++ case T_SHORT: ext_w_h(dst, dst); break; ++ default: ++ break; ++ } ++} ++ ++void C2_MacroAssembler::reduce(FloatRegister dst, FloatRegister src, FloatRegister vsrc, FloatRegister tmp, BasicType type, int opcode, int vector_size) { ++ if (vector_size == 32) { ++ switch (type) { ++ case T_FLOAT: ++ reduce_ins_f(dst, vsrc, src, type, opcode); ++ xvpickve_w(tmp, vsrc, 1); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ xvpickve_w(tmp, vsrc, 2); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ xvpickve_w(tmp, vsrc, 3); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ xvpickve_w(tmp, vsrc, 4); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ xvpickve_w(tmp, vsrc, 5); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ xvpickve_w(tmp, vsrc, 6); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ xvpickve_w(tmp, vsrc, 7); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ break; ++ case T_DOUBLE: ++ reduce_ins_f(dst, vsrc, src, type, opcode); ++ xvpickve_d(tmp, vsrc, 1); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ xvpickve_d(tmp, vsrc, 2); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ xvpickve_d(tmp, vsrc, 3); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (vector_size == 16) { ++ switch (type) { ++ case T_FLOAT: ++ reduce_ins_f(dst, vsrc, src, type, opcode); ++ vpermi_w(tmp, vsrc, 0b00000001); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ vpermi_w(tmp, vsrc, 0b00000010); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ vpermi_w(tmp, vsrc, 0b00000011); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ break; ++ case T_DOUBLE: ++ reduce_ins_f(dst, vsrc, src, type, opcode); ++ vpermi_w(tmp, vsrc, 0b00001110); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (vector_size == 8) { ++ assert(type == T_FLOAT, "must be"); ++ vpermi_w(tmp, vsrc, 0b00000001); ++ reduce_ins_f(dst, vsrc, src, type, opcode); ++ reduce_ins_f(dst, tmp, dst, type, opcode); ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++void C2_MacroAssembler::vector_compare(FloatRegister dst, FloatRegister src1, FloatRegister src2, BasicType bt, int cond, int vector_size) { ++ if (vector_size == 32) { ++ if (bt == T_BYTE) { ++ switch (cond) { ++ case BoolTest::ne: xvseq_b (dst, src1, src2); xvxori_b(dst, dst, 0xff); break; ++ case BoolTest::eq: xvseq_b (dst, src1, src2); break; ++ case BoolTest::ge: xvsle_b (dst, src2, src1); break; ++ case BoolTest::gt: xvslt_b (dst, src2, src1); break; ++ case BoolTest::le: xvsle_b (dst, src1, src2); break; ++ case BoolTest::lt: xvslt_b (dst, src1, src2); break; ++ case BoolTest::uge: xvsle_bu(dst, src2, src1); break; ++ case BoolTest::ugt: xvslt_bu(dst, src2, src1); break; ++ case BoolTest::ule: xvsle_bu(dst, src1, src2); break; ++ case BoolTest::ult: xvslt_bu(dst, src1, src2); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (bt == T_SHORT) { ++ switch (cond) { ++ case BoolTest::ne: xvseq_h (dst, src1, src2); xvxori_b(dst, dst, 0xff); break; ++ case BoolTest::eq: xvseq_h (dst, src1, src2); break; ++ case BoolTest::ge: xvsle_h (dst, src2, src1); break; ++ case BoolTest::gt: xvslt_h (dst, src2, src1); break; ++ case BoolTest::le: xvsle_h (dst, src1, src2); break; ++ case BoolTest::lt: xvslt_h (dst, src1, src2); break; ++ case BoolTest::uge: xvsle_hu(dst, src2, src1); break; ++ case BoolTest::ugt: xvslt_hu(dst, src2, src1); break; ++ case BoolTest::ule: xvsle_hu(dst, src1, src2); break; ++ case BoolTest::ult: xvslt_hu(dst, src1, src2); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (bt == T_INT) { ++ switch (cond) { ++ case BoolTest::ne: xvseq_w (dst, src1, src2); xvxori_b(dst, dst, 0xff); break; ++ case BoolTest::eq: xvseq_w (dst, src1, src2); break; ++ case BoolTest::ge: xvsle_w (dst, src2, src1); break; ++ case BoolTest::gt: xvslt_w (dst, src2, src1); break; ++ case BoolTest::le: xvsle_w (dst, src1, src2); break; ++ case BoolTest::lt: xvslt_w (dst, src1, src2); break; ++ case BoolTest::uge: xvsle_wu(dst, src2, src1); break; ++ case BoolTest::ugt: xvslt_wu(dst, src2, src1); break; ++ case BoolTest::ule: xvsle_wu(dst, src1, src2); break; ++ case BoolTest::ult: xvslt_wu(dst, src1, src2); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (bt == T_LONG) { ++ switch (cond) { ++ case BoolTest::ne: xvseq_d (dst, src1, src2); xvxori_b(dst, dst, 0xff); break; ++ case BoolTest::eq: xvseq_d (dst, src1, src2); break; ++ case BoolTest::ge: xvsle_d (dst, src2, src1); break; ++ case BoolTest::gt: xvslt_d (dst, src2, src1); break; ++ case BoolTest::le: xvsle_d (dst, src1, src2); break; ++ case BoolTest::lt: xvslt_d (dst, src1, src2); break; ++ case BoolTest::uge: xvsle_du(dst, src2, src1); break; ++ case BoolTest::ugt: xvslt_du(dst, src2, src1); break; ++ case BoolTest::ule: xvsle_du(dst, src1, src2); break; ++ case BoolTest::ult: xvslt_du(dst, src1, src2); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (bt == T_FLOAT) { ++ switch (cond) { ++ case BoolTest::ne: xvfcmp_cune_s(dst, src1, src2); break; ++ case BoolTest::eq: xvfcmp_ceq_s (dst, src1, src2); break; ++ case BoolTest::ge: xvfcmp_cle_s (dst, src2, src1); break; ++ case BoolTest::gt: xvfcmp_clt_s (dst, src2, src1); break; ++ case BoolTest::le: xvfcmp_cle_s (dst, src1, src2); break; ++ case BoolTest::lt: xvfcmp_clt_s (dst, src1, src2); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (bt == T_DOUBLE) { ++ switch (cond) { ++ case BoolTest::ne: xvfcmp_cune_d(dst, src1, src2); break; ++ case BoolTest::eq: xvfcmp_ceq_d (dst, src1, src2); break; ++ case BoolTest::ge: xvfcmp_cle_d (dst, src2, src1); break; ++ case BoolTest::gt: xvfcmp_clt_d (dst, src2, src1); break; ++ case BoolTest::le: xvfcmp_cle_d (dst, src1, src2); break; ++ case BoolTest::lt: xvfcmp_clt_d (dst, src1, src2); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++ } else if (vector_size == 16 || vector_size == 8 || vector_size == 4) { ++ if (bt == T_BYTE) { ++ switch (cond) { ++ case BoolTest::ne: vseq_b (dst, src1, src2); vxori_b(dst, dst, 0xff); break; ++ case BoolTest::eq: vseq_b (dst, src1, src2); break; ++ case BoolTest::ge: vsle_b (dst, src2, src1); break; ++ case BoolTest::gt: vslt_b (dst, src2, src1); break; ++ case BoolTest::le: vsle_b (dst, src1, src2); break; ++ case BoolTest::lt: vslt_b (dst, src1, src2); break; ++ case BoolTest::uge: vsle_bu(dst, src2, src1); break; ++ case BoolTest::ugt: vslt_bu(dst, src2, src1); break; ++ case BoolTest::ule: vsle_bu(dst, src1, src2); break; ++ case BoolTest::ult: vslt_bu(dst, src1, src2); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (bt == T_SHORT) { ++ switch (cond) { ++ case BoolTest::ne: vseq_h (dst, src1, src2); vxori_b(dst, dst, 0xff); break; ++ case BoolTest::eq: vseq_h (dst, src1, src2); break; ++ case BoolTest::ge: vsle_h (dst, src2, src1); break; ++ case BoolTest::gt: vslt_h (dst, src2, src1); break; ++ case BoolTest::le: vsle_h (dst, src1, src2); break; ++ case BoolTest::lt: vslt_h (dst, src1, src2); break; ++ case BoolTest::uge: vsle_hu(dst, src2, src1); break; ++ case BoolTest::ugt: vslt_hu(dst, src2, src1); break; ++ case BoolTest::ule: vsle_hu(dst, src1, src2); break; ++ case BoolTest::ult: vslt_hu(dst, src1, src2); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (bt == T_INT) { ++ switch (cond) { ++ case BoolTest::ne: vseq_w (dst, src1, src2); vxori_b(dst, dst, 0xff); break; ++ case BoolTest::eq: vseq_w (dst, src1, src2); break; ++ case BoolTest::ge: vsle_w (dst, src2, src1); break; ++ case BoolTest::gt: vslt_w (dst, src2, src1); break; ++ case BoolTest::le: vsle_w (dst, src1, src2); break; ++ case BoolTest::lt: vslt_w (dst, src1, src2); break; ++ case BoolTest::uge: vsle_wu(dst, src2, src1); break; ++ case BoolTest::ugt: vslt_wu(dst, src2, src1); break; ++ case BoolTest::ule: vsle_wu(dst, src1, src2); break; ++ case BoolTest::ult: vslt_wu(dst, src1, src2); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (bt == T_LONG) { ++ switch (cond) { ++ case BoolTest::ne: vseq_d (dst, src1, src2); vxori_b(dst, dst, 0xff); break; ++ case BoolTest::eq: vseq_d (dst, src1, src2); break; ++ case BoolTest::ge: vsle_d (dst, src2, src1); break; ++ case BoolTest::gt: vslt_d (dst, src2, src1); break; ++ case BoolTest::le: vsle_d (dst, src1, src2); break; ++ case BoolTest::lt: vslt_d (dst, src1, src2); break; ++ case BoolTest::uge: vsle_du(dst, src2, src1); break; ++ case BoolTest::ugt: vslt_du(dst, src2, src1); break; ++ case BoolTest::ule: vsle_du(dst, src1, src2); break; ++ case BoolTest::ult: vslt_du(dst, src1, src2); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (bt == T_FLOAT) { ++ switch (cond) { ++ case BoolTest::ne: vfcmp_cune_s(dst, src1, src2); break; ++ case BoolTest::eq: vfcmp_ceq_s (dst, src1, src2); break; ++ case BoolTest::ge: vfcmp_cle_s (dst, src2, src1); break; ++ case BoolTest::gt: vfcmp_clt_s (dst, src2, src1); break; ++ case BoolTest::le: vfcmp_cle_s (dst, src1, src2); break; ++ case BoolTest::lt: vfcmp_clt_s (dst, src1, src2); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (bt == T_DOUBLE) { ++ switch (cond) { ++ case BoolTest::ne: vfcmp_cune_d(dst, src1, src2); break; ++ case BoolTest::eq: vfcmp_ceq_d (dst, src1, src2); break; ++ case BoolTest::ge: vfcmp_cle_d (dst, src2, src1); break; ++ case BoolTest::gt: vfcmp_clt_d (dst, src2, src1); break; ++ case BoolTest::le: vfcmp_cle_d (dst, src1, src2); break; ++ case BoolTest::lt: vfcmp_clt_d (dst, src1, src2); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++void C2_MacroAssembler::cmp_branch_short(int flag, Register op1, Register op2, Label& L, bool is_signed) { ++ ++ switch(flag) { ++ case 0x01: //equal ++ beq(op1, op2, L); ++ break; ++ case 0x02: //not_equal ++ bne(op1, op2, L); ++ break; ++ case 0x03: //above ++ if (is_signed) ++ blt(op2, op1, L); ++ else ++ bltu(op2, op1, L); ++ break; ++ case 0x04: //above_equal ++ if (is_signed) ++ bge(op1, op2, L); ++ else ++ bgeu(op1, op2, L); ++ break; ++ case 0x05: //below ++ if (is_signed) ++ blt(op1, op2, L); ++ else ++ bltu(op1, op2, L); ++ break; ++ case 0x06: //below_equal ++ if (is_signed) ++ bge(op2, op1, L); ++ else ++ bgeu(op2, op1, L); ++ break; ++ default: ++ Unimplemented(); ++ } ++} ++ ++void C2_MacroAssembler::cmp_branch_long(int flag, Register op1, Register op2, Label* L, bool is_signed) { ++ Label not_taken; ++ ++ switch(flag) { ++ case 0x01: //equal ++ bne(op1, op2, not_taken); ++ break; ++ case 0x02: //not_equal ++ beq(op1, op2, not_taken); ++ break; ++ case 0x03: //above ++ if (is_signed) ++ bge(op2, op1, not_taken); ++ else ++ bgeu(op2, op1, not_taken); ++ break; ++ case 0x04: //above_equal ++ if (is_signed) ++ blt(op1, op2, not_taken); ++ else ++ bltu(op1, op2, not_taken); ++ break; ++ case 0x05: //below ++ if (is_signed) ++ bge(op1, op2, not_taken); ++ else ++ bgeu(op1, op2, not_taken); ++ break; ++ case 0x06: //below_equal ++ if (is_signed) ++ blt(op2, op1, not_taken); ++ else ++ bltu(op2, op1, not_taken); ++ break; ++ default: ++ Unimplemented(); ++ } ++ ++ jmp_far(*L); ++ bind(not_taken); ++} ++ ++void C2_MacroAssembler::cmp_branchEqNe_off21(int flag, Register op1, Label& L) { ++ switch(flag) { ++ case 0x01: //equal ++ beqz(op1, L); ++ break; ++ case 0x02: //not_equal ++ bnez(op1, L); ++ break; ++ default: ++ Unimplemented(); ++ } ++} ++ ++bool C2_MacroAssembler::in_scratch_emit_size() { ++ if (ciEnv::current()->task() != nullptr) { ++ PhaseOutput* phase_output = Compile::current()->output(); ++ if (phase_output != nullptr && phase_output->in_scratch_emit_size()) { ++ return true; ++ } ++ } ++ return MacroAssembler::in_scratch_emit_size(); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/c2_MacroAssembler_loongarch.hpp b/src/hotspot/cpu/loongarch/c2_MacroAssembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/c2_MacroAssembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/c2_MacroAssembler_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,141 @@ ++/* ++ * Copyright (c) 2020, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_C2_MACROASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_C2_MACROASSEMBLER_LOONGARCH_HPP ++ ++// C2_MacroAssembler contains high-level macros for C2 ++ ++public: ++ void cmp_branch_short(int flag, Register op1, Register op2, Label& L, bool is_signed); ++ void cmp_branch_long(int flag, Register op1, Register op2, Label* L, bool is_signed); ++ void cmp_branchEqNe_off21(int flag, Register op1, Label& L); ++ ++ void fast_lock_c2(Register oop, Register box, Register flag, ++ Register disp_hdr, Register tmp); ++ void fast_unlock_c2(Register oop, Register box, Register flag, ++ Register disp_hdr, Register tmp); ++ ++ // Compare strings. ++ void string_compare(Register str1, Register str2, ++ Register cnt1, Register cnt2, Register result, ++ int ae, Register tmp1, Register tmp2, ++ FloatRegister vtmp1, FloatRegister vtmp2); ++ ++ // Find index of char in Latin-1 string ++ void stringL_indexof_char(Register str1, Register cnt1, ++ Register ch, Register result, ++ Register tmp1, Register tmp2, ++ Register tmp3); ++ ++ // Find index of char in UTF-16 string ++ void string_indexof_char(Register str1, Register cnt1, ++ Register ch, Register result, ++ Register tmp1, Register tmp2, ++ Register tmp3); ++ ++ void string_indexof(Register haystack, Register needle, ++ Register haystack_len, Register needle_len, ++ Register result, int ae); ++ ++ void string_indexof_linearscan(Register haystack, Register needle, ++ Register haystack_len, Register needle_len, ++ int needle_con_cnt, Register result, int ae); ++ ++ // Compare char[] or byte[] arrays. ++ void arrays_equals(Register str1, Register str2, ++ Register cnt, Register tmp1, Register tmp2, Register result, ++ bool is_char, bool is_array); ++ ++ // Memory Data Type ++ #define INT_TYPE 0x100 ++ #define FLOAT_TYPE 0x200 ++ #define SIGNED_TYPE 0x10 ++ #define UNSIGNED_TYPE 0x20 ++ ++ typedef enum { ++ LOAD_BYTE = INT_TYPE | SIGNED_TYPE | 0x1, ++ LOAD_CHAR = INT_TYPE | SIGNED_TYPE | 0x2, ++ LOAD_SHORT = INT_TYPE | SIGNED_TYPE | 0x3, ++ LOAD_INT = INT_TYPE | SIGNED_TYPE | 0x4, ++ LOAD_LONG = INT_TYPE | SIGNED_TYPE | 0x5, ++ STORE_BYTE = INT_TYPE | SIGNED_TYPE | 0x6, ++ STORE_CHAR = INT_TYPE | SIGNED_TYPE | 0x7, ++ STORE_SHORT = INT_TYPE | SIGNED_TYPE | 0x8, ++ STORE_INT = INT_TYPE | SIGNED_TYPE | 0x9, ++ STORE_LONG = INT_TYPE | SIGNED_TYPE | 0xa, ++ LOAD_LINKED_LONG = INT_TYPE | SIGNED_TYPE | 0xb, ++ ++ LOAD_U_BYTE = INT_TYPE | UNSIGNED_TYPE | 0x1, ++ LOAD_U_SHORT = INT_TYPE | UNSIGNED_TYPE | 0x2, ++ LOAD_U_INT = INT_TYPE | UNSIGNED_TYPE | 0x3, ++ ++ LOAD_FLOAT = FLOAT_TYPE | SIGNED_TYPE | 0x1, ++ LOAD_DOUBLE = FLOAT_TYPE | SIGNED_TYPE | 0x2, ++ LOAD_VECTORX = FLOAT_TYPE | SIGNED_TYPE | 0x3, ++ LOAD_VECTORY = FLOAT_TYPE | SIGNED_TYPE | 0x4, ++ STORE_FLOAT = FLOAT_TYPE | SIGNED_TYPE | 0x5, ++ STORE_DOUBLE = FLOAT_TYPE | SIGNED_TYPE | 0x6, ++ STORE_VECTORX = FLOAT_TYPE | SIGNED_TYPE | 0x7, ++ STORE_VECTORY = FLOAT_TYPE | SIGNED_TYPE | 0x8 ++ } CMLoadStoreDataType; ++ ++ void loadstore_enc(Register reg, int base, int index, int scale, int disp, int type) { ++ assert((type & INT_TYPE), "must be General reg type"); ++ loadstore_t(reg, base, index, scale, disp, type); ++ } ++ ++ void loadstore_enc(FloatRegister reg, int base, int index, int scale, int disp, int type) { ++ assert((type & FLOAT_TYPE), "must be Float reg type"); ++ loadstore_t(reg, base, index, scale, disp, type); ++ } ++ ++ void reduce(Register dst, Register src, FloatRegister vsrc, FloatRegister tmp1, FloatRegister tmp2, BasicType type, int opcode, int vector_size); ++ void reduce(FloatRegister dst, FloatRegister src, FloatRegister vsrc, FloatRegister tmp, BasicType type, int opcode, int vector_size); ++ ++ void vector_compare(FloatRegister dst, FloatRegister src1, FloatRegister src2, BasicType type, int cond, int vector_size); ++ ++private: ++ // Return true if the phase output is in the scratch emit size mode. ++ virtual bool in_scratch_emit_size() override; ++ ++ template ++ void loadstore_t(T reg, int base, int index, int scale, int disp, int type) { ++ if (index != -1) { ++ assert(((scale==0)&&(disp==0)), "only support base+index"); ++ loadstore(reg, as_Register(base), as_Register(index), type); ++ } else { ++ loadstore(reg, as_Register(base), disp, type); ++ } ++ } ++ void loadstore(Register reg, Register base, int disp, int type); ++ void loadstore(Register reg, Register base, Register disp, int type); ++ void loadstore(FloatRegister reg, Register base, int disp, int type); ++ void loadstore(FloatRegister reg, Register base, Register disp, int type); ++ ++ void reduce_ins_v(FloatRegister vec1, FloatRegister vec2, FloatRegister vec3, BasicType type, int opcode); ++ void reduce_ins_r(Register reg1, Register reg2, Register reg3, BasicType type, int opcode); ++ void reduce_ins_f(FloatRegister reg1, FloatRegister reg2, FloatRegister reg3, BasicType type, int opcode); ++#endif // CPU_LOONGARCH_C2_MACROASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/codeBuffer_loongarch.cpp b/src/hotspot/cpu/loongarch/codeBuffer_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/codeBuffer_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/codeBuffer_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,32 @@ ++/* ++ * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. ++ * Copyright (c) 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/codeBuffer.inline.hpp" ++#include "asm/macroAssembler.hpp" ++ ++bool CodeBuffer::pd_finalize_stubs() { ++ return emit_shared_stubs_to_interp(this, _shared_stub_to_interp_requests); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/codeBuffer_loongarch.hpp b/src/hotspot/cpu/loongarch/codeBuffer_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/codeBuffer_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/codeBuffer_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,37 @@ ++/* ++ * Copyright (c) 2002, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_CODEBUFFER_LOONGARCH_HPP ++#define CPU_LOONGARCH_CODEBUFFER_LOONGARCH_HPP ++ ++private: ++ void pd_initialize() {} ++ bool pd_finalize_stubs(); ++ ++public: ++ void flush_bundle(bool start_new_bundle) {} ++ static constexpr bool supports_shared_stubs() { return true; } ++ ++#endif // CPU_LOONGARCH_CODEBUFFER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/compiledIC_loongarch.cpp b/src/hotspot/cpu/loongarch/compiledIC_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/compiledIC_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/compiledIC_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,138 @@ ++/* ++ * Copyright (c) 1997, 2014, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "code/compiledIC.hpp" ++#include "code/icBuffer.hpp" ++#include "code/nmethod.hpp" ++#include "memory/resourceArea.hpp" ++#include "runtime/mutexLocker.hpp" ++#include "runtime/safepoint.hpp" ++ ++// ---------------------------------------------------------------------------- ++ ++#define __ _masm. ++address CompiledStaticCall::emit_to_interp_stub(CodeBuffer &cbuf, address mark) { ++ precond(cbuf.stubs()->start() != badAddress); ++ precond(cbuf.stubs()->end() != badAddress); ++ ++ if (mark == nullptr) { ++ mark = cbuf.insts_mark(); // get mark within main instrs section ++ } ++ ++ // Note that the code buffer's insts_mark is always relative to insts. ++ // That's why we must use the macroassembler to generate a stub. ++ MacroAssembler _masm(&cbuf); ++ ++ address base = __ start_a_stub(CompiledStaticCall::to_interp_stub_size()); ++ if (base == nullptr) return nullptr; // CodeBuffer::expand failed ++ // static stub relocation stores the instruction address of the call ++ ++ __ relocate(static_stub_Relocation::spec(mark), 0); ++ ++ { ++ __ emit_static_call_stub(); ++ } ++ ++ // Update current stubs pointer and restore code_end. ++ __ end_a_stub(); ++ return base; ++} ++#undef __ ++ ++int CompiledStaticCall::to_interp_stub_size() { ++ return NativeInstruction::nop_instruction_size + NativeMovConstReg::instruction_size + NativeGeneralJump::instruction_size; ++} ++ ++int CompiledStaticCall::to_trampoline_stub_size() { ++ return NativeInstruction::nop_instruction_size + NativeCallTrampolineStub::instruction_size; ++} ++ ++// Relocation entries for call stub, compiled java to interpreter. ++int CompiledStaticCall::reloc_to_interp_stub() { ++ return 16; ++} ++ ++void CompiledDirectStaticCall::set_to_interpreted(const methodHandle& callee, address entry) { ++ address stub = find_stub(); ++ guarantee(stub != nullptr, "stub not found"); ++ ++ if (TraceICs) { ++ ResourceMark rm; ++ tty->print_cr("CompiledDirectStaticCall@" INTPTR_FORMAT ": set_to_interpreted %s", ++ p2i(instruction_address()), ++ callee->name_and_sig_as_C_string()); ++ } ++ ++ // Creation also verifies the object. ++ NativeMovConstReg* method_holder = nativeMovConstReg_at(stub + NativeInstruction::nop_instruction_size); ++ NativeGeneralJump* jump = nativeGeneralJump_at(method_holder->next_instruction_address()); ++ verify_mt_safe(callee, entry, method_holder, jump); ++ ++ // Update stub. ++ method_holder->set_data((intptr_t)callee()); ++ jump->set_jump_destination(entry); ++ ++ // Update jump to call. ++ set_destination_mt_safe(stub); ++} ++ ++void CompiledDirectStaticCall::set_stub_to_clean(static_stub_Relocation* static_stub) { ++ assert (CompiledIC_lock->is_locked() || SafepointSynchronize::is_at_safepoint(), "mt unsafe call"); ++ // Reset stub. ++ address stub = static_stub->addr(); ++ assert(stub != nullptr, "stub not found"); ++ // Creation also verifies the object. ++ NativeMovConstReg* method_holder = nativeMovConstReg_at(stub + NativeInstruction::nop_instruction_size); ++ NativeGeneralJump* jump = nativeGeneralJump_at(method_holder->next_instruction_address()); ++ method_holder->set_data(0); ++ jump->set_jump_destination(jump->instruction_address()); ++} ++ ++//----------------------------------------------------------------------------- ++// Non-product mode code ++#ifndef PRODUCT ++ ++void CompiledDirectStaticCall::verify() { ++ // Verify call. ++ _call->verify(); ++ if (os::is_MP()) { ++ _call->verify_alignment(); ++ } ++ ++ // Verify stub. ++ address stub = find_stub(); ++ assert(stub != nullptr, "no stub found for static call"); ++ // Creation also verifies the object. ++ NativeMovConstReg* method_holder = nativeMovConstReg_at(stub + NativeInstruction::nop_instruction_size); ++ NativeGeneralJump* jump = nativeGeneralJump_at(method_holder->next_instruction_address()); ++ ++ ++ // Verify state. ++ assert(is_clean() || is_call_to_compiled() || is_call_to_interpreted(), "sanity check"); ++} ++ ++#endif // !PRODUCT +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/continuationEntry_loongarch.hpp b/src/hotspot/cpu/loongarch/continuationEntry_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/continuationEntry_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/continuationEntry_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,33 @@ ++/* ++ * Copyright (c) 2022 SAP SE. All rights reserved. ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_CONTINUATIONENTRY_LOONGARCH_HPP ++#define CPU_LOONGARCH_CONTINUATIONENTRY_LOONGARCH_HPP ++ ++class ContinuationEntryPD { ++ // empty ++}; ++ ++#endif // CPU_LOONGARCH_CONTINUATIONENTRY_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/continuationEntry_loongarch.inline.hpp b/src/hotspot/cpu/loongarch/continuationEntry_loongarch.inline.hpp +--- a/src/hotspot/cpu/loongarch/continuationEntry_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/continuationEntry_loongarch.inline.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,52 @@ ++/* ++ * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_CONTINUATIONENTRY_LOONGARCH_INLINE_HPP ++#define CPU_LOONGARCH_CONTINUATIONENTRY_LOONGARCH_INLINE_HPP ++ ++#include "runtime/continuationEntry.hpp" ++ ++#include "code/codeCache.hpp" ++#include "oops/method.inline.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/registerMap.hpp" ++ ++inline frame ContinuationEntry::to_frame() const { ++ static CodeBlob* cb = CodeCache::find_blob_fast(entry_pc()); ++ assert(cb != nullptr, ""); ++ assert(cb->as_compiled_method()->method()->is_continuation_enter_intrinsic(), ""); ++ return frame(entry_sp(), entry_sp(), entry_fp(), entry_pc(), cb); ++} ++ ++inline intptr_t* ContinuationEntry::entry_fp() const { ++ return (intptr_t*)((address)this + size()) + 2; ++} ++ ++inline void ContinuationEntry::update_register_map(RegisterMap* map) const { ++ intptr_t** fp = (intptr_t**)(bottom_sender_sp() - 2); ++ frame::update_map_with_saved_link(map, fp); ++} ++ ++#endif // CPU_LOONGARCH_CONTINUATIONENTRY_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/continuationFreezeThaw_loongarch.inline.hpp b/src/hotspot/cpu/loongarch/continuationFreezeThaw_loongarch.inline.hpp +--- a/src/hotspot/cpu/loongarch/continuationFreezeThaw_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/continuationFreezeThaw_loongarch.inline.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,284 @@ ++/* ++ * Copyright (c) 2019, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_CONTINUATIONFREEZETHAW_LOONGARCH_INLINE_HPP ++#define CPU_LOONGARCH_CONTINUATIONFREEZETHAW_LOONGARCH_INLINE_HPP ++ ++#include "code/codeBlob.inline.hpp" ++#include "oops/stackChunkOop.inline.hpp" ++#include "runtime/frame.hpp" ++#include "runtime/frame.inline.hpp" ++ ++ ++inline void patch_callee_link(const frame& f, intptr_t* fp) { ++ DEBUG_ONLY(intptr_t* orig = *ContinuationHelper::Frame::callee_link_address(f)); ++ *ContinuationHelper::Frame::callee_link_address(f) = fp; ++} ++ ++inline void patch_callee_link_relative(const frame& f, intptr_t* fp) { ++ intptr_t* la = (intptr_t*)ContinuationHelper::Frame::callee_link_address(f); ++ intptr_t new_value = fp - la; ++ *la = new_value; ++} ++ ++////// Freeze ++ ++// Fast path ++ ++inline void FreezeBase::patch_stack_pd(intptr_t* frame_sp, intptr_t* heap_sp) { ++ // copy the spilled fp from the heap to the stack ++ *(frame_sp - 2) = *(heap_sp - 2); ++} ++ ++// Slow path ++ ++template ++inline frame FreezeBase::sender(const frame& f) { ++ assert(FKind::is_instance(f), ""); ++ if (FKind::interpreted) { ++ return frame(f.sender_sp(), f.interpreter_frame_sender_sp(), f.link(), f.sender_pc()); ++ } ++ ++ intptr_t** link_addr = link_address(f); ++ intptr_t* sender_sp = (intptr_t*)(link_addr + 2); // f.unextended_sp() + (fsize/wordSize); // ++ address sender_pc = (address) *(sender_sp - 1); ++ assert(sender_sp != f.sp(), "must have changed"); ++ ++ int slot = 0; ++ CodeBlob* sender_cb = CodeCache::find_blob_and_oopmap(sender_pc, slot); ++ return sender_cb != nullptr ++ ? frame(sender_sp, sender_sp, *link_addr, sender_pc, sender_cb, ++ slot == -1 ? nullptr : sender_cb->oop_map_for_slot(slot, sender_pc), ++ false /* on_heap ? */) ++ : frame(sender_sp, sender_sp, *link_addr, sender_pc); ++} ++ ++template frame FreezeBase::new_heap_frame(frame& f, frame& caller) { ++ assert(FKind::is_instance(f), ""); ++ assert(!caller.is_interpreted_frame() ++ || caller.unextended_sp() == (intptr_t*)caller.at(frame::interpreter_frame_last_sp_offset), ""); ++ ++ intptr_t *sp, *fp; // sp is really our unextended_sp ++ if (FKind::interpreted) { ++ assert((intptr_t*)f.at(frame::interpreter_frame_last_sp_offset) == nullptr ++ || f.unextended_sp() == (intptr_t*)f.at(frame::interpreter_frame_last_sp_offset), ""); ++ intptr_t locals_offset = *f.addr_at(frame::interpreter_frame_locals_offset); ++ // If the caller.is_empty(), i.e. we're freezing into an empty chunk, then we set ++ // the chunk's argsize in finalize_freeze and make room for it above the unextended_sp ++ bool overlap_caller = caller.is_interpreted_frame() || caller.is_empty(); ++ fp = caller.unextended_sp() - 1 - locals_offset + (overlap_caller ? ContinuationHelper::InterpretedFrame::stack_argsize(f) : 0); ++ sp = fp - (f.fp() - f.unextended_sp()); ++ assert(sp <= fp, ""); ++ assert(fp <= caller.unextended_sp(), ""); ++ caller.set_sp(fp + frame::sender_sp_offset); ++ ++ assert(_cont.tail()->is_in_chunk(sp), ""); ++ ++ frame hf(sp, sp, fp, f.pc(), nullptr, nullptr, true /* on_heap */); ++ *hf.addr_at(frame::interpreter_frame_locals_offset) = locals_offset; ++ return hf; ++ } else { ++ // We need to re-read fp out of the frame because it may be an oop and we might have ++ // had a safepoint in finalize_freeze, after constructing f. ++ fp = *(intptr_t**)(f.sp() - 2); ++ ++ int fsize = FKind::size(f); ++ sp = caller.unextended_sp() - fsize; ++ if (caller.is_interpreted_frame()) { ++ // If the caller is interpreted, our stackargs are not supposed to overlap with it ++ // so we make more room by moving sp down by argsize ++ int argsize = FKind::stack_argsize(f); ++ sp -= argsize; ++ } ++ caller.set_sp(sp + fsize); ++ ++ assert(_cont.tail()->is_in_chunk(sp), ""); ++ ++ return frame(sp, sp, fp, f.pc(), nullptr, nullptr, true /* on_heap */); ++ } ++} ++ ++void FreezeBase::adjust_interpreted_frame_unextended_sp(frame& f) { ++ assert((f.at(frame::interpreter_frame_last_sp_offset) != 0) || (f.unextended_sp() == f.sp()), ""); ++ intptr_t* real_unextended_sp = (intptr_t*)f.at(frame::interpreter_frame_last_sp_offset); ++ if (real_unextended_sp != nullptr) { ++ f.set_unextended_sp(real_unextended_sp); // can be null at a safepoint ++ } ++} ++ ++static inline void relativize_one(intptr_t* const vfp, intptr_t* const hfp, int offset) { ++ assert(*(hfp + offset) == *(vfp + offset), ""); ++ intptr_t* addr = hfp + offset; ++ intptr_t value = *(intptr_t**)addr - vfp; ++ *addr = value; ++} ++ ++inline void FreezeBase::relativize_interpreted_frame_metadata(const frame& f, const frame& hf) { ++ intptr_t* vfp = f.fp(); ++ intptr_t* hfp = hf.fp(); ++ assert(hfp == hf.unextended_sp() + (f.fp() - f.unextended_sp()), ""); ++ assert((f.at(frame::interpreter_frame_last_sp_offset) != 0) ++ || (f.unextended_sp() == f.sp()), ""); ++ assert(f.fp() > (intptr_t*)f.at(frame::interpreter_frame_initial_sp_offset), ""); ++ ++ // at(frame::interpreter_frame_last_sp_offset) can be null at safepoint preempts ++ *hf.addr_at(frame::interpreter_frame_last_sp_offset) = hf.unextended_sp() - hf.fp(); ++ ++ relativize_one(vfp, hfp, frame::interpreter_frame_initial_sp_offset); // == block_top == block_bottom ++ ++ assert((hf.fp() - hf.unextended_sp()) == (f.fp() - f.unextended_sp()), ""); ++ assert(hf.unextended_sp() == (intptr_t*)hf.at(frame::interpreter_frame_last_sp_offset), ""); ++ assert(hf.unextended_sp() <= (intptr_t*)hf.at(frame::interpreter_frame_initial_sp_offset), ""); ++ assert(hf.fp() > (intptr_t*)hf.at(frame::interpreter_frame_initial_sp_offset), ""); ++ // assert(hf.fp() <= (intptr_t*)hf.at(frame::interpreter_frame_locals_offset), ""); ++} ++ ++inline void FreezeBase::set_top_frame_metadata_pd(const frame& hf) { ++ stackChunkOop chunk = _cont.tail(); ++ assert(chunk->is_in_chunk(hf.sp() - 1), ""); ++ assert(chunk->is_in_chunk(hf.sp() - 2), ""); ++ ++ *(hf.sp() - 1) = (intptr_t)hf.pc(); ++ ++ intptr_t* fp_addr = hf.sp() - 2; ++ *fp_addr = hf.is_interpreted_frame() ? (intptr_t)(hf.fp() - fp_addr) ++ : (intptr_t)hf.fp(); ++} ++ ++inline void FreezeBase::patch_pd(frame& hf, const frame& caller) { ++ if (caller.is_interpreted_frame()) { ++ assert(!caller.is_empty(), ""); ++ patch_callee_link_relative(caller, caller.fp()); ++ } else { ++ // If we're the bottom-most frame frozen in this freeze, the caller might have stayed frozen in the chunk, ++ // and its oop-containing fp fixed. We've now just overwritten it, so we must patch it back to its value ++ // as read from the chunk. ++ patch_callee_link(caller, caller.fp()); ++ } ++} ++ ++//////// Thaw ++ ++// Fast path ++ ++inline void ThawBase::prefetch_chunk_pd(void* start, int size) { ++ size <<= LogBytesPerWord; ++ Prefetch::read(start, size); ++ Prefetch::read(start, size - 64); ++} ++ ++template ++inline void Thaw::patch_caller_links(intptr_t* sp, intptr_t* bottom) { ++ // Fast path depends on !PreserveFramePointer. See can_thaw_fast(). ++ assert(!PreserveFramePointer, "Frame pointers need to be fixed"); ++} ++ ++// Slow path ++ ++inline frame ThawBase::new_entry_frame() { ++ intptr_t* sp = _cont.entrySP(); ++ return frame(sp, sp, _cont.entryFP(), _cont.entryPC()); // TODO PERF: This finds code blob and computes deopt state ++} ++ ++template frame ThawBase::new_stack_frame(const frame& hf, frame& caller, bool bottom) { ++ assert(FKind::is_instance(hf), ""); ++ // The values in the returned frame object will be written into the callee's stack in patch. ++ ++ if (FKind::interpreted) { ++ intptr_t* heap_sp = hf.unextended_sp(); ++ // If caller is interpreted it already made room for the callee arguments ++ int overlap = caller.is_interpreted_frame() ? ContinuationHelper::InterpretedFrame::stack_argsize(hf) : 0; ++ const int fsize = ContinuationHelper::InterpretedFrame::frame_bottom(hf) - hf.unextended_sp() - overlap; ++ const int locals = hf.interpreter_frame_method()->max_locals(); ++ intptr_t* frame_sp = caller.unextended_sp() - fsize; ++ intptr_t* fp = frame_sp + (hf.fp() - heap_sp); ++ DEBUG_ONLY(intptr_t* unextended_sp = fp + *hf.addr_at(frame::interpreter_frame_last_sp_offset);) ++ assert(frame_sp == unextended_sp, ""); ++ caller.set_sp(fp + frame::sender_sp_offset); ++ frame f(frame_sp, frame_sp, fp, hf.pc()); ++ // we need to set the locals so that the caller of new_stack_frame() can call ++ // ContinuationHelper::InterpretedFrame::frame_bottom ++ // copy relativized locals from the heap frame ++ *f.addr_at(frame::interpreter_frame_locals_offset) = *hf.addr_at(frame::interpreter_frame_locals_offset); ++ return f; ++ } else { ++ int fsize = FKind::size(hf); ++ intptr_t* frame_sp = caller.unextended_sp() - fsize; ++ if (bottom || caller.is_interpreted_frame()) { ++ int argsize = hf.compiled_frame_stack_argsize(); ++ ++ fsize += argsize; ++ frame_sp -= argsize; ++ caller.set_sp(caller.sp() - argsize); ++ assert(caller.sp() == frame_sp + (fsize-argsize), ""); ++ ++ frame_sp = align(hf, frame_sp, caller, bottom); ++ } ++ ++ assert(hf.cb() != nullptr, ""); ++ assert(hf.oop_map() != nullptr, ""); ++ intptr_t* fp; ++ if (PreserveFramePointer) { ++ // we need to recreate a "real" frame pointer, pointing into the stack ++ fp = frame_sp + FKind::size(hf) - 2; ++ } else { ++ fp = FKind::stub ++ ? frame_sp + fsize - 2 // this value is used for the safepoint stub ++ : *(intptr_t**)(hf.sp() - 2); // we need to re-read fp because it may be an oop and we might have fixed the frame. ++ } ++ return frame(frame_sp, frame_sp, fp, hf.pc(), hf.cb(), hf.oop_map(), false); // TODO PERF : this computes deopt state; is it necessary? ++ } ++} ++ ++inline intptr_t* ThawBase::align(const frame& hf, intptr_t* frame_sp, frame& caller, bool bottom) { ++#ifdef _LP64 ++ if (((intptr_t)frame_sp % frame::frame_alignment) != 0) { ++ // assert(caller.is_interpreted_frame() || (bottom && hf.compiled_frame_stack_argsize() % 2 != 0), ""); ++ frame_sp--; ++ caller.set_sp(caller.sp() - 1); ++ } ++ assert(is_aligned(frame_sp, frame::frame_alignment), ""); ++#endif ++ ++ return frame_sp; ++} ++ ++inline void ThawBase::patch_pd(frame& f, const frame& caller) { ++ patch_callee_link(caller, caller.fp()); ++} ++ ++static inline void derelativize_one(intptr_t* const fp, int offset) { ++ intptr_t* addr = fp + offset; ++ *addr = (intptr_t)(fp + *addr); ++} ++ ++inline void ThawBase::derelativize_interpreted_frame_metadata(const frame& hf, const frame& f) { ++ intptr_t* vfp = f.fp(); ++ ++ derelativize_one(vfp, frame::interpreter_frame_last_sp_offset); ++ derelativize_one(vfp, frame::interpreter_frame_initial_sp_offset); ++} ++ ++#endif // CPU_LOONGARCH_CONTINUATIONFREEZETHAW_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/continuationHelper_loongarch.inline.hpp b/src/hotspot/cpu/loongarch/continuationHelper_loongarch.inline.hpp +--- a/src/hotspot/cpu/loongarch/continuationHelper_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/continuationHelper_loongarch.inline.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,145 @@ ++/* ++ * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_CONTINUATIONHELPER_LOONGARCH_INLINE_HPP ++#define CPU_LOONGARCH_CONTINUATIONHELPER_LOONGARCH_INLINE_HPP ++ ++#include "runtime/continuationHelper.hpp" ++ ++#include "runtime/continuationEntry.inline.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/registerMap.hpp" ++#include "utilities/macros.hpp" ++ ++template ++static inline intptr_t** link_address(const frame& f) { ++ assert(FKind::is_instance(f), ""); ++ return FKind::interpreted ++ ? (intptr_t**)(f.fp() + frame::link_offset) ++ : (intptr_t**)(f.unextended_sp() + f.cb()->frame_size() - 2); ++} ++ ++inline int ContinuationHelper::frame_align_words(int size) { ++ if (frame::frame_alignment != 1) { ++ return size & 1; ++ } else { ++ return 0; ++ } ++} ++ ++inline intptr_t* ContinuationHelper::frame_align_pointer(intptr_t* sp) { ++#ifdef _LP64 ++ sp = align_down(sp, frame::frame_alignment); ++#endif ++ return sp; ++} ++ ++template ++inline void ContinuationHelper::update_register_map(const frame& f, RegisterMap* map) { ++ frame::update_map_with_saved_link(map, link_address(f)); ++} ++ ++inline void ContinuationHelper::update_register_map_with_callee(const frame& f, RegisterMap* map) { ++ frame::update_map_with_saved_link(map, ContinuationHelper::Frame::callee_link_address(f)); ++} ++ ++inline void ContinuationHelper::push_pd(const frame& f) { ++ *(intptr_t**)(f.sp() - 2) = f.fp(); ++} ++ ++inline void ContinuationHelper::set_anchor_to_entry_pd(JavaFrameAnchor* anchor, ContinuationEntry* entry) { ++ anchor->set_last_Java_fp(entry->entry_fp()); ++} ++ ++#ifdef ASSERT ++inline void ContinuationHelper::set_anchor_pd(JavaFrameAnchor* anchor, intptr_t* sp) { ++ intptr_t* fp = *(intptr_t**)(sp - 2); ++ anchor->set_last_Java_fp(fp); ++} ++ ++inline bool ContinuationHelper::Frame::assert_frame_laid_out(frame f) { ++ intptr_t* sp = f.sp(); ++ address pc = *(address*)(sp - frame::sender_sp_ret_address_offset()); ++ intptr_t* fp = *(intptr_t**)(sp - 2); ++ assert(f.raw_pc() == pc, "f.ra_pc: " INTPTR_FORMAT " actual: " INTPTR_FORMAT, p2i(f.raw_pc()), p2i(pc)); ++ assert(f.fp() == fp, "f.fp: " INTPTR_FORMAT " actual: " INTPTR_FORMAT, p2i(f.fp()), p2i(fp)); ++ return f.raw_pc() == pc && f.fp() == fp; ++} ++#endif ++ ++inline intptr_t** ContinuationHelper::Frame::callee_link_address(const frame& f) { ++ return (intptr_t**)(f.sp() - 2); ++} ++ ++inline address* ContinuationHelper::Frame::return_pc_address(const frame& f) { ++ return (address*)(f.real_fp() - 1); ++} ++ ++inline address* ContinuationHelper::InterpretedFrame::return_pc_address(const frame& f) { ++ return (address*)(f.fp() + frame::return_addr_offset); ++} ++ ++inline void ContinuationHelper::InterpretedFrame::patch_sender_sp(frame& f, const frame& caller) { ++ intptr_t* sp = caller.unextended_sp(); ++ assert(f.is_interpreted_frame(), ""); ++ intptr_t* la = f.addr_at(frame::interpreter_frame_sender_sp_offset); ++ *la = f.is_heap_frame() ? (intptr_t)(sp - f.fp()) : (intptr_t)sp; ++} ++ ++inline address ContinuationHelper::Frame::real_pc(const frame& f) { ++ address* pc_addr = &(((address*) f.sp())[-1]); ++ return *pc_addr; ++} ++ ++inline void ContinuationHelper::Frame::patch_pc(const frame& f, address pc) { ++ address* pc_addr = &(((address*) f.sp())[-1]); ++ *pc_addr = pc; ++} ++ ++inline intptr_t* ContinuationHelper::InterpretedFrame::frame_top(const frame& f, InterpreterOopMap* mask) { // inclusive; this will be copied with the frame ++ // interpreter_frame_last_sp_offset, points to unextended_sp includes arguments in the frame ++ // interpreter_frame_initial_sp_offset excludes expression stack slots ++ int expression_stack_sz = expression_stack_size(f, mask); ++ intptr_t* res = *(intptr_t**)f.addr_at(frame::interpreter_frame_initial_sp_offset) - expression_stack_sz; ++ assert(res == (intptr_t*)f.interpreter_frame_monitor_end() - expression_stack_sz, ""); ++ assert(res >= f.unextended_sp(), ++ "res: " INTPTR_FORMAT " initial_sp: " INTPTR_FORMAT " last_sp: " INTPTR_FORMAT " unextended_sp: " INTPTR_FORMAT " expression_stack_size: %d", ++ p2i(res), p2i(f.addr_at(frame::interpreter_frame_initial_sp_offset)), f.at(frame::interpreter_frame_last_sp_offset), p2i(f.unextended_sp()), expression_stack_sz); ++ return res; ++} ++ ++inline intptr_t* ContinuationHelper::InterpretedFrame::frame_bottom(const frame& f) { // exclusive; this will not be copied with the frame ++ return (intptr_t*)f.at_relative(frame::interpreter_frame_locals_offset) + 1; // exclusive, so we add 1 word ++} ++ ++inline intptr_t* ContinuationHelper::InterpretedFrame::frame_top(const frame& f, int callee_argsize, bool callee_interpreted) { ++ return f.unextended_sp() + (callee_interpreted ? callee_argsize : 0); ++} ++ ++inline intptr_t* ContinuationHelper::InterpretedFrame::callers_sp(const frame& f) { ++ return f.fp(); ++} ++ ++#endif // CPU_LOONGARCH_CONTINUATIONFRAMEHELPERS_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/copy_loongarch.cpp b/src/hotspot/cpu/loongarch/copy_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/copy_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/copy_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,147 @@ ++/* ++ * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "utilities/copy.hpp" ++ ++// Template for atomic, element-wise copy. ++template ++static void copy_conjoint_atomic(const T* from, T* to, size_t count) { ++ if (from > to) { ++ while (count-- > 0) { ++ // Copy forwards ++ *to++ = *from++; ++ } ++ } else { ++ from += count - 1; ++ to += count - 1; ++ while (count-- > 0) { ++ // Copy backwards ++ *to-- = *from--; ++ } ++ } ++} ++ ++static void c_conjoint_words(const HeapWord* from, HeapWord* to, size_t count) { ++ (void)memmove(to, from, count * HeapWordSize); ++} ++ ++static void c_disjoint_words(const HeapWord* from, HeapWord* to, size_t count) { ++ (void)memcpy(to, from, count * HeapWordSize); ++} ++ ++static void c_disjoint_words_atomic(const HeapWord* from, HeapWord* to, size_t count) { ++ while (count-- > 0) { ++ *to++ = *from++; ++ } ++} ++ ++static void c_aligned_conjoint_words(const HeapWord* from, HeapWord* to, size_t count) { ++ c_conjoint_words(from, to, count); ++} ++ ++static void c_aligned_disjoint_words(const HeapWord* from, HeapWord* to, size_t count) { ++ c_disjoint_words(from, to, count); ++} ++ ++static void c_conjoint_bytes(const void* from, void* to, size_t count) { ++ (void)memmove(to, from, count); ++} ++ ++static void c_conjoint_bytes_atomic(const void* from, void* to, size_t count) { ++ c_conjoint_bytes(from, to, count); ++} ++ ++static void c_conjoint_jshorts_atomic(const jshort* from, jshort* to, size_t count) { ++ copy_conjoint_atomic(from, to, count); ++} ++ ++static void c_conjoint_jints_atomic(const jint* from, jint* to, size_t count) { ++ copy_conjoint_atomic(from, to, count); ++} ++ ++static void c_conjoint_jlongs_atomic(const jlong* from, jlong* to, size_t count) { ++ copy_conjoint_atomic(from, to, count); ++} ++ ++static void c_conjoint_oops_atomic(const oop* from, oop* to, size_t count) { ++ assert(HeapWordSize == BytesPerOop, "heapwords and oops must be the same size"); ++ copy_conjoint_atomic(from, to, count); ++} ++ ++static void c_arrayof_conjoint_bytes(const HeapWord* from, HeapWord* to, size_t count) { ++ c_conjoint_bytes_atomic(from, to, count); ++} ++ ++static void c_arrayof_conjoint_jshorts(const HeapWord* from, HeapWord* to, size_t count) { ++ c_conjoint_jshorts_atomic((jshort*)from, (jshort*)to, count); ++} ++ ++static void c_arrayof_conjoint_jints(const HeapWord* from, HeapWord* to, size_t count) { ++ c_conjoint_jints_atomic((jint*)from, (jint*)to, count); ++} ++ ++static void c_arrayof_conjoint_jlongs(const HeapWord* from, HeapWord* to, size_t count) { ++ c_conjoint_jlongs_atomic((jlong*)from, (jlong*)to, count); ++} ++ ++static void c_arrayof_conjoint_oops(const HeapWord* from, HeapWord* to, size_t count) { ++ assert(BytesPerLong == BytesPerOop, "jlongs and oops must be the same size"); ++ c_conjoint_oops_atomic((oop*)from, (oop*)to, count); ++} ++ ++static void c_fill_to_words(HeapWord* tohw, julong value, size_t count) { ++ julong* to = (julong*) tohw; ++ while (count-- > 0) { ++ *to++ = value; ++ } ++} ++ ++static void c_fill_to_aligned_words(HeapWord* tohw, julong value, size_t count) { ++ c_fill_to_words(tohw, value, count); ++} ++ ++static void c_fill_to_bytes(void* to, jubyte value, size_t count) { ++ (void)memset(to, value, count); ++} ++ ++Copy::CopyHeapWord Copy::_conjoint_words = c_conjoint_words; ++Copy::CopyHeapWord Copy::_disjoint_words = c_disjoint_words; ++Copy::CopyHeapWord Copy::_disjoint_words_atomic = c_disjoint_words_atomic; ++Copy::CopyHeapWord Copy::_aligned_conjoint_words = c_aligned_conjoint_words; ++Copy::CopyHeapWord Copy::_aligned_disjoint_words = c_aligned_disjoint_words; ++Copy::CopyByte Copy::_conjoint_bytes = c_conjoint_bytes; ++Copy::CopyByte Copy::_conjoint_bytes_atomic = c_conjoint_bytes_atomic; ++Copy::CopyShort Copy::_conjoint_jshorts_atomic = c_conjoint_jshorts_atomic; ++Copy::CopyInt Copy::_conjoint_jints_atomic = c_conjoint_jints_atomic; ++Copy::CopyLong Copy::_conjoint_jlongs_atomic = c_conjoint_jlongs_atomic; ++Copy::CopyOop Copy::_conjoint_oops_atomic = c_conjoint_oops_atomic; ++Copy::CopyHeapWord Copy::_arrayof_conjoint_bytes = c_arrayof_conjoint_bytes; ++Copy::CopyHeapWord Copy::_arrayof_conjoint_jshorts = c_arrayof_conjoint_jshorts; ++Copy::CopyHeapWord Copy::_arrayof_conjoint_jints = c_arrayof_conjoint_jints; ++Copy::CopyHeapWord Copy::_arrayof_conjoint_jlongs = c_arrayof_conjoint_jlongs; ++Copy::CopyHeapWord Copy::_arrayof_conjoint_oops = c_arrayof_conjoint_oops; ++Copy::FillHeapWord Copy::_fill_to_words = c_fill_to_words; ++Copy::FillHeapWord Copy::_fill_to_aligned_words = c_fill_to_aligned_words; ++Copy::FillByte Copy::_fill_to_bytes = c_fill_to_bytes; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/copy_loongarch.hpp b/src/hotspot/cpu/loongarch/copy_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/copy_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/copy_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,65 @@ ++/* ++ * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_COPY_LOONGARCH_HPP ++#define CPU_LOONGARCH_COPY_LOONGARCH_HPP ++ ++friend class StubGenerator; ++ ++typedef void (*CopyByte)(const void*, void*, size_t); ++typedef void (*CopyShort)(const jshort*, jshort*, size_t); ++typedef void (*CopyInt)(const jint*, jint*, size_t); ++typedef void (*CopyLong)(const jlong*, jlong*, size_t); ++typedef void (*CopyOop)(const oop*, oop*, size_t); ++typedef void (*CopyHeapWord)(const HeapWord*, HeapWord*, size_t); ++typedef void (*FillByte)(void*, jubyte, size_t); ++typedef void (*FillHeapWord)(HeapWord*, julong, size_t); ++ ++static CopyHeapWord _conjoint_words; ++static CopyHeapWord _disjoint_words; ++static CopyHeapWord _disjoint_words_atomic; ++static CopyHeapWord _aligned_conjoint_words; ++static CopyHeapWord _aligned_disjoint_words; ++static CopyByte _conjoint_bytes; ++static CopyByte _conjoint_bytes_atomic; ++static CopyShort _conjoint_jshorts_atomic; ++static CopyInt _conjoint_jints_atomic; ++static CopyLong _conjoint_jlongs_atomic; ++static CopyOop _conjoint_oops_atomic; ++static CopyHeapWord _arrayof_conjoint_bytes; ++static CopyHeapWord _arrayof_conjoint_jshorts; ++static CopyHeapWord _arrayof_conjoint_jints; ++static CopyHeapWord _arrayof_conjoint_jlongs; ++static CopyHeapWord _arrayof_conjoint_oops; ++static FillHeapWord _fill_to_words; ++static FillHeapWord _fill_to_aligned_words; ++static FillByte _fill_to_bytes; ++ ++// Inline functions for memory copy and fill. ++ ++// Contains inline asm implementations ++#include OS_CPU_HEADER_INLINE(copy) ++ ++#endif //CPU_LOONGARCH_COPY_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/disassembler_loongarch.hpp b/src/hotspot/cpu/loongarch/disassembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/disassembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/disassembler_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,49 @@ ++/* ++ * Copyright (c) 1997, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_DISASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_DISASSEMBLER_LOONGARCH_HPP ++ ++ static int pd_instruction_alignment() { ++ return sizeof(int); ++ } ++ ++ static const char* pd_cpu_opts() { ++ return "gpr-names=64"; ++ } ++ ++ // special-case instruction decoding. ++ // There may be cases where the binutils disassembler doesn't do ++ // the perfect job. In those cases, decode_instruction0 may kick in ++ // and do it right. ++ // If nothing had to be done, just return "here", otherwise return "here + instr_len(here)" ++ static address decode_instruction0(address here, outputStream* st, address virtual_begin = nullptr) { ++ return here; ++ } ++ ++ // platform-specific instruction annotations (like value of loaded constants) ++ static void annotate(address pc, outputStream* st) { }; ++ ++#endif // CPU_LOONGARCH_DISASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/downcallLinker_loongarch_64.cpp b/src/hotspot/cpu/loongarch/downcallLinker_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/downcallLinker_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/downcallLinker_loongarch_64.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,360 @@ ++/* ++ * Copyright (c) 2020, Red Hat, Inc. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "code/codeBlob.hpp" ++#include "code/codeCache.hpp" ++#include "code/vmreg.inline.hpp" ++#include "compiler/oopMap.hpp" ++#include "logging/logStream.hpp" ++#include "memory/resourceArea.hpp" ++#include "prims/downcallLinker.hpp" ++#include "runtime/globals.hpp" ++#include "runtime/stubCodeGenerator.hpp" ++ ++#define __ _masm-> ++ ++class DowncallStubGenerator : public StubCodeGenerator { ++ BasicType* _signature; ++ int _num_args; ++ BasicType _ret_bt; ++ const ABIDescriptor& _abi; ++ ++ const GrowableArray& _input_registers; ++ const GrowableArray& _output_registers; ++ ++ bool _needs_return_buffer; ++ int _captured_state_mask; ++ bool _needs_transition; ++ ++ int _frame_complete; ++ int _frame_size_slots; ++ OopMapSet* _oop_maps; ++public: ++ DowncallStubGenerator(CodeBuffer* buffer, ++ BasicType* signature, ++ int num_args, ++ BasicType ret_bt, ++ const ABIDescriptor& abi, ++ const GrowableArray& input_registers, ++ const GrowableArray& output_registers, ++ bool needs_return_buffer, ++ int captured_state_mask, ++ bool needs_transition) ++ : StubCodeGenerator(buffer, PrintMethodHandleStubs), ++ _signature(signature), ++ _num_args(num_args), ++ _ret_bt(ret_bt), ++ _abi(abi), ++ _input_registers(input_registers), ++ _output_registers(output_registers), ++ _needs_return_buffer(needs_return_buffer), ++ _captured_state_mask(captured_state_mask), ++ _needs_transition(needs_transition), ++ _frame_complete(0), ++ _frame_size_slots(0), ++ _oop_maps(nullptr) { ++ } ++ ++ void generate(); ++ ++ int frame_complete() const { ++ return _frame_complete; ++ } ++ ++ int framesize() const { ++ return (_frame_size_slots >> (LogBytesPerWord - LogBytesPerInt)); ++ } ++ ++ OopMapSet* oop_maps() const { ++ return _oop_maps; ++ } ++}; ++ ++static const int native_invoker_code_base_size = 256; ++static const int native_invoker_size_per_arg = 8; ++ ++RuntimeStub* DowncallLinker::make_downcall_stub(BasicType* signature, ++ int num_args, ++ BasicType ret_bt, ++ const ABIDescriptor& abi, ++ const GrowableArray& input_registers, ++ const GrowableArray& output_registers, ++ bool needs_return_buffer, ++ int captured_state_mask, ++ bool needs_transition) { ++ int code_size = native_invoker_code_base_size + (num_args * native_invoker_size_per_arg); ++ int locs_size = 1; // must be non-zero ++ CodeBuffer code("nep_invoker_blob", code_size, locs_size); ++ DowncallStubGenerator g(&code, signature, num_args, ret_bt, abi, ++ input_registers, output_registers, ++ needs_return_buffer, captured_state_mask, ++ needs_transition); ++ g.generate(); ++ code.log_section_sizes("nep_invoker_blob"); ++ ++ RuntimeStub* stub = ++ RuntimeStub::new_runtime_stub("nep_invoker_blob", ++ &code, ++ g.frame_complete(), ++ g.framesize(), ++ g.oop_maps(), false); ++ ++#ifndef PRODUCT ++ LogTarget(Trace, foreign, downcall) lt; ++ if (lt.is_enabled()) { ++ ResourceMark rm; ++ LogStream ls(lt); ++ stub->print_on(&ls); ++ } ++#endif ++ ++ return stub; ++} ++ ++void DowncallStubGenerator::generate() { ++ enum layout { ++ fp_off, ++ fp_off2, ++ return_off, ++ return_off2, ++ framesize // inclusive of return address ++ // The following are also computed dynamically: ++ // spill area for return value ++ // out arg area (e.g. for stack args) ++ }; ++ ++ Register tmp1 = SCR1; ++ Register tmp2 = SCR2; ++ ++ VMStorage shuffle_reg = as_VMStorage(S0); ++ JavaCallingConvention in_conv; ++ NativeCallingConvention out_conv(_input_registers); ++ ArgumentShuffle arg_shuffle(_signature, _num_args, _signature, _num_args, &in_conv, &out_conv, shuffle_reg); ++ ++#ifndef PRODUCT ++ LogTarget(Trace, foreign, downcall) lt; ++ if (lt.is_enabled()) { ++ ResourceMark rm; ++ LogStream ls(lt); ++ arg_shuffle.print_on(&ls); ++ } ++#endif ++ ++ int allocated_frame_size = 0; ++ assert(_abi._shadow_space_bytes == 0, "not expecting shadow space on LoongArch64"); ++ allocated_frame_size += arg_shuffle.out_arg_bytes(); ++ ++ bool should_save_return_value = !_needs_return_buffer; ++ RegSpiller out_reg_spiller(_output_registers); ++ int spill_offset = -1; ++ ++ if (should_save_return_value) { ++ spill_offset = 0; ++ // spill area can be shared with shadow space and out args, ++ // since they are only used before the call, ++ // and spill area is only used after. ++ allocated_frame_size = out_reg_spiller.spill_size_bytes() > allocated_frame_size ++ ? out_reg_spiller.spill_size_bytes() ++ : allocated_frame_size; ++ } ++ ++ StubLocations locs; ++ locs.set(StubLocations::TARGET_ADDRESS, _abi._scratch1); ++ if (_needs_return_buffer) { ++ locs.set_frame_data(StubLocations::RETURN_BUFFER, allocated_frame_size); ++ allocated_frame_size += BytesPerWord; // for address spill ++ } ++ if (_captured_state_mask != 0) { ++ locs.set_frame_data(StubLocations::CAPTURED_STATE_BUFFER, allocated_frame_size); ++ allocated_frame_size += BytesPerWord; ++ } ++ ++ _frame_size_slots = align_up(framesize + (allocated_frame_size >> LogBytesPerInt), 4); ++ assert(is_even(_frame_size_slots/2), "sp not 16-byte aligned"); ++ ++ _oop_maps = _needs_transition ? new OopMapSet() : nullptr; ++ address start = __ pc(); ++ ++ __ enter(); ++ ++ // RA and FP are already in place ++ __ addi_d(SP, SP, -((unsigned)_frame_size_slots-4) << LogBytesPerInt); // prolog ++ ++ _frame_complete = __ pc() - start; ++ ++ if (_needs_transition) { ++ Label L; ++ address the_pc = __ pc(); ++ __ bind(L); ++ __ set_last_Java_frame(TREG, SP, FP, L); ++ OopMap* map = new OopMap(_frame_size_slots, 0); ++ _oop_maps->add_gc_map(the_pc - start, map); ++ ++ // State transition ++ __ li(tmp1, _thread_in_native); ++ if (os::is_MP()) { ++ __ addi_d(tmp2, TREG, in_bytes(JavaThread::thread_state_offset())); ++ __ amswap_db_w(R0, tmp1, tmp2); ++ } else { ++ __ st_w(tmp1, TREG, in_bytes(JavaThread::thread_state_offset())); ++ } ++ } ++ ++ __ block_comment("{ argument shuffle"); ++ arg_shuffle.generate(_masm, shuffle_reg, 0, _abi._shadow_space_bytes, locs); ++ __ block_comment("} argument shuffle"); ++ ++ __ jalr(as_Register(locs.get(StubLocations::TARGET_ADDRESS))); ++ // this call is assumed not to have killed rthread ++ ++ if (_needs_return_buffer) { ++ __ ld_d(tmp1, SP, locs.data_offset(StubLocations::RETURN_BUFFER)); ++ int offset = 0; ++ for (int i = 0; i < _output_registers.length(); i++) { ++ VMStorage reg = _output_registers.at(i); ++ if (reg.type() == StorageType::INTEGER) { ++ __ st_d(as_Register(reg), tmp1, offset); ++ offset += 8; ++ } else if (reg.type() == StorageType::FLOAT) { ++ __ fst_d(as_FloatRegister(reg), tmp1, offset); ++ offset += 8; ++ } else { ++ ShouldNotReachHere(); ++ } ++ } ++ } ++ ++ ////////////////////////////////////////////////////////////////////////////// ++ ++ if (_captured_state_mask != 0) { ++ __ block_comment("{ save thread local"); ++ ++ if (should_save_return_value) { ++ out_reg_spiller.generate_spill(_masm, spill_offset); ++ } ++ ++ __ ld_d(c_rarg0, Address(SP, locs.data_offset(StubLocations::CAPTURED_STATE_BUFFER))); ++ __ li(c_rarg1, _captured_state_mask); ++ __ call(CAST_FROM_FN_PTR(address, DowncallLinker::capture_state), relocInfo::runtime_call_type); ++ ++ if (should_save_return_value) { ++ out_reg_spiller.generate_fill(_masm, spill_offset); ++ } ++ ++ __ block_comment("} save thread local"); ++ } ++ ++ ////////////////////////////////////////////////////////////////////////////// ++ ++ Label L_after_safepoint_poll; ++ Label L_safepoint_poll_slow_path; ++ Label L_reguard; ++ Label L_after_reguard; ++ if (_needs_transition) { ++ __ li(tmp1, _thread_in_native_trans); ++ ++ // Force this write out before the read below ++ if (os::is_MP() && !UseSystemMemoryBarrier) { ++ __ addi_d(tmp2, TREG, in_bytes(JavaThread::thread_state_offset())); ++ __ amswap_db_w(R0, tmp1, tmp2); // AnyAny ++ } else { ++ __ st_w(tmp1, TREG, in_bytes(JavaThread::thread_state_offset())); ++ } ++ ++ __ safepoint_poll(L_safepoint_poll_slow_path, TREG, true /* at_return */, true /* acquire */, false /* in_nmethod */); ++ ++ __ ld_w(tmp1, TREG, in_bytes(JavaThread::suspend_flags_offset())); ++ __ bnez(tmp1, L_safepoint_poll_slow_path); ++ ++ __ bind(L_after_safepoint_poll); ++ ++ // change thread state ++ __ li(tmp1, _thread_in_Java); ++ if (os::is_MP()) { ++ __ addi_d(tmp2, TREG, in_bytes(JavaThread::thread_state_offset())); ++ __ amswap_db_w(R0, tmp1, tmp2); ++ } else { ++ __ st_w(tmp1, TREG, in_bytes(JavaThread::thread_state_offset())); ++ } ++ ++ __ block_comment("reguard stack check"); ++ __ ld_w(tmp1, TREG, in_bytes(JavaThread::stack_guard_state_offset())); ++ __ addi_d(tmp1, tmp1, -StackOverflow::stack_guard_yellow_reserved_disabled); ++ __ beqz(tmp1, L_reguard); ++ __ bind(L_after_reguard); ++ ++ __ reset_last_Java_frame(true); ++ } ++ ++ __ leave(); // required for proper stackwalking of RuntimeStub frame ++ __ jr(RA); ++ ++ ////////////////////////////////////////////////////////////////////////////// ++ ++ if (_needs_transition) { ++ __ block_comment("{ L_safepoint_poll_slow_path"); ++ __ bind(L_safepoint_poll_slow_path); ++ ++ if (should_save_return_value) { ++ // Need to save the native result registers around any runtime calls. ++ out_reg_spiller.generate_spill(_masm, spill_offset); ++ } ++ ++ __ move(c_rarg0, TREG); ++ assert(frame::arg_reg_save_area_bytes == 0, "not expecting frame reg save area"); ++ __ call(CAST_FROM_FN_PTR(address, JavaThread::check_special_condition_for_native_trans), relocInfo::runtime_call_type); ++ ++ if (should_save_return_value) { ++ out_reg_spiller.generate_fill(_masm, spill_offset); ++ } ++ ++ __ b(L_after_safepoint_poll); ++ __ block_comment("} L_safepoint_poll_slow_path"); ++ ++ ////////////////////////////////////////////////////////////////////////////// ++ ++ __ block_comment("{ L_reguard"); ++ __ bind(L_reguard); ++ ++ if (should_save_return_value) { ++ out_reg_spiller.generate_spill(_masm, spill_offset); ++ } ++ ++ __ call(CAST_FROM_FN_PTR(address, SharedRuntime::reguard_yellow_pages),relocInfo::runtime_call_type); ++ ++ if (should_save_return_value) { ++ out_reg_spiller.generate_fill(_masm, spill_offset); ++ } ++ ++ __ b(L_after_reguard); ++ ++ __ block_comment("} L_reguard"); ++ } ++ ++ ////////////////////////////////////////////////////////////////////////////// ++ ++ __ flush(); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/foreignGlobals_loongarch.cpp b/src/hotspot/cpu/loongarch/foreignGlobals_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/foreignGlobals_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/foreignGlobals_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,195 @@ ++/* ++ * Copyright (c) 2020, Red Hat, Inc. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#include "precompiled.hpp" ++#include "code/vmreg.inline.hpp" ++#include "runtime/jniHandles.hpp" ++#include "runtime/jniHandles.inline.hpp" ++#include "oops/typeArrayOop.inline.hpp" ++#include "oops/oopCast.inline.hpp" ++#include "prims/foreignGlobals.hpp" ++#include "prims/foreignGlobals.inline.hpp" ++#include "prims/vmstorage.hpp" ++#include "utilities/formatBuffer.hpp" ++ ++bool ForeignGlobals::is_foreign_linker_supported() { ++ return true; ++} ++ ++bool ABIDescriptor::is_volatile_reg(Register reg) const { ++ return _integer_argument_registers.contains(reg) ++ || _integer_additional_volatile_registers.contains(reg); ++} ++ ++bool ABIDescriptor::is_volatile_reg(FloatRegister reg) const { ++ return _float_argument_registers.contains(reg) ++ || _float_additional_volatile_registers.contains(reg); ++} ++ ++const ABIDescriptor ForeignGlobals::parse_abi_descriptor(jobject jabi) { ++ oop abi_oop = JNIHandles::resolve_non_null(jabi); ++ ABIDescriptor abi; ++ ++ objArrayOop inputStorage = jdk_internal_foreign_abi_ABIDescriptor::inputStorage(abi_oop); ++ parse_register_array(inputStorage, StorageType::INTEGER, abi._integer_argument_registers, as_Register); ++ parse_register_array(inputStorage, StorageType::FLOAT, abi._float_argument_registers, as_FloatRegister); ++ ++ objArrayOop outputStorage = jdk_internal_foreign_abi_ABIDescriptor::outputStorage(abi_oop); ++ parse_register_array(outputStorage, StorageType::INTEGER, abi._integer_return_registers, as_Register); ++ parse_register_array(outputStorage, StorageType::FLOAT, abi._float_return_registers, as_FloatRegister); ++ ++ objArrayOop volatileStorage = jdk_internal_foreign_abi_ABIDescriptor::volatileStorage(abi_oop); ++ parse_register_array(volatileStorage, StorageType::INTEGER, abi._integer_additional_volatile_registers, as_Register); ++ parse_register_array(volatileStorage, StorageType::FLOAT, abi._float_additional_volatile_registers, as_FloatRegister); ++ ++ abi._stack_alignment_bytes = jdk_internal_foreign_abi_ABIDescriptor::stackAlignment(abi_oop); ++ abi._shadow_space_bytes = jdk_internal_foreign_abi_ABIDescriptor::shadowSpace(abi_oop); ++ ++ abi._scratch1 = parse_vmstorage(jdk_internal_foreign_abi_ABIDescriptor::scratch1(abi_oop)); ++ abi._scratch2 = parse_vmstorage(jdk_internal_foreign_abi_ABIDescriptor::scratch2(abi_oop)); ++ ++ return abi; ++} ++ ++int RegSpiller::pd_reg_size(VMStorage reg) { ++ if (reg.type() == StorageType::INTEGER) { ++ return 8; ++ } else if (reg.type() == StorageType::FLOAT) { ++ return 8; ++ } ++ return 0; // stack and BAD ++} ++ ++void RegSpiller::pd_store_reg(MacroAssembler* masm, int offset, VMStorage reg) { ++ if (reg.type() == StorageType::INTEGER) { ++ masm->st_d(as_Register(reg), SP, offset); ++ } else if (reg.type() == StorageType::FLOAT) { ++ masm->fst_d(as_FloatRegister(reg), SP, offset); ++ } else { ++ // stack and BAD ++ } ++} ++ ++void RegSpiller::pd_load_reg(MacroAssembler* masm, int offset, VMStorage reg) { ++ if (reg.type() == StorageType::INTEGER) { ++ masm->ld_d(as_Register(reg), SP, offset); ++ } else if (reg.type() == StorageType::FLOAT) { ++ masm->fld_d(as_FloatRegister(reg), SP, offset); ++ } else { ++ // stack and BAD ++ } ++} ++ ++static void move_reg64(MacroAssembler* masm, int out_stk_bias, ++ Register from_reg, VMStorage to_reg) { ++ int out_bias = 0; ++ switch (to_reg.type()) { ++ case StorageType::INTEGER: ++ assert(to_reg.segment_mask() == REG64_MASK, "only moves to 64-bit registers supported"); ++ masm->move(as_Register(to_reg), from_reg); ++ break; ++ case StorageType::STACK: ++ out_bias = out_stk_bias; ++ case StorageType::FRAME_DATA: { ++ Address dest(SP, to_reg.offset() + out_bias); ++ masm->st_d(from_reg, dest); ++ } break; ++ default: ShouldNotReachHere(); ++ } ++} ++ ++static void move_stack(MacroAssembler* masm, Register tmp_reg, int in_stk_bias, int out_stk_bias, ++ VMStorage from_reg, VMStorage to_reg) { ++ Address from_addr(FP, from_reg.offset() + in_stk_bias); ++ int out_bias = 0; ++ switch (to_reg.type()) { ++ case StorageType::INTEGER: ++ assert(to_reg.segment_mask() == REG64_MASK, "only moves to 64-bit registers supported"); ++ masm->ld_d(as_Register(to_reg), from_addr); ++ break; ++ case StorageType::FLOAT: ++ assert(to_reg.segment_mask() == FLOAT64_MASK, "only moves to float registers supported"); ++ masm->fld_d(as_FloatRegister(to_reg), from_addr); ++ break; ++ case StorageType::STACK: ++ out_bias = out_stk_bias; ++ case StorageType::FRAME_DATA: { ++ masm->ld_d(tmp_reg, from_addr); ++ Address dest(SP, to_reg.offset() + out_bias); ++ masm->st_d(tmp_reg, dest); ++ } break; ++ default: ShouldNotReachHere(); ++ } ++} ++ ++static void move_float64(MacroAssembler* masm, int out_stk_bias, ++ FloatRegister from_reg, VMStorage to_reg) { ++ switch (to_reg.type()) { ++ case StorageType::INTEGER: ++ assert(to_reg.segment_mask() == REG64_MASK, "only moves to 64-bit integer registers supported"); ++ masm->movfr2gr_d(as_Register(to_reg), from_reg); ++ break; ++ case StorageType::FLOAT: ++ assert(to_reg.segment_mask() == FLOAT64_MASK, "only moves to float registers supported"); ++ masm->fmov_d(as_FloatRegister(to_reg), from_reg); ++ break; ++ case StorageType::STACK: { ++ Address dest(SP, to_reg.offset() + out_stk_bias); ++ masm->fst_d(from_reg, dest); ++ } break; ++ default: ShouldNotReachHere(); ++ } ++} ++ ++void ArgumentShuffle::pd_generate(MacroAssembler* masm, VMStorage tmp, int in_stk_bias, int out_stk_bias, const StubLocations& locs) const { ++ Register tmp_reg = as_Register(tmp); ++ for (int i = 0; i < _moves.length(); i++) { ++ Move move = _moves.at(i); ++ VMStorage from_reg = move.from; ++ VMStorage to_reg = move.to; ++ ++ // replace any placeholders ++ if (from_reg.type() == StorageType::PLACEHOLDER) { ++ from_reg = locs.get(from_reg); ++ } ++ if (to_reg.type() == StorageType::PLACEHOLDER) { ++ to_reg = locs.get(to_reg); ++ } ++ ++ switch (from_reg.type()) { ++ case StorageType::INTEGER: ++ assert(from_reg.segment_mask() == REG64_MASK, "only 64-bit register supported"); ++ move_reg64(masm, out_stk_bias, as_Register(from_reg), to_reg); ++ break; ++ case StorageType::FLOAT: ++ assert(from_reg.segment_mask() == FLOAT64_MASK, "only float register supported"); ++ move_float64(masm, out_stk_bias, as_FloatRegister(from_reg), to_reg); ++ break; ++ case StorageType::STACK: ++ move_stack(masm, tmp_reg, in_stk_bias, out_stk_bias, from_reg, to_reg); ++ break; ++ default: ShouldNotReachHere(); ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/foreignGlobals_loongarch.hpp b/src/hotspot/cpu/loongarch/foreignGlobals_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/foreignGlobals_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/foreignGlobals_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,50 @@ ++/* ++ * Copyright (c) 2020, Red Hat, Inc. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#ifndef CPU_LOONGARCH_FOREIGN_GLOBALS_LOONGARCH_HPP ++#define CPU_LOONGARCH_FOREIGN_GLOBALS_LOONGARCH_HPP ++ ++#include "asm/macroAssembler.hpp" ++#include "utilities/growableArray.hpp" ++ ++struct ABIDescriptor { ++ GrowableArray _integer_argument_registers; ++ GrowableArray _integer_return_registers; ++ GrowableArray _float_argument_registers; ++ GrowableArray _float_return_registers; ++ ++ GrowableArray _integer_additional_volatile_registers; ++ GrowableArray _float_additional_volatile_registers; ++ ++ int32_t _stack_alignment_bytes; ++ int32_t _shadow_space_bytes; ++ ++ VMStorage _scratch1; ++ VMStorage _scratch2; ++ ++ bool is_volatile_reg(Register reg) const; ++ bool is_volatile_reg(FloatRegister reg) const; ++}; ++ ++#endif // CPU_LOONGARCH_FOREIGN_GLOBALS_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/frame_loongarch.cpp b/src/hotspot/cpu/loongarch/frame_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/frame_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/frame_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,626 @@ ++/* ++ * Copyright (c) 1997, 2014, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "compiler/oopMap.hpp" ++#include "interpreter/interpreter.hpp" ++#include "gc/shared/collectedHeap.hpp" ++#include "memory/resourceArea.hpp" ++#include "oops/markWord.hpp" ++#include "oops/method.hpp" ++#include "oops/oop.inline.hpp" ++#include "prims/methodHandles.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/handles.inline.hpp" ++#include "runtime/javaCalls.hpp" ++#include "runtime/monitorChunk.hpp" ++#include "runtime/signature.hpp" ++#include "runtime/stackWatermarkSet.hpp" ++#include "runtime/stubCodeGenerator.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "vmreg_loongarch.inline.hpp" ++ ++#ifdef ASSERT ++void RegisterMap::check_location_valid() { ++} ++#endif ++ ++// Profiling/safepoint support ++ ++bool frame::safe_for_sender(JavaThread *thread) { ++ address sp = (address)_sp; ++ address fp = (address)_fp; ++ address unextended_sp = (address)_unextended_sp; ++ ++ // consider stack guards when trying to determine "safe" stack pointers ++ // sp must be within the usable part of the stack (not in guards) ++ if (!thread->is_in_usable_stack(sp)) { ++ return false; ++ } ++ ++ // unextended sp must be within the stack and above or equal sp ++ if (!thread->is_in_stack_range_incl(unextended_sp, sp)) { ++ return false; ++ } ++ ++ // an fp must be within the stack and above (but not equal) sp ++ // second evaluation on fp+ is added to handle situation where fp is -1 ++ bool fp_safe = thread->is_in_stack_range_excl(fp, sp) && ++ thread->is_in_full_stack_checked(fp + (return_addr_offset * sizeof(void*))); ++ ++ // We know sp/unextended_sp are safe only fp is questionable here ++ ++ // If the current frame is known to the code cache then we can attempt to ++ // construct the sender and do some validation of it. This goes a long way ++ // toward eliminating issues when we get in frame construction code ++ ++ if (_cb != nullptr ) { ++ ++ // First check if frame is complete and tester is reliable ++ // Unfortunately we can only check frame complete for runtime stubs and nmethod ++ // other generic buffer blobs are more problematic so we just assume they are ++ // ok. adapter blobs never have a frame complete and are never ok. ++ ++ if (!_cb->is_frame_complete_at(_pc)) { ++ if (_cb->is_nmethod() || _cb->is_adapter_blob() || _cb->is_runtime_stub()) { ++ return false; ++ } ++ } ++ ++ // Could just be some random pointer within the codeBlob ++ if (!_cb->code_contains(_pc)) { ++ return false; ++ } ++ ++ // Entry frame checks ++ if (is_entry_frame()) { ++ // an entry frame must have a valid fp. ++ return fp_safe && is_entry_frame_valid(thread); ++ } else if (is_upcall_stub_frame()) { ++ return fp_safe; ++ } ++ ++ intptr_t* sender_sp = nullptr; ++ intptr_t* sender_unextended_sp = nullptr; ++ address sender_pc = nullptr; ++ intptr_t* saved_fp = nullptr; ++ ++ if (is_interpreted_frame()) { ++ // fp must be safe ++ if (!fp_safe) { ++ return false; ++ } ++ ++ sender_pc = (address) this->fp()[return_addr_offset]; ++ // for interpreted frames, the value below is the sender "raw" sp, ++ // which can be different from the sender unextended sp (the sp seen ++ // by the sender) because of current frame local variables ++ sender_sp = (intptr_t*) addr_at(sender_sp_offset); ++ sender_unextended_sp = (intptr_t*) this->fp()[interpreter_frame_sender_sp_offset]; ++ saved_fp = (intptr_t*) this->fp()[link_offset]; ++ ++ } else { ++ // must be some sort of compiled/runtime frame ++ // fp does not have to be safe (although it could be check for c1?) ++ ++ // check for a valid frame_size, otherwise we are unlikely to get a valid sender_pc ++ if (_cb->frame_size() <= 0) { ++ return false; ++ } ++ ++ sender_sp = _unextended_sp + _cb->frame_size(); ++ // Is sender_sp safe? ++ if (!thread->is_in_full_stack_checked((address)sender_sp)) { ++ return false; ++ } ++ sender_unextended_sp = sender_sp; ++ // On LA the return_address is always the word on the stack ++ sender_pc = (address) *(sender_sp - 1); ++ // Note: frame::sender_sp_offset is only valid for compiled frame ++ saved_fp = (intptr_t*) *(sender_sp - 2); ++ } ++ ++ if (Continuation::is_return_barrier_entry(sender_pc)) { ++ // If our sender_pc is the return barrier, then our "real" sender is the continuation entry ++ frame s = Continuation::continuation_bottom_sender(thread, *this, sender_sp); ++ sender_sp = s.sp(); ++ sender_pc = s.pc(); ++ } ++ ++ // If the potential sender is the interpreter then we can do some more checking ++ if (Interpreter::contains(sender_pc)) { ++ ++ // FP is always saved in a recognizable place in any code we generate. However ++ // only if the sender is interpreted/call_stub (c1 too?) are we certain that the saved FP ++ // is really a frame pointer. ++ ++ if (!thread->is_in_stack_range_excl((address)saved_fp, (address)sender_sp)) { ++ return false; ++ } ++ ++ // construct the potential sender ++ ++ frame sender(sender_sp, sender_unextended_sp, saved_fp, sender_pc); ++ ++ return sender.is_interpreted_frame_valid(thread); ++ ++ } ++ ++ // We must always be able to find a recognizable pc ++ CodeBlob* sender_blob = CodeCache::find_blob(sender_pc); ++ if (sender_pc == nullptr || sender_blob == nullptr) { ++ return false; ++ } ++ ++ // Could just be some random pointer within the codeBlob ++ if (!sender_blob->code_contains(sender_pc)) { ++ return false; ++ } ++ ++ // We should never be able to see an adapter if the current frame is something from code cache ++ if (sender_blob->is_adapter_blob()) { ++ return false; ++ } ++ ++ // Could be the call_stub ++ if (StubRoutines::returns_to_call_stub(sender_pc)) { ++ if (!thread->is_in_stack_range_excl((address)saved_fp, (address)sender_sp)) { ++ return false; ++ } ++ ++ // construct the potential sender ++ ++ frame sender(sender_sp, sender_unextended_sp, saved_fp, sender_pc); ++ ++ // Validate the JavaCallWrapper an entry frame must have ++ address jcw = (address)sender.entry_frame_call_wrapper(); ++ ++ return thread->is_in_stack_range_excl(jcw, (address)sender.fp()); ++ } else if (sender_blob->is_upcall_stub()) { ++ return false; ++ } ++ ++ CompiledMethod* nm = sender_blob->as_compiled_method_or_null(); ++ if (nm != nullptr) { ++ if (nm->is_deopt_mh_entry(sender_pc) || nm->is_deopt_entry(sender_pc) || ++ nm->method()->is_method_handle_intrinsic()) { ++ return false; ++ } ++ } ++ ++ // If the frame size is 0 something (or less) is bad because every nmethod has a non-zero frame size ++ // because the return address counts against the callee's frame. ++ ++ if (sender_blob->frame_size() <= 0) { ++ assert(!sender_blob->is_compiled(), "should count return address at least"); ++ return false; ++ } ++ ++ // We should never be able to see anything here except an nmethod. If something in the ++ // code cache (current frame) is called by an entity within the code cache that entity ++ // should not be anything but the call stub (already covered), the interpreter (already covered) ++ // or an nmethod. ++ ++ if (!sender_blob->is_compiled()) { ++ return false; ++ } ++ ++ // Could put some more validation for the potential non-interpreted sender ++ // frame we'd create by calling sender if I could think of any. Wait for next crash in forte... ++ ++ // One idea is seeing if the sender_pc we have is one that we'd expect to call to current cb ++ ++ // We've validated the potential sender that would be created ++ return true; ++ } ++ ++ // Must be native-compiled frame. Since sender will try and use fp to find ++ // linkages it must be safe ++ ++ if (!fp_safe) { ++ return false; ++ } ++ ++ // Will the pc we fetch be non-zero (which we'll find at the oldest frame) ++ ++ if ( (address) this->fp()[return_addr_offset] == nullptr) return false; ++ ++ ++ // could try and do some more potential verification of native frame if we could think of some... ++ ++ return true; ++ ++} ++ ++void frame::patch_pc(Thread* thread, address pc) { ++ assert(_cb == CodeCache::find_blob(pc), "unexpected pc"); ++ address* pc_addr = &(((address*) sp())[-1]); ++ address pc_old = *pc_addr; ++ ++ if (TracePcPatching) { ++ tty->print_cr("patch_pc at address " INTPTR_FORMAT " [" INTPTR_FORMAT " -> " INTPTR_FORMAT "]", ++ p2i(pc_addr), p2i(pc_old), p2i(pc)); ++ } ++ ++ assert(!Continuation::is_return_barrier_entry(pc_old), "return barrier"); ++ ++ // Either the return address is the original one or we are going to ++ // patch in the same address that's already there. ++ assert(_pc == pc_old || pc == pc_old || pc_old == 0, "must be"); ++ DEBUG_ONLY(address old_pc = _pc;) ++ *pc_addr = pc; ++ _pc = pc; // must be set before call to get_deopt_original_pc ++ address original_pc = CompiledMethod::get_deopt_original_pc(this); ++ if (original_pc != nullptr) { ++ assert(original_pc == old_pc, "expected original PC to be stored before patching"); ++ _deopt_state = is_deoptimized; ++ _pc = original_pc; ++ } else { ++ _deopt_state = not_deoptimized; ++ } ++} ++ ++intptr_t* frame::entry_frame_argument_at(int offset) const { ++ // convert offset to index to deal with tsi ++ int index = (Interpreter::expr_offset_in_bytes(offset)/wordSize); ++ // Entry frame's arguments are always in relation to unextended_sp() ++ return &unextended_sp()[index]; ++} ++ ++// locals ++void frame::interpreter_frame_set_locals(intptr_t* locs) { ++ assert(is_interpreted_frame(), "interpreted frame expected"); ++ // set relativized locals ++ ptr_at_put(interpreter_frame_locals_offset, (intptr_t) (locs - fp())); ++} ++ ++// sender_sp ++intptr_t* frame::interpreter_frame_sender_sp() const { ++ assert(is_interpreted_frame(), "interpreted frame expected"); ++ return (intptr_t*) at(interpreter_frame_sender_sp_offset); ++} ++ ++void frame::set_interpreter_frame_sender_sp(intptr_t* sender_sp) { ++ assert(is_interpreted_frame(), "interpreted frame expected"); ++ ptr_at_put(interpreter_frame_sender_sp_offset, (intptr_t) sender_sp); ++} ++ ++ ++// monitor elements ++ ++BasicObjectLock* frame::interpreter_frame_monitor_begin() const { ++ return (BasicObjectLock*) addr_at(interpreter_frame_monitor_block_bottom_offset); ++} ++ ++BasicObjectLock* frame::interpreter_frame_monitor_end() const { ++ BasicObjectLock* result = (BasicObjectLock*) at(interpreter_frame_monitor_block_top_offset); ++ // make sure the pointer points inside the frame ++ assert(sp() <= (intptr_t*) result, "monitor end should be above the stack pointer"); ++ assert((intptr_t*) result < fp(), "monitor end should be strictly below the frame pointer"); ++ return result; ++} ++ ++void frame::interpreter_frame_set_monitor_end(BasicObjectLock* value) { ++ *((BasicObjectLock**)addr_at(interpreter_frame_monitor_block_top_offset)) = value; ++} ++ ++// Used by template based interpreter deoptimization ++void frame::interpreter_frame_set_last_sp(intptr_t* sp) { ++ *((intptr_t**)addr_at(interpreter_frame_last_sp_offset)) = sp; ++} ++ ++frame frame::sender_for_entry_frame(RegisterMap* map) const { ++ assert(map != nullptr, "map must be set"); ++ // Java frame called from C; skip all C frames and return top C ++ // frame of that chunk as the sender ++ JavaFrameAnchor* jfa = entry_frame_call_wrapper()->anchor(); ++ assert(!entry_frame_is_first(), "next Java fp must be non zero"); ++ assert(jfa->last_Java_sp() > sp(), "must be above this frame on stack"); ++ // Since we are walking the stack now this nested anchor is obviously walkable ++ // even if it wasn't when it was stacked. ++ jfa->make_walkable(); ++ map->clear(); ++ assert(map->include_argument_oops(), "should be set by clear"); ++ frame fr(jfa->last_Java_sp(), jfa->last_Java_fp(), jfa->last_Java_pc()); ++ return fr; ++} ++ ++UpcallStub::FrameData* UpcallStub::frame_data_for_frame(const frame& frame) const { ++ assert(frame.is_upcall_stub_frame(), "wrong frame"); ++ // need unextended_sp here, since normal sp is wrong for interpreter callees ++ return reinterpret_cast( ++ reinterpret_cast
(frame.unextended_sp()) + in_bytes(_frame_data_offset)); ++} ++ ++bool frame::upcall_stub_frame_is_first() const { ++ assert(is_upcall_stub_frame(), "must be optimzed entry frame"); ++ UpcallStub* blob = _cb->as_upcall_stub(); ++ JavaFrameAnchor* jfa = blob->jfa_for_frame(*this); ++ return jfa->last_Java_sp() == nullptr; ++} ++ ++frame frame::sender_for_upcall_stub_frame(RegisterMap* map) const { ++ assert(map != nullptr, "map must be set"); ++ UpcallStub* blob = _cb->as_upcall_stub(); ++ // Java frame called from C; skip all C frames and return top C ++ // frame of that chunk as the sender ++ JavaFrameAnchor* jfa = blob->jfa_for_frame(*this); ++ assert(!upcall_stub_frame_is_first(), "must have a frame anchor to go back to"); ++ assert(jfa->last_Java_sp() > sp(), "must be above this frame on stack"); ++ // Since we are walking the stack now this nested anchor is obviously walkable ++ // even if it wasn't when it was stacked. ++ jfa->make_walkable(); ++ map->clear(); ++ assert(map->include_argument_oops(), "should be set by clear"); ++ frame fr(jfa->last_Java_sp(), jfa->last_Java_fp(), jfa->last_Java_pc()); ++ ++ return fr; ++} ++ ++//------------------------------------------------------------------------------ ++// frame::verify_deopt_original_pc ++// ++// Verifies the calculated original PC of a deoptimization PC for the ++// given unextended SP. The unextended SP might also be the saved SP ++// for MethodHandle call sites. ++#ifdef ASSERT ++void frame::verify_deopt_original_pc(CompiledMethod* nm, intptr_t* unextended_sp) { ++ frame fr; ++ ++ // This is ugly but it's better than to change {get,set}_original_pc ++ // to take an SP value as argument. And it's only a debugging ++ // method anyway. ++ fr._unextended_sp = unextended_sp; ++ ++ address original_pc = nm->get_original_pc(&fr); ++ assert(nm->insts_contains(original_pc), ++ "original PC must be in the main code section of the compiled method (or must be immediately following it)"); ++} ++#endif ++ ++//------------------------------------------------------------------------------ ++// frame::adjust_unextended_sp ++#ifdef ASSERT ++void frame::adjust_unextended_sp() { ++ // On LoongArch, sites calling method handle intrinsics and lambda forms are treated ++ // as any other call site. Therefore, no special action is needed when we are ++ // returning to any of these call sites. ++ ++ if (_cb != nullptr) { ++ CompiledMethod* sender_cm = _cb->as_compiled_method_or_null(); ++ if (sender_cm != nullptr) { ++ // If the sender PC is a deoptimization point, get the original PC. ++ if (sender_cm->is_deopt_entry(_pc) || ++ sender_cm->is_deopt_mh_entry(_pc)) { ++ verify_deopt_original_pc(sender_cm, _unextended_sp); ++ } ++ } ++ } ++} ++#endif ++ ++//------------------------------------------------------------------------------ ++// frame::sender_for_interpreter_frame ++frame frame::sender_for_interpreter_frame(RegisterMap* map) const { ++ // SP is the raw SP from the sender after adapter or interpreter ++ // extension. ++ intptr_t* sender_sp = this->sender_sp(); ++ ++ // This is the sp before any possible extension (adapter/locals). ++ intptr_t* unextended_sp = interpreter_frame_sender_sp(); ++ ++ // The interpreter and compiler(s) always save FP in a known ++ // location on entry. We must record where that location is ++ // so this if FP was live on callout from c2 we can find ++ // the saved copy no matter what it called. ++ ++ // Since the interpreter always saves FP if we record where it is then ++ // we don't have to always save FP on entry and exit to c2 compiled ++ // code, on entry will be enough. ++ ++#ifdef COMPILER2_OR_JVMCI ++ if (map->update_map()) { ++ update_map_with_saved_link(map, (intptr_t**) addr_at(link_offset)); ++ } ++#endif // COMPILER2_OR_JVMCI ++ ++ if (Continuation::is_return_barrier_entry(sender_pc())) { ++ if (map->walk_cont()) { // about to walk into an h-stack ++ return Continuation::top_frame(*this, map); ++ } else { ++ return Continuation::continuation_bottom_sender(map->thread(), *this, sender_sp); ++ } ++ } ++ ++ return frame(sender_sp, unextended_sp, link(), sender_pc()); ++} ++ ++bool frame::is_interpreted_frame_valid(JavaThread* thread) const { ++ assert(is_interpreted_frame(), "Not an interpreted frame"); ++ // These are reasonable sanity checks ++ if (fp() == 0 || (intptr_t(fp()) & (wordSize-1)) != 0) { ++ return false; ++ } ++ if (sp() == 0 || (intptr_t(sp()) & (wordSize-1)) != 0) { ++ return false; ++ } ++ if (fp() + interpreter_frame_initial_sp_offset < sp()) { ++ return false; ++ } ++ // These are hacks to keep us out of trouble. ++ // The problem with these is that they mask other problems ++ if (fp() <= sp()) { // this attempts to deal with unsigned comparison above ++ return false; ++ } ++ ++ // do some validation of frame elements ++ ++ // first the method ++ ++ Method* m = safe_interpreter_frame_method(); ++ ++ // validate the method we'd find in this potential sender ++ if (!Method::is_valid_method(m)) return false; ++ ++ // stack frames shouldn't be much larger than max_stack elements ++ ++ //if (fp() - sp() > 1024 + m->max_stack()*Interpreter::stackElementSize()) { ++ if (fp() - sp() > 4096) { // stack frames shouldn't be large. ++ return false; ++ } ++ ++ // validate bci/bcp ++ ++ address bcp = interpreter_frame_bcp(); ++ if (m->validate_bci_from_bcp(bcp) < 0) { ++ return false; ++ } ++ ++ // validate constantPoolCache* ++ ConstantPoolCache* cp = *interpreter_frame_cache_addr(); ++ if (MetaspaceObj::is_valid(cp) == false) { ++ return false; ++ } ++ ++ // validate locals ++ address locals = (address)interpreter_frame_locals(); ++ if (locals > thread->stack_base() /*|| locals < (address) fp() */) { ++ return false; ++ } ++ ++ // We'd have to be pretty unlucky to be mislead at this point ++ return true; ++} ++ ++BasicType frame::interpreter_frame_result(oop* oop_result, jvalue* value_result) { ++ assert(is_interpreted_frame(), "interpreted frame expected"); ++ Method* method = interpreter_frame_method(); ++ BasicType type = method->result_type(); ++ ++ intptr_t* tos_addr; ++ if (method->is_native()) { ++ tos_addr = (intptr_t*)sp(); ++ if (type == T_FLOAT || type == T_DOUBLE) { ++ // This is because we do a push(ltos) after push(dtos) in generate_native_entry. ++ tos_addr += 2 * Interpreter::stackElementWords; ++ } ++ } else { ++ tos_addr = (intptr_t*)interpreter_frame_tos_address(); ++ } ++ ++ switch (type) { ++ case T_OBJECT : ++ case T_ARRAY : { ++ oop obj; ++ if (method->is_native()) { ++ obj = cast_to_oop(at(interpreter_frame_oop_temp_offset)); ++ } else { ++ oop* obj_p = (oop*)tos_addr; ++ obj = (obj_p == nullptr) ? (oop)nullptr : *obj_p; ++ } ++ assert(Universe::is_in_heap_or_null(obj), "sanity check"); ++ *oop_result = obj; ++ break; ++ } ++ case T_BOOLEAN : value_result->z = *(jboolean*)tos_addr; break; ++ case T_BYTE : value_result->b = *(jbyte*)tos_addr; break; ++ case T_CHAR : value_result->c = *(jchar*)tos_addr; break; ++ case T_SHORT : value_result->s = *(jshort*)tos_addr; break; ++ case T_INT : value_result->i = *(jint*)tos_addr; break; ++ case T_LONG : value_result->j = *(jlong*)tos_addr; break; ++ case T_FLOAT : value_result->f = *(jfloat*)tos_addr; break; ++ case T_DOUBLE : value_result->d = *(jdouble*)tos_addr; break; ++ case T_VOID : /* Nothing to do */ break; ++ default : ShouldNotReachHere(); ++ } ++ ++ return type; ++} ++ ++ ++intptr_t* frame::interpreter_frame_tos_at(jint offset) const { ++ int index = (Interpreter::expr_offset_in_bytes(offset)/wordSize); ++ return &interpreter_frame_tos_address()[index]; ++} ++ ++#ifndef PRODUCT ++ ++#define DESCRIBE_FP_OFFSET(name) \ ++ values.describe(frame_no, fp() + frame::name##_offset, #name) ++ ++void frame::describe_pd(FrameValues& values, int frame_no) { ++ if (is_interpreted_frame()) { ++ DESCRIBE_FP_OFFSET(interpreter_frame_sender_sp); ++ DESCRIBE_FP_OFFSET(interpreter_frame_last_sp); ++ DESCRIBE_FP_OFFSET(interpreter_frame_method); ++ DESCRIBE_FP_OFFSET(interpreter_frame_mirror); ++ DESCRIBE_FP_OFFSET(interpreter_frame_mdp); ++ DESCRIBE_FP_OFFSET(interpreter_frame_cache); ++ DESCRIBE_FP_OFFSET(interpreter_frame_locals); ++ DESCRIBE_FP_OFFSET(interpreter_frame_bcp); ++ DESCRIBE_FP_OFFSET(interpreter_frame_initial_sp); ++ } ++ ++ if (is_java_frame() || Continuation::is_continuation_enterSpecial(*this)) { ++ intptr_t* ret_pc_loc; ++ intptr_t* fp_loc; ++ if (is_interpreted_frame()) { ++ ret_pc_loc = fp() + return_addr_offset; ++ fp_loc = fp(); ++ } else { ++ ret_pc_loc = real_fp() - 1; ++ fp_loc = real_fp() - 2; ++ } ++ address ret_pc = *(address*)ret_pc_loc; ++ values.describe(frame_no, ret_pc_loc, ++ Continuation::is_return_barrier_entry(ret_pc) ? "return address (return barrier)" : "return address"); ++ values.describe(-1, fp_loc, "saved fp", 0); // "unowned" as value belongs to sender ++ } ++} ++#endif ++ ++intptr_t *frame::initial_deoptimization_info() { ++ // used to reset the saved FP ++ return fp(); ++} ++ ++#ifndef PRODUCT ++// This is a generic constructor which is only used by pns() in debug.cpp. ++frame::frame(void* ptr_sp, void* ptr_fp, void* pc) : _on_heap(false) { ++ init((intptr_t*)ptr_sp, (intptr_t*)ptr_fp, (address)pc); ++} ++ ++#endif ++ ++void JavaFrameAnchor::make_walkable() { ++ // last frame set? ++ if (last_Java_sp() == nullptr) { return; } ++ // already walkable? ++ if (walkable()) { return; } ++ vmassert(last_Java_sp() != nullptr, "not called from Java code?"); ++ vmassert(last_Java_pc() == nullptr, "already walkable"); ++ _last_Java_pc = (address)_last_Java_sp[-1]; ++ vmassert(walkable(), "something went wrong"); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/frame_loongarch.hpp b/src/hotspot/cpu/loongarch/frame_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/frame_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/frame_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,200 @@ ++/* ++ * Copyright (c) 1997, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_FRAME_LOONGARCH_HPP ++#define CPU_LOONGARCH_FRAME_LOONGARCH_HPP ++ ++// A frame represents a physical stack frame (an activation). Frames can be ++// C or Java frames, and the Java frames can be interpreted or compiled. ++// In contrast, vframes represent source-level activations, so that one physical frame ++// can correspond to multiple source level frames because of inlining. ++// A frame is comprised of {pc, fp, sp} ++// ------------------------------ Asm interpreter ---------------------------------------- ++// Layout of asm interpreter frame: ++// Low ++// [expression stack ] * <- sp ++// [monitors ] \ ++// ... | monitor block size ++// [monitors ] / ++// [monitor block size ] ++// [byte code index/pointr] = bcx() bcx_offset ++// [pointer to locals ] = locals() locals_offset ++// [constant pool cache ] = cache() cache_offset ++// [methodData ] = mdp() mdx_offset ++// [Method ] = method() method_offset ++// [last sp ] = last_sp() last_sp_offset ++// [old stack pointer ] (sender_sp) sender_sp_offset ++// [old frame pointer ] = link() ++// [return pc ] ++// [oop temp ] <- fp (only for native calls) ++// [locals and parameters ] ++// High <- sender sp ++// ------------------------------ Asm interpreter ---------------------------------------- ++// ++// ------------------------------ Native (C frame) --------------------------------------- ++// Layout of C frame: ++// High ++// | ++// - <----- fp <- sender sp ++// fp -8 | [ra] = sender_pc() ++// fp-16 | [fp (sender)] = link() ++// | [...] ++// | ++// - <----- sp ++// | ++// v ++// Low ++// ------------------------------ Native (C frame) --------------------------------------- ++ ++ public: ++ enum { ++ pc_return_offset = 0, ++ ++ link_offset = -2, ++ return_addr_offset = -1, ++ sender_sp_offset = 0, ++ ++ // Interpreter frames ++ interpreter_frame_result_handler_offset = 1, // for native calls only ++ interpreter_frame_oop_temp_offset = 0, // for native calls only ++ ++ interpreter_frame_sender_sp_offset = -3, ++ // outgoing sp before a call to an invoked method ++ interpreter_frame_last_sp_offset = interpreter_frame_sender_sp_offset - 1, ++ interpreter_frame_locals_offset = interpreter_frame_last_sp_offset - 1, ++ interpreter_frame_method_offset = interpreter_frame_locals_offset - 1, ++ interpreter_frame_mirror_offset = interpreter_frame_method_offset - 1, ++ interpreter_frame_mdp_offset = interpreter_frame_mirror_offset - 1, ++ interpreter_frame_cache_offset = interpreter_frame_mdp_offset - 1, ++ interpreter_frame_bcp_offset = interpreter_frame_cache_offset - 1, ++ interpreter_frame_initial_sp_offset = interpreter_frame_bcp_offset - 1, ++ ++ interpreter_frame_monitor_block_top_offset = interpreter_frame_initial_sp_offset, ++ interpreter_frame_monitor_block_bottom_offset = interpreter_frame_initial_sp_offset, ++ ++ // Entry frames ++ // n.b. these values are determined by the layout defined in ++ // stubGenerator for the Java call stub ++ entry_frame_after_call_words = 23, ++ entry_frame_call_wrapper_offset = -3, ++ ++ // we don't need a save area ++ arg_reg_save_area_bytes = 0, ++ ++ // size, in words, of frame metadata (e.g. pc and link) ++ metadata_words = 2, ++ // size, in words, of metadata at frame bottom, i.e. it is not part of the ++ // caller/callee overlap ++ metadata_words_at_bottom = metadata_words, ++ // size, in words, of frame metadata at the frame top, i.e. it is located ++ // between a callee frame and its stack arguments, where it is part ++ // of the caller/callee overlap ++ metadata_words_at_top = 0, ++ // in bytes ++ frame_alignment = 16, ++ // size, in words, of maximum shift in frame position due to alignment ++ align_wiggle = 1 ++ }; ++ ++ intptr_t ptr_at(int offset) const { ++ return *ptr_at_addr(offset); ++ } ++ ++ void ptr_at_put(int offset, intptr_t value) { ++ *ptr_at_addr(offset) = value; ++ } ++ ++ private: ++ // an additional field beyond _sp and _pc: ++ union { ++ intptr_t* _fp; // frame pointer ++ int _offset_fp; // relative frame pointer for use in stack-chunk frames ++ }; ++ // The interpreter and adapters will extend the frame of the caller. ++ // Since oopMaps are based on the sp of the caller before extension ++ // we need to know that value. However in order to compute the address ++ // of the return address we need the real "raw" sp. Since sparc already ++ // uses sp() to mean "raw" sp and unextended_sp() to mean the caller's ++ // original sp we use that convention. ++ ++ union { ++ intptr_t* _unextended_sp; ++ int _offset_unextended_sp; // for use in stack-chunk frames ++ }; ++ ++ void adjust_unextended_sp() NOT_DEBUG_RETURN; ++ ++ intptr_t* ptr_at_addr(int offset) const { ++ return (intptr_t*) addr_at(offset); ++ } ++ ++#ifdef ASSERT ++ // Used in frame::sender_for_{interpreter,compiled}_frame ++ static void verify_deopt_original_pc(CompiledMethod* nm, intptr_t* unextended_sp); ++#endif ++ ++ const ImmutableOopMap* get_oop_map() const; ++ ++ public: ++ // Constructors ++ ++ frame(intptr_t* ptr_sp, intptr_t* ptr_fp, address pc); ++ ++ frame(intptr_t* ptr_sp, intptr_t* unextended_sp, intptr_t* ptr_fp, address pc); ++ ++ frame(intptr_t* sp, intptr_t* unextended_sp, intptr_t* fp, address pc, CodeBlob* cb); ++ // used for fast frame construction by continuations ++ frame(intptr_t* sp, intptr_t* unextended_sp, intptr_t* fp, address pc, CodeBlob* cb, const ImmutableOopMap* oop_map, bool on_heap); ++ ++ frame(intptr_t* ptr_sp, intptr_t* ptr_fp); ++ ++ void init(intptr_t* ptr_sp, intptr_t* ptr_fp, address pc); ++ void setup(address pc); ++ ++ // accessors for the instance variables ++ // Note: not necessarily the real 'frame pointer' (see real_fp) ++ ++ intptr_t* fp() const { assert_absolute(); return _fp; } ++ void set_fp(intptr_t* newfp) { _fp = newfp; } ++ int offset_fp() const { assert_offset(); return _offset_fp; } ++ void set_offset_fp(int value) { assert_on_heap(); _offset_fp = value; } ++ ++ inline address* sender_pc_addr() const; ++ ++ // expression stack tos if we are nested in a java call ++ intptr_t* interpreter_frame_last_sp() const; ++ ++ template ++ static void update_map_with_saved_link(RegisterMapT* map, intptr_t** link_addr); ++ ++ // deoptimization support ++ void interpreter_frame_set_last_sp(intptr_t* last_sp); ++ ++ static jint interpreter_frame_expression_stack_direction() { return -1; } ++ ++ // returns the sending frame, without applying any barriers ++ inline frame sender_raw(RegisterMap* map) const; ++ ++#endif // CPU_LOONGARCH_FRAME_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/frame_loongarch.inline.hpp b/src/hotspot/cpu/loongarch/frame_loongarch.inline.hpp +--- a/src/hotspot/cpu/loongarch/frame_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/frame_loongarch.inline.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,472 @@ ++/* ++ * Copyright (c) 1997, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_FRAME_LOONGARCH_INLINE_HPP ++#define CPU_LOONGARCH_FRAME_LOONGARCH_INLINE_HPP ++ ++#include "code/codeBlob.inline.hpp" ++#include "code/codeCache.inline.hpp" ++#include "code/vmreg.inline.hpp" ++#include "interpreter/interpreter.hpp" ++#include "interpreter/oopMapCache.hpp" ++#include "runtime/sharedRuntime.hpp" ++ ++// Inline functions for Loongson frames: ++ ++// Constructors: ++ ++inline frame::frame() { ++ _pc = nullptr; ++ _sp = nullptr; ++ _unextended_sp = nullptr; ++ _fp = nullptr; ++ _cb = nullptr; ++ _deopt_state = unknown; ++ _on_heap = false; ++ DEBUG_ONLY(_frame_index = -1;) ++} ++ ++inline void frame::init(intptr_t* ptr_sp, intptr_t* ptr_fp, address pc) { ++ intptr_t a = intptr_t(ptr_sp); ++ intptr_t b = intptr_t(ptr_fp); ++ _sp = ptr_sp; ++ _unextended_sp = ptr_sp; ++ _fp = ptr_fp; ++ _pc = pc; ++ _oop_map = nullptr; ++ _on_heap = false; ++ DEBUG_ONLY(_frame_index = -1;) ++ ++ assert(pc != nullptr, "no pc?"); ++ _cb = CodeCache::find_blob(pc); ++ setup(pc); ++} ++ ++inline void frame::setup(address pc) { ++ adjust_unextended_sp(); ++ ++ address original_pc = CompiledMethod::get_deopt_original_pc(this); ++ if (original_pc != nullptr) { ++ _pc = original_pc; ++ _deopt_state = is_deoptimized; ++ assert(_cb == nullptr || _cb->as_compiled_method()->insts_contains_inclusive(_pc), ++ "original PC must be in the main code section of the compiled method (or must be immediately following it)"); ++ } else { ++ if (_cb == SharedRuntime::deopt_blob()) { ++ _deopt_state = is_deoptimized; ++ } else { ++ _deopt_state = not_deoptimized; ++ } ++ } ++} ++ ++inline frame::frame(intptr_t* ptr_sp, intptr_t* ptr_fp, address pc) { ++ init(ptr_sp, ptr_fp, pc); ++} ++ ++inline frame::frame(intptr_t* ptr_sp, intptr_t* unextended_sp, intptr_t* ptr_fp, address pc, CodeBlob* cb) { ++ intptr_t a = intptr_t(ptr_sp); ++ intptr_t b = intptr_t(ptr_fp); ++ _sp = ptr_sp; ++ _unextended_sp = unextended_sp; ++ _fp = ptr_fp; ++ _pc = pc; ++ assert(pc != nullptr, "no pc?"); ++ _cb = cb; ++ _oop_map = nullptr; ++ assert(_cb != nullptr, "pc: " INTPTR_FORMAT, p2i(pc)); ++ _on_heap = false; ++ DEBUG_ONLY(_frame_index = -1;) ++ setup(pc); ++} ++ ++inline frame::frame(intptr_t* ptr_sp, intptr_t* unextended_sp, intptr_t* ptr_fp, address pc, CodeBlob* cb, ++ const ImmutableOopMap* oop_map, bool on_heap) { ++ _sp = ptr_sp; ++ _unextended_sp = unextended_sp; ++ _fp = ptr_fp; ++ _pc = pc; ++ _cb = cb; ++ _oop_map = oop_map; ++ _deopt_state = not_deoptimized; ++ _on_heap = on_heap; ++ DEBUG_ONLY(_frame_index = -1;) ++ ++ // In thaw, non-heap frames use this constructor to pass oop_map. I don't know why. ++ assert(_on_heap || _cb != nullptr, "these frames are always heap frames"); ++ if (cb != nullptr) { ++ setup(pc); ++ } ++#ifdef ASSERT ++ // The following assertion has been disabled because it would sometime trap for Continuation.run, ++ // which is not *in* a continuation and therefore does not clear the _cont_fastpath flag, but this ++ // is benign even in fast mode (see Freeze::setup_jump) ++ // We might freeze deoptimized frame in slow mode ++ // assert(_pc == pc && _deopt_state == not_deoptimized, ""); ++#endif ++} ++ ++inline frame::frame(intptr_t* ptr_sp, intptr_t* unextended_sp, intptr_t* ptr_fp, address pc) { ++ intptr_t a = intptr_t(ptr_sp); ++ intptr_t b = intptr_t(ptr_fp); ++ _sp = ptr_sp; ++ _unextended_sp = unextended_sp; ++ _fp = ptr_fp; ++ _pc = pc; ++ assert(pc != nullptr, "no pc?"); ++ _cb = CodeCache::find_blob_fast(pc); ++ assert(_cb != nullptr, "pc: " INTPTR_FORMAT " sp: " INTPTR_FORMAT " unextended_sp: " INTPTR_FORMAT " fp: " INTPTR_FORMAT, p2i(pc), p2i(ptr_sp), p2i(unextended_sp), p2i(ptr_fp)); ++ _oop_map = nullptr; ++ _on_heap = false; ++ DEBUG_ONLY(_frame_index = -1;) ++ ++ setup(pc); ++} ++ ++inline frame::frame(intptr_t* ptr_sp) : frame(ptr_sp, ptr_sp, *(intptr_t**)(ptr_sp - 2), *(address*)(ptr_sp - 1)) {} ++ ++inline frame::frame(intptr_t* ptr_sp, intptr_t* ptr_fp) { ++ intptr_t a = intptr_t(ptr_sp); ++ intptr_t b = intptr_t(ptr_fp); ++ _sp = ptr_sp; ++ _unextended_sp = ptr_sp; ++ _fp = ptr_fp; ++ _pc = (address)(ptr_sp[-1]); ++ _on_heap = false; ++ DEBUG_ONLY(_frame_index = -1;) ++ ++ // Here's a sticky one. This constructor can be called via AsyncGetCallTrace ++ // when last_Java_sp is non-null but the pc fetched is junk. ++ // AsyncGetCallTrace -> pd_get_top_frame_for_signal_handler ++ // -> pd_last_frame should use a specialized version of pd_last_frame which could ++ // call a specilaized frame constructor instead of this one. ++ // Then we could use the assert below. However this assert is of somewhat dubious ++ // value. ++ ++ _cb = CodeCache::find_blob(_pc); ++ adjust_unextended_sp(); ++ ++ address original_pc = CompiledMethod::get_deopt_original_pc(this); ++ if (original_pc != nullptr) { ++ _pc = original_pc; ++ _deopt_state = is_deoptimized; ++ } else { ++ _deopt_state = not_deoptimized; ++ } ++} ++ ++// Accessors ++ ++inline bool frame::equal(frame other) const { ++ bool ret = sp() == other.sp() && ++ unextended_sp() == other.unextended_sp() && ++ fp() == other.fp() && ++ pc() == other.pc(); ++ assert(!ret || ret && cb() == other.cb() && _deopt_state == other._deopt_state, "inconsistent construction"); ++ return ret; ++} ++ ++// Return unique id for this frame. The id must have a value where we can distinguish ++// identity and younger/older relationship. null represents an invalid (incomparable) ++// frame. ++inline intptr_t* frame::id(void) const { return unextended_sp(); } ++ ++// Return true if the frame is older (less recent activation) than the frame represented by id ++inline bool frame::is_older(intptr_t* id) const { assert(this->id() != nullptr && id != nullptr, "nullptr frame id"); ++ return this->id() > id ; } ++ ++inline intptr_t* frame::link() const { return (intptr_t*) *(intptr_t **)addr_at(link_offset); } ++ ++inline intptr_t* frame::link_or_null() const { ++ intptr_t** ptr = (intptr_t **)addr_at(link_offset); ++ return os::is_readable_pointer(ptr) ? *ptr : nullptr; ++} ++ ++inline intptr_t* frame::unextended_sp() const { assert_absolute(); return _unextended_sp; } ++inline void frame::set_unextended_sp(intptr_t* value) { _unextended_sp = value; } ++inline int frame::offset_unextended_sp() const { assert_offset(); return _offset_unextended_sp; } ++inline void frame::set_offset_unextended_sp(int value) { assert_on_heap(); _offset_unextended_sp = value; } ++ ++inline intptr_t* frame::real_fp() const { ++ if (_cb != nullptr) { ++ // use the frame size if valid ++ int size = _cb->frame_size(); ++ if (size > 0) { ++ return unextended_sp() + size; ++ } ++ } ++ // else rely on fp() ++ assert(!is_compiled_frame(), "unknown compiled frame size"); ++ return fp(); ++} ++ ++inline int frame::frame_size() const { ++ return is_interpreted_frame() ++ ? sender_sp() - sp() ++ : cb()->frame_size(); ++} ++ ++inline int frame::compiled_frame_stack_argsize() const { ++ assert(cb()->is_compiled(), ""); ++ return (cb()->as_compiled_method()->method()->num_stack_arg_slots() * VMRegImpl::stack_slot_size) >> LogBytesPerWord; ++} ++ ++inline void frame::interpreted_frame_oop_map(InterpreterOopMap* mask) const { ++ assert(mask != nullptr, ""); ++ Method* m = interpreter_frame_method(); ++ int bci = interpreter_frame_bci(); ++ m->mask_for(bci, mask); // OopMapCache::compute_one_oop_map(m, bci, mask); ++} ++ ++// Return address ++inline address* frame::sender_pc_addr() const { return (address*) addr_at(return_addr_offset); } ++inline address frame::sender_pc() const { return *sender_pc_addr(); } ++inline intptr_t* frame::sender_sp() const { return addr_at(sender_sp_offset); } ++ ++inline intptr_t* frame::interpreter_frame_locals() const { ++ intptr_t n = *addr_at(interpreter_frame_locals_offset); ++ return &fp()[n]; // return relativized locals ++} ++ ++inline intptr_t* frame::interpreter_frame_last_sp() const { ++ return (intptr_t*)at(interpreter_frame_last_sp_offset); ++} ++ ++inline intptr_t* frame::interpreter_frame_bcp_addr() const { ++ return (intptr_t*)addr_at(interpreter_frame_bcp_offset); ++} ++ ++inline intptr_t* frame::interpreter_frame_mdp_addr() const { ++ return (intptr_t*)addr_at(interpreter_frame_mdp_offset); ++} ++ ++ ++// Constant pool cache ++ ++inline ConstantPoolCache** frame::interpreter_frame_cache_addr() const { ++ return (ConstantPoolCache**)addr_at(interpreter_frame_cache_offset); ++} ++ ++// Method ++ ++inline Method** frame::interpreter_frame_method_addr() const { ++ return (Method**)addr_at(interpreter_frame_method_offset); ++} ++ ++// Mirror ++ ++inline oop* frame::interpreter_frame_mirror_addr() const { ++ return (oop*)addr_at(interpreter_frame_mirror_offset); ++} ++ ++// top of expression stack ++inline intptr_t* frame::interpreter_frame_tos_address() const { ++ intptr_t* last_sp = interpreter_frame_last_sp(); ++ if (last_sp == nullptr) { ++ return sp(); ++ } else { ++ // sp() may have been extended or shrunk by an adapter. At least ++ // check that we don't fall behind the legal region. ++ // For top deoptimized frame last_sp == interpreter_frame_monitor_end. ++ assert(last_sp <= (intptr_t*) interpreter_frame_monitor_end(), "bad tos"); ++ return last_sp; ++ } ++} ++ ++inline oop* frame::interpreter_frame_temp_oop_addr() const { ++ return (oop *)(fp() + interpreter_frame_oop_temp_offset); ++} ++ ++inline int frame::interpreter_frame_monitor_size() { ++ return BasicObjectLock::size(); ++} ++ ++ ++// expression stack ++// (the max_stack arguments are used by the GC; see class FrameClosure) ++ ++inline intptr_t* frame::interpreter_frame_expression_stack() const { ++ intptr_t* monitor_end = (intptr_t*) interpreter_frame_monitor_end(); ++ return monitor_end - 1; ++} ++ ++ ++// Entry frames ++ ++inline JavaCallWrapper** frame::entry_frame_call_wrapper_addr() const { ++ return (JavaCallWrapper**)addr_at(entry_frame_call_wrapper_offset); ++} ++ ++ ++// Compiled frames ++inline oop frame::saved_oop_result(RegisterMap* map) const { ++ oop* result_adr = (oop *)map->location(A0->as_VMReg(), nullptr); ++ guarantee(result_adr != nullptr, "bad register save location"); ++ return (*result_adr); ++} ++ ++inline void frame::set_saved_oop_result(RegisterMap* map, oop obj) { ++ oop* result_adr = (oop *)map->location(A0->as_VMReg(), nullptr); ++ guarantee(result_adr != nullptr, "bad register save location"); ++ *result_adr = obj; ++} ++ ++inline bool frame::is_interpreted_frame() const { ++ return Interpreter::contains(pc()); ++} ++ ++inline int frame::sender_sp_ret_address_offset() { ++ return frame::sender_sp_offset - frame::return_addr_offset; ++} ++ ++inline const ImmutableOopMap* frame::get_oop_map() const { ++ if (_cb == nullptr) return nullptr; ++ if (_cb->oop_maps() != nullptr) { ++ NativePostCallNop* nop = nativePostCallNop_at(_pc); ++ if (nop != nullptr && nop->displacement() != 0) { ++ int slot = ((nop->displacement() >> 24) & 0xff); ++ return _cb->oop_map_for_slot(slot, _pc); ++ } ++ const ImmutableOopMap* oop_map = OopMapSet::find_map(this); ++ return oop_map; ++ } ++ return nullptr; ++} ++ ++//------------------------------------------------------------------------------ ++// frame::sender ++frame frame::sender(RegisterMap* map) const { ++ frame result = sender_raw(map); ++ ++ if (map->process_frames() && !map->in_cont()) { ++ StackWatermarkSet::on_iteration(map->thread(), result); ++ } ++ ++ return result; ++} ++ ++//------------------------------------------------------------------------------ ++// frame::sender_raw ++frame frame::sender_raw(RegisterMap* map) const { ++ // Default is we done have to follow them. The sender_for_xxx will ++ // update it accordingly ++ assert(map != nullptr, "map must be set"); ++ map->set_include_argument_oops(false); ++ ++ if (map->in_cont()) { // already in an h-stack ++ return map->stack_chunk()->sender(*this, map); ++ } ++ ++ if (is_entry_frame()) { ++ return sender_for_entry_frame(map); ++ } ++ if (is_upcall_stub_frame()) { ++ return sender_for_upcall_stub_frame(map); ++ } ++ if (is_interpreted_frame()) { ++ return sender_for_interpreter_frame(map); ++ } ++ ++ assert(_cb == CodeCache::find_blob(pc()),"Must be the same"); ++ if (_cb != nullptr) { ++ return sender_for_compiled_frame(map); ++ } ++ ++ // Must be native-compiled frame, i.e. the marshaling code for native ++ // methods that exists in the core system. ++ ++ // Native code may or may not have signed the return address, we have no way to be sure ++ // or what signing methods they used. Instead, just ensure the stripped value is used. ++ ++ return frame(sender_sp(), link(), sender_pc()); ++} ++ ++//------------------------------------------------------------------------------ ++// frame::sender_for_compiled_frame ++frame frame::sender_for_compiled_frame(RegisterMap* map) const { ++ // we cannot rely upon the last fp having been saved to the thread ++ // in C2 code but it will have been pushed onto the stack. so we ++ // have to find it relative to the unextended sp ++ ++ assert(_cb->frame_size() > 0, "must have non-zero frame size"); ++ intptr_t* l_sender_sp = unextended_sp() + _cb->frame_size(); ++ ++ // the return_address is always the word on the stack ++ address sender_pc = (address) *(l_sender_sp + frame::return_addr_offset); ++ ++ intptr_t** saved_fp_addr = (intptr_t**) (l_sender_sp + frame::link_offset); ++ ++ assert(map != nullptr, "map must be set"); ++ if (map->update_map()) { ++ // Tell GC to use argument oopmaps for some runtime stubs that need it. ++ // For C1, the runtime stub might not have oop maps, so set this flag ++ // outside of update_register_map. ++ if (!_cb->is_compiled()) { // compiled frames do not use callee-saved registers ++ map->set_include_argument_oops(_cb->caller_must_gc_arguments(map->thread())); ++ if (oop_map() != nullptr) { ++ _oop_map->update_register_map(this, map); ++ } ++ } else { ++ assert(!_cb->caller_must_gc_arguments(map->thread()), ""); ++ assert(!map->include_argument_oops(), ""); ++ assert(oop_map() == nullptr || !oop_map()->has_any(OopMapValue::callee_saved_value), "callee-saved value in compiled frame"); ++ } ++ ++ // Since the prolog does the save and restore of FP there is no ++ // oopmap for it so we must fill in its location as if there was ++ // an oopmap entry since if our caller was compiled code there ++ // could be live jvm state in it. ++ update_map_with_saved_link(map, saved_fp_addr); ++ } ++ ++ if (Continuation::is_return_barrier_entry(sender_pc)) { ++ if (map->walk_cont()) { // about to walk into an h-stack ++ return Continuation::top_frame(*this, map); ++ } else { ++ return Continuation::continuation_bottom_sender(map->thread(), *this, l_sender_sp); ++ } ++ } ++ ++ intptr_t* unextended_sp = l_sender_sp; ++ return frame(l_sender_sp, unextended_sp, *saved_fp_addr, sender_pc); ++} ++ ++//------------------------------------------------------------------------------ ++// frame::update_map_with_saved_link ++template ++void frame::update_map_with_saved_link(RegisterMapT* map, intptr_t** link_addr) { ++ assert(map != nullptr, "map must be set"); ++ // The interpreter and compiler(s) always save FP in a known ++ // location on entry. C2-compiled code uses FP as an allocatable ++ // callee-saved register. We must record where that location is so ++ // that if FP was live on callout from C2 we can find the saved copy. ++ map->set_location(FP->as_VMReg(), (address) link_addr); ++ // this is weird "H" ought to be at a higher address however the ++ // oopMaps seems to have the "H" regs at the same address and the ++ // vanilla register. ++ map->set_location(FP->as_VMReg()->next(), (address) link_addr); ++} ++ ++#endif // CPU_LOONGARCH_FRAME_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/g1/g1BarrierSetAssembler_loongarch.cpp b/src/hotspot/cpu/loongarch/gc/g1/g1BarrierSetAssembler_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/gc/g1/g1BarrierSetAssembler_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/g1/g1BarrierSetAssembler_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,491 @@ ++/* ++ * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "gc/g1/g1BarrierSet.hpp" ++#include "gc/g1/g1BarrierSetAssembler.hpp" ++#include "gc/g1/g1BarrierSetRuntime.hpp" ++#include "gc/g1/g1CardTable.hpp" ++#include "gc/g1/g1ThreadLocalData.hpp" ++#include "gc/g1/heapRegion.hpp" ++#include "interpreter/interp_masm.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "utilities/macros.hpp" ++#ifdef COMPILER1 ++#include "c1/c1_LIRAssembler.hpp" ++#include "c1/c1_MacroAssembler.hpp" ++#include "gc/g1/c1/g1BarrierSetC1.hpp" ++#endif ++ ++#define __ masm-> ++ ++void G1BarrierSetAssembler::gen_write_ref_array_pre_barrier(MacroAssembler* masm, DecoratorSet decorators, ++ Register addr, Register count, RegSet saved_regs) { ++ bool dest_uninitialized = (decorators & IS_DEST_UNINITIALIZED) != 0; ++ ++ if (!dest_uninitialized) { ++ Label filtered; ++ Address in_progress(TREG, in_bytes(G1ThreadLocalData::satb_mark_queue_active_offset())); ++ // Is marking active? ++ if (in_bytes(SATBMarkQueue::byte_width_of_active()) == 4) { ++ __ ld_w(AT, in_progress); ++ } else { ++ assert(in_bytes(SATBMarkQueue::byte_width_of_active()) == 1, "Assumption"); ++ __ ld_b(AT, in_progress); ++ } ++ ++ __ beqz(AT, filtered); ++ ++ __ push(saved_regs); ++ if (count == A0) { ++ if (addr == A1) { ++ __ move(AT, A0); ++ __ move(A0, A1); ++ __ move(A1, AT); ++ } else { ++ __ move(A1, count); ++ __ move(A0, addr); ++ } ++ } else { ++ __ move(A0, addr); ++ __ move(A1, count); ++ } ++ if (UseCompressedOops) { ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_array_pre_narrow_oop_entry), 2); ++ } else { ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_array_pre_oop_entry), 2); ++ } ++ __ pop(saved_regs); ++ ++ __ bind(filtered); ++ } ++} ++ ++void G1BarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssembler* masm, DecoratorSet decorators, ++ Register addr, Register count, Register tmp, RegSet saved_regs) { ++ __ push(saved_regs); ++ if (count == A0) { ++ assert_different_registers(A1, addr); ++ __ move(A1, count); ++ __ move(A0, addr); ++ } else { ++ assert_different_registers(A0, count); ++ __ move(A0, addr); ++ __ move(A1, count); ++ } ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_array_post_entry), 2); ++ __ pop(saved_regs); ++} ++ ++void G1BarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Register dst, Address src, Register tmp1, Register tmp2) { ++ bool on_oop = is_reference_type(type); ++ bool on_weak = (decorators & ON_WEAK_OOP_REF) != 0; ++ bool on_phantom = (decorators & ON_PHANTOM_OOP_REF) != 0; ++ bool on_reference = on_weak || on_phantom; ++ ModRefBarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp2); ++ if (on_oop && on_reference) { ++ // RA is live. It must be saved around calls. ++ __ enter(); // barrier may call runtime ++ // Generate the G1 pre-barrier code to log the value of ++ // the referent field in an SATB buffer. ++ g1_write_barrier_pre(masm /* masm */, ++ noreg /* obj */, ++ dst /* pre_val */, ++ TREG /* thread */, ++ tmp1 /* tmp1 */, ++ tmp2 /* tmp2 */, ++ true /* tosca_live */, ++ true /* expand_call */); ++ __ leave(); ++ } ++} ++ ++void G1BarrierSetAssembler::g1_write_barrier_pre(MacroAssembler* masm, ++ Register obj, ++ Register pre_val, ++ Register thread, ++ Register tmp1, ++ Register tmp2, ++ bool tosca_live, ++ bool expand_call) { ++ // If expand_call is true then we expand the call_VM_leaf macro ++ // directly to skip generating the check by ++ // InterpreterMacroAssembler::call_VM_leaf_base that checks _last_sp. ++ ++ assert(thread == TREG, "must be"); ++ ++ Label done; ++ Label runtime; ++ ++ assert_different_registers(obj, pre_val, tmp1, tmp2); ++ assert(pre_val != noreg && tmp1 != noreg && tmp2 != noreg, "expecting a register"); ++ ++ Address in_progress(thread, in_bytes(G1ThreadLocalData::satb_mark_queue_active_offset())); ++ Address index(thread, in_bytes(G1ThreadLocalData::satb_mark_queue_index_offset())); ++ Address buffer(thread, in_bytes(G1ThreadLocalData::satb_mark_queue_buffer_offset())); ++ ++ // Is marking active? ++ if (in_bytes(SATBMarkQueue::byte_width_of_active()) == 4) { ++ __ ld_w(tmp1, in_progress); ++ } else { ++ assert(in_bytes(SATBMarkQueue::byte_width_of_active()) == 1, "Assumption"); ++ __ ld_b(tmp1, in_progress); ++ } ++ __ beqz(tmp1, done); ++ ++ // Do we need to load the previous value? ++ if (obj != noreg) { ++ __ load_heap_oop(pre_val, Address(obj, 0), noreg, noreg, AS_RAW); ++ } ++ ++ // Is the previous value null? ++ __ beqz(pre_val, done); ++ ++ // Can we store original value in the thread's buffer? ++ // Is index == 0? ++ // (The index field is typed as size_t.) ++ ++ __ ld_d(tmp1, index); ++ __ beqz(tmp1, runtime); ++ ++ __ addi_d(tmp1, tmp1, -1 * wordSize); ++ __ st_d(tmp1, index); ++ __ ld_d(tmp2, buffer); ++ ++ // Record the previous value ++ __ stx_d(pre_val, tmp1, tmp2); ++ __ b(done); ++ ++ __ bind(runtime); ++ ++ __ push_call_clobbered_registers(); ++ ++ // Calling the runtime using the regular call_VM_leaf mechanism generates ++ // code (generated by InterpreterMacroAssember::call_VM_leaf_base) ++ // that checks that the *(ebp+frame::interpreter_frame_last_sp) == nullptr. ++ // ++ // If we care generating the pre-barrier without a frame (e.g. in the ++ // intrinsified Reference.get() routine) then ebp might be pointing to ++ // the caller frame and so this check will most likely fail at runtime. ++ // ++ // Expanding the call directly bypasses the generation of the check. ++ // So when we do not have have a full interpreter frame on the stack ++ // expand_call should be passed true. ++ ++ if (expand_call) { ++ assert(pre_val != A1, "smashed arg"); ++ __ super_call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_field_pre_entry), pre_val, thread); ++ } else { ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_field_pre_entry), pre_val, thread); ++ } ++ ++ __ pop_call_clobbered_registers(); ++ ++ __ bind(done); ++} ++ ++void G1BarrierSetAssembler::g1_write_barrier_post(MacroAssembler* masm, ++ Register store_addr, ++ Register new_val, ++ Register thread, ++ Register tmp1, ++ Register tmp2) { ++ assert(thread == TREG, "must be"); ++ assert_different_registers(store_addr, thread, tmp1, tmp2, SCR1); ++ assert(store_addr != noreg && new_val != noreg && tmp1 != noreg ++ && tmp2 != noreg, "expecting a register"); ++ ++ Address queue_index(thread, in_bytes(G1ThreadLocalData::dirty_card_queue_index_offset())); ++ Address buffer(thread, in_bytes(G1ThreadLocalData::dirty_card_queue_buffer_offset())); ++ ++ CardTableBarrierSet* ct = barrier_set_cast(BarrierSet::barrier_set()); ++ assert(sizeof(*ct->card_table()->byte_map_base()) == sizeof(jbyte), "adjust this code"); ++ ++ Label done; ++ Label runtime; ++ ++ // Does store cross heap regions? ++ __ xorr(tmp1, store_addr, new_val); ++ __ srli_d(tmp1, tmp1, HeapRegion::LogOfHRGrainBytes); ++ __ beqz(tmp1, done); ++ ++ // crosses regions, storing null? ++ __ beqz(new_val, done); ++ ++ // storing region crossing non-null, is card already dirty? ++ const Register card_addr = tmp1; ++ ++ __ srli_d(card_addr, store_addr, CardTable::card_shift()); ++ // Do not use ExternalAddress to load 'byte_map_base', since 'byte_map_base' is NOT ++ // a valid address and therefore is not properly handled by the relocation code. ++ __ li(tmp2, (intptr_t)ct->card_table()->byte_map_base()); ++ __ add_d(card_addr, card_addr, tmp2); ++ ++ __ ld_bu(tmp2, card_addr, 0); ++ __ addi_d(tmp2, tmp2, -1 * (int)G1CardTable::g1_young_card_val()); ++ __ beqz(tmp2, done); ++ ++ assert((int)CardTable::dirty_card_val() == 0, "must be 0"); ++ ++ __ membar(__ StoreLoad); ++ __ ld_bu(tmp2, card_addr, 0); ++ __ beqz(tmp2, done); ++ ++ // storing a region crossing, non-null oop, card is clean. ++ // dirty card and log. ++ __ st_b(R0, card_addr, 0); ++ ++ __ ld_d(SCR1, queue_index); ++ __ beqz(SCR1, runtime); ++ __ addi_d(SCR1, SCR1, -1 * wordSize); ++ __ st_d(SCR1, queue_index); ++ __ ld_d(tmp2, buffer); ++ __ ld_d(SCR1, queue_index); ++ __ stx_d(card_addr, tmp2, SCR1); ++ __ b(done); ++ ++ __ bind(runtime); ++ // save the live input values ++ __ push(store_addr); ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_field_post_entry), card_addr, TREG); ++ __ pop(store_addr); ++ ++ __ bind(done); ++} ++ ++void G1BarrierSetAssembler::oop_store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Address dst, Register val, Register tmp1, Register tmp2, Register tmp3) { ++ bool in_heap = (decorators & IN_HEAP) != 0; ++ bool as_normal = (decorators & AS_NORMAL) != 0; ++ assert((decorators & IS_DEST_UNINITIALIZED) == 0, "unsupported"); ++ ++ bool needs_pre_barrier = as_normal; ++ bool needs_post_barrier = val != noreg && in_heap; ++ ++ // flatten object address if needed ++ // We do it regardless of precise because we need the registers ++ if (dst.index() == noreg && dst.disp() == 0) { ++ if (dst.base() != tmp3) { ++ __ move(tmp3, dst.base()); ++ } ++ } else { ++ __ lea(tmp3, dst); ++ } ++ ++ if (needs_pre_barrier) { ++ g1_write_barrier_pre(masm /*masm*/, ++ tmp3 /* obj */, ++ tmp2 /* pre_val */, ++ TREG /* thread */, ++ tmp1 /* tmp1 */, ++ SCR1 /* tmp2 */, ++ val != noreg /* tosca_live */, ++ false /* expand_call */); ++ } ++ if (val == noreg) { ++ BarrierSetAssembler::store_at(masm, decorators, type, Address(tmp3, 0), val, noreg, noreg, noreg); ++ } else { ++ Register new_val = val; ++ if (needs_post_barrier) { ++ // G1 barrier needs uncompressed oop for region cross check. ++ if (UseCompressedOops) { ++ new_val = tmp2; ++ __ move(new_val, val); ++ } ++ } ++ BarrierSetAssembler::store_at(masm, decorators, type, Address(tmp3, 0), val, noreg, noreg, noreg); ++ if (needs_post_barrier) { ++ g1_write_barrier_post(masm /*masm*/, ++ tmp3 /* store_adr */, ++ new_val /* new_val */, ++ TREG /* thread */, ++ tmp1 /* tmp1 */, ++ tmp2 /* tmp2 */); ++ } ++ } ++} ++ ++#ifdef COMPILER1 ++ ++#undef __ ++#define __ ce->masm()-> ++ ++void G1BarrierSetAssembler::gen_pre_barrier_stub(LIR_Assembler* ce, G1PreBarrierStub* stub) { ++ G1BarrierSetC1* bs = (G1BarrierSetC1*)BarrierSet::barrier_set()->barrier_set_c1(); ++ // At this point we know that marking is in progress. ++ // If do_load() is true then we have to emit the ++ // load of the previous value; otherwise it has already ++ // been loaded into _pre_val. ++ ++ __ bind(*stub->entry()); ++ ++ assert(stub->pre_val()->is_register(), "Precondition."); ++ ++ Register pre_val_reg = stub->pre_val()->as_register(); ++ ++ if (stub->do_load()) { ++ ce->mem2reg(stub->addr(), stub->pre_val(), T_OBJECT, stub->patch_code(), stub->info(), false /*wide*/); ++ } ++ __ beqz(pre_val_reg, *stub->continuation()); ++ ce->store_parameter(stub->pre_val()->as_register(), 0); ++ __ call(bs->pre_barrier_c1_runtime_code_blob()->code_begin(), relocInfo::runtime_call_type); ++ __ b(*stub->continuation()); ++} ++ ++void G1BarrierSetAssembler::gen_post_barrier_stub(LIR_Assembler* ce, G1PostBarrierStub* stub) { ++ G1BarrierSetC1* bs = (G1BarrierSetC1*)BarrierSet::barrier_set()->barrier_set_c1(); ++ __ bind(*stub->entry()); ++ assert(stub->addr()->is_register(), "Precondition."); ++ assert(stub->new_val()->is_register(), "Precondition."); ++ Register new_val_reg = stub->new_val()->as_register(); ++ __ beqz(new_val_reg, *stub->continuation()); ++ ce->store_parameter(stub->addr()->as_pointer_register(), 0); ++ __ call(bs->post_barrier_c1_runtime_code_blob()->code_begin(), relocInfo::runtime_call_type); ++ __ b(*stub->continuation()); ++} ++ ++#undef __ ++ ++#define __ sasm-> ++ ++void G1BarrierSetAssembler::generate_c1_pre_barrier_runtime_stub(StubAssembler* sasm) { ++ __ prologue("g1_pre_barrier", false); ++ ++ // arg0 : previous value of memory ++ ++ BarrierSet* bs = BarrierSet::barrier_set(); ++ ++ const Register pre_val = A0; ++ const Register tmp = SCR2; ++ ++ Address in_progress(TREG, in_bytes(G1ThreadLocalData::satb_mark_queue_active_offset())); ++ Address queue_index(TREG, in_bytes(G1ThreadLocalData::satb_mark_queue_index_offset())); ++ Address buffer(TREG, in_bytes(G1ThreadLocalData::satb_mark_queue_buffer_offset())); ++ ++ Label done; ++ Label runtime; ++ ++ // Is marking still active? ++ if (in_bytes(SATBMarkQueue::byte_width_of_active()) == 4) { // 4-byte width ++ __ ld_w(tmp, in_progress); ++ } else { ++ assert(in_bytes(SATBMarkQueue::byte_width_of_active()) == 1, "Assumption"); ++ __ ld_b(tmp, in_progress); ++ } ++ __ beqz(tmp, done); ++ ++ // Can we store original value in the thread's buffer? ++ __ ld_d(tmp, queue_index); ++ __ beqz(tmp, runtime); ++ ++ __ addi_d(tmp, tmp, -wordSize); ++ __ st_d(tmp, queue_index); ++ __ ld_d(SCR1, buffer); ++ __ add_d(tmp, tmp, SCR1); ++ __ load_parameter(0, SCR1); ++ __ st_d(SCR1, Address(tmp, 0)); ++ __ b(done); ++ ++ __ bind(runtime); ++ __ push_call_clobbered_registers(); ++ __ load_parameter(0, pre_val); ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_field_pre_entry), pre_val, TREG); ++ __ pop_call_clobbered_registers(); ++ __ bind(done); ++ ++ __ epilogue(); ++} ++ ++void G1BarrierSetAssembler::generate_c1_post_barrier_runtime_stub(StubAssembler* sasm) { ++ __ prologue("g1_post_barrier", false); ++ ++ // arg0: store_address, not use? ++ Address store_addr(FP, 2 * BytesPerWord); ++ ++ BarrierSet* bs = BarrierSet::barrier_set(); ++ CardTableBarrierSet* ctbs = barrier_set_cast(bs); ++ CardTable* ct = ctbs->card_table(); ++ ++ Label done; ++ Label runtime; ++ ++ // At this point we know new_value is non-null and the new_value crosses regions. ++ // Must check to see if card is already dirty ++ ++ Address queue_index(TREG, in_bytes(G1ThreadLocalData::dirty_card_queue_index_offset())); ++ Address buffer(TREG, in_bytes(G1ThreadLocalData::dirty_card_queue_buffer_offset())); ++ ++ const Register card_offset = SCR2; ++ // RA is free here, so we can use it to hold the byte_map_base. ++ const Register byte_map_base = RA; ++ ++ assert_different_registers(card_offset, byte_map_base, SCR1); ++ ++ __ load_parameter(0, card_offset); ++ __ srli_d(card_offset, card_offset, CardTable::card_shift()); ++ __ load_byte_map_base(byte_map_base); ++ __ ldx_bu(SCR1, byte_map_base, card_offset); ++ __ addi_d(SCR1, SCR1, -(int)G1CardTable::g1_young_card_val()); ++ __ beqz(SCR1, done); ++ ++ assert((int)CardTable::dirty_card_val() == 0, "must be 0"); ++ ++ __ membar(Assembler::StoreLoad); ++ __ ldx_bu(SCR1, byte_map_base, card_offset); ++ __ beqz(SCR1, done); ++ ++ // storing region crossing non-null, card is clean. ++ // dirty card and log. ++ __ stx_b(R0, byte_map_base, card_offset); ++ ++ // Convert card offset into an address in card_addr ++ Register card_addr = card_offset; ++ __ add_d(card_addr, byte_map_base, card_addr); ++ ++ __ ld_d(SCR1, queue_index); ++ __ beqz(SCR1, runtime); ++ __ addi_d(SCR1, SCR1, -wordSize); ++ __ st_d(SCR1, queue_index); ++ ++ // Reuse RA to hold buffer_addr ++ const Register buffer_addr = RA; ++ ++ __ ld_d(buffer_addr, buffer); ++ __ stx_d(card_addr, buffer_addr, SCR1); ++ __ b(done); ++ ++ __ bind(runtime); ++ __ push_call_clobbered_registers(); ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_field_post_entry), card_addr, TREG); ++ __ pop_call_clobbered_registers(); ++ __ bind(done); ++ __ epilogue(); ++} ++ ++#undef __ ++ ++#endif // COMPILER1 +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/g1/g1BarrierSetAssembler_loongarch.hpp b/src/hotspot/cpu/loongarch/gc/g1/g1BarrierSetAssembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/gc/g1/g1BarrierSetAssembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/g1/g1BarrierSetAssembler_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,72 @@ ++/* ++ * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_GC_G1_G1BARRIERSETASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_GC_G1_G1BARRIERSETASSEMBLER_LOONGARCH_HPP ++ ++#include "asm/macroAssembler.hpp" ++#include "gc/shared/modRefBarrierSetAssembler.hpp" ++ ++class LIR_Assembler; ++class StubAssembler; ++class G1PreBarrierStub; ++class G1PostBarrierStub; ++ ++class G1BarrierSetAssembler: public ModRefBarrierSetAssembler { ++ protected: ++ virtual void gen_write_ref_array_pre_barrier(MacroAssembler* masm, DecoratorSet decorators, Register addr, Register count, RegSet saved_regs); ++ virtual void gen_write_ref_array_post_barrier(MacroAssembler* masm, DecoratorSet decorators, Register addr, Register count, Register tmp, RegSet saved_regs); ++ ++ void g1_write_barrier_pre(MacroAssembler* masm, ++ Register obj, ++ Register pre_val, ++ Register thread, ++ Register tmp1, ++ Register tmp2, ++ bool tosca_live, ++ bool expand_call); ++ ++ void g1_write_barrier_post(MacroAssembler* masm, ++ Register store_addr, ++ Register new_val, ++ Register thread, ++ Register tmp1, ++ Register tmp2); ++ ++ virtual void oop_store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Address dst, Register val, Register tmp1, Register tmp2, Register tmp3); ++ ++ public: ++ void gen_pre_barrier_stub(LIR_Assembler* ce, G1PreBarrierStub* stub); ++ void gen_post_barrier_stub(LIR_Assembler* ce, G1PostBarrierStub* stub); ++ ++ void generate_c1_pre_barrier_runtime_stub(StubAssembler* sasm); ++ void generate_c1_post_barrier_runtime_stub(StubAssembler* sasm); ++ ++ virtual void load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Register dst, Address src, Register tmp1, Register tmp2); ++}; ++ ++#endif // CPU_LOONGARCH_GC_G1_G1BARRIERSETASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/g1/g1Globals_loongarch.hpp b/src/hotspot/cpu/loongarch/gc/g1/g1Globals_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/gc/g1/g1Globals_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/g1/g1Globals_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,30 @@ ++/* ++ * Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#ifndef CPU_LOONGARCH_GC_G1_G1GLOBALS_LOONGARCH_HPP ++#define CPU_LOONGARCH_GC_G1_G1GLOBALS_LOONGARCH_HPP ++ ++const size_t G1MergeHeapRootsPrefetchCacheSize = 8; ++ ++#endif // CPU_LOONGARCH_GC_G1_G1GLOBALS_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/shared/barrierSetAssembler_loongarch.cpp b/src/hotspot/cpu/loongarch/gc/shared/barrierSetAssembler_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/gc/shared/barrierSetAssembler_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/shared/barrierSetAssembler_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,453 @@ ++/* ++ * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "classfile/classLoaderData.hpp" ++#include "gc/shared/barrierSet.hpp" ++#include "gc/shared/barrierSetAssembler.hpp" ++#include "gc/shared/barrierSetNMethod.hpp" ++#include "gc/shared/collectedHeap.hpp" ++#include "interpreter/interp_masm.hpp" ++#include "runtime/javaThread.hpp" ++#include "runtime/jniHandles.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/stubRoutines.hpp" ++ ++#define __ masm-> ++ ++void BarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Register dst, Address src, Register tmp1, Register tmp2) { ++ // RA is live. It must be saved around calls. ++ ++ bool in_heap = (decorators & IN_HEAP) != 0; ++ bool in_native = (decorators & IN_NATIVE) != 0; ++ bool is_not_null = (decorators & IS_NOT_NULL) != 0; ++ ++ switch (type) { ++ case T_OBJECT: ++ case T_ARRAY: { ++ if (in_heap) { ++ if (UseCompressedOops) { ++ __ ld_wu(dst, src); ++ if (is_not_null) { ++ __ decode_heap_oop_not_null(dst); ++ } else { ++ __ decode_heap_oop(dst); ++ } ++ } else { ++ __ ld_d(dst, src); ++ } ++ } else { ++ assert(in_native, "why else?"); ++ __ ld_d(dst, src); ++ } ++ break; ++ } ++ case T_BOOLEAN: __ ld_bu (dst, src); break; ++ case T_BYTE: __ ld_b (dst, src); break; ++ case T_CHAR: __ ld_hu (dst, src); break; ++ case T_SHORT: __ ld_h (dst, src); break; ++ case T_INT: __ ld_w (dst, src); break; ++ case T_LONG: __ ld_d (dst, src); break; ++ case T_ADDRESS: __ ld_d (dst, src); break; ++ case T_FLOAT: ++ assert(dst == noreg, "only to ftos"); ++ __ fld_s(FSF, src); ++ break; ++ case T_DOUBLE: ++ assert(dst == noreg, "only to dtos"); ++ __ fld_d(FSF, src); ++ break; ++ default: Unimplemented(); ++ } ++} ++ ++void BarrierSetAssembler::store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Address dst, Register val, Register tmp1, Register tmp2, Register tmp3) { ++ bool in_heap = (decorators & IN_HEAP) != 0; ++ bool in_native = (decorators & IN_NATIVE) != 0; ++ bool is_not_null = (decorators & IS_NOT_NULL) != 0; ++ ++ switch (type) { ++ case T_OBJECT: ++ case T_ARRAY: { ++ if (in_heap) { ++ if (val == noreg) { ++ assert(!is_not_null, "inconsistent access"); ++ if (UseCompressedOops) { ++ __ st_w(R0, dst); ++ } else { ++ __ st_d(R0, dst); ++ } ++ } else { ++ if (UseCompressedOops) { ++ assert(!dst.uses(val), "not enough registers"); ++ if (is_not_null) { ++ __ encode_heap_oop_not_null(val); ++ } else { ++ __ encode_heap_oop(val); ++ } ++ __ st_w(val, dst); ++ } else { ++ __ st_d(val, dst); ++ } ++ } ++ } else { ++ assert(in_native, "why else?"); ++ assert(val != noreg, "not supported"); ++ __ st_d(val, dst); ++ } ++ break; ++ } ++ case T_BOOLEAN: ++ __ andi(val, val, 0x1); // boolean is true if LSB is 1 ++ __ st_b(val, dst); ++ break; ++ case T_BYTE: ++ __ st_b(val, dst); ++ break; ++ case T_SHORT: ++ __ st_h(val, dst); ++ break; ++ case T_CHAR: ++ __ st_h(val, dst); ++ break; ++ case T_INT: ++ __ st_w(val, dst); ++ break; ++ case T_LONG: ++ __ st_d(val, dst); ++ break; ++ case T_FLOAT: ++ assert(val == noreg, "only tos"); ++ __ fst_s(FSF, dst); ++ break; ++ case T_DOUBLE: ++ assert(val == noreg, "only tos"); ++ __ fst_d(FSF, dst); ++ break; ++ case T_ADDRESS: ++ __ st_d(val, dst); ++ break; ++ default: Unimplemented(); ++ } ++} ++ ++void BarrierSetAssembler::copy_load_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ Register dst, ++ Address src, ++ Register tmp) { ++ if (bytes == 1) { ++ __ ld_bu(dst, src); ++ } else if (bytes == 2) { ++ __ ld_hu(dst, src); ++ } else if (bytes == 4) { ++ __ ld_wu(dst, src); ++ } else if (bytes == 8) { ++ __ ld_d(dst, src); ++ } else { ++ // Not the right size ++ ShouldNotReachHere(); ++ } ++ if ((decorators & ARRAYCOPY_CHECKCAST) != 0 && UseCompressedOops) { ++ __ decode_heap_oop(dst); ++ } ++} ++ ++void BarrierSetAssembler::copy_store_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ Address dst, ++ Register src, ++ Register tmp1, ++ Register tmp2, ++ Register tmp3) { ++ if ((decorators & ARRAYCOPY_CHECKCAST) != 0 && UseCompressedOops) { ++ __ encode_heap_oop(src); ++ } ++ ++ if (bytes == 1) { ++ __ st_b(src, dst); ++ } else if (bytes == 2) { ++ __ st_h(src, dst); ++ } else if (bytes == 4) { ++ __ st_w(src, dst); ++ } else if (bytes == 8) { ++ __ st_d(src, dst); ++ } else { ++ // Not the right size ++ ShouldNotReachHere(); ++ } ++} ++ ++void BarrierSetAssembler::copy_load_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ FloatRegister dst, ++ Address src, ++ Register tmp1, ++ Register tmp2, ++ FloatRegister vec_tmp, ++ bool need_save_restore) { ++ assert(bytes > 8, "can only deal with vector registers"); ++ if (UseLSX && bytes == 16) { ++ __ vld(dst, src.base(), src.disp()); ++ } else if (UseLASX && bytes == 32) { ++ __ xvld(dst, src.base(), src.disp()); ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++void BarrierSetAssembler::copy_store_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ Address dst, ++ FloatRegister src, ++ Register tmp1, ++ Register tmp2, ++ Register tmp3, ++ Register tmp4, ++ FloatRegister vec_tmp1, ++ FloatRegister vec_tmp2, ++ bool need_save_restore) { ++ assert(bytes > 8, "can only deal with vector registers"); ++ if (UseLSX && bytes == 16) { ++ __ vst(src, dst.base(), dst.disp()); ++ } else if (UseLASX && bytes == 32) { ++ __ xvst(src, dst.base(), dst.disp()); ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++void BarrierSetAssembler::obj_equals(MacroAssembler* masm, ++ Register obj1, Address obj2) { ++ Unimplemented(); ++} ++ ++void BarrierSetAssembler::obj_equals(MacroAssembler* masm, ++ Register obj1, Register obj2) { ++ Unimplemented(); ++} ++ ++void BarrierSetAssembler::try_resolve_jobject_in_native(MacroAssembler* masm, Register jni_env, ++ Register obj, Register tmp, Label& slowpath) { ++ STATIC_ASSERT(JNIHandles::tag_mask == 3); ++ __ addi_d(AT, R0, ~(int)JNIHandles::tag_mask); ++ __ andr(obj, obj, AT); ++ __ ld_d(obj, Address(obj, 0)); ++} ++ ++// Defines obj, preserves var_size_in_bytes, okay for t2 == var_size_in_bytes. ++void BarrierSetAssembler::tlab_allocate(MacroAssembler* masm, Register obj, ++ Register var_size_in_bytes, ++ int con_size_in_bytes, ++ Register t1, ++ Register t2, ++ Label& slow_case) { ++ assert_different_registers(obj, t2); ++ assert_different_registers(obj, var_size_in_bytes); ++ Register end = t2; ++ ++ // verify_tlab(); ++ ++ __ ld_d(obj, Address(TREG, JavaThread::tlab_top_offset())); ++ if (var_size_in_bytes == noreg) { ++ __ lea(end, Address(obj, con_size_in_bytes)); ++ } else { ++ __ lea(end, Address(obj, var_size_in_bytes, Address::no_scale, 0)); ++ } ++ __ ld_d(SCR1, Address(TREG, JavaThread::tlab_end_offset())); ++ __ blt_far(SCR1, end, slow_case, false); ++ ++ // update the tlab top pointer ++ __ st_d(end, Address(TREG, JavaThread::tlab_top_offset())); ++ ++ // recover var_size_in_bytes if necessary ++ if (var_size_in_bytes == end) { ++ __ sub_d(var_size_in_bytes, var_size_in_bytes, obj); ++ } ++ // verify_tlab(); ++} ++ ++void BarrierSetAssembler::incr_allocated_bytes(MacroAssembler* masm, ++ Register var_size_in_bytes, ++ int con_size_in_bytes, ++ Register t1) { ++ assert(t1->is_valid(), "need temp reg"); ++ ++ __ ld_d(t1, Address(TREG, JavaThread::allocated_bytes_offset())); ++ if (var_size_in_bytes->is_valid()) ++ __ add_d(t1, t1, var_size_in_bytes); ++ else ++ __ addi_d(t1, t1, con_size_in_bytes); ++ __ st_d(t1, Address(TREG, JavaThread::allocated_bytes_offset())); ++} ++ ++static volatile uint32_t _patching_epoch = 0; ++ ++address BarrierSetAssembler::patching_epoch_addr() { ++ return (address)&_patching_epoch; ++} ++ ++void BarrierSetAssembler::increment_patching_epoch() { ++ Atomic::inc(&_patching_epoch); ++} ++ ++void BarrierSetAssembler::clear_patching_epoch() { ++ _patching_epoch = 0; ++} ++ ++void BarrierSetAssembler::nmethod_entry_barrier(MacroAssembler* masm, Label* slow_path, Label* continuation, Label* guard) { ++ BarrierSetNMethod* bs_nm = BarrierSet::barrier_set()->barrier_set_nmethod(); ++ ++ if (bs_nm == nullptr) { ++ return; ++ } ++ ++ Label local_guard; ++ NMethodPatchingType patching_type = nmethod_patching_type(); ++ ++ if (slow_path == nullptr) { ++ guard = &local_guard; ++ } ++ ++ __ lipc(SCR1, *guard); ++ __ ld_wu(SCR1, SCR1, 0); ++ ++ switch (patching_type) { ++ case NMethodPatchingType::conc_data_patch: ++ // Subsequent loads of oops must occur after load of guard value. ++ // BarrierSetNMethod::disarm sets guard with release semantics. ++ __ membar(__ LoadLoad); // fall through to stw_instruction_and_data_patch ++ case NMethodPatchingType::stw_instruction_and_data_patch: ++ { ++ // With STW patching, no data or instructions are updated concurrently, ++ // which means there isn't really any need for any fencing for neither ++ // data nor instruction modification happening concurrently. The ++ // instruction patching is synchronized with global icache_flush() by ++ // the write hart on riscv. So here we can do a plain conditional ++ // branch with no fencing. ++ Address thread_disarmed_addr(TREG, in_bytes(bs_nm->thread_disarmed_guard_value_offset())); ++ __ ld_wu(SCR2, thread_disarmed_addr); ++ break; ++ } ++ case NMethodPatchingType::conc_instruction_and_data_patch: ++ { ++ // If we patch code we need both a code patching and a loadload ++ // fence. It's not super cheap, so we use a global epoch mechanism ++ // to hide them in a slow path. ++ // The high level idea of the global epoch mechanism is to detect ++ // when any thread has performed the required fencing, after the ++ // last nmethod was disarmed. This implies that the required ++ // fencing has been performed for all preceding nmethod disarms ++ // as well. Therefore, we do not need any further fencing. ++ __ lea_long(SCR2, ExternalAddress((address)&_patching_epoch)); ++ // Embed an artificial data dependency to order the guard load ++ // before the epoch load. ++ __ srli_d(RA, SCR1, 32); ++ __ orr(SCR2, SCR2, RA); ++ // Read the global epoch value. ++ __ ld_wu(SCR2, SCR2, 0); ++ // Combine the guard value (low order) with the epoch value (high order). ++ __ slli_d(SCR2, SCR2, 32); ++ __ orr(SCR1, SCR1, SCR2); ++ // Compare the global values with the thread-local values ++ Address thread_disarmed_and_epoch_addr(TREG, in_bytes(bs_nm->thread_disarmed_guard_value_offset())); ++ __ ld_d(SCR2, thread_disarmed_and_epoch_addr); ++ break; ++ } ++ default: ++ ShouldNotReachHere(); ++ } ++ ++ if (slow_path == nullptr) { ++ Label skip_barrier; ++ __ beq(SCR1, SCR2, skip_barrier); ++ ++ __ call_long(StubRoutines::la::method_entry_barrier()); ++ __ b(skip_barrier); ++ ++ __ bind(local_guard); ++ __ emit_int32(0); // nmethod guard value. Skipped over in common case. ++ __ bind(skip_barrier); ++ } else { ++ __ xorr(SCR1, SCR1, SCR2); ++ __ bnez(SCR1, *slow_path); ++ __ bind(*continuation); ++ } ++} ++ ++void BarrierSetAssembler::c2i_entry_barrier(MacroAssembler* masm) { ++ BarrierSetNMethod* bs = BarrierSet::barrier_set()->barrier_set_nmethod(); ++ if (bs == nullptr) { ++ return; ++ } ++ ++ Label bad_call; ++ __ beqz(Rmethod, bad_call); ++ ++ // Pointer chase to the method holder to find out if the method is concurrently unloading. ++ Label method_live; ++ __ load_method_holder_cld(SCR2, Rmethod); ++ ++ // Is it a strong CLD? ++ __ ld_w(SCR1, Address(SCR2, ClassLoaderData::keep_alive_offset())); ++ __ bnez(SCR1, method_live); ++ ++ // Is it a weak but alive CLD? ++ __ push2(T2, T8); ++ __ ld_d(T8, Address(SCR2, ClassLoaderData::holder_offset())); ++ __ resolve_weak_handle(T8, T2, SCR2); // Assembler occupies SCR1. ++ __ move(SCR1, T8); ++ __ pop2(T2, T8); ++ __ bnez(SCR1, method_live); ++ ++ __ bind(bad_call); ++ ++ __ jmp(SharedRuntime::get_handle_wrong_method_stub(), relocInfo::runtime_call_type); ++ __ bind(method_live); ++} ++ ++void BarrierSetAssembler::check_oop(MacroAssembler* masm, Register obj, Register tmp1, Register tmp2, Label& error) { ++ // Check if the oop is in the right area of memory ++ __ li(tmp2, (intptr_t) Universe::verify_oop_mask()); ++ __ andr(tmp1, obj, tmp2); ++ __ li(tmp2, (intptr_t) Universe::verify_oop_bits()); ++ ++ // Compare tmp1 and tmp2. ++ __ bne(tmp1, tmp2, error); ++ ++ // make sure klass is 'reasonable', which is not zero. ++ __ load_klass(obj, obj); // get klass ++ __ beqz(obj, error); // if klass is null it is broken ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/shared/barrierSetAssembler_loongarch.hpp b/src/hotspot/cpu/loongarch/gc/shared/barrierSetAssembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/gc/shared/barrierSetAssembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/shared/barrierSetAssembler_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,147 @@ ++/* ++ * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_GC_SHARED_BARRIERSETASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_GC_SHARED_BARRIERSETASSEMBLER_LOONGARCH_HPP ++ ++#include "asm/macroAssembler.hpp" ++#include "gc/shared/barrierSet.hpp" ++#include "gc/shared/barrierSetNMethod.hpp" ++#include "memory/allocation.hpp" ++#include "oops/access.hpp" ++ ++class InterpreterMacroAssembler; ++ ++enum class NMethodPatchingType { ++ stw_instruction_and_data_patch, ++ conc_instruction_and_data_patch, ++ conc_data_patch ++}; ++ ++class BarrierSetAssembler: public CHeapObj { ++private: ++ void incr_allocated_bytes(MacroAssembler* masm, ++ Register var_size_in_bytes, ++ int con_size_in_bytes, ++ Register t1); ++ ++public: ++ virtual void arraycopy_prologue(MacroAssembler* masm, DecoratorSet decorators, bool is_oop, ++ Register src, Register dst, Register count, RegSet saved_regs) {} ++ virtual void arraycopy_epilogue(MacroAssembler* masm, DecoratorSet decorators, bool is_oop, ++ Register dst, Register count, Register scratch, RegSet saved_regs) {} ++ ++ virtual void copy_load_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ Register dst, ++ Address src, ++ Register tmp); ++ ++ virtual void copy_store_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ Address dst, ++ Register src, ++ Register tmp1, ++ Register tmp2, ++ Register tmp3); ++ ++ virtual void copy_load_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ FloatRegister dst, ++ Address src, ++ Register tmp1, ++ Register tmp2, ++ FloatRegister vec_tmp, ++ bool need_save_restore = true); ++ ++ virtual void copy_store_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ Address dst, ++ FloatRegister src, ++ Register tmp1, ++ Register tmp2, ++ Register tmp3, ++ Register tmp4, ++ FloatRegister vec_tmp1, ++ FloatRegister vec_tmp2, ++ bool need_save_restore = true); ++ ++ virtual void load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Register dst, Address src, Register tmp1, Register tmp2); ++ virtual void store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Address dst, Register val, Register tmp1, Register tmp2, Register tmp3); ++ ++ ++ virtual void obj_equals(MacroAssembler* masm, ++ Register obj1, Register obj2); ++ virtual void obj_equals(MacroAssembler* masm, ++ Register obj1, Address obj2); ++ ++ virtual void resolve(MacroAssembler* masm, DecoratorSet decorators, Register obj) { ++ // Default implementation does not need to do anything. ++ } ++ ++ // Support for jniFastGetField to try resolving a jobject/jweak in native ++ virtual void try_resolve_jobject_in_native(MacroAssembler* masm, Register jni_env, ++ Register obj, Register tmp, Label& slowpath); ++ ++ virtual void tlab_allocate(MacroAssembler* masm, ++ Register obj, // result: pointer to object after successful allocation ++ Register var_size_in_bytes, // object size in bytes if unknown at compile time; invalid otherwise ++ int con_size_in_bytes, // object size in bytes if known at compile time ++ Register t1, // temp register ++ Register t2, // temp register ++ Label& slow_case // continuation point if fast allocation fails ++ ); ++ ++ virtual void barrier_stubs_init() {} ++ ++ virtual NMethodPatchingType nmethod_patching_type() { return NMethodPatchingType::stw_instruction_and_data_patch; } ++ ++ virtual void nmethod_entry_barrier(MacroAssembler* masm, Label* slow_path, Label* continuation, Label* guard); ++ virtual void c2i_entry_barrier(MacroAssembler* masm); ++ ++ virtual void check_oop(MacroAssembler* masm, Register obj, Register tmp1, Register tmp2, Label& error); ++ ++ virtual bool supports_instruction_patching() { ++ NMethodPatchingType patching_type = nmethod_patching_type(); ++ return patching_type == NMethodPatchingType::conc_instruction_and_data_patch || ++ patching_type == NMethodPatchingType::stw_instruction_and_data_patch; ++ } ++ ++ static address patching_epoch_addr(); ++ static void clear_patching_epoch(); ++ static void increment_patching_epoch(); ++}; ++ ++#endif // CPU_LOONGARCH_GC_SHARED_BARRIERSETASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/shared/barrierSetNMethod_loongarch.cpp b/src/hotspot/cpu/loongarch/gc/shared/barrierSetNMethod_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/gc/shared/barrierSetNMethod_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/shared/barrierSetNMethod_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,222 @@ ++/* ++ * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2019, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "code/codeCache.hpp" ++#include "code/nativeInst.hpp" ++#include "gc/shared/barrierSetAssembler.hpp" ++#include "gc/shared/barrierSetNMethod.hpp" ++#include "logging/log.hpp" ++#include "memory/resourceArea.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/javaThread.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/registerMap.hpp" ++#include "utilities/align.hpp" ++#include "utilities/debug.hpp" ++#if INCLUDE_JVMCI ++#include "jvmci/jvmciRuntime.hpp" ++#endif ++ ++static int slow_path_size(nmethod* nm) { ++ // The slow path code is out of line with C2. ++ // Leave a b to the stub in the fast path. ++ return nm->is_compiled_by_c2() ? 2 : 6; ++} ++ ++static int entry_barrier_offset(nmethod* nm) { ++ BarrierSetAssembler* bs_asm = BarrierSet::barrier_set()->barrier_set_assembler(); ++ switch (bs_asm->nmethod_patching_type()) { ++ case NMethodPatchingType::stw_instruction_and_data_patch: ++ return -4 * (3 + slow_path_size(nm)); ++ case NMethodPatchingType::conc_data_patch: ++ return -4 * (4 + slow_path_size(nm)); ++ case NMethodPatchingType::conc_instruction_and_data_patch: ++ return -4 * (10 + slow_path_size(nm)); ++ } ++ ShouldNotReachHere(); ++ return 0; ++} ++ ++class NativeNMethodBarrier { ++ address _instruction_address; ++ int* _guard_addr; ++ nmethod* _nm; ++ ++ address instruction_address() const { return _instruction_address; } ++ ++ int *guard_addr() { ++ return _guard_addr; ++ } ++ ++ int local_guard_offset(nmethod* nm) { ++ // It's the last instruction ++ return (-entry_barrier_offset(nm)) - 4; ++ } ++ ++public: ++ NativeNMethodBarrier(nmethod* nm): _nm(nm) { ++#if INCLUDE_JVMCI ++ if (nm->is_compiled_by_jvmci()) { ++ address pc = nm->code_begin() + nm->jvmci_nmethod_data()->nmethod_entry_patch_offset(); ++ RelocIterator iter(nm, pc, pc + 4); ++ guarantee(iter.next(), "missing relocs"); ++ guarantee(iter.type() == relocInfo::section_word_type, "unexpected reloc"); ++ ++ _guard_addr = (int*) iter.section_word_reloc()->target(); ++ _instruction_address = pc; ++ } else ++#endif ++ { ++ _instruction_address = nm->code_begin() + nm->frame_complete_offset() + entry_barrier_offset(nm); ++ if (nm->is_compiled_by_c2()) { ++ // With c2 compiled code, the guard is out-of-line in a stub ++ // We find it using the RelocIterator. ++ RelocIterator iter(nm); ++ while (iter.next()) { ++ if (iter.type() == relocInfo::entry_guard_type) { ++ entry_guard_Relocation* const reloc = iter.entry_guard_reloc(); ++ _guard_addr = reinterpret_cast(reloc->addr()); ++ return; ++ } ++ } ++ ShouldNotReachHere(); ++ } ++ _guard_addr = reinterpret_cast(instruction_address() + local_guard_offset(nm)); ++ } ++ } ++ ++ int get_value() { ++ return Atomic::load_acquire(guard_addr()); ++ } ++ ++ void set_value(int value) { ++ Atomic::release_store(guard_addr(), value); ++ } ++ ++ bool check_barrier(err_msg& msg) const; ++ void verify() const { ++ err_msg msg("%s", ""); ++ assert(check_barrier(msg), "%s", msg.buffer()); ++ } ++}; ++ ++// Store the instruction bitmask, bits and name for checking the barrier. ++struct CheckInsn { ++ uint32_t mask; ++ uint32_t bits; ++ const char *name; ++}; ++ ++static const struct CheckInsn barrierInsn[] = { ++ { 0xfe000000, 0x18000000, "pcaddi"}, ++ { 0xffc00000, 0x2a800000, "ld.wu"}, ++}; ++ ++// The encodings must match the instructions emitted by ++// BarrierSetAssembler::nmethod_entry_barrier. The matching ignores the specific ++// register numbers and immediate values in the encoding. ++bool NativeNMethodBarrier::check_barrier(err_msg& msg) const { ++ intptr_t addr = (intptr_t) instruction_address(); ++ for(unsigned int i = 0; i < sizeof(barrierInsn)/sizeof(struct CheckInsn); i++ ) { ++ uint32_t inst = *((uint32_t*) addr); ++ if ((inst & barrierInsn[i].mask) != barrierInsn[i].bits) { ++ msg.print("Addr: " INTPTR_FORMAT " Code: 0x%x not an %s instruction", addr, inst, barrierInsn[i].name); ++ return false; ++ } ++ addr +=4; ++ } ++ return true; ++} ++ ++void BarrierSetNMethod::deoptimize(nmethod* nm, address* return_address_ptr) { ++ ++ typedef struct { ++ intptr_t *sp; intptr_t *fp; address ra; address pc; ++ } frame_pointers_t; ++ ++ frame_pointers_t *new_frame = (frame_pointers_t *)(return_address_ptr - 5); ++ ++ JavaThread *thread = JavaThread::current(); ++ RegisterMap reg_map(thread, ++ RegisterMap::UpdateMap::skip, ++ RegisterMap::ProcessFrames::include, ++ RegisterMap::WalkContinuation::skip); ++ frame frame = thread->last_frame(); ++ ++ assert(frame.is_compiled_frame() || frame.is_native_frame(), "must be"); ++ assert(frame.cb() == nm, "must be"); ++ frame = frame.sender(®_map); ++ ++ LogTarget(Trace, nmethod, barrier) out; ++ if (out.is_enabled()) { ++ ResourceMark mark; ++ log_trace(nmethod, barrier)("deoptimize(nmethod: %s(%p), return_addr: %p, osr: %d, thread: %p(%s), making rsp: %p) -> %p", ++ nm->method()->name_and_sig_as_C_string(), ++ nm, *(address *) return_address_ptr, nm->is_osr_method(), thread, ++ thread->name(), frame.sp(), nm->verified_entry_point()); ++ } ++ ++ new_frame->sp = frame.sp(); ++ new_frame->fp = frame.fp(); ++ new_frame->ra = frame.pc(); ++ new_frame->pc = SharedRuntime::get_handle_wrong_method_stub(); ++} ++ ++void BarrierSetNMethod::set_guard_value(nmethod* nm, int value) { ++ if (!supports_entry_barrier(nm)) { ++ return; ++ } ++ ++ if (value == disarmed_guard_value()) { ++ // The patching epoch is incremented before the nmethod is disarmed. Disarming ++ // is performed with a release store. In the nmethod entry barrier, the values ++ // are read in the opposite order, such that the load of the nmethod guard ++ // acquires the patching epoch. This way, the guard is guaranteed to block ++ // entries to the nmethod, until it has safely published the requirement for ++ // further fencing by mutators, before they are allowed to enter. ++ BarrierSetAssembler* bs_asm = BarrierSet::barrier_set()->barrier_set_assembler(); ++ bs_asm->increment_patching_epoch(); ++ } ++ ++ NativeNMethodBarrier barrier(nm); ++ barrier.set_value(value); ++} ++ ++int BarrierSetNMethod::guard_value(nmethod* nm) { ++ if (!supports_entry_barrier(nm)) { ++ return disarmed_guard_value(); ++ } ++ ++ NativeNMethodBarrier barrier(nm); ++ return barrier.get_value(); ++} ++ ++#if INCLUDE_JVMCI ++bool BarrierSetNMethod::verify_barrier(nmethod* nm, err_msg& msg) { ++ NativeNMethodBarrier barrier(nm); ++ return barrier.check_barrier(msg); ++} ++#endif +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/shared/cardTableBarrierSetAssembler_loongarch.cpp b/src/hotspot/cpu/loongarch/gc/shared/cardTableBarrierSetAssembler_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/gc/shared/cardTableBarrierSetAssembler_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/shared/cardTableBarrierSetAssembler_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,117 @@ ++/* ++ * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "gc/shared/barrierSet.hpp" ++#include "gc/shared/cardTable.hpp" ++#include "gc/shared/cardTableBarrierSet.hpp" ++#include "gc/shared/cardTableBarrierSetAssembler.hpp" ++ ++#define __ masm-> ++ ++#ifdef PRODUCT ++#define BLOCK_COMMENT(str) /* nothing */ ++#else ++#define BLOCK_COMMENT(str) __ block_comment(str) ++#endif ++ ++#define BIND(label) bind(label); BLOCK_COMMENT(#label ":") ++ ++#define TIMES_OOP (UseCompressedOops ? Address::times_4 : Address::times_8) ++ ++void CardTableBarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssembler* masm, DecoratorSet decorators, ++ Register addr, Register count, Register tmp, ++ RegSet saved_regs) { ++ BarrierSet *bs = BarrierSet::barrier_set(); ++ CardTableBarrierSet* ctbs = barrier_set_cast(bs); ++ CardTable* ct = ctbs->card_table(); ++ assert(sizeof(*ct->byte_map_base()) == sizeof(jbyte), "adjust this code"); ++ intptr_t disp = (intptr_t) ct->byte_map_base(); ++ ++ Label L_loop, L_done; ++ const Register end = count; ++ assert_different_registers(addr, end); ++ ++ __ beq(count, R0, L_done); // zero count - nothing to do ++ ++ __ li(tmp, disp); ++ ++ __ lea(end, Address(addr, count, TIMES_OOP, 0)); // end == addr+count*oop_size ++ __ addi_d(end, end, -BytesPerHeapOop); // end - 1 to make inclusive ++ __ srli_d(addr, addr, CardTable::card_shift()); ++ __ srli_d(end, end, CardTable::card_shift()); ++ __ sub_d(end, end, addr); // end --> cards count ++ ++ __ add_d(addr, addr, tmp); ++ ++ __ BIND(L_loop); ++ __ stx_b(R0, addr, count); ++ __ addi_d(count, count, -1); ++ __ bge(count, R0, L_loop); ++ ++ __ BIND(L_done); ++} ++ ++// Does a store check for the oop in register obj. ++void CardTableBarrierSetAssembler::store_check(MacroAssembler* masm, Register obj, Register tmp) { ++ assert_different_registers(obj, tmp, SCR1); ++ ++ __ srli_d(obj, obj, CardTable::card_shift()); ++ ++ __ load_byte_map_base(tmp); ++ ++ assert(CardTable::dirty_card_val() == 0, "must be"); ++ ++ if (UseCondCardMark) { ++ Label L_already_dirty; ++ __ ldx_b(SCR1, obj, tmp); ++ __ beqz(SCR1, L_already_dirty); ++ __ stx_b(R0, obj, tmp); ++ __ bind(L_already_dirty); ++ } else { ++ __ stx_b(R0, obj, tmp); ++ } ++} ++ ++void CardTableBarrierSetAssembler::oop_store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Address dst, Register val, Register tmp1, Register tmp2, Register tmp3) { ++ bool in_heap = (decorators & IN_HEAP) != 0; ++ bool is_array = (decorators & IS_ARRAY) != 0; ++ bool on_anonymous = (decorators & ON_UNKNOWN_OOP_REF) != 0; ++ bool precise = is_array || on_anonymous; ++ ++ bool needs_post_barrier = val != noreg && in_heap; ++ BarrierSetAssembler::store_at(masm, decorators, type, dst, val, noreg, noreg, noreg); ++ if (needs_post_barrier) { ++ // flatten object address if needed ++ if (!precise || (dst.index() == noreg && dst.disp() == 0)) { ++ store_check(masm, dst.base(), tmp1); ++ } else { ++ __ lea(tmp1, dst); ++ store_check(masm, tmp1, tmp2); ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/shared/cardTableBarrierSetAssembler_loongarch.hpp b/src/hotspot/cpu/loongarch/gc/shared/cardTableBarrierSetAssembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/gc/shared/cardTableBarrierSetAssembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/shared/cardTableBarrierSetAssembler_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,44 @@ ++/* ++ * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_GC_SHARED_CARDTABLEBARRIERSETASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_GC_SHARED_CARDTABLEBARRIERSETASSEMBLER_LOONGARCH_HPP ++ ++#include "asm/macroAssembler.hpp" ++#include "gc/shared/modRefBarrierSetAssembler.hpp" ++ ++class CardTableBarrierSetAssembler: public ModRefBarrierSetAssembler { ++protected: ++ void store_check(MacroAssembler* masm, Register obj, Register tmp); ++ ++ virtual void gen_write_ref_array_post_barrier(MacroAssembler* masm, DecoratorSet decorators, ++ Register addr, Register count, Register tmp, ++ RegSet saved_regs); ++ ++ virtual void oop_store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Address dst, Register val, Register tmp1, Register tmp2, Register tmp3); ++}; ++ ++#endif // CPU_LOONGARCH_GC_SHARED_CARDTABLEBARRIERSETASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/shared/modRefBarrierSetAssembler_loongarch.cpp b/src/hotspot/cpu/loongarch/gc/shared/modRefBarrierSetAssembler_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/gc/shared/modRefBarrierSetAssembler_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/shared/modRefBarrierSetAssembler_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,53 @@ ++/* ++ * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "gc/shared/modRefBarrierSetAssembler.hpp" ++ ++#define __ masm-> ++ ++void ModRefBarrierSetAssembler::arraycopy_prologue(MacroAssembler* masm, DecoratorSet decorators, bool is_oop, ++ Register src, Register dst, Register count, RegSet saved_regs) { ++ if (is_oop) { ++ gen_write_ref_array_pre_barrier(masm, decorators, dst, count, saved_regs); ++ } ++} ++ ++void ModRefBarrierSetAssembler::arraycopy_epilogue(MacroAssembler* masm, DecoratorSet decorators, bool is_oop, ++ Register dst, Register count, Register scratch, RegSet saved_regs) { ++ if (is_oop) { ++ gen_write_ref_array_post_barrier(masm, decorators, dst, count, scratch, saved_regs); ++ } ++} ++ ++void ModRefBarrierSetAssembler::store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Address dst, Register val, Register tmp1, Register tmp2, Register tmp3) { ++ if (type == T_OBJECT || type == T_ARRAY) { ++ oop_store_at(masm, decorators, type, dst, val, tmp1, tmp2, tmp3); ++ } else { ++ BarrierSetAssembler::store_at(masm, decorators, type, dst, val, tmp1, tmp2, tmp3); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/shared/modRefBarrierSetAssembler_loongarch.hpp b/src/hotspot/cpu/loongarch/gc/shared/modRefBarrierSetAssembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/gc/shared/modRefBarrierSetAssembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/shared/modRefBarrierSetAssembler_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,54 @@ ++/* ++ * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_GC_SHARED_MODREFBARRIERSETASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_GC_SHARED_MODREFBARRIERSETASSEMBLER_LOONGARCH_HPP ++ ++#include "asm/macroAssembler.hpp" ++#include "gc/shared/barrierSetAssembler.hpp" ++ ++// The ModRefBarrierSetAssembler filters away accesses on BasicTypes other ++// than T_OBJECT/T_ARRAY (oops). The oop accesses call one of the protected ++// accesses, which are overridden in the concrete BarrierSetAssembler. ++ ++class ModRefBarrierSetAssembler: public BarrierSetAssembler { ++protected: ++ virtual void gen_write_ref_array_pre_barrier(MacroAssembler* masm, DecoratorSet decorators, ++ Register addr, Register count, RegSet saved_regs) {} ++ virtual void gen_write_ref_array_post_barrier(MacroAssembler* masm, DecoratorSet decorators, ++ Register addr, Register count, Register tmp, RegSet saved_regs) {} ++ virtual void oop_store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Address dst, Register val, Register tmp1, Register tmp2, Register tmp3) = 0; ++public: ++ virtual void arraycopy_prologue(MacroAssembler* masm, DecoratorSet decorators, bool is_oop, ++ Register src, Register dst, Register count, RegSet saved_regs); ++ virtual void arraycopy_epilogue(MacroAssembler* masm, DecoratorSet decorators, bool is_oop, ++ Register dst, Register count, Register scratch, RegSet saved_regs); ++ ++ virtual void store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Address dst, Register val, Register tmp1, Register tmp2, Register tmp3); ++}; ++ ++#endif // CPU_LOONGARCH_GC_SHARED_MODREFBARRIERSETASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/shenandoah/c1/shenandoahBarrierSetC1_loongarch.cpp b/src/hotspot/cpu/loongarch/gc/shenandoah/c1/shenandoahBarrierSetC1_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/gc/shenandoah/c1/shenandoahBarrierSetC1_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/shenandoah/c1/shenandoahBarrierSetC1_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,130 @@ ++/* ++ * Copyright (c) 2018, 2021, Red Hat, Inc. All rights reserved. ++ * Copyright (c) 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "c1/c1_LIRAssembler.hpp" ++#include "c1/c1_MacroAssembler.hpp" ++#include "gc/shared/gc_globals.hpp" ++#include "gc/shenandoah/shenandoahBarrierSet.hpp" ++#include "gc/shenandoah/shenandoahBarrierSetAssembler.hpp" ++#include "gc/shenandoah/c1/shenandoahBarrierSetC1.hpp" ++ ++#define __ masm->masm()-> ++ ++void LIR_OpShenandoahCompareAndSwap::emit_code(LIR_Assembler* masm) { ++ Register addr = _addr->as_register_lo(); ++ Register newval = _new_value->as_register(); ++ Register cmpval = _cmp_value->as_register(); ++ Register result = result_opr()->as_register(); ++ ++ ShenandoahBarrierSet::assembler()->iu_barrier(masm->masm(), newval, SCR2); ++ ++ if (UseCompressedOops) { ++ Register tmp1 = _tmp1->as_register(); ++ Register tmp2 = _tmp2->as_register(); ++ ++ __ encode_heap_oop(tmp1, cmpval); ++ cmpval = tmp1; ++ __ encode_heap_oop(tmp2, newval); ++ newval = tmp2; ++ } ++ ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(masm->masm(), addr, cmpval, newval, /*acquire*/ true, /*is_cae*/ false, result); ++ ++ if (CompilerConfig::is_c1_only_no_jvmci()) { ++ // The membar here is necessary to prevent reordering between the ++ // release store in the CAS above and a subsequent volatile load. ++ // However for tiered compilation C1 inserts a full barrier before ++ // volatile loads which means we don't need an additional barrier ++ // here (see LIRGenerator::volatile_field_load()). ++ __ membar(__ AnyAny); ++ } ++} ++ ++#undef __ ++ ++#ifdef ASSERT ++#define __ gen->lir(__FILE__, __LINE__)-> ++#else ++#define __ gen->lir()-> ++#endif ++ ++LIR_Opr ShenandoahBarrierSetC1::atomic_cmpxchg_at_resolved(LIRAccess& access, LIRItem& cmp_value, LIRItem& new_value) { ++ if (access.is_oop()) { ++ LIRGenerator *gen = access.gen(); ++ if (ShenandoahSATBBarrier) { ++ pre_barrier(gen, access.access_emit_info(), access.decorators(), access.resolved_addr(), ++ LIR_OprFact::illegalOpr /* pre_val */); ++ } ++ if (ShenandoahCASBarrier) { ++ cmp_value.load_item(); ++ new_value.load_item(); ++ ++ LIR_Opr t1 = LIR_OprFact::illegalOpr; ++ LIR_Opr t2 = LIR_OprFact::illegalOpr; ++ LIR_Opr addr = access.resolved_addr()->as_address_ptr()->base(); ++ LIR_Opr result = gen->new_register(T_INT); ++ ++ if (UseCompressedOops) { ++ t1 = gen->new_register(T_OBJECT); ++ t2 = gen->new_register(T_OBJECT); ++ } ++ ++ __ append(new LIR_OpShenandoahCompareAndSwap(addr, cmp_value.result(), new_value.result(), t1, t2, result)); ++ return result; ++ } ++ } ++ return BarrierSetC1::atomic_cmpxchg_at_resolved(access, cmp_value, new_value); ++} ++ ++LIR_Opr ShenandoahBarrierSetC1::atomic_xchg_at_resolved(LIRAccess& access, LIRItem& value) { ++ LIRGenerator* gen = access.gen(); ++ BasicType type = access.type(); ++ ++ LIR_Opr result = gen->new_register(type); ++ value.load_item(); ++ LIR_Opr value_opr = value.result(); ++ ++ if (access.is_oop()) { ++ value_opr = iu_barrier(access.gen(), value_opr, access.access_emit_info(), access.decorators()); ++ } ++ ++ assert(type == T_INT || is_reference_type(type) LP64_ONLY( || type == T_LONG ), "unexpected type"); ++ LIR_Opr tmp = gen->new_register(T_INT); ++ __ xchg(access.resolved_addr(), value_opr, result, tmp); ++ ++ if (access.is_oop()) { ++ result = load_reference_barrier(access.gen(), result, LIR_OprFact::addressConst(0), access.decorators()); ++ LIR_Opr tmp = gen->new_register(type); ++ __ move(result, tmp); ++ result = tmp; ++ if (ShenandoahSATBBarrier) { ++ pre_barrier(access.gen(), access.access_emit_info(), access.decorators(), LIR_OprFact::illegalOpr, ++ result /* pre_val */); ++ } ++ } ++ ++ return result; ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/shenandoah/shenandoahBarrierSetAssembler_loongarch.cpp b/src/hotspot/cpu/loongarch/gc/shenandoah/shenandoahBarrierSetAssembler_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/gc/shenandoah/shenandoahBarrierSetAssembler_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/shenandoah/shenandoahBarrierSetAssembler_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,783 @@ ++/* ++ * Copyright (c) 2018, 2021, Red Hat, Inc. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "gc/shenandoah/shenandoahBarrierSet.hpp" ++#include "gc/shenandoah/shenandoahBarrierSetAssembler.hpp" ++#include "gc/shenandoah/shenandoahForwarding.hpp" ++#include "gc/shenandoah/shenandoahHeap.inline.hpp" ++#include "gc/shenandoah/shenandoahHeapRegion.hpp" ++#include "gc/shenandoah/shenandoahRuntime.hpp" ++#include "gc/shenandoah/shenandoahThreadLocalData.hpp" ++#include "gc/shenandoah/heuristics/shenandoahHeuristics.hpp" ++#include "interpreter/interpreter.hpp" ++#include "interpreter/interp_masm.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/thread.hpp" ++#ifdef COMPILER1 ++#include "c1/c1_LIRAssembler.hpp" ++#include "c1/c1_MacroAssembler.hpp" ++#include "gc/shenandoah/c1/shenandoahBarrierSetC1.hpp" ++#endif ++ ++#define __ masm-> ++ ++void ShenandoahBarrierSetAssembler::arraycopy_prologue(MacroAssembler* masm, DecoratorSet decorators, bool is_oop, ++ Register src, Register dst, Register count, RegSet saved_regs) { ++ if (is_oop) { ++ bool dest_uninitialized = (decorators & IS_DEST_UNINITIALIZED) != 0; ++ if ((ShenandoahSATBBarrier && !dest_uninitialized) || ShenandoahIUBarrier || ShenandoahLoadRefBarrier) { ++ Label done; ++ ++ // Avoid calling runtime if count == 0 ++ __ beqz(count, done); ++ ++ // Is GC active? ++ Address gc_state(TREG, in_bytes(ShenandoahThreadLocalData::gc_state_offset())); ++ __ ld_b(SCR1, gc_state); ++ if (ShenandoahSATBBarrier && dest_uninitialized) { ++ __ andi(SCR1, SCR1, ShenandoahHeap::HAS_FORWARDED); ++ __ beqz(SCR1, done); ++ } else { ++ __ andi(SCR1, SCR1, ShenandoahHeap::HAS_FORWARDED | ShenandoahHeap::MARKING); ++ __ beqz(SCR1, done); ++ } ++ ++ __ push(saved_regs); ++ if (UseCompressedOops) { ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::arraycopy_barrier_narrow_oop_entry), src, dst, count); ++ } else { ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::arraycopy_barrier_oop_entry), src, dst, count); ++ } ++ __ pop(saved_regs); ++ __ bind(done); ++ } ++ } ++} ++ ++void ShenandoahBarrierSetAssembler::shenandoah_write_barrier_pre(MacroAssembler* masm, ++ Register obj, ++ Register pre_val, ++ Register thread, ++ Register tmp, ++ bool tosca_live, ++ bool expand_call) { ++ if (ShenandoahSATBBarrier) { ++ satb_write_barrier_pre(masm, obj, pre_val, thread, tmp, SCR1, tosca_live, expand_call); ++ } ++} ++ ++void ShenandoahBarrierSetAssembler::satb_write_barrier_pre(MacroAssembler* masm, ++ Register obj, ++ Register pre_val, ++ Register thread, ++ Register tmp1, ++ Register tmp2, ++ bool tosca_live, ++ bool expand_call) { ++ // If expand_call is true then we expand the call_VM_leaf macro ++ // directly to skip generating the check by ++ // InterpreterMacroAssembler::call_VM_leaf_base that checks _last_sp. ++ ++ assert(thread == TREG, "must be"); ++ ++ Label done; ++ Label runtime; ++ ++ assert_different_registers(obj, pre_val, tmp1, tmp2); ++ assert(pre_val != noreg && tmp1 != noreg && tmp2 != noreg, "expecting a register"); ++ ++ Address in_progress(thread, in_bytes(ShenandoahThreadLocalData::satb_mark_queue_active_offset())); ++ Address index(thread, in_bytes(ShenandoahThreadLocalData::satb_mark_queue_index_offset())); ++ Address buffer(thread, in_bytes(ShenandoahThreadLocalData::satb_mark_queue_buffer_offset())); ++ ++ // Is marking active? ++ if (in_bytes(SATBMarkQueue::byte_width_of_active()) == 4) { ++ __ ld_w(tmp1, in_progress); ++ } else { ++ assert(in_bytes(SATBMarkQueue::byte_width_of_active()) == 1, "Assumption"); ++ __ ld_b(tmp1, in_progress); ++ } ++ __ beqz(tmp1, done); ++ ++ // Do we need to load the previous value? ++ if (obj != noreg) { ++ __ load_heap_oop(pre_val, Address(obj, 0), noreg, noreg, AS_RAW); ++ } ++ ++ // Is the previous value null? ++ __ beqz(pre_val, done); ++ ++ // Can we store original value in the thread's buffer? ++ // Is index == 0? ++ // (The index field is typed as size_t.) ++ ++ __ ld_d(tmp1, index); // tmp := *index_adr ++ __ beqz(tmp1, runtime); // tmp == 0? ++ // If yes, goto runtime ++ ++ __ addi_d(tmp1, tmp1, -wordSize); // tmp := tmp - wordSize ++ __ st_d(tmp1, index); // *index_adr := tmp ++ __ ld_d(tmp2, buffer); ++ // tmp := tmp + *buffer_adr ++ ++ // Record the previous value ++ __ stx_d(pre_val, tmp1, tmp2); ++ __ b(done); ++ ++ __ bind(runtime); ++ // save the live input values ++ RegSet saved = RegSet::of(pre_val); ++ if (tosca_live) saved += RegSet::of(V0); ++ if (obj != noreg) saved += RegSet::of(obj); ++ ++ __ push(saved); ++ ++ // Calling the runtime using the regular call_VM_leaf mechanism generates ++ // code (generated by InterpreterMacroAssember::call_VM_leaf_base) ++ // that checks that the *(rfp+frame::interpreter_frame_last_sp) == nullptr. ++ // ++ // If we care generating the pre-barrier without a frame (e.g. in the ++ // intrinsified Reference.get() routine) then ebp might be pointing to ++ // the caller frame and so this check will most likely fail at runtime. ++ // ++ // Expanding the call directly bypasses the generation of the check. ++ // So when we do not have have a full interpreter frame on the stack ++ // expand_call should be passed true. ++ ++ if (expand_call) { ++ assert(pre_val != A1, "smashed arg"); ++ __ super_call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::write_ref_field_pre_entry), pre_val, thread); ++ } else { ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::write_ref_field_pre_entry), pre_val, thread); ++ } ++ ++ __ pop(saved); ++ ++ __ bind(done); ++} ++ ++void ShenandoahBarrierSetAssembler::resolve_forward_pointer(MacroAssembler* masm, Register dst, Register tmp) { ++ assert(ShenandoahLoadRefBarrier || ShenandoahCASBarrier, "Should be enabled"); ++ Label is_null; ++ __ beqz(dst, is_null); ++ resolve_forward_pointer_not_null(masm, dst, tmp); ++ __ bind(is_null); ++} ++ ++// IMPORTANT: This must preserve all registers, even SCR1 and SCR2, except those explicitely ++// passed in. ++void ShenandoahBarrierSetAssembler::resolve_forward_pointer_not_null(MacroAssembler* masm, Register dst, Register tmp) { ++ assert(ShenandoahLoadRefBarrier || ShenandoahCASBarrier, "Should be enabled"); ++ // The below loads the mark word, checks if the lowest two bits are ++ // set, and if so, clear the lowest two bits and copy the result ++ // to dst. Otherwise it leaves dst alone. ++ // Implementing this is surprisingly awkward. I do it here by: ++ // - Inverting the mark word ++ // - Test lowest two bits == 0 ++ // - If so, set the lowest two bits ++ // - Invert the result back, and copy to dst ++ ++ Register scr = RA; ++ bool borrow_reg = (tmp == noreg); ++ if (borrow_reg) { ++ // No free registers available. Make one useful. ++ tmp = SCR1; ++ if (tmp == dst) { ++ tmp = SCR2; ++ } ++ __ push(tmp); ++ } ++ ++ assert_different_registers(tmp, scr, dst); ++ ++ Label done; ++ __ movgr2fr_d(fscratch, scr); ++ __ ld_d(tmp, dst, oopDesc::mark_offset_in_bytes()); ++ __ nor(tmp, tmp, R0); ++ __ andi(scr, tmp, markWord::lock_mask_in_place); ++ __ bnez(scr, done); ++ __ ori(tmp, tmp, markWord::marked_value); ++ __ nor(dst, tmp, R0); ++ __ bind(done); ++ __ movfr2gr_d(scr, fscratch); ++ ++ if (borrow_reg) { ++ __ pop(tmp); ++ } ++} ++ ++void ShenandoahBarrierSetAssembler::load_reference_barrier(MacroAssembler* masm, Register dst, Address load_addr, DecoratorSet decorators) { ++ assert(ShenandoahLoadRefBarrier, "Should be enabled"); ++ assert_different_registers(load_addr.base(), load_addr.index(), SCR1, SCR2); ++ ++ bool is_strong = ShenandoahBarrierSet::is_strong_access(decorators); ++ bool is_weak = ShenandoahBarrierSet::is_weak_access(decorators); ++ bool is_phantom = ShenandoahBarrierSet::is_phantom_access(decorators); ++ bool is_native = ShenandoahBarrierSet::is_native_access(decorators); ++ bool is_narrow = UseCompressedOops && !is_native; ++ ++ Label heap_stable, not_cset; ++ __ enter(); ++ __ bstrins_d(SP, R0, 3, 0); ++ Address gc_state(TREG, in_bytes(ShenandoahThreadLocalData::gc_state_offset())); ++ Register tmp = (dst == SCR1) ? SCR2 : SCR1; ++ ++ // Check for heap stability ++ if (is_strong) { ++ __ ld_b(tmp, gc_state); ++ __ andi(tmp, tmp, ShenandoahHeap::HAS_FORWARDED); ++ __ beqz(tmp, heap_stable); ++ } else { ++ Label lrb; ++ __ ld_b(tmp, gc_state); ++ __ andi(tmp, tmp, ShenandoahHeap::WEAK_ROOTS); ++ __ bnez(tmp, lrb); ++ ++ __ ld_b(tmp, gc_state); ++ __ andi(tmp, tmp, ShenandoahHeap::HAS_FORWARDED); ++ __ beqz(tmp, heap_stable); ++ __ bind(lrb); ++ } ++ ++ // use A1 for load address ++ Register result_dst = dst; ++ if (dst == A1) { ++ __ move(tmp, dst); ++ dst = tmp; ++ } ++ ++ // Save A0 and A1, unless it is an output register ++ __ push2(A0, A1); ++ __ lea(A1, load_addr); ++ __ move(A0, dst); ++ ++ // Test for in-cset ++ if (is_strong) { ++ __ li(SCR2, ShenandoahHeap::in_cset_fast_test_addr()); ++ __ srli_d(SCR1, A0, ShenandoahHeapRegion::region_size_bytes_shift_jint()); ++ __ ldx_b(SCR2, SCR2, SCR1); ++ __ beqz(SCR2, not_cset); ++ } ++ ++ __ push_call_clobbered_registers_except(RegSet::of(V0)); ++ if (is_strong) { ++ if (is_narrow) { ++ __ li(RA, CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_strong_narrow)); ++ } else { ++ __ li(RA, CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_strong)); ++ } ++ } else if (is_weak) { ++ if (is_narrow) { ++ __ li(RA, CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_weak_narrow)); ++ } else { ++ __ li(RA, CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_weak)); ++ } ++ } else { ++ assert(is_phantom, "only remaining strength"); ++ assert(!is_narrow, "phantom access cannot be narrow"); ++ __ li(RA, CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_phantom)); ++ } ++ __ jalr(RA); ++ __ pop_call_clobbered_registers_except(RegSet::of(V0)); ++ ++ __ bind(not_cset); ++ ++ __ move(result_dst, A0); ++ if (result_dst == A0) ++ __ pop2(R0, A1); ++ else ++ __ pop2(A0, A1); ++ ++ __ bind(heap_stable); ++ __ leave(); ++} ++ ++void ShenandoahBarrierSetAssembler::iu_barrier(MacroAssembler* masm, Register dst, Register tmp) { ++ if (ShenandoahIUBarrier) { ++ __ push_call_clobbered_registers(); ++ satb_write_barrier_pre(masm, noreg, dst, TREG, tmp, SCR1, true, false); ++ __ pop_call_clobbered_registers(); ++ } ++} ++ ++// ++// Arguments: ++// ++// Inputs: ++// src: oop location to load from, might be clobbered ++// ++// Output: ++// dst: oop loaded from src location ++// ++// Kill: ++// SCR1 (scratch reg) ++// ++// Alias: ++// dst: SCR1 (might use SCR1 as temporary output register to avoid clobbering src) ++// ++void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Register dst, Address src, Register tmp1, Register tmp2) { ++ // 1: non-reference load, no additional barrier is needed ++ if (!is_reference_type(type)) { ++ BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp2); ++ return; ++ } ++ ++ // 2: load a reference from src location and apply LRB if needed ++ if (ShenandoahBarrierSet::need_load_reference_barrier(decorators, type)) { ++ Register result_dst = dst; ++ ++ // Preserve src location for LRB ++ if (dst == src.base() || dst == src.index() || dst == SCR1) { ++ dst = SCR2; ++ } ++ assert_different_registers(dst, src.base(), src.index()); ++ ++ BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp2); ++ ++ load_reference_barrier(masm, dst, src, decorators); ++ ++ if (dst != result_dst) { ++ __ move(result_dst, dst); ++ dst = result_dst; ++ } ++ } else { ++ BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp2); ++ } ++ ++ // 3: apply keep-alive barrier if needed ++ if (ShenandoahBarrierSet::need_keep_alive_barrier(decorators, type)) { ++ __ enter(); ++ __ push_call_clobbered_registers(); ++ satb_write_barrier_pre(masm /* masm */, ++ noreg /* obj */, ++ dst /* pre_val */, ++ TREG /* thread */, ++ tmp1 /* tmp1 */, ++ tmp2 /* tmp2 */, ++ true /* tosca_live */, ++ true /* expand_call */); ++ __ pop_call_clobbered_registers(); ++ __ leave(); ++ } ++} ++ ++void ShenandoahBarrierSetAssembler::store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Address dst, Register val, Register tmp1, Register tmp2, Register tmp3) { ++ bool on_oop = is_reference_type(type); ++ if (!on_oop) { ++ BarrierSetAssembler::store_at(masm, decorators, type, dst, val, tmp1, tmp2, tmp3); ++ return; ++ } ++ ++ // flatten object address if needed ++ if (dst.index() == noreg && dst.disp() == 0) { ++ if (dst.base() != tmp3) { ++ __ move(tmp3, dst.base()); ++ } ++ } else { ++ __ lea(tmp3, dst); ++ } ++ ++ shenandoah_write_barrier_pre(masm, ++ tmp3 /* obj */, ++ tmp2 /* pre_val */, ++ TREG /* thread */, ++ tmp1 /* tmp */, ++ val != noreg /* tosca_live */, ++ false /* expand_call */); ++ ++ if (val == noreg) { ++ BarrierSetAssembler::store_at(masm, decorators, type, Address(tmp3, 0), noreg, noreg, noreg, noreg); ++ } else { ++ iu_barrier(masm, val, tmp1); ++ BarrierSetAssembler::store_at(masm, decorators, type, Address(tmp3, 0), val, noreg, noreg, noreg); ++ } ++} ++ ++void ShenandoahBarrierSetAssembler::try_resolve_jobject_in_native(MacroAssembler* masm, Register jni_env, ++ Register obj, Register tmp, Label& slowpath) { ++ Label done; ++ // Resolve jobject ++ BarrierSetAssembler::try_resolve_jobject_in_native(masm, jni_env, obj, tmp, slowpath); ++ ++ // Check for null. ++ __ beqz(obj, done); ++ ++ assert(obj != SCR1, "need SCR1"); ++ Address gc_state(jni_env, ShenandoahThreadLocalData::gc_state_offset() - JavaThread::jni_environment_offset()); ++ __ lea(SCR1, gc_state); ++ __ ld_b(SCR1, SCR1, 0); ++ ++ // Check for heap in evacuation phase ++ __ andi(SCR1, SCR1, ShenandoahHeap::EVACUATION); ++ __ bnez(SCR1, slowpath); ++ ++ __ bind(done); ++} ++ ++// Special Shenandoah CAS implementation that handles false negatives due ++// to concurrent evacuation. The service is more complex than a ++// traditional CAS operation because the CAS operation is intended to ++// succeed if the reference at addr exactly matches expected or if the ++// reference at addr holds a pointer to a from-space object that has ++// been relocated to the location named by expected. There are two ++// races that must be addressed: ++// a) A parallel thread may mutate the contents of addr so that it points ++// to a different object. In this case, the CAS operation should fail. ++// b) A parallel thread may heal the contents of addr, replacing a ++// from-space pointer held in addr with the to-space pointer ++// representing the new location of the object. ++// Upon entry to cmpxchg_oop, it is assured that new_val equals null ++// or it refers to an object that is not being evacuated out of ++// from-space, or it refers to the to-space version of an object that ++// is being evacuated out of from-space. ++// ++// By default the value held in the result register following execution ++// of the generated code sequence is 0 to indicate failure of CAS, ++// non-zero to indicate success. If is_cae, the result is the value most ++// recently fetched from addr rather than a boolean success indicator. ++// ++// Clobbers SCR1, SCR2 ++void ShenandoahBarrierSetAssembler::cmpxchg_oop(MacroAssembler* masm, ++ Address addr, ++ Register expected, ++ Register new_val, ++ bool acquire, bool is_cae, ++ Register result) { ++ Register tmp1 = SCR2; ++ Register tmp2 = SCR1; ++ bool is_narrow = UseCompressedOops; ++ ++ assert_different_registers(addr.base(), expected, tmp1, tmp2); ++ assert_different_registers(addr.base(), new_val, tmp1, tmp2); ++ ++ Label step4, done_succ, done_fail, done, is_null; ++ ++ // There are two ways to reach this label. Initial entry into the ++ // cmpxchg_oop code expansion starts at step1 (which is equivalent ++ // to label step4). Additionally, in the rare case that four steps ++ // are required to perform the requested operation, the fourth step ++ // is the same as the first. On a second pass through step 1, ++ // control may flow through step 2 on its way to failure. It will ++ // not flow from step 2 to step 3 since we are assured that the ++ // memory at addr no longer holds a from-space pointer. ++ // ++ // The comments that immediately follow the step4 label apply only ++ // to the case in which control reaches this label by branch from ++ // step 3. ++ ++ __ bind (step4); ++ ++ // Step 4. CAS has failed because the value most recently fetched ++ // from addr is no longer the from-space pointer held in tmp2. If a ++ // different thread replaced the in-memory value with its equivalent ++ // to-space pointer, then CAS may still be able to succeed. The ++ // value held in the expected register has not changed. ++ // ++ // It is extremely rare we reach this point. For this reason, the ++ // implementation opts for smaller rather than potentially faster ++ // code. Ultimately, smaller code for this rare case most likely ++ // delivers higher overall throughput by enabling improved icache ++ // performance. ++ ++ // Step 1. Fast-path. ++ // ++ // Try to CAS with given arguments. If successful, then we are done. ++ // ++ // No label required for step 1. ++ ++ if (is_narrow) { ++ __ cmpxchg32(addr, expected, new_val, tmp2, false /* sign */, false /* retold */, ++ acquire /* acquire */, false /* weak */, true /* exchange */); ++ } else { ++ __ cmpxchg(addr, expected, new_val, tmp2, false /* retold */, acquire /* acquire */, ++ false /* weak */, true /* exchange */); ++ } ++ // tmp2 holds value fetched. ++ ++ // If expected equals null but tmp2 does not equal null, the ++ // following branches to done to report failure of CAS. If both ++ // expected and tmp2 equal null, the following branches to done to ++ // report success of CAS. There's no need for a special test of ++ // expected equal to null. ++ ++ __ beq(tmp2, expected, done_succ); ++ // if CAS failed, fall through to step 2 ++ ++ // Step 2. CAS has failed because the value held at addr does not ++ // match expected. This may be a false negative because the value fetched ++ // from addr (now held in tmp2) may be a from-space pointer to the ++ // original copy of same object referenced by to-space pointer expected. ++ // ++ // To resolve this, it suffices to find the forward pointer associated ++ // with fetched value. If this matches expected, retry CAS with new ++ // parameters. If this mismatches, then we have a legitimate ++ // failure, and we're done. ++ // ++ // No need for step2 label. ++ ++ // overwrite tmp1 with from-space pointer fetched from memory ++ __ move(tmp1, tmp2); ++ ++ if (is_narrow) { ++ __ beqz(tmp1, is_null); ++ // Decode tmp1 in order to resolve its forward pointer ++ __ decode_heap_oop_not_null(tmp1); ++ resolve_forward_pointer_not_null(masm, tmp1); ++ // Encode tmp1 to compare against expected. ++ __ encode_heap_oop_not_null(tmp1); ++ __ bind(is_null); ++ } else { ++ resolve_forward_pointer(masm, tmp1); ++ } ++ ++ // Does forwarded value of fetched from-space pointer match original ++ // value of expected? If tmp1 holds null, this comparison will fail ++ // because we know from step1 that expected is not null. There is ++ // no need for a separate test for tmp1 (the value originally held ++ // in memory) equal to null. ++ ++ // If not, then the failure was legitimate and we're done. ++ // Branching to done with NE condition denotes failure. ++ __ bne(tmp1, expected, done_fail); ++ ++ // Fall through to step 3. No need for step3 label. ++ ++ // Step 3. We've confirmed that the value originally held in memory ++ // (now held in tmp2) pointed to from-space version of original ++ // expected value. Try the CAS again with the from-space expected ++ // value. If it now succeeds, we're good. ++ // ++ // Note: tmp2 holds encoded from-space pointer that matches to-space ++ // object residing at expected. tmp2 is the new "expected". ++ ++ // Note that macro implementation of __cmpxchg cannot use same register ++ // tmp2 for result and expected since it overwrites result before it ++ // compares result with expected. ++ if (is_narrow) { ++ __ cmpxchg32(addr, tmp2, new_val, tmp1, false /* sign */, false /* retold */, ++ acquire /* acquire */, false /* weak */, false /* exchange */); ++ } else { ++ __ cmpxchg(addr, tmp2, new_val, tmp1, false /* retold */, acquire /* acquire */, ++ false /* weak */, false /* exchange */); ++ } ++ // tmp1 set iff success, tmp2 holds value fetched. ++ ++ // If fetched value did not equal the new expected, this could ++ // still be a false negative because some other thread may have ++ // newly overwritten the memory value with its to-space equivalent. ++ __ beqz(tmp1, step4); ++ ++ if (is_cae) { ++ // We're falling through to done to indicate success. ++ __ move(tmp2, expected); ++ } ++ ++ __ bind(done_succ); ++ if (!is_cae) { ++ __ li(tmp2, 1L); ++ } ++ __ b(done); ++ ++ __ bind(done_fail); ++ if (!is_cae) { ++ __ li(tmp2, 0L); ++ } ++ ++ __ bind(done); ++ __ move(result, tmp2); ++} ++ ++#undef __ ++ ++#ifdef COMPILER1 ++ ++#define __ ce->masm()-> ++ ++void ShenandoahBarrierSetAssembler::gen_pre_barrier_stub(LIR_Assembler* ce, ShenandoahPreBarrierStub* stub) { ++ ShenandoahBarrierSetC1* bs = (ShenandoahBarrierSetC1*)BarrierSet::barrier_set()->barrier_set_c1(); ++ // At this point we know that marking is in progress. ++ // If do_load() is true then we have to emit the ++ // load of the previous value; otherwise it has already ++ // been loaded into _pre_val. ++ ++ __ bind(*stub->entry()); ++ ++ assert(stub->pre_val()->is_register(), "Precondition."); ++ ++ Register pre_val_reg = stub->pre_val()->as_register(); ++ ++ if (stub->do_load()) { ++ ce->mem2reg(stub->addr(), stub->pre_val(), T_OBJECT, stub->patch_code(), stub->info(), false /*wide*/); ++ } ++ __ beqz(pre_val_reg, *stub->continuation()); ++ ce->store_parameter(stub->pre_val()->as_register(), 0); ++ __ call(bs->pre_barrier_c1_runtime_code_blob()->code_begin(), relocInfo::runtime_call_type); ++ __ b(*stub->continuation()); ++} ++ ++void ShenandoahBarrierSetAssembler::gen_load_reference_barrier_stub(LIR_Assembler* ce, ShenandoahLoadReferenceBarrierStub* stub) { ++ ShenandoahBarrierSetC1* bs = (ShenandoahBarrierSetC1*)BarrierSet::barrier_set()->barrier_set_c1(); ++ __ bind(*stub->entry()); ++ ++ DecoratorSet decorators = stub->decorators(); ++ bool is_strong = ShenandoahBarrierSet::is_strong_access(decorators); ++ bool is_weak = ShenandoahBarrierSet::is_weak_access(decorators); ++ bool is_phantom = ShenandoahBarrierSet::is_phantom_access(decorators); ++ bool is_native = ShenandoahBarrierSet::is_native_access(decorators); ++ ++ Register obj = stub->obj()->as_register(); ++ Register res = stub->result()->as_register(); ++ Register addr = stub->addr()->as_pointer_register(); ++ Register tmp1 = stub->tmp1()->as_register(); ++ Register tmp2 = stub->tmp2()->as_register(); ++ ++ assert(res == V0, "result must arrive in V0"); ++ ++ if (res != obj) { ++ __ move(res, obj); ++ } ++ ++ if (is_strong) { ++ // Check for object in cset. ++ __ li(tmp2, ShenandoahHeap::in_cset_fast_test_addr()); ++ __ srli_d(tmp1, res, ShenandoahHeapRegion::region_size_bytes_shift_jint()); ++ __ ldx_b(tmp2, tmp2, tmp1); ++ __ beqz(tmp2, *stub->continuation()); ++ } ++ ++ ce->store_parameter(res, 0); ++ ce->store_parameter(addr, 1); ++ if (is_strong) { ++ if (is_native) { ++ __ call(bs->load_reference_barrier_strong_native_rt_code_blob()->code_begin(), relocInfo::runtime_call_type); ++ } else { ++ __ call(bs->load_reference_barrier_strong_rt_code_blob()->code_begin(), relocInfo::runtime_call_type); ++ } ++ } else if (is_weak) { ++ __ call(bs->load_reference_barrier_weak_rt_code_blob()->code_begin(), relocInfo::runtime_call_type); ++ } else { ++ assert(is_phantom, "only remaining strength"); ++ __ call(bs->load_reference_barrier_phantom_rt_code_blob()->code_begin(), relocInfo::runtime_call_type); ++ } ++ ++ __ b(*stub->continuation()); ++} ++ ++#undef __ ++ ++#define __ sasm-> ++ ++void ShenandoahBarrierSetAssembler::generate_c1_pre_barrier_runtime_stub(StubAssembler* sasm) { ++ __ prologue("shenandoah_pre_barrier", false); ++ ++ // arg0 : previous value of memory ++ ++ BarrierSet* bs = BarrierSet::barrier_set(); ++ ++ const Register pre_val = A0; ++ const Register thread = TREG; ++ const Register tmp = SCR1; ++ ++ Address queue_index(thread, in_bytes(ShenandoahThreadLocalData::satb_mark_queue_index_offset())); ++ Address buffer(thread, in_bytes(ShenandoahThreadLocalData::satb_mark_queue_buffer_offset())); ++ ++ Label done; ++ Label runtime; ++ ++ // Is marking still active? ++ Address gc_state(thread, in_bytes(ShenandoahThreadLocalData::gc_state_offset())); ++ __ ld_b(tmp, gc_state); ++ __ andi(tmp, tmp, ShenandoahHeap::MARKING); ++ __ beqz(tmp, done); ++ ++ // Can we store original value in the thread's buffer? ++ __ ld_d(tmp, queue_index); ++ __ beqz(tmp, runtime); ++ ++ __ addi_d(tmp, tmp, -wordSize); ++ __ st_d(tmp, queue_index); ++ __ ld_d(SCR2, buffer); ++ __ add_d(tmp, tmp, SCR2); ++ __ load_parameter(0, SCR2); ++ __ st_d(SCR2, tmp, 0); ++ __ b(done); ++ ++ __ bind(runtime); ++ __ push_call_clobbered_registers(); ++ __ load_parameter(0, pre_val); ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, ShenandoahRuntime::write_ref_field_pre_entry), pre_val, thread); ++ __ pop_call_clobbered_registers(); ++ __ bind(done); ++ ++ __ epilogue(); ++} ++ ++void ShenandoahBarrierSetAssembler::generate_c1_load_reference_barrier_runtime_stub(StubAssembler* sasm, DecoratorSet decorators) { ++ __ prologue("shenandoah_load_reference_barrier", false); ++ __ bstrins_d(SP, R0, 3, 0); ++ // arg0 : object to be resolved ++ ++ __ push_call_clobbered_registers_except(RegSet::of(V0)); ++ __ load_parameter(0, A0); ++ __ load_parameter(1, A1); ++ ++ bool is_strong = ShenandoahBarrierSet::is_strong_access(decorators); ++ bool is_weak = ShenandoahBarrierSet::is_weak_access(decorators); ++ bool is_phantom = ShenandoahBarrierSet::is_phantom_access(decorators); ++ bool is_native = ShenandoahBarrierSet::is_native_access(decorators); ++ if (is_strong) { ++ if (is_native) { ++ __ li(RA, CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_strong)); ++ } else { ++ if (UseCompressedOops) { ++ __ li(RA, CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_strong_narrow)); ++ } else { ++ __ li(RA, CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_strong)); ++ } ++ } ++ } else if (is_weak) { ++ assert(!is_native, "weak must not be called off-heap"); ++ if (UseCompressedOops) { ++ __ li(RA, CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_weak_narrow)); ++ } else { ++ __ li(RA, CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_weak)); ++ } ++ } else { ++ assert(is_phantom, "only remaining strength"); ++ assert(is_native, "phantom must only be called off-heap"); ++ __ li(RA, CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_phantom)); ++ } ++ __ jalr(RA); ++ __ pop_call_clobbered_registers_except(RegSet::of(V0)); ++ ++ __ epilogue(); ++} ++ ++#undef __ ++ ++#endif // COMPILER1 +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/shenandoah/shenandoahBarrierSetAssembler_loongarch.hpp b/src/hotspot/cpu/loongarch/gc/shenandoah/shenandoahBarrierSetAssembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/gc/shenandoah/shenandoahBarrierSetAssembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/shenandoah/shenandoahBarrierSetAssembler_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,88 @@ ++/* ++ * Copyright (c) 2018, 2021, Red Hat, Inc. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_GC_SHENANDOAH_SHENANDOAHBARRIERSETASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_GC_SHENANDOAH_SHENANDOAHBARRIERSETASSEMBLER_LOONGARCH_HPP ++ ++#include "asm/macroAssembler.hpp" ++#include "gc/shared/barrierSetAssembler.hpp" ++#include "gc/shenandoah/shenandoahBarrierSet.hpp" ++#ifdef COMPILER1 ++class LIR_Assembler; ++class ShenandoahPreBarrierStub; ++class ShenandoahLoadReferenceBarrierStub; ++class StubAssembler; ++#endif ++class StubCodeGenerator; ++ ++class ShenandoahBarrierSetAssembler: public BarrierSetAssembler { ++private: ++ ++ void satb_write_barrier_pre(MacroAssembler* masm, ++ Register obj, ++ Register pre_val, ++ Register thread, ++ Register tmp1, ++ Register tmp2, ++ bool tosca_live, ++ bool expand_call); ++ void shenandoah_write_barrier_pre(MacroAssembler* masm, ++ Register obj, ++ Register pre_val, ++ Register thread, ++ Register tmp, ++ bool tosca_live, ++ bool expand_call); ++ ++ void resolve_forward_pointer(MacroAssembler* masm, Register dst, Register tmp = noreg); ++ void resolve_forward_pointer_not_null(MacroAssembler* masm, Register dst, Register tmp = noreg); ++ void load_reference_barrier(MacroAssembler* masm, Register dst, Address load_addr, DecoratorSet decorators); ++ ++public: ++ ++ void iu_barrier(MacroAssembler* masm, Register dst, Register tmp); ++ ++ virtual NMethodPatchingType nmethod_patching_type() { return NMethodPatchingType::conc_data_patch; } ++ ++#ifdef COMPILER1 ++ void gen_pre_barrier_stub(LIR_Assembler* ce, ShenandoahPreBarrierStub* stub); ++ void gen_load_reference_barrier_stub(LIR_Assembler* ce, ShenandoahLoadReferenceBarrierStub* stub); ++ void generate_c1_pre_barrier_runtime_stub(StubAssembler* sasm); ++ void generate_c1_load_reference_barrier_runtime_stub(StubAssembler* sasm, DecoratorSet decorators); ++#endif ++ ++ virtual void arraycopy_prologue(MacroAssembler* masm, DecoratorSet decorators, bool is_oop, ++ Register src, Register dst, Register count, RegSet saved_regs); ++ virtual void load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Register dst, Address src, Register tmp1, Register tmp2); ++ virtual void store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, ++ Address dst, Register val, Register tmp1, Register tmp2, Register tmp3); ++ virtual void try_resolve_jobject_in_native(MacroAssembler* masm, Register jni_env, ++ Register obj, Register tmp, Label& slowpath); ++ void cmpxchg_oop(MacroAssembler* masm, Address mem, Register expected, Register new_val, ++ bool acquire, bool is_cae, Register result); ++}; ++ ++#endif // CPU_LOONGARCH_GC_SHENANDOAH_SHENANDOAHBARRIERSETASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/shenandoah/shenandoah_loongarch_64.ad b/src/hotspot/cpu/loongarch/gc/shenandoah/shenandoah_loongarch_64.ad +--- a/src/hotspot/cpu/loongarch/gc/shenandoah/shenandoah_loongarch_64.ad 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/shenandoah/shenandoah_loongarch_64.ad 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,232 @@ ++// ++// Copyright (c) 2018, Red Hat, Inc. All rights reserved. ++// Copyright (c) 2023, Loongson Technology. All rights reserved. ++// DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++// ++// This code is free software; you can redistribute it and/or modify it ++// under the terms of the GNU General Public License version 2 only, as ++// published by the Free Software Foundation. ++// ++// This code is distributed in the hope that it will be useful, but WITHOUT ++// ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++// version 2 for more details (a copy is included in the LICENSE file that ++// accompanied this code). ++// ++// You should have received a copy of the GNU General Public License version ++// 2 along with this work; if not, write to the Free Software Foundation, ++// Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++// ++// Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++// or visit www.oracle.com if you need additional information or have any ++// questions. ++// ++// ++ ++source_hpp %{ ++#include "gc/shenandoah/shenandoahBarrierSet.hpp" ++#include "gc/shenandoah/shenandoahBarrierSetAssembler.hpp" ++%} ++ ++encode %{ ++ enc_class loongarch_enc_cmpxchg_oop_shenandoah(indirect mem, mRegP oldval, mRegP newval, mRegI res) %{ ++ MacroAssembler _masm(&cbuf); ++ Address addr(as_Register($mem$$base), 0); ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(&_masm, addr, $oldval$$Register, $newval$$Register, ++ /*acquire*/ false, /*is_cae*/ false, $res$$Register); ++ %} ++ ++ enc_class loongarch_enc_cmpxchg_acq_oop_shenandoah(indirect mem, mRegP oldval, mRegP newval, mRegI res) %{ ++ MacroAssembler _masm(&cbuf); ++ Address addr(as_Register($mem$$base), 0); ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(&_masm, addr, $oldval$$Register, $newval$$Register, ++ /*acquire*/ true, /*is_cae*/ false, $res$$Register); ++ %} ++%} ++ ++instruct compareAndSwapP_shenandoah(mRegI res, indirect mem, mRegP oldval, mRegP newval) %{ ++ match(Set res (ShenandoahCompareAndSwapP mem (Binary oldval newval))); ++ ++ format %{ ++ "cmpxchg_shenandoah $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then $mem <-- $newval" ++ %} ++ ++ ins_encode(loongarch_enc_cmpxchg_oop_shenandoah(mem, oldval, newval, res)); ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct compareAndSwapN_shenandoah(mRegI res, indirect mem, mRegN oldval, mRegN newval) %{ ++ match(Set res (ShenandoahCompareAndSwapN mem (Binary oldval newval))); ++ ++ format %{ ++ "cmpxchgw_shenandoah $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then $mem <-- $newval" ++ %} ++ ++ ins_encode %{ ++ Address addr(as_Register($mem$$base), 0); ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(&_masm, addr, $oldval$$Register, $newval$$Register, ++ /*acquire*/ false, /*is_cae*/ false, $res$$Register); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct compareAndSwapPAcq_shenandoah(mRegI res, indirect mem, mRegP oldval, mRegP newval) %{ ++ match(Set res (ShenandoahCompareAndSwapP mem (Binary oldval newval))); ++ ++ format %{ ++ "cmpxchg_acq_shenandoah_oop $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then $mem <-- $newval" ++ %} ++ ++ ins_encode(loongarch_enc_cmpxchg_acq_oop_shenandoah(mem, oldval, newval, res)); ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct compareAndSwapNAcq_shenandoah(mRegI res, indirect mem, mRegN oldval, mRegN newval) %{ ++ match(Set res (ShenandoahCompareAndSwapN mem (Binary oldval newval))); ++ ++ format %{ ++ "cmpxchgw_acq_shenandoah_narrow_oop $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then $mem <-- $newval" ++ %} ++ ++ ins_encode %{ ++ Address addr(as_Register($mem$$base), 0); ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(&_masm, addr, $oldval$$Register, $newval$$Register, ++ /*acquire*/ true, /*is_cae*/ false, $res$$Register); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct compareAndExchangeN_shenandoah(mRegN res, indirect mem, mRegN oldval, mRegN newval) %{ ++ match(Set res (ShenandoahCompareAndExchangeN mem (Binary oldval newval))); ++ ++ format %{ ++ "cmpxchgw_shenandoah $res = $mem, $oldval, $newval\t# (narrow oop, weak) if $mem == $oldval then $mem <-- $newval" ++ %} ++ ++ ins_encode %{ ++ Address addr(as_Register($mem$$base), 0); ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(&_masm, addr, $oldval$$Register, $newval$$Register, ++ /*acquire*/ false, /*is_cae*/ true, $res$$Register); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct compareAndExchangeP_shenandoah(mRegP res, indirect mem, mRegP oldval, mRegP newval) %{ ++ match(Set res (ShenandoahCompareAndExchangeP mem (Binary oldval newval))); ++ ++ format %{ ++ "cmpxchg_shenandoah $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then $mem <-- $newval" ++ %} ++ ++ ins_encode %{ ++ Address addr(as_Register($mem$$base), 0); ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(&_masm, addr, $oldval$$Register, $newval$$Register, ++ /*acquire*/ false, /*is_cae*/ true, $res$$Register); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct compareAndExchangeNAcq_shenandoah(mRegN res, indirect mem, mRegN oldval, mRegN newval) %{ ++ match(Set res (ShenandoahCompareAndExchangeN mem (Binary oldval newval))); ++ ++ format %{ ++ "cmpxchgw_acq_shenandoah $res = $mem, $oldval, $newval\t# (narrow oop, weak) if $mem == $oldval then $mem <-- $newval" ++ %} ++ ++ ins_encode %{ ++ Address addr(as_Register($mem$$base), 0); ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(&_masm, addr, $oldval$$Register, $newval$$Register, ++ /*acquire*/ true, /*is_cae*/ true, $res$$Register); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct compareAndExchangePAcq_shenandoah(mRegP res, indirect mem, mRegP oldval, mRegP newval) %{ ++ match(Set res (ShenandoahCompareAndExchangeP mem (Binary oldval newval))); ++ ++ format %{ ++ "cmpxchg_acq_shenandoah $mem, $oldval, $newval\t# (ptr) if $mem == $oldval then $mem <-- $newval" ++ %} ++ ++ ins_encode %{ ++ Address addr(as_Register($mem$$base), 0); ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(&_masm, addr, $oldval$$Register, $newval$$Register, ++ /*acquire*/ true, /*is_cae*/ true, $res$$Register); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct weakCompareAndSwapN_shenandoah(mRegI res, indirect mem, mRegN oldval, mRegN newval) %{ ++ match(Set res (ShenandoahWeakCompareAndSwapN mem (Binary oldval newval))); ++ ++ format %{ ++ "cmpxchgw_shenandoah $res = $mem, $oldval, $newval\t# (narrow oop, weak) if $mem == $oldval then $mem <-- $newval" ++ %} ++ ins_encode %{ ++ // Weak is not currently supported by ShenandoahBarrierSet::cmpxchg_oop ++ Address addr(as_Register($mem$$base), 0); ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(&_masm, addr, $oldval$$Register, $newval$$Register, ++ /*acquire*/ false, /*is_cae*/ false, $res$$Register); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct weakCompareAndSwapP_shenandoah(mRegI res, indirect mem, mRegP oldval, mRegP newval) %{ ++ match(Set res (ShenandoahWeakCompareAndSwapP mem (Binary oldval newval))); ++ ++ format %{ ++ "cmpxchg_shenandoah $res = $mem, $oldval, $newval\t# (ptr, weak) if $mem == $oldval then $mem <-- $newval" ++ %} ++ ++ ins_encode %{ ++ // Weak is not currently supported by ShenandoahBarrierSet::cmpxchg_oop ++ Address addr(as_Register($mem$$base), 0); ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(&_masm, addr, $oldval$$Register, $newval$$Register, ++ /*acquire*/ false, /*is_cae*/ false, $res$$Register); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct weakCompareAndSwapNAcq_shenandoah(mRegI res, indirect mem, mRegN oldval, mRegN newval) %{ ++ match(Set res (ShenandoahWeakCompareAndSwapN mem (Binary oldval newval))); ++ ++ format %{ ++ "cmpxchgw_acq_shenandoah $res = $mem, $oldval, $newval\t# (narrow oop, weak) if $mem == $oldval then $mem <-- $newval" ++ %} ++ ++ ins_encode %{ ++ // Weak is not currently supported by ShenandoahBarrierSet::cmpxchg_oop ++ Address addr(as_Register($mem$$base), 0); ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(&_masm, addr, $oldval$$Register, $newval$$Register, ++ /*acquire*/ true, /*is_cae*/ false, $res$$Register); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct weakCompareAndSwapPAcq_shenandoah(mRegI res, indirect mem, mRegP oldval, mRegP newval) %{ ++ match(Set res (ShenandoahWeakCompareAndSwapP mem (Binary oldval newval))); ++ ++ format %{ ++ "cmpxchg_acq_shenandoah $res = $mem, $oldval, $newval\t# (ptr, weak) if $mem == $oldval then $mem <-- $newval" ++ %} ++ ++ ins_encode %{ ++ // Weak is not currently supported by ShenandoahBarrierSet::cmpxchg_oop ++ Address addr(as_Register($mem$$base), 0); ++ ShenandoahBarrierSet::assembler()->cmpxchg_oop(&_masm, addr, $oldval$$Register, $newval$$Register, ++ /*acquire*/ true, /*is_cae*/ false, $res$$Register); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/x/xBarrierSetAssembler_loongarch.cpp b/src/hotspot/cpu/loongarch/gc/x/xBarrierSetAssembler_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/gc/x/xBarrierSetAssembler_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/x/xBarrierSetAssembler_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,471 @@ ++/* ++ * Copyright (c) 2019, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "code/codeBlob.hpp" ++#include "code/vmreg.inline.hpp" ++#include "gc/x/xBarrier.inline.hpp" ++#include "gc/x/xBarrierSet.hpp" ++#include "gc/x/xBarrierSetAssembler.hpp" ++#include "gc/x/xBarrierSetRuntime.hpp" ++#include "gc/x/xThreadLocalData.hpp" ++#include "memory/resourceArea.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "utilities/macros.hpp" ++#ifdef COMPILER1 ++#include "c1/c1_LIRAssembler.hpp" ++#include "c1/c1_MacroAssembler.hpp" ++#include "gc/x/c1/xBarrierSetC1.hpp" ++#endif // COMPILER1 ++#ifdef COMPILER2 ++#include "gc/x/c2/xBarrierSetC2.hpp" ++#endif // COMPILER2 ++ ++#ifdef PRODUCT ++#define BLOCK_COMMENT(str) /* nothing */ ++#else ++#define BLOCK_COMMENT(str) __ block_comment(str) ++#endif ++ ++#undef __ ++#define __ masm-> ++ ++void XBarrierSetAssembler::load_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ Register dst, ++ Address src, ++ Register tmp1, ++ Register tmp2) { ++ if (!XBarrierSet::barrier_needed(decorators, type)) { ++ // Barrier not needed ++ BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp2); ++ return; ++ } ++ ++ // Allocate scratch register ++ Register scratch = tmp1; ++ ++ assert_different_registers(dst, scratch, SCR1); ++ ++ Label done; ++ ++ // ++ // Fast Path ++ // ++ ++ // Load address ++ __ lea(scratch, src); ++ ++ // Load oop at address ++ __ ld_d(dst, scratch, 0); ++ ++ // Test address bad mask ++ __ ld_d(SCR1, address_bad_mask_from_thread(TREG)); ++ __ andr(SCR1, dst, SCR1); ++ __ beqz(SCR1, done); ++ ++ // ++ // Slow path ++ // ++ __ enter(); ++ ++ if (dst != V0) { ++ __ push(V0); ++ } ++ __ push_call_clobbered_registers_except(RegSet::of(V0)); ++ ++ if (dst != A0) { ++ __ move(A0, dst); ++ } ++ __ move(A1, scratch); ++ __ MacroAssembler::call_VM_leaf_base(XBarrierSetRuntime::load_barrier_on_oop_field_preloaded_addr(decorators), 2); ++ ++ __ pop_call_clobbered_registers_except(RegSet::of(V0)); ++ ++ // Make sure dst has the return value. ++ if (dst != V0) { ++ __ move(dst, V0); ++ __ pop(V0); ++ } ++ __ leave(); ++ ++ __ bind(done); ++} ++ ++#ifdef ASSERT ++ ++void XBarrierSetAssembler::store_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ Address dst, ++ Register val, ++ Register tmp1, ++ Register tmp2, ++ Register tmp3) { ++ // Verify value ++ if (is_reference_type(type)) { ++ // Note that src could be noreg, which means we ++ // are storing null and can skip verification. ++ if (val != noreg) { ++ Label done; ++ ++ // tmp1, tmp2 and tmp3 are often set to noreg. ++ ++ __ ld_d(AT, address_bad_mask_from_thread(TREG)); ++ __ andr(AT, val, AT); ++ __ beqz(AT, done); ++ __ stop("Verify oop store failed"); ++ __ should_not_reach_here(); ++ __ bind(done); ++ } ++ } ++ ++ // Store value ++ BarrierSetAssembler::store_at(masm, decorators, type, dst, val, tmp1, tmp2, noreg); ++} ++ ++#endif // ASSERT ++ ++void XBarrierSetAssembler::arraycopy_prologue(MacroAssembler* masm, ++ DecoratorSet decorators, ++ bool is_oop, ++ Register src, ++ Register dst, ++ Register count, ++ RegSet saved_regs) { ++ if (!is_oop) { ++ // Barrier not needed ++ return; ++ } ++ ++ BLOCK_COMMENT("XBarrierSetAssembler::arraycopy_prologue {"); ++ ++ __ push(saved_regs); ++ ++ if (count == A0) { ++ if (src == A1) { ++ // exactly backwards!! ++ __ move(AT, A0); ++ __ move(A0, A1); ++ __ move(A1, AT); ++ } else { ++ __ move(A1, count); ++ __ move(A0, src); ++ } ++ } else { ++ __ move(A0, src); ++ __ move(A1, count); ++ } ++ ++ __ call_VM_leaf(XBarrierSetRuntime::load_barrier_on_oop_array_addr(), 2); ++ ++ __ pop(saved_regs); ++ ++ BLOCK_COMMENT("} XBarrierSetAssembler::arraycopy_prologue"); ++} ++ ++void XBarrierSetAssembler::try_resolve_jobject_in_native(MacroAssembler* masm, ++ Register jni_env, ++ Register robj, ++ Register tmp, ++ Label& slowpath) { ++ BLOCK_COMMENT("XBarrierSetAssembler::try_resolve_jobject_in_native {"); ++ ++ assert_different_registers(jni_env, robj, tmp); ++ ++ // Resolve jobject ++ BarrierSetAssembler::try_resolve_jobject_in_native(masm, jni_env, robj, tmp, slowpath); ++ ++ // The Address offset is too large to direct load - -784. Our range is +127, -128. ++ __ li(tmp, (int64_t)(in_bytes(XThreadLocalData::address_bad_mask_offset()) - ++ in_bytes(JavaThread::jni_environment_offset()))); ++ ++ // Load address bad mask ++ __ ldx_d(tmp, jni_env, tmp); ++ ++ // Check address bad mask ++ __ andr(AT, robj, tmp); ++ __ bnez(AT, slowpath); ++ ++ BLOCK_COMMENT("} XBarrierSetAssembler::try_resolve_jobject_in_native"); ++} ++ ++#ifdef COMPILER1 ++ ++#undef __ ++#define __ ce->masm()-> ++ ++void XBarrierSetAssembler::generate_c1_load_barrier_test(LIR_Assembler* ce, ++ LIR_Opr ref) const { ++ assert_different_registers(SCR1, TREG, ref->as_register()); ++ __ ld_d(SCR1, address_bad_mask_from_thread(TREG)); ++ __ andr(SCR1, SCR1, ref->as_register()); ++} ++ ++void XBarrierSetAssembler::generate_c1_load_barrier_stub(LIR_Assembler* ce, ++ XLoadBarrierStubC1* stub) const { ++ // Stub entry ++ __ bind(*stub->entry()); ++ ++ Register ref = stub->ref()->as_register(); ++ Register ref_addr = noreg; ++ Register tmp = noreg; ++ ++ if (stub->tmp()->is_valid()) { ++ // Load address into tmp register ++ ce->leal(stub->ref_addr(), stub->tmp()); ++ ref_addr = tmp = stub->tmp()->as_pointer_register(); ++ } else { ++ // Address already in register ++ ref_addr = stub->ref_addr()->as_address_ptr()->base()->as_pointer_register(); ++ } ++ ++ assert_different_registers(ref, ref_addr, noreg); ++ ++ // Save V0 unless it is the result or tmp register ++ // Set up SP to accommodate parameters and maybe V0. ++ if (ref != V0 && tmp != V0) { ++ __ addi_d(SP, SP, -32); ++ __ st_d(V0, SP, 16); ++ } else { ++ __ addi_d(SP, SP, -16); ++ } ++ ++ // Setup arguments and call runtime stub ++ ce->store_parameter(ref_addr, 1); ++ ce->store_parameter(ref, 0); ++ ++ __ call(stub->runtime_stub(), relocInfo::runtime_call_type); ++ ++ // Verify result ++ __ verify_oop(V0); ++ ++ // Move result into place ++ if (ref != V0) { ++ __ move(ref, V0); ++ } ++ ++ // Restore V0 unless it is the result or tmp register ++ if (ref != V0 && tmp != V0) { ++ __ ld_d(V0, SP, 16); ++ __ addi_d(SP, SP, 32); ++ } else { ++ __ addi_d(SP, SP, 16); ++ } ++ ++ // Stub exit ++ __ b(*stub->continuation()); ++} ++ ++#undef __ ++#define __ sasm-> ++ ++void XBarrierSetAssembler::generate_c1_load_barrier_runtime_stub(StubAssembler* sasm, ++ DecoratorSet decorators) const { ++ __ prologue("zgc_load_barrier stub", false); ++ ++ __ push_call_clobbered_registers_except(RegSet::of(V0)); ++ ++ // Setup arguments ++ __ load_parameter(0, A0); ++ __ load_parameter(1, A1); ++ ++ __ call_VM_leaf(XBarrierSetRuntime::load_barrier_on_oop_field_preloaded_addr(decorators), 2); ++ ++ __ pop_call_clobbered_registers_except(RegSet::of(V0)); ++ ++ __ epilogue(); ++} ++#endif // COMPILER1 ++ ++#ifdef COMPILER2 ++ ++OptoReg::Name XBarrierSetAssembler::refine_register(const Node* node, OptoReg::Name opto_reg) { ++ if (!OptoReg::is_reg(opto_reg)) { ++ return OptoReg::Bad; ++ } ++ ++ const VMReg vm_reg = OptoReg::as_VMReg(opto_reg); ++ if (vm_reg->is_FloatRegister()) { ++ return opto_reg & ~1; ++ } ++ ++ return opto_reg; ++} ++ ++#undef __ ++#define __ _masm-> ++ ++class XSaveLiveRegisters { ++private: ++ MacroAssembler* const _masm; ++ RegSet _gp_regs; ++ FloatRegSet _fp_regs; ++ FloatRegSet _lsx_vp_regs; ++ FloatRegSet _lasx_vp_regs; ++ ++public: ++ void initialize(XLoadBarrierStubC2* stub) { ++ // Record registers that needs to be saved/restored ++ RegMaskIterator rmi(stub->live()); ++ while (rmi.has_next()) { ++ const OptoReg::Name opto_reg = rmi.next(); ++ if (OptoReg::is_reg(opto_reg)) { ++ const VMReg vm_reg = OptoReg::as_VMReg(opto_reg); ++ if (vm_reg->is_Register()) { ++ _gp_regs += RegSet::of(vm_reg->as_Register()); ++ } else if (vm_reg->is_FloatRegister()) { ++ if (UseLASX && vm_reg->next(7)) ++ _lasx_vp_regs += FloatRegSet::of(vm_reg->as_FloatRegister()); ++ else if (UseLSX && vm_reg->next(3)) ++ _lsx_vp_regs += FloatRegSet::of(vm_reg->as_FloatRegister()); ++ else ++ _fp_regs += FloatRegSet::of(vm_reg->as_FloatRegister()); ++ } else { ++ fatal("Unknown register type"); ++ } ++ } ++ } ++ ++ // Remove C-ABI SOE registers, scratch regs and _ref register that will be updated ++ _gp_regs -= RegSet::range(S0, S7) + RegSet::of(SP, SCR1, SCR2, stub->ref()); ++ } ++ ++ XSaveLiveRegisters(MacroAssembler* masm, XLoadBarrierStubC2* stub) : ++ _masm(masm), ++ _gp_regs(), ++ _fp_regs(), ++ _lsx_vp_regs(), ++ _lasx_vp_regs() { ++ ++ // Figure out what registers to save/restore ++ initialize(stub); ++ ++ // Save registers ++ __ push(_gp_regs); ++ __ push_fpu(_fp_regs); ++ __ push_vp(_lsx_vp_regs /* UseLSX */); ++ __ push_vp(_lasx_vp_regs /* UseLASX */); ++ } ++ ++ ~XSaveLiveRegisters() { ++ // Restore registers ++ __ pop_vp(_lasx_vp_regs /* UseLASX */); ++ __ pop_vp(_lsx_vp_regs /* UseLSX */); ++ __ pop_fpu(_fp_regs); ++ __ pop(_gp_regs); ++ } ++}; ++ ++#undef __ ++#define __ _masm-> ++ ++class XSetupArguments { ++private: ++ MacroAssembler* const _masm; ++ const Register _ref; ++ const Address _ref_addr; ++ ++public: ++ XSetupArguments(MacroAssembler* masm, XLoadBarrierStubC2* stub) : ++ _masm(masm), ++ _ref(stub->ref()), ++ _ref_addr(stub->ref_addr()) { ++ ++ // Setup arguments ++ if (_ref_addr.base() == noreg) { ++ // No self healing ++ if (_ref != A0) { ++ __ move(A0, _ref); ++ } ++ __ move(A1, R0); ++ } else { ++ // Self healing ++ if (_ref == A0) { ++ // _ref is already at correct place ++ __ lea(A1, _ref_addr); ++ } else if (_ref != A1) { ++ // _ref is in wrong place, but not in A1, so fix it first ++ __ lea(A1, _ref_addr); ++ __ move(A0, _ref); ++ } else if (_ref_addr.base() != A0 && _ref_addr.index() != A0) { ++ assert(_ref == A1, "Mov ref first, vacating A0"); ++ __ move(A0, _ref); ++ __ lea(A1, _ref_addr); ++ } else { ++ assert(_ref == A1, "Need to vacate A1 and _ref_addr is using A0"); ++ if (_ref_addr.base() == A0 || _ref_addr.index() == A0) { ++ __ move(T4, A1); ++ __ lea(A1, _ref_addr); ++ __ move(A0, T4); ++ } else { ++ ShouldNotReachHere(); ++ } ++ } ++ } ++ } ++ ++ ~XSetupArguments() { ++ // Transfer result ++ if (_ref != V0) { ++ __ move(_ref, V0); ++ } ++ } ++}; ++ ++#undef __ ++#define __ masm-> ++ ++void XBarrierSetAssembler::generate_c2_load_barrier_stub(MacroAssembler* masm, XLoadBarrierStubC2* stub) const { ++ BLOCK_COMMENT("XLoadBarrierStubC2"); ++ ++ // Stub entry ++ __ bind(*stub->entry()); ++ ++ { ++ XSaveLiveRegisters save_live_registers(masm, stub); ++ XSetupArguments setup_arguments(masm, stub); ++ __ call_VM_leaf(stub->slow_path(), 2); ++ } ++ // Stub exit ++ __ b(*stub->continuation()); ++} ++#endif // COMPILER2 ++ ++#undef __ ++#define __ masm-> ++ ++void XBarrierSetAssembler::check_oop(MacroAssembler* masm, Register obj, Register tmp1, Register tmp2, Label& error) { ++ // Check if mask is good. ++ // verifies that XAddressBadMask & obj == 0 ++ __ ld_d(tmp2, Address(TREG, XThreadLocalData::address_bad_mask_offset())); ++ __ andr(tmp1, obj, tmp2); ++ __ bnez(tmp1, error); ++ ++ BarrierSetAssembler::check_oop(masm, obj, tmp1, tmp2, error); ++} ++ ++#undef __ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/x/xBarrierSetAssembler_loongarch.hpp b/src/hotspot/cpu/loongarch/gc/x/xBarrierSetAssembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/gc/x/xBarrierSetAssembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/x/xBarrierSetAssembler_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,105 @@ ++/* ++ * Copyright (c) 2019, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#ifndef CPU_LOONGARCH_GC_X_XBARRIERSETASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_GC_X_XBARRIERSETASSEMBLER_LOONGARCH_HPP ++ ++#include "code/vmreg.hpp" ++#include "oops/accessDecorators.hpp" ++#ifdef COMPILER2 ++#include "opto/optoreg.hpp" ++#endif // COMPILER2 ++ ++#ifdef COMPILER1 ++class LIR_Assembler; ++class LIR_Opr; ++class StubAssembler; ++class XLoadBarrierStubC1; ++#endif // COMPILER1 ++ ++#ifdef COMPILER2 ++class Node; ++class XLoadBarrierStubC2; ++#endif // COMPILER2 ++ ++class XBarrierSetAssembler : public XBarrierSetAssemblerBase { ++public: ++ virtual void load_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ Register dst, ++ Address src, ++ Register tmp1, ++ Register tmp2); ++ ++#ifdef ASSERT ++ virtual void store_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ Address dst, ++ Register val, ++ Register tmp1, ++ Register tmp2, ++ Register tmp3); ++#endif // ASSERT ++ ++ virtual void arraycopy_prologue(MacroAssembler* masm, ++ DecoratorSet decorators, ++ bool is_oop, ++ Register src, ++ Register dst, ++ Register count, ++ RegSet saved_regs); ++ ++ virtual void try_resolve_jobject_in_native(MacroAssembler* masm, ++ Register jni_env, ++ Register robj, ++ Register tmp, ++ Label& slowpath); ++ ++ virtual NMethodPatchingType nmethod_patching_type() { return NMethodPatchingType::conc_data_patch; } ++ ++#ifdef COMPILER1 ++ void generate_c1_load_barrier_test(LIR_Assembler* ce, ++ LIR_Opr ref) const; ++ ++ void generate_c1_load_barrier_stub(LIR_Assembler* ce, ++ XLoadBarrierStubC1* stub) const; ++ ++ void generate_c1_load_barrier_runtime_stub(StubAssembler* sasm, ++ DecoratorSet decorators) const; ++#endif // COMPILER1 ++ ++#ifdef COMPILER2 ++ OptoReg::Name refine_register(const Node* node, ++ OptoReg::Name opto_reg); ++ ++ void generate_c2_load_barrier_stub(MacroAssembler* masm, ++ XLoadBarrierStubC2* stub) const; ++#endif // COMPILER2 ++ ++ void check_oop(MacroAssembler* masm, Register obj, Register tmp1, Register tmp2, Label& error); ++}; ++ ++#endif // CPU_LOONGARCH_GC_X_XBARRIERSETASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/x/xGlobals_loongarch.cpp b/src/hotspot/cpu/loongarch/gc/x/xGlobals_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/gc/x/xGlobals_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/x/xGlobals_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,211 @@ ++/* ++ * Copyright (c) 2017, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#include "precompiled.hpp" ++#include "gc/shared/gcLogPrecious.hpp" ++#include "gc/shared/gc_globals.hpp" ++#include "gc/x/xGlobals.hpp" ++#include "runtime/globals.hpp" ++#include "runtime/os.hpp" ++#include "utilities/globalDefinitions.hpp" ++#include "utilities/powerOfTwo.hpp" ++ ++#ifdef LINUX ++#include ++#endif // LINUX ++ ++// ++// The heap can have three different layouts, depending on the max heap size. ++// ++// Address Space & Pointer Layout 1 ++// -------------------------------- ++// ++// +--------------------------------+ 0x00007FFFFFFFFFFF (127TB) ++// . . ++// . . ++// . . ++// +--------------------------------+ 0x0000014000000000 (20TB) ++// | Remapped View | ++// +--------------------------------+ 0x0000010000000000 (16TB) ++// . . ++// +--------------------------------+ 0x00000c0000000000 (12TB) ++// | Marked1 View | ++// +--------------------------------+ 0x0000080000000000 (8TB) ++// | Marked0 View | ++// +--------------------------------+ 0x0000040000000000 (4TB) ++// . . ++// +--------------------------------+ 0x0000000000000000 ++// ++// 6 4 4 4 4 ++// 3 6 5 2 1 0 ++// +--------------------+----+-----------------------------------------------+ ++// |00000000 00000000 00|1111|11 11111111 11111111 11111111 11111111 11111111| ++// +--------------------+----+-----------------------------------------------+ ++// | | | ++// | | * 41-0 Object Offset (42-bits, 4TB address space) ++// | | ++// | * 45-42 Metadata Bits (4-bits) 0001 = Marked0 (Address view 4-8TB) ++// | 0010 = Marked1 (Address view 8-12TB) ++// | 0100 = Remapped (Address view 16-20TB) ++// | 1000 = Finalizable (Address view N/A) ++// | ++// * 63-46 Fixed (18-bits, always zero) ++// ++// ++// Address Space & Pointer Layout 2 ++// -------------------------------- ++// ++// +--------------------------------+ 0x00007FFFFFFFFFFF (127TB) ++// . . ++// . . ++// . . ++// +--------------------------------+ 0x0000280000000000 (40TB) ++// | Remapped View | ++// +--------------------------------+ 0x0000200000000000 (32TB) ++// . . ++// +--------------------------------+ 0x0000180000000000 (24TB) ++// | Marked1 View | ++// +--------------------------------+ 0x0000100000000000 (16TB) ++// | Marked0 View | ++// +--------------------------------+ 0x0000080000000000 (8TB) ++// . . ++// +--------------------------------+ 0x0000000000000000 ++// ++// 6 4 4 4 4 ++// 3 7 6 3 2 0 ++// +------------------+-----+------------------------------------------------+ ++// |00000000 00000000 0|1111|111 11111111 11111111 11111111 11111111 11111111| ++// +-------------------+----+------------------------------------------------+ ++// | | | ++// | | * 42-0 Object Offset (43-bits, 8TB address space) ++// | | ++// | * 46-43 Metadata Bits (4-bits) 0001 = Marked0 (Address view 8-16TB) ++// | 0010 = Marked1 (Address view 16-24TB) ++// | 0100 = Remapped (Address view 32-40TB) ++// | 1000 = Finalizable (Address view N/A) ++// | ++// * 63-47 Fixed (17-bits, always zero) ++// ++// ++// Address Space & Pointer Layout 3 ++// -------------------------------- ++// ++// +--------------------------------+ 0x00007FFFFFFFFFFF (127TB) ++// . . ++// . . ++// . . ++// +--------------------------------+ 0x0000500000000000 (80TB) ++// | Remapped View | ++// +--------------------------------+ 0x0000400000000000 (64TB) ++// . . ++// +--------------------------------+ 0x0000300000000000 (48TB) ++// | Marked1 View | ++// +--------------------------------+ 0x0000200000000000 (32TB) ++// | Marked0 View | ++// +--------------------------------+ 0x0000100000000000 (16TB) ++// . . ++// +--------------------------------+ 0x0000000000000000 ++// ++// 6 4 4 4 4 ++// 3 8 7 4 3 0 ++// +------------------+----+-------------------------------------------------+ ++// |00000000 00000000 |1111|1111 11111111 11111111 11111111 11111111 11111111| ++// +------------------+----+-------------------------------------------------+ ++// | | | ++// | | * 43-0 Object Offset (44-bits, 16TB address space) ++// | | ++// | * 47-44 Metadata Bits (4-bits) 0001 = Marked0 (Address view 16-32TB) ++// | 0010 = Marked1 (Address view 32-48TB) ++// | 0100 = Remapped (Address view 64-80TB) ++// | 1000 = Finalizable (Address view N/A) ++// | ++// * 63-48 Fixed (16-bits, always zero) ++// ++ ++// Default value if probing is not implemented for a certain platform: 128TB ++static const size_t DEFAULT_MAX_ADDRESS_BIT = 47; ++// Minimum value returned, if probing fails: 64GB ++static const size_t MINIMUM_MAX_ADDRESS_BIT = 36; ++ ++static size_t probe_valid_max_address_bit() { ++#ifdef LINUX ++ size_t max_address_bit = 0; ++ const size_t page_size = os::vm_page_size(); ++ for (size_t i = DEFAULT_MAX_ADDRESS_BIT; i > MINIMUM_MAX_ADDRESS_BIT; --i) { ++ const uintptr_t base_addr = ((uintptr_t) 1U) << i; ++ if (msync((void*)base_addr, page_size, MS_ASYNC) == 0) { ++ // msync suceeded, the address is valid, and maybe even already mapped. ++ max_address_bit = i; ++ break; ++ } ++ if (errno != ENOMEM) { ++ // Some error occured. This should never happen, but msync ++ // has some undefined behavior, hence ignore this bit. ++#ifdef ASSERT ++ fatal("Received '%s' while probing the address space for the highest valid bit", os::errno_name(errno)); ++#else // ASSERT ++ log_warning_p(gc)("Received '%s' while probing the address space for the highest valid bit", os::errno_name(errno)); ++#endif // ASSERT ++ continue; ++ } ++ // Since msync failed with ENOMEM, the page might not be mapped. ++ // Try to map it, to see if the address is valid. ++ void* const result_addr = mmap((void*) base_addr, page_size, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0); ++ if (result_addr != MAP_FAILED) { ++ munmap(result_addr, page_size); ++ } ++ if ((uintptr_t) result_addr == base_addr) { ++ // address is valid ++ max_address_bit = i; ++ break; ++ } ++ } ++ if (max_address_bit == 0) { ++ // probing failed, allocate a very high page and take that bit as the maximum ++ const uintptr_t high_addr = ((uintptr_t) 1U) << DEFAULT_MAX_ADDRESS_BIT; ++ void* const result_addr = mmap((void*) high_addr, page_size, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0); ++ if (result_addr != MAP_FAILED) { ++ max_address_bit = BitsPerSize_t - count_leading_zeros((size_t) result_addr) - 1; ++ munmap(result_addr, page_size); ++ } ++ } ++ log_info_p(gc, init)("Probing address space for the highest valid bit: " SIZE_FORMAT, max_address_bit); ++ return MAX2(max_address_bit, MINIMUM_MAX_ADDRESS_BIT); ++#else // LINUX ++ return DEFAULT_MAX_ADDRESS_BIT; ++#endif // LINUX ++} ++ ++size_t XPlatformAddressOffsetBits() { ++ const static size_t valid_max_address_offset_bits = probe_valid_max_address_bit() + 1; ++ const size_t max_address_offset_bits = valid_max_address_offset_bits - 3; ++ const size_t min_address_offset_bits = max_address_offset_bits - 2; ++ const size_t address_offset = round_up_power_of_2(MaxHeapSize * XVirtualToPhysicalRatio); ++ const size_t address_offset_bits = log2i_exact(address_offset); ++ return clamp(address_offset_bits, min_address_offset_bits, max_address_offset_bits); ++} ++ ++size_t XPlatformAddressMetadataShift() { ++ return XPlatformAddressOffsetBits(); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/x/xGlobals_loongarch.hpp b/src/hotspot/cpu/loongarch/gc/x/xGlobals_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/gc/x/xGlobals_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/x/xGlobals_loongarch.hpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,34 @@ ++/* ++ * Copyright (c) 2015, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#ifndef CPU_LOONGARCH_GC_X_XGLOBALS_LOONGARCH_HPP ++#define CPU_LOONGARCH_GC_X_XGLOBALS_LOONGARCH_HPP ++ ++const size_t XPlatformHeapViews = 3; ++const size_t XPlatformCacheLineSize = 64; ++ ++size_t XPlatformAddressOffsetBits(); ++size_t XPlatformAddressMetadataShift(); ++ ++#endif // CPU_LOONGARCH_GC_X_XGLOBALS_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/x/x_loongarch_64.ad b/src/hotspot/cpu/loongarch/gc/x/x_loongarch_64.ad +--- a/src/hotspot/cpu/loongarch/gc/x/x_loongarch_64.ad 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/x/x_loongarch_64.ad 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,256 @@ ++// ++// Copyright (c) 2019, 2021, Oracle and/or its affiliates. All rights reserved. ++// Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++// DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++// ++// This code is free software; you can redistribute it and/or modify it ++// under the terms of the GNU General Public License version 2 only, as ++// published by the Free Software Foundation. ++// ++// This code is distributed in the hope that it will be useful, but WITHOUT ++// ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++// version 2 for more details (a copy is included in the LICENSE file that ++// accompanied this code). ++// ++// You should have received a copy of the GNU General Public License version ++// 2 along with this work; if not, write to the Free Software Foundation, ++// Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++// ++// Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++// or visit www.oracle.com if you need additional information or have any ++// questions. ++// ++ ++source_hpp %{ ++ ++#include "gc/shared/gc_globals.hpp" ++#include "gc/x/c2/xBarrierSetC2.hpp" ++#include "gc/x/xThreadLocalData.hpp" ++ ++%} ++ ++source %{ ++ ++static void x_load_barrier(MacroAssembler& _masm, const MachNode* node, Address ref_addr, Register ref, Register tmp, uint8_t barrier_data) { ++ if (barrier_data == XLoadBarrierElided) { ++ return; ++ } ++ XLoadBarrierStubC2* const stub = XLoadBarrierStubC2::create(node, ref_addr, ref, tmp, barrier_data); ++ __ ld_d(tmp, Address(TREG, XThreadLocalData::address_bad_mask_offset())); ++ __ andr(tmp, tmp, ref); ++ __ bnez(tmp, *stub->entry()); ++ __ bind(*stub->continuation()); ++} ++ ++static void x_load_barrier_slow_path(MacroAssembler& _masm, const MachNode* node, Address ref_addr, Register ref, Register tmp) { ++ XLoadBarrierStubC2* const stub = XLoadBarrierStubC2::create(node, ref_addr, ref, tmp, XLoadBarrierStrong); ++ __ b(*stub->entry()); ++ __ bind(*stub->continuation()); ++} ++ ++static void x_compare_and_swap(MacroAssembler& _masm, const MachNode* node, ++ Register res, Register mem, Register oldval, Register newval, ++ Register tmp, bool weak, bool acquire) { ++ // z-specific load barrier requires strong CAS operations. ++ // Weak CAS operations are thus only emitted if the barrier is elided. ++ Address addr(mem); ++ if (node->barrier_data() == XLoadBarrierElided) { ++ __ cmpxchg(addr, oldval, newval, tmp, false /* retold */, acquire /* acquire */, ++ weak /* weak */, false /* exchange */); ++ __ move(res, tmp); ++ } else { ++ __ move(tmp, oldval); ++ __ cmpxchg(addr, tmp, newval, AT, true /* retold */, acquire /* acquire */, ++ false /* weak */, false /* exchange */); ++ __ move(res, AT); ++ ++ Label good; ++ __ ld_d(AT, Address(TREG, XThreadLocalData::address_bad_mask_offset())); ++ __ andr(AT, AT, tmp); ++ __ beqz(AT, good); ++ x_load_barrier_slow_path(_masm, node, addr, tmp, res /* used as tmp */); ++ __ cmpxchg(addr, oldval, newval, tmp, false /* retold */, acquire /* acquire */, weak /* weak */, false /* exchange */); ++ __ move(res, tmp); ++ __ bind(good); ++ } ++} ++ ++static void x_compare_and_exchange(MacroAssembler& _masm, const MachNode* node, ++ Register res, Register mem, Register oldval, Register newval, Register tmp, ++ bool weak, bool acquire) { ++ // z-specific load barrier requires strong CAS operations. ++ // Weak CAS operations are thus only emitted if the barrier is elided. ++ Address addr(mem); ++ __ cmpxchg(addr, oldval, newval, res, false /* retold */, acquire /* barrier */, ++ weak && node->barrier_data() == XLoadBarrierElided /* weak */, true /* exchange */); ++ if (node->barrier_data() != XLoadBarrierElided) { ++ Label good; ++ __ ld_d(tmp, Address(TREG, XThreadLocalData::address_bad_mask_offset())); ++ __ andr(tmp, tmp, res); ++ __ beqz(tmp, good); ++ x_load_barrier_slow_path(_masm, node, addr, res /* ref */, tmp); ++ __ cmpxchg(addr, oldval, newval, res, false /* retold */, acquire /* barrier */, weak /* weak */, true /* exchange */); ++ __ bind(good); ++ } ++} ++ ++%} ++ ++// Load Pointer ++instruct xLoadP(mRegP dst, memory mem, mRegP tmp) ++%{ ++ match(Set dst (LoadP mem)); ++ effect(TEMP_DEF dst, TEMP tmp); ++ ins_cost(125);//must be equal loadP in loongarch_64.ad ++ ++ predicate(UseZGC && !ZGenerational && n->as_Load()->barrier_data() != 0); ++ ++ format %{ "xLoadP $dst, $mem" %} ++ ++ ins_encode %{ ++ Address ref_addr = Address(as_Register($mem$$base), as_Register($mem$$index), Address::no_scale, $mem$$disp); ++ __ block_comment("xLoadP"); ++ __ ld_d($dst$$Register, ref_addr); ++ x_load_barrier(_masm, this, ref_addr, $dst$$Register, $tmp$$Register, barrier_data()); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct xCompareAndSwapP(mRegI res, mRegP mem, mRegP oldval, mRegP newval, mRegP tmp) %{ ++ match(Set res (CompareAndSwapP mem (Binary oldval newval))); ++ effect(TEMP_DEF res, TEMP tmp); ++ ++ predicate((UseZGC && !ZGenerational && n->as_LoadStore()->barrier_data() == XLoadBarrierStrong) ++ && (((CompareAndSwapNode*)n)->order() != MemNode::acquire && ((CompareAndSwapNode*) n)->order() != MemNode::seqcst)); ++ ins_cost(3 * MEMORY_REF_COST);//must be equal compareAndSwapP in loongarch_64.ad ++ ++ format %{ "CMPXCHG $res, $mem, $oldval, $newval; as bool; ptr" %} ++ ins_encode %{ ++ __ block_comment("xCompareAndSwapP"); ++ x_compare_and_swap(_masm, this, ++ $res$$Register, $mem$$Register, $oldval$$Register, $newval$$Register, ++ $tmp$$Register, false /* weak */, false /* acquire */); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct xCompareAndSwapP_acq(mRegI res, mRegP mem, mRegP oldval, mRegP newval, mRegP tmp) %{ ++ match(Set res (CompareAndSwapP mem (Binary oldval newval))); ++ effect(TEMP_DEF res, TEMP tmp); ++ ++ predicate((UseZGC && !ZGenerational && n->as_LoadStore()->barrier_data() == XLoadBarrierStrong)); ++ ins_cost(4 * MEMORY_REF_COST);//must be larger than xCompareAndSwapP ++ ++ format %{ "CMPXCHG acq $res, $mem, $oldval, $newval; as bool; ptr" %} ++ ins_encode %{ ++ __ block_comment("xCompareAndSwapP_acq"); ++ x_compare_and_swap(_masm, this, ++ $res$$Register, $mem$$Register, $oldval$$Register, $newval$$Register, ++ $tmp$$Register, false /* weak */, true /* acquire */); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct xCompareAndSwapPWeak(mRegI res, mRegP mem, mRegP oldval, mRegP newval, mRegP tmp) %{ ++ match(Set res (WeakCompareAndSwapP mem (Binary oldval newval))); ++ effect(TEMP_DEF res, TEMP tmp); ++ ++ predicate((UseZGC && !ZGenerational && n->as_LoadStore()->barrier_data() == XLoadBarrierStrong) ++ && ((CompareAndSwapNode*)n)->order() != MemNode::acquire && ((CompareAndSwapNode*) n)->order() != MemNode::seqcst); ++ ++ ins_cost(MEMORY_REF_COST);//must be equal weakCompareAndSwapP in loongarch_64.ad ++ ++ format %{ "weak CMPXCHG $res, $mem, $oldval, $newval; as bool; ptr" %} ++ ins_encode %{ ++ __ block_comment("xCompareAndSwapPWeak"); ++ x_compare_and_swap(_masm, this, ++ $res$$Register, $mem$$Register, $oldval$$Register, $newval$$Register, ++ $tmp$$Register, true /* weak */, false /* acquire */); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct xCompareAndSwapPWeak_acq(mRegI res, mRegP mem, mRegP oldval, mRegP newval, mRegP tmp) %{ ++ match(Set res (WeakCompareAndSwapP mem (Binary oldval newval))); ++ effect(TEMP_DEF res, TEMP tmp); ++ ++ predicate((UseZGC && !ZGenerational && n->as_LoadStore()->barrier_data() == XLoadBarrierStrong)); ++ ins_cost(2* MEMORY_REF_COST);//must be equal weakCompareAndSwapP_acq in loongarch_64.ad ++ ++ format %{ "weak CMPXCHG acq $res, $mem, $oldval, $newval; as bool; ptr" %} ++ ins_encode %{ ++ __ block_comment("xCompareAndSwapPWeak_acq"); ++ x_compare_and_swap(_masm, this, ++ $res$$Register, $mem$$Register, $oldval$$Register, $newval$$Register, ++ $tmp$$Register, true /* weak */, true /* acquire */); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct xCompareAndExchangeP(mRegP res, mRegP mem, mRegP oldval, mRegP newval, mRegP tmp) %{ ++ match(Set res (CompareAndExchangeP mem (Binary oldval newval))); ++ effect(TEMP_DEF res, TEMP tmp); ++ ins_cost(2* MEMORY_REF_COST);//must be equal compareAndExchangeP in loongarch_64.ad ++ ++ predicate((UseZGC && !ZGenerational && n->as_LoadStore()->barrier_data() == XLoadBarrierStrong) ++ && ( ++ ((CompareAndSwapNode*)n)->order() != MemNode::acquire ++ && ((CompareAndSwapNode*)n)->order() != MemNode::seqcst ++ )); ++ ++ format %{ "CMPXCHG $res, $mem, $oldval, $newval; as ptr; ptr" %} ++ ins_encode %{ ++ __ block_comment("xCompareAndExchangeP"); ++ x_compare_and_exchange(_masm, this, ++ $res$$Register, $mem$$Register, $oldval$$Register, $newval$$Register, $tmp$$Register, ++ false /* weak */, false /* acquire */); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct xCompareAndExchangeP_acq(mRegP res, mRegP mem, mRegP oldval, mRegP newval, mRegP tmp) %{ ++ match(Set res (CompareAndExchangeP mem (Binary oldval newval))); ++ effect(TEMP_DEF res, TEMP tmp); ++ ++ predicate((UseZGC && !ZGenerational && n->as_LoadStore()->barrier_data() == XLoadBarrierStrong) ++ && ( ++ ((CompareAndSwapNode*)n)->order() == MemNode::acquire ++ || ((CompareAndSwapNode*)n)->order() == MemNode::seqcst ++ )); ++ ++ format %{ "CMPXCHG acq $res, $mem, $oldval, $newval; as ptr; ptr" %} ++ ins_encode %{ ++ __ block_comment("xCompareAndExchangeP_acq"); ++ x_compare_and_exchange(_masm, this, ++ $res$$Register, $mem$$Register, $oldval$$Register, $newval$$Register, $tmp$$Register, ++ false /* weak */, true /* acquire */); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct xGetAndSetP(mRegP mem, mRegP newv, mRegP prev, mRegP tmp) %{ ++ match(Set prev (GetAndSetP mem newv)); ++ effect(TEMP_DEF prev, TEMP tmp); ++ ++ predicate(UseZGC && !ZGenerational && n->as_LoadStore()->barrier_data() != 0); ++ ++ format %{ "GetAndSetP $prev, $mem, $newv" %} ++ ins_encode %{ ++ Register prev = $prev$$Register; ++ Register newv = $newv$$Register; ++ Register addr = $mem$$Register; ++ __ block_comment("xGetAndSetP"); ++ __ amswap_db_d(prev, newv, addr); ++ x_load_barrier(_masm, this, Address(noreg, 0), prev, $tmp$$Register, barrier_data()); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/z/zAddress_loongarch.cpp b/src/hotspot/cpu/loongarch/gc/z/zAddress_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/gc/z/zAddress_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/z/zAddress_loongarch.cpp 2024-02-20 10:42:36.155530119 +0800 +@@ -0,0 +1,109 @@ ++/* ++ * Copyright (c) 2017, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#include "precompiled.hpp" ++#include "gc/shared/gcLogPrecious.hpp" ++#include "gc/shared/gc_globals.hpp" ++#include "gc/z/zAddress.hpp" ++#include "gc/z/zBarrierSetAssembler.hpp" ++#include "gc/z/zGlobals.hpp" ++#include "runtime/globals.hpp" ++#include "runtime/os.hpp" ++#include "utilities/globalDefinitions.hpp" ++#include "utilities/powerOfTwo.hpp" ++ ++#ifdef LINUX ++#include ++#endif // LINUX ++ ++// Default value if probing is not implemented for a certain platform: 128TB ++static const size_t DEFAULT_MAX_ADDRESS_BIT = 47; ++// Minimum value returned, if probing fails: 64GB ++static const size_t MINIMUM_MAX_ADDRESS_BIT = 36; ++ ++static size_t probe_valid_max_address_bit() { ++#ifdef LINUX ++ size_t max_address_bit = 0; ++ const size_t page_size = os::vm_page_size(); ++ for (size_t i = DEFAULT_MAX_ADDRESS_BIT; i > MINIMUM_MAX_ADDRESS_BIT; --i) { ++ const uintptr_t base_addr = ((uintptr_t) 1U) << i; ++ if (msync((void*)base_addr, page_size, MS_ASYNC) == 0) { ++ // msync suceeded, the address is valid, and maybe even already mapped. ++ max_address_bit = i; ++ break; ++ } ++ if (errno != ENOMEM) { ++ // Some error occured. This should never happen, but msync ++ // has some undefined behavior, hence ignore this bit. ++#ifdef ASSERT ++ fatal("Received '%s' while probing the address space for the highest valid bit", os::errno_name(errno)); ++#else // ASSERT ++ log_warning_p(gc)("Received '%s' while probing the address space for the highest valid bit", os::errno_name(errno)); ++#endif // ASSERT ++ continue; ++ } ++ // Since msync failed with ENOMEM, the page might not be mapped. ++ // Try to map it, to see if the address is valid. ++ void* const result_addr = mmap((void*) base_addr, page_size, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0); ++ if (result_addr != MAP_FAILED) { ++ munmap(result_addr, page_size); ++ } ++ if ((uintptr_t) result_addr == base_addr) { ++ // address is valid ++ max_address_bit = i; ++ break; ++ } ++ } ++ if (max_address_bit == 0) { ++ // probing failed, allocate a very high page and take that bit as the maximum ++ const uintptr_t high_addr = ((uintptr_t) 1U) << DEFAULT_MAX_ADDRESS_BIT; ++ void* const result_addr = mmap((void*) high_addr, page_size, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0); ++ if (result_addr != MAP_FAILED) { ++ max_address_bit = BitsPerSize_t - count_leading_zeros((size_t) result_addr) - 1; ++ munmap(result_addr, page_size); ++ } ++ } ++ log_info_p(gc, init)("Probing address space for the highest valid bit: " SIZE_FORMAT, max_address_bit); ++ return MAX2(max_address_bit, MINIMUM_MAX_ADDRESS_BIT); ++#else // LINUX ++ return DEFAULT_MAX_ADDRESS_BIT; ++#endif // LINUX ++} ++ ++size_t ZPlatformAddressOffsetBits() { ++ const static size_t valid_max_address_offset_bits = probe_valid_max_address_bit() + 1; ++ const size_t max_address_offset_bits = valid_max_address_offset_bits - 3; ++ const size_t min_address_offset_bits = max_address_offset_bits - 2; ++ const size_t address_offset = round_up_power_of_2(MaxHeapSize * ZVirtualToPhysicalRatio); ++ const size_t address_offset_bits = log2i_exact(address_offset); ++ return clamp(address_offset_bits, min_address_offset_bits, max_address_offset_bits); ++} ++ ++size_t ZPlatformAddressHeapBaseShift() { ++ return ZPlatformAddressOffsetBits(); ++} ++ ++void ZGlobalsPointers::pd_set_good_masks() { ++ BarrierSetAssembler::clear_patching_epoch(); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/z/zAddress_loongarch.hpp b/src/hotspot/cpu/loongarch/gc/z/zAddress_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/gc/z/zAddress_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/z/zAddress_loongarch.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,35 @@ ++/* ++ * Copyright (c) 2015, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#ifndef CPU_LOONGARCH64_GC_Z_ZADDRESS_LOONGARCH64_HPP ++#define CPU_LOONGARCH64_GC_Z_ZADDRESS_LOONGARCH64_HPP ++ ++#include "utilities/globalDefinitions.hpp" ++ ++const size_t ZPointerLoadShift = 16; ++ ++size_t ZPlatformAddressOffsetBits(); ++size_t ZPlatformAddressHeapBaseShift(); ++ ++#endif // CPU_LOONGARCH64_GC_Z_ZADDRESS_LOONGARCH64_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/z/zAddress_loongarch.inline.hpp b/src/hotspot/cpu/loongarch/gc/z/zAddress_loongarch.inline.hpp +--- a/src/hotspot/cpu/loongarch/gc/z/zAddress_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/z/zAddress_loongarch.inline.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,38 @@ ++/* ++ * Copyright (c) 2019, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#ifndef CPU_LOONGARCH64_GC_Z_ZADDRESS_LOONGARCH64_INLINE_HPP ++#define CPU_LOONGARCH64_GC_Z_ZADDRESS_LOONGARCH64_INLINE_HPP ++ ++#include "utilities/globalDefinitions.hpp" ++ ++inline uintptr_t ZPointer::remap_bits(uintptr_t colored) { ++ return colored & ZPointerRemappedMask; ++} ++ ++inline constexpr int ZPointer::load_shift_lookup(uintptr_t value) { ++ return ZPointerLoadShift; ++} ++ ++#endif // CPU_LOONGARCH64_GC_Z_ZADDRESS_LOONGARCH64_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/z/zBarrierSetAssembler_loongarch.cpp b/src/hotspot/cpu/loongarch/gc/z/zBarrierSetAssembler_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/gc/z/zBarrierSetAssembler_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/z/zBarrierSetAssembler_loongarch.cpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,1226 @@ ++/* ++ * Copyright (c) 2019, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "code/codeBlob.hpp" ++#include "code/vmreg.inline.hpp" ++#include "gc/z/zBarrier.inline.hpp" ++#include "gc/z/zBarrierSet.hpp" ++#include "gc/z/zBarrierSetAssembler.hpp" ++#include "gc/z/zBarrierSetRuntime.hpp" ++#include "gc/z/zThreadLocalData.hpp" ++#include "memory/resourceArea.hpp" ++#include "runtime/jniHandles.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "utilities/macros.hpp" ++#ifdef COMPILER1 ++#include "c1/c1_LIRAssembler.hpp" ++#include "c1/c1_MacroAssembler.hpp" ++#include "gc/z/c1/zBarrierSetC1.hpp" ++#endif // COMPILER1 ++#ifdef COMPILER2 ++#include "gc/z/c2/zBarrierSetC2.hpp" ++#include "opto/output.hpp" ++#endif // COMPILER2 ++ ++#ifdef PRODUCT ++#define BLOCK_COMMENT(str) /* nothing */ ++#else ++#define BLOCK_COMMENT(str) __ block_comment(str) ++#endif ++ ++#undef __ ++#define __ masm-> ++ ++// Helper for saving and restoring registers across a runtime call that does ++// not have any live vector registers. ++class ZRuntimeCallSpill { ++private: ++ MacroAssembler* _masm; ++ Register _result; ++ ++ void save() { ++ MacroAssembler* masm = _masm; ++ ++ __ enter(); ++ if (_result != noreg) { ++ __ push_call_clobbered_registers_except(RegSet::of(_result)); ++ } else { ++ __ push_call_clobbered_registers(); ++ } ++ } ++ ++ void restore() { ++ MacroAssembler* masm = _masm; ++ ++ if (_result != noreg) { ++ // Make sure _result has the return value. ++ if (_result != V0) { ++ __ move(_result, V0); ++ } ++ ++ __ pop_call_clobbered_registers_except(RegSet::of(_result)); ++ } else { ++ __ pop_call_clobbered_registers(); ++ } ++ __ leave(); ++ } ++ ++public: ++ ZRuntimeCallSpill(MacroAssembler* masm, Register result) ++ : _masm(masm), ++ _result(result) { ++ save(); ++ } ++ ++ ~ZRuntimeCallSpill() { ++ restore(); ++ } ++}; ++ ++void ZBarrierSetAssembler::check_oop(MacroAssembler* masm, Register obj, Register tmp1, Register tmp2, Label& error) { ++ // C1 calls verify_oop in the middle of barriers, before they have been uncolored ++ // and after being colored. Therefore, we must deal with colored oops as well. ++ Label done; ++ Label check_oop; ++ Label check_zaddress; ++ int color_bits = ZPointerRemappedShift + ZPointerRemappedBits; ++ ++ uintptr_t shifted_base_start_mask = (UCONST64(1) << (ZAddressHeapBaseShift + color_bits + 1)) - 1; ++ uintptr_t shifted_base_end_mask = (UCONST64(1) << (ZAddressHeapBaseShift + 1)) - 1; ++ uintptr_t shifted_base_mask = shifted_base_start_mask ^ shifted_base_end_mask; ++ ++ uintptr_t shifted_address_end_mask = (UCONST64(1) << (color_bits + 1)) - 1; ++ uintptr_t shifted_address_mask = shifted_base_end_mask ^ (uintptr_t)CONST64(-1); ++ ++ // Check colored null ++ __ li(tmp1, shifted_address_mask); ++ __ andr(tmp1, tmp1, obj); ++ __ beqz(tmp1, done); ++ ++ // Check for zpointer ++ __ li(tmp1, shifted_base_mask); ++ __ andr(tmp1, tmp1, obj); ++ __ beqz(tmp1, check_oop); ++ ++ // Uncolor presumed zpointer ++ __ z_uncolor(obj); ++ ++ __ b(check_zaddress); ++ ++ __ bind(check_oop); ++ ++ // make sure klass is 'reasonable', which is not zero. ++ __ load_klass(tmp1, obj); // get klass ++ __ beqz(tmp1, error); // if klass is null it is broken ++ ++ __ bind(check_zaddress); ++ // Check if the oop is in the right area of memory ++ __ li(tmp1, (intptr_t) Universe::verify_oop_mask()); ++ __ andr(tmp1, tmp1, obj); ++ __ li(obj, (intptr_t) Universe::verify_oop_bits()); ++ __ bne(tmp1, obj, error); ++ ++ __ bind(done); ++} ++ ++void ZBarrierSetAssembler::load_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ Register dst, ++ Address src, ++ Register tmp1, ++ Register tmp2) { ++ if (!ZBarrierSet::barrier_needed(decorators, type)) { ++ // Barrier not needed ++ BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp2); ++ return; ++ } ++ ++ BLOCK_COMMENT("ZBarrierSetAssembler::load_at {"); ++ ++ assert_different_registers(tmp1, tmp2, src.base(), noreg); ++ assert_different_registers(tmp1, tmp2, src.index()); ++ assert_different_registers(tmp1, tmp2, dst, noreg); ++ ++ Label done; ++ Label uncolor; ++ ++ // Load bad mask into scratch register. ++ const bool on_non_strong = ++ (decorators & ON_WEAK_OOP_REF) != 0 || ++ (decorators & ON_PHANTOM_OOP_REF) != 0; ++ ++ // Test address bad mask ++ if (on_non_strong) { ++ __ ld_d(tmp1, mark_bad_mask_from_thread(TREG)); ++ } else { ++ __ ld_d(tmp1, load_bad_mask_from_thread(TREG)); ++ } ++ ++ __ lea(tmp2, src); ++ __ ld_d(dst, tmp2, 0); ++ ++ // Test reference against bad mask. If mask bad, then we need to fix it up. ++ __ andr(tmp1, dst, tmp1); ++ __ beqz(tmp1, uncolor); ++ ++ { ++ // Call VM ++ ZRuntimeCallSpill rcs(masm, dst); ++ ++ if (A0 != dst) { ++ __ move(A0, dst); ++ } ++ __ move(A1, tmp2); ++ __ MacroAssembler::call_VM_leaf_base(ZBarrierSetRuntime::load_barrier_on_oop_field_preloaded_addr(decorators), 2); ++ } ++ ++ // Slow-path has already uncolored ++ __ b(done); ++ ++ __ bind(uncolor); ++ ++ // Remove the color bits ++ __ z_uncolor(dst); ++ ++ __ bind(done); ++ ++ BLOCK_COMMENT("} ZBarrierSetAssembler::load_at"); ++} ++ ++void ZBarrierSetAssembler::store_barrier_fast(MacroAssembler* masm, ++ Address ref_addr, ++ Register rnew_zaddress, ++ Register rnew_zpointer, ++ Register rtmp, ++ bool in_nmethod, ++ bool is_atomic, ++ Label& medium_path, ++ Label& medium_path_continuation) const { ++ assert_different_registers(ref_addr.base(), rnew_zpointer, rtmp); ++ assert_different_registers(ref_addr.index(), rnew_zpointer, rtmp); ++ assert_different_registers(rnew_zaddress, rnew_zpointer, rtmp); ++ ++ if (in_nmethod) { ++ if (is_atomic) { ++ __ ld_hu(rtmp, ref_addr); ++ // Atomic operations must ensure that the contents of memory are store-good before ++ // an atomic operation can execute. ++ // A not relocatable object could have spurious raw null pointers in its fields after ++ // getting promoted to the old generation. ++ __ relocate(barrier_Relocation::spec(), ZBarrierRelocationFormatStoreGoodBits); ++ __ patchable_li16(rnew_zpointer, barrier_Relocation::unpatched); ++ __ bne(rtmp, rnew_zpointer, medium_path); ++ } else { ++ __ ld_d(rtmp, ref_addr); ++ // Stores on relocatable objects never need to deal with raw null pointers in fields. ++ // Raw null pointers may only exist in the young generation, as they get pruned when ++ // the object is relocated to old. And no pre-write barrier needs to perform any action ++ // in the young generation. ++ __ relocate(barrier_Relocation::spec(), ZBarrierRelocationFormatStoreBadMask); ++ __ patchable_li16(rnew_zpointer, barrier_Relocation::unpatched); ++ __ andr(rtmp, rtmp, rnew_zpointer); ++ __ bnez(rtmp, medium_path); ++ } ++ __ bind(medium_path_continuation); ++ __ z_color(rnew_zpointer, rnew_zaddress, rtmp); ++ } else { ++ assert(!is_atomic, "atomics outside of nmethods not supported"); ++ __ lea(rtmp, ref_addr); ++ __ ld_d(rtmp, rtmp, 0); ++ __ ld_d(rnew_zpointer, Address(TREG, ZThreadLocalData::store_bad_mask_offset())); ++ __ andr(rtmp, rtmp, rnew_zpointer); ++ __ bnez(rtmp, medium_path); ++ __ bind(medium_path_continuation); ++ if (rnew_zaddress == noreg) { ++ __ move(rnew_zpointer, R0); ++ } else { ++ __ move(rnew_zpointer, rnew_zaddress); ++ } ++ ++ // Load the current good shift, and add the color bits ++ __ slli_d(rnew_zpointer, rnew_zpointer, ZPointerLoadShift); ++ __ ld_d(rtmp, Address(TREG, ZThreadLocalData::store_good_mask_offset())); ++ __ orr(rnew_zpointer, rnew_zpointer, rtmp); ++ } ++} ++ ++static void store_barrier_buffer_add(MacroAssembler* masm, ++ Address ref_addr, ++ Register tmp1, ++ Register tmp2, ++ Label& slow_path) { ++ Address buffer(TREG, ZThreadLocalData::store_barrier_buffer_offset()); ++ assert_different_registers(ref_addr.base(), ref_addr.index(), tmp1, tmp2); ++ ++ __ ld_d(tmp1, buffer); ++ ++ // Combined pointer bump and check if the buffer is disabled or full ++ // Tune ZStoreBarrierBuffer length to decrease the opportunity goto ++ // copy_store_at slow-path. ++ __ ld_d(tmp2, Address(tmp1, ZStoreBarrierBuffer::current_offset())); ++ __ beqz(tmp2, slow_path); ++ ++ // Bump the pointer ++ __ addi_d(tmp2, tmp2, - (int) sizeof(ZStoreBarrierEntry)); ++ __ st_d(tmp2, Address(tmp1, ZStoreBarrierBuffer::current_offset())); ++ ++ // Compute the buffer entry address ++ __ lea(tmp2, Address(tmp2, ZStoreBarrierBuffer::buffer_offset())); ++ __ add_d(tmp2, tmp2, tmp1); ++ ++ // Compute and log the store address ++ __ lea(tmp1, ref_addr); ++ __ st_d(tmp1, Address(tmp2, in_bytes(ZStoreBarrierEntry::p_offset()))); ++ ++ // Load and log the prev value ++ __ ld_d(tmp1, tmp1, 0); ++ __ st_d(tmp1, Address(tmp2, in_bytes(ZStoreBarrierEntry::prev_offset()))); ++} ++ ++void ZBarrierSetAssembler::store_barrier_medium(MacroAssembler* masm, ++ Address ref_addr, ++ Register rtmp1, ++ Register rtmp2, ++ Register rtmp3, ++ bool is_native, ++ bool is_atomic, ++ Label& medium_path_continuation, ++ Label& slow_path, ++ Label& slow_path_continuation) const { ++ assert_different_registers(ref_addr.base(), ref_addr.index(), rtmp1, rtmp2); ++ ++ // The reason to end up in the medium path is that the pre-value was not 'good'. ++ if (is_native) { ++ __ b(slow_path); ++ __ bind(slow_path_continuation); ++ __ b(medium_path_continuation); ++ } else if (is_atomic) { ++ // Atomic accesses can get to the medium fast path because the value was a ++ // raw null value. If it was not null, then there is no doubt we need to take a slow path. ++ ++ __ lea(rtmp2, ref_addr); ++ __ ld_d(rtmp1, rtmp2, 0); ++ __ bnez(rtmp1, slow_path); ++ ++ // If we get this far, we know there is a young raw null value in the field. ++ __ relocate(barrier_Relocation::spec(), ZBarrierRelocationFormatStoreGoodBits); ++ __ patchable_li16(rtmp1, barrier_Relocation::unpatched); ++ __ cmpxchg(Address(rtmp2, 0), R0, rtmp1, SCR1, ++ false /* retold */, false /* barrier */, true /* weak */, false /* exchange */); ++ __ beqz(SCR1, slow_path); ++ ++ __ bind(slow_path_continuation); ++ __ b(medium_path_continuation); ++ } else { ++ // A non-atomic relocatable object won't get to the medium fast path due to a ++ // raw null in the young generation. We only get here because the field is bad. ++ // In this path we don't need any self healing, so we can avoid a runtime call ++ // most of the time by buffering the store barrier to be applied lazily. ++ store_barrier_buffer_add(masm, ++ ref_addr, ++ rtmp1, ++ rtmp2, ++ slow_path); ++ __ bind(slow_path_continuation); ++ __ b(medium_path_continuation); ++ } ++} ++ ++void ZBarrierSetAssembler::store_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ Address dst, ++ Register val, ++ Register tmp1, ++ Register tmp2, ++ Register tmp3) { ++ if (!ZBarrierSet::barrier_needed(decorators, type)) { ++ BarrierSetAssembler::store_at(masm, decorators, type, dst, val, tmp1, tmp2, tmp3); ++ return; ++ } ++ ++ bool dest_uninitialized = (decorators & IS_DEST_UNINITIALIZED) != 0; ++ ++ assert_different_registers(val, tmp1, dst.base()); ++ ++ if (dest_uninitialized) { ++ if (val == noreg) { ++ __ move(tmp1, R0); ++ } else { ++ __ move(tmp1, val); ++ } ++ // Add the color bits ++ __ slli_d(tmp1, tmp1, ZPointerLoadShift); ++ __ ld_d(tmp2, Address(TREG, ZThreadLocalData::store_good_mask_offset())); ++ __ orr(tmp1, tmp2, tmp1); ++ } else { ++ Label done; ++ Label medium; ++ Label medium_continuation; ++ Label slow; ++ Label slow_continuation; ++ store_barrier_fast(masm, dst, val, tmp1, tmp2, false, false, medium, medium_continuation); ++ ++ __ b(done); ++ __ bind(medium); ++ store_barrier_medium(masm, ++ dst, ++ tmp1, ++ tmp2, ++ noreg /* tmp3 */, ++ false /* is_native */, ++ false /* is_atomic */, ++ medium_continuation, ++ slow, ++ slow_continuation); ++ ++ __ bind(slow); ++ { ++ // Call VM ++ ZRuntimeCallSpill rcs(masm, noreg); ++ __ lea(A0, dst); ++ __ MacroAssembler::call_VM_leaf_base(ZBarrierSetRuntime::store_barrier_on_oop_field_without_healing_addr(), 1); ++ } ++ ++ __ b(slow_continuation); ++ __ bind(done); ++ } ++ ++ // Store value ++ BarrierSetAssembler::store_at(masm, decorators, type, dst, tmp1, tmp2, tmp3, noreg); ++} ++ ++// Reference to stub generate_disjoint|conjoint_large_copy_lsx|lasx and generate_long_small_copy ++static FloatRegister z_copy_load_bad_vreg = FT10; ++static FloatRegister z_copy_store_good_vreg = FT11; ++static FloatRegister z_copy_store_bad_vreg = FT12; ++static FloatRegSet z_arraycopy_saved_vregs = FloatRegSet::of(F0, F1) + ++ FloatRegSet::range(FT0, FT7) + ++ FloatRegSet::of(z_copy_load_bad_vreg, ++ z_copy_store_good_vreg, ++ z_copy_store_bad_vreg); ++ ++static void load_wide_arraycopy_masks(MacroAssembler* masm) { ++ __ lea_long(SCR1, ExternalAddress((address)&ZPointerVectorLoadBadMask)); ++ if (UseLASX) { ++ __ xvld(z_copy_load_bad_vreg, SCR1, 0); ++ } else if (UseLSX) { ++ __ vld(z_copy_load_bad_vreg, SCR1, 0); ++ } ++ ++ __ lea_long(SCR1, ExternalAddress((address)&ZPointerVectorStoreBadMask)); ++ if (UseLASX) { ++ __ xvld(z_copy_store_bad_vreg, SCR1, 0); ++ } else if (UseLSX) { ++ __ vld(z_copy_store_bad_vreg, SCR1, 0); ++ } ++ ++ __ lea_long(SCR1, ExternalAddress((address)&ZPointerVectorStoreGoodMask)); ++ if (UseLASX) { ++ __ xvld(z_copy_store_good_vreg, SCR1, 0); ++ } else if (UseLSX) { ++ __ vld(z_copy_store_good_vreg, SCR1, 0); ++ } ++} ++ ++void ZBarrierSetAssembler::arraycopy_prologue(MacroAssembler* masm, ++ DecoratorSet decorators, ++ bool is_oop, ++ Register src, ++ Register dst, ++ Register count, ++ RegSet saved_regs) { ++ if (!is_oop) { ++ // Barrier not needed ++ return; ++ } ++ ++ BLOCK_COMMENT("ZBarrierSetAssembler::arraycopy_prologue {"); ++ ++ load_wide_arraycopy_masks(masm); ++ ++ BLOCK_COMMENT("} ZBarrierSetAssembler::arraycopy_prologue"); ++} ++ ++void ZBarrierSetAssembler::copy_load_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ Register dst, ++ Address src, ++ Register tmp) { ++ if (!is_reference_type(type)) { ++ BarrierSetAssembler::copy_load_at(masm, decorators, type, bytes, dst, src, noreg); ++ return; ++ } ++ ++ Label load_done; ++ ++ // Load oop at address ++ BarrierSetAssembler::copy_load_at(masm, decorators, type, bytes, dst, src, noreg); ++ ++ assert_different_registers(dst, tmp); ++ ++ // Test address bad mask ++ __ ld_d(tmp, Address(TREG, ZThreadLocalData::load_bad_mask_offset())); ++ __ andr(tmp, dst, tmp); ++ __ beqz(tmp, load_done); ++ ++ { ++ // Call VM ++ ZRuntimeCallSpill rsc(masm, dst); ++ ++ __ lea(A1, src); ++ ++ if (A0 != dst) { ++ __ move(A0, dst); ++ } ++ ++ __ MacroAssembler::call_VM_leaf_base(ZBarrierSetRuntime::load_barrier_on_oop_field_preloaded_store_good_addr(), 2); ++ } ++ ++ __ bind(load_done); ++ ++ // Remove metadata bits so that the store side (vectorized or non-vectorized) can ++ // inject the store-good color with an or instruction. ++ __ bstrins_d(dst, R0, 15, 0); ++ ++ if ((decorators & ARRAYCOPY_CHECKCAST) != 0) { ++ __ z_uncolor(dst); ++ } ++} ++ ++void ZBarrierSetAssembler::copy_store_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ Address dst, ++ Register src, ++ Register tmp1, ++ Register tmp2, ++ Register tmp3) { ++ if (!is_reference_type(type)) { ++ BarrierSetAssembler::copy_store_at(masm, decorators, type, bytes, dst, src, noreg, noreg, noreg); ++ return; ++ } ++ ++ bool is_dest_uninitialized = (decorators & IS_DEST_UNINITIALIZED) != 0; ++ ++ assert_different_registers(src, tmp1, tmp2, tmp3); ++ ++ if (!is_dest_uninitialized) { ++ Label store, store_bad; ++ __ ld_d(tmp3, dst); ++ // Test reference against bad mask. If mask bad, then we need to fix it up. ++ __ ld_d(tmp1, Address(TREG, ZThreadLocalData::store_bad_mask_offset())); ++ __ andr(tmp1, tmp3, tmp1); ++ __ beqz(tmp1, store); ++ ++ store_barrier_buffer_add(masm, dst, tmp1, tmp2, store_bad); ++ __ b(store); ++ ++ __ bind(store_bad); ++ { ++ // Call VM ++ ZRuntimeCallSpill rcs(masm, noreg); ++ ++ __ lea(A0, dst); ++ ++ __ MacroAssembler::call_VM_leaf_base(ZBarrierSetRuntime::store_barrier_on_oop_field_without_healing_addr(), 1); ++ } ++ ++ __ bind(store); ++ } ++ ++ if ((decorators & ARRAYCOPY_CHECKCAST) != 0) { ++ __ slli_d(src, src, ZPointerLoadShift); ++ } ++ ++ // Set store-good color, replacing whatever color was there before ++ __ ld_d(tmp1, Address(TREG, ZThreadLocalData::store_good_mask_offset())); ++ __ bstrins_d(src, tmp1, 15, 0); ++ ++ // Store value ++ BarrierSetAssembler::copy_store_at(masm, decorators, type, bytes, dst, src, noreg, noreg, noreg); ++} ++ ++void ZBarrierSetAssembler::copy_load_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ FloatRegister dst, ++ Address src, ++ Register tmp1, ++ Register tmp2, ++ FloatRegister vec_tmp, ++ bool need_save_restore) { ++ if (!is_reference_type(type)) { ++ BarrierSetAssembler::copy_load_at(masm, decorators, type, bytes, dst, src, noreg, noreg, fnoreg); ++ return; ++ } ++ ++ // Load source vector ++ BarrierSetAssembler::copy_load_at(masm, decorators, type, bytes, dst, src, noreg, noreg, fnoreg); ++ ++ assert_different_registers(dst, vec_tmp); ++ ++ Label done, fallback; ++ ++ // Test reference against bad mask. If mask bad, then we need to fix it up. ++ if (UseLASX) { ++ __ xvand_v(vec_tmp, dst, z_copy_load_bad_vreg); ++ __ xvsetnez_v(FCC0, vec_tmp); ++ } else if (UseLSX) { ++ __ vand_v(vec_tmp, dst, z_copy_load_bad_vreg); ++ __ vsetnez_v(FCC0, vec_tmp); ++ } ++ __ movcf2gr(SCR1, FCC0); ++ __ bnez(SCR1, fallback); // vec_tmp not equal 0.0, then goto fallback ++ ++ // Remove bad metadata bits so that the store can colour the pointers with an or instruction. ++ // This makes the fast path and slow path formats look the same, in the sense that they don't ++ // have any of the store bad bits. ++ if (UseLASX) { ++ __ xvandn_v(dst, z_copy_store_bad_vreg, dst); ++ } else if (UseLSX) { ++ __ vandn_v(dst, z_copy_store_bad_vreg, dst); ++ } ++ __ b(done); ++ ++ __ bind(fallback); ++ ++ Address src0(src.base(), src.disp() + 0); ++ Address src1(src.base(), src.disp() + 8); ++ Address src2(src.base(), src.disp() + 16); ++ Address src3(src.base(), src.disp() + 24); ++ ++ if (need_save_restore) { ++ __ push_vp(z_arraycopy_saved_vregs - FloatRegSet::of(dst)); ++ } ++ ++ assert_different_registers(tmp1, tmp2); ++ ++ if (UseLASX) { ++ __ addi_d(SP, SP, - wordSize * 4); ++ ++ // The lower 64 bits. ++ ZBarrierSetAssembler::copy_load_at(masm, decorators, type, 8, tmp2, src0, tmp1); ++ __ st_d(tmp2, SP, 0); ++ ++ // The mid-lower 64 bits. ++ ZBarrierSetAssembler::copy_load_at(masm, decorators, type, 8, tmp2, src1, tmp1); ++ __ st_d(tmp2, SP, 8); ++ ++ // The mid-higher 64 bits. ++ ZBarrierSetAssembler::copy_load_at(masm, decorators, type, 8, tmp2, src2, tmp1); ++ __ st_d(tmp2, SP, 16); ++ ++ // The higher 64 bits. ++ ZBarrierSetAssembler::copy_load_at(masm, decorators, type, 8, tmp2, src3, tmp1); ++ __ st_d(tmp2, SP, 24); ++ ++ __ xvld(dst, SP, 0); ++ __ addi_d(SP, SP, wordSize * 4); ++ } else if (UseLSX) { ++ __ addi_d(SP, SP, - wordSize * 2); ++ ++ // The lower 64 bits. ++ ZBarrierSetAssembler::copy_load_at(masm, decorators, type, 8, tmp2, src0, tmp1); ++ __ st_d(tmp2, SP, 0); ++ ++ // The higher 64 bits. ++ ZBarrierSetAssembler::copy_load_at(masm, decorators, type, 8, tmp2, src1, tmp1); ++ __ st_d(tmp2, SP, 8); ++ ++ __ vld(dst, SP, 0); ++ __ addi_d(SP, SP, wordSize * 2); ++ } ++ ++ if (need_save_restore) { ++ __ pop_vp(z_arraycopy_saved_vregs - FloatRegSet::of(dst)); ++ } ++ ++ __ bind(done); ++} ++ ++void ZBarrierSetAssembler::copy_store_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ Address dst, ++ FloatRegister src, ++ Register tmp1, ++ Register tmp2, ++ Register tmp3, ++ Register tmp4, ++ FloatRegister vec_tmp1, ++ FloatRegister vec_tmp2, ++ bool need_save_restore) { ++ if (!is_reference_type(type)) { ++ BarrierSetAssembler::copy_store_at(masm, decorators, type, bytes, dst, src, noreg, noreg, noreg, noreg, fnoreg, fnoreg); ++ return; ++ } ++ ++ bool is_dest_uninitialized = (decorators & IS_DEST_UNINITIALIZED) != 0; ++ ++ Label done, fallback; ++ ++ if (!is_dest_uninitialized) { ++ // Load pre values ++ BarrierSetAssembler::copy_load_at(masm, decorators, type, bytes, vec_tmp1, dst, noreg, noreg, fnoreg); ++ ++ assert_different_registers(vec_tmp1, vec_tmp2); ++ ++ // Test reference against bad mask. If mask bad, then we need to fix it up. ++ if (UseLASX) { ++ __ xvand_v(vec_tmp2, vec_tmp1, z_copy_store_bad_vreg); ++ __ xvsetnez_v(FCC0, vec_tmp2); ++ } else if (UseLSX) { ++ __ vand_v(vec_tmp2, vec_tmp1, z_copy_store_bad_vreg); ++ __ vsetnez_v(FCC0, vec_tmp2); ++ } ++ __ movcf2gr(SCR1, FCC0); ++ __ bnez(SCR1, fallback); // vec_tmp1 not equal 0.0, then goto fallback ++ } ++ ++ // Color source ++ if (UseLASX) { ++ __ xvor_v(src, src, z_copy_store_good_vreg); ++ } else if (UseLSX) { ++ __ vor_v(src, src, z_copy_store_good_vreg); ++ } ++ // Store colored source in destination ++ BarrierSetAssembler::copy_store_at(masm, decorators, type, bytes, dst, src, noreg, noreg, noreg, noreg, fnoreg, fnoreg); ++ __ b(done); ++ ++ __ bind(fallback); ++ ++ Address dst0(dst.base(), dst.disp() + 0); ++ Address dst1(dst.base(), dst.disp() + 8); ++ Address dst2(dst.base(), dst.disp() + 16); ++ Address dst3(dst.base(), dst.disp() + 24); ++ ++ if (need_save_restore) { ++ __ push_vp(z_arraycopy_saved_vregs - FloatRegSet::of(src)); ++ } ++ ++ assert_different_registers(tmp4, tmp1, tmp2, tmp3); ++ ++ if (UseLASX) { ++ __ addi_d(SP, SP, - wordSize * 4); ++ __ xvst(src, SP, 0); ++ ++ // The lower 64 bits. ++ __ ld_d(tmp4, SP, 0); ++ ZBarrierSetAssembler::copy_store_at(masm, decorators, type, 8, dst0, tmp4, tmp1, tmp2, tmp3); ++ ++ // The mid-lower 64 bits. ++ __ ld_d(tmp4, SP, 8); ++ ZBarrierSetAssembler::copy_store_at(masm, decorators, type, 8, dst1, tmp4, tmp1, tmp2, tmp3); ++ ++ // The mid-higher 64 bits. ++ __ ld_d(tmp4, SP, 16); ++ ZBarrierSetAssembler::copy_store_at(masm, decorators, type, 8, dst2, tmp4, tmp1, tmp2, tmp3); ++ ++ // The higher 64 bits. ++ __ ld_d(tmp4, SP, 24); ++ ZBarrierSetAssembler::copy_store_at(masm, decorators, type, 8, dst3, tmp4, tmp1, tmp2, tmp3); ++ ++ __ addi_d(SP, SP, wordSize * 4); ++ } else if (UseLSX) { ++ // Extract the 2 oops from the src vector register ++ __ addi_d(SP, SP, - wordSize * 2); ++ __ vst(src, SP, 0); ++ ++ // The lower 64 bits. ++ __ ld_d(tmp4, SP, 0); ++ ZBarrierSetAssembler::copy_store_at(masm, decorators, type, 8, dst0, tmp4, tmp1, tmp2, tmp3); ++ ++ // The higher 64 bits. ++ __ ld_d(tmp4, SP, 8); ++ ZBarrierSetAssembler::copy_store_at(masm, decorators, type, 8, dst1, tmp4, tmp1, tmp2, tmp3); ++ ++ __ addi_d(SP, SP, wordSize * 2); ++ } ++ ++ if (need_save_restore) { ++ __ pop_vp(z_arraycopy_saved_vregs - FloatRegSet::of(src)); ++ } ++ ++ __ bind(done); ++} ++ ++void ZBarrierSetAssembler::try_resolve_jobject_in_native(MacroAssembler* masm, ++ Register jni_env, ++ Register robj, ++ Register tmp, ++ Label& slowpath) { ++ BLOCK_COMMENT("ZBarrierSetAssembler::try_resolve_jobject_in_native {"); ++ ++ Label done, tagged, weak_tagged, uncolor; ++ ++ // Test for tag ++ __ li(tmp, JNIHandles::tag_mask); ++ __ andr(tmp, robj, tmp); ++ __ bnez(tmp, tagged); ++ ++ // Resolve local handle ++ __ ld_d(robj, robj, 0); ++ __ b(done); ++ ++ __ bind(tagged); ++ ++ // Test for weak tag ++ __ li(tmp, JNIHandles::TypeTag::weak_global); ++ __ andr(tmp, robj, tmp); ++ __ bnez(tmp, weak_tagged); ++ ++ // Resolve global handle ++ __ ld_d(robj, Address(robj, -JNIHandles::TypeTag::global)); ++ __ lea(tmp, load_bad_mask_from_jni_env(jni_env)); ++ __ ld_d(tmp, tmp, 0); ++ __ andr(tmp, robj, tmp); ++ __ bnez(tmp, slowpath); ++ __ b(uncolor); ++ ++ __ bind(weak_tagged); ++ ++ // Resolve weak handle ++ __ ld_d(robj, Address(robj, -JNIHandles::TypeTag::weak_global)); ++ __ lea(tmp, mark_bad_mask_from_jni_env(jni_env)); ++ __ ld_d(tmp, tmp, 0); ++ __ andr(tmp, robj, tmp); ++ __ bnez(tmp, slowpath); ++ ++ __ bind(uncolor); ++ ++ // Uncolor ++ __ z_uncolor(robj); ++ ++ __ bind(done); ++ ++ BLOCK_COMMENT("} ZBarrierSetAssembler::try_resolve_jobject_in_native"); ++} ++ ++static uint16_t patch_barrier_relocation_value(int format) { ++ switch (format) { ++ case ZBarrierRelocationFormatLoadBadMask: ++ return (uint16_t)ZPointerLoadBadMask; ++ case ZBarrierRelocationFormatMarkBadMask: ++ return (uint16_t)ZPointerMarkBadMask; ++ case ZBarrierRelocationFormatStoreGoodBits: ++ return (uint16_t)ZPointerStoreGoodMask; ++ case ZBarrierRelocationFormatStoreBadMask: ++ return (uint16_t)ZPointerStoreBadMask; ++ default: ++ ShouldNotReachHere(); ++ return 0; ++ } ++} ++ ++void ZBarrierSetAssembler::patch_barrier_relocation(address addr, int format) { ++ int inst = *(int*)addr; ++ int size = 2 * BytesPerInstWord; ++ CodeBuffer cb(addr, size); ++ MacroAssembler masm(&cb); ++ masm.patchable_li16(as_Register(inst & 0x1f), patch_barrier_relocation_value(format)); ++ ICache::invalidate_range(addr, size); ++} ++ ++#ifdef COMPILER1 ++ ++#undef __ ++#define __ ce->masm()-> ++ ++void ZBarrierSetAssembler::generate_c1_uncolor(LIR_Assembler* ce, LIR_Opr ref) const { ++ __ z_uncolor(ref->as_register()); ++} ++ ++void ZBarrierSetAssembler::generate_c1_color(LIR_Assembler* ce, LIR_Opr ref) const { ++ __ z_color(ref->as_register(), ref->as_register(), SCR1); ++} ++ ++void ZBarrierSetAssembler::generate_c1_load_barrier(LIR_Assembler* ce, ++ LIR_Opr ref, ++ ZLoadBarrierStubC1* stub, ++ bool on_non_strong) const { ++ Label good; ++ __ check_color(ref->as_register(), SCR1, on_non_strong); ++ __ beqz(SCR1, good); ++ __ b(*stub->entry()); ++ ++ __ bind(good); ++ __ z_uncolor(ref->as_register()); ++ __ bind(*stub->continuation()); ++} ++ ++void ZBarrierSetAssembler::generate_c1_load_barrier_stub(LIR_Assembler* ce, ++ ZLoadBarrierStubC1* stub) const { ++ // Stub entry ++ __ bind(*stub->entry()); ++ ++ Register ref = stub->ref()->as_register(); ++ Register ref_addr = noreg; ++ Register tmp = noreg; ++ ++ if (stub->tmp()->is_valid()) { ++ // Load address into tmp register ++ ce->leal(stub->ref_addr(), stub->tmp()); ++ ref_addr = tmp = stub->tmp()->as_pointer_register(); ++ } else { ++ // Address already in register ++ ref_addr = stub->ref_addr()->as_address_ptr()->base()->as_pointer_register(); ++ } ++ ++ assert_different_registers(ref, ref_addr, noreg); ++ ++ // Save V0 unless it is the result or tmp register ++ // Set up SP to accommodate parameters and maybe V0. ++ if (ref != V0 && tmp != V0) { ++ __ addi_d(SP, SP, -32); ++ __ st_d(V0, SP, 16); ++ } else { ++ __ addi_d(SP, SP, -16); ++ } ++ ++ // Setup arguments and call runtime stub ++ ce->store_parameter(ref_addr, 1); ++ ce->store_parameter(ref, 0); ++ ++ __ call(stub->runtime_stub(), relocInfo::runtime_call_type); ++ ++ // Verify result ++ __ verify_oop(V0); ++ ++ // Move result into place ++ if (ref != V0) { ++ __ move(ref, V0); ++ } ++ ++ // Restore V0 unless it is the result or tmp register ++ if (ref != V0 && tmp != V0) { ++ __ ld_d(V0, SP, 16); ++ __ addi_d(SP, SP, 32); ++ } else { ++ __ addi_d(SP, SP, 16); ++ } ++ ++ // Stub exit ++ __ b(*stub->continuation()); ++} ++ ++void ZBarrierSetAssembler::generate_c1_store_barrier(LIR_Assembler* ce, ++ LIR_Address* addr, ++ LIR_Opr new_zaddress, ++ LIR_Opr new_zpointer, ++ ZStoreBarrierStubC1* stub) const { ++ Register rnew_zaddress = new_zaddress->as_register(); ++ Register rnew_zpointer = new_zpointer->as_register(); ++ ++ store_barrier_fast(ce->masm(), ++ ce->as_Address(addr), ++ rnew_zaddress, ++ rnew_zpointer, ++ SCR2, ++ true, ++ stub->is_atomic(), ++ *stub->entry(), ++ *stub->continuation()); ++} ++ ++void ZBarrierSetAssembler::generate_c1_store_barrier_stub(LIR_Assembler* ce, ++ ZStoreBarrierStubC1* stub) const { ++ // Stub entry ++ __ bind(*stub->entry()); ++ Label slow; ++ Label slow_continuation; ++ store_barrier_medium(ce->masm(), ++ ce->as_Address(stub->ref_addr()->as_address_ptr()), ++ SCR2, ++ stub->new_zpointer()->as_register(), ++ stub->tmp()->as_pointer_register(), ++ false /* is_native */, ++ stub->is_atomic(), ++ *stub->continuation(), ++ slow, ++ slow_continuation); ++ ++ __ bind(slow); ++ ++ __ lea(stub->new_zpointer()->as_register(), ce->as_Address(stub->ref_addr()->as_address_ptr())); ++ ++ __ addi_d(SP, SP, -16); ++ // Setup arguments and call runtime stub ++ assert(stub->new_zpointer()->is_valid(), "invariant"); ++ ce->store_parameter(stub->new_zpointer()->as_register(), 0); ++ __ call(stub->runtime_stub(), relocInfo::runtime_call_type); ++ __ addi_d(SP, SP, 16); ++ ++ // Stub exit ++ __ b(slow_continuation); ++} ++ ++#undef __ ++#define __ sasm-> ++ ++void ZBarrierSetAssembler::generate_c1_load_barrier_runtime_stub(StubAssembler* sasm, ++ DecoratorSet decorators) const { ++ __ prologue("zgc_load_barrier stub", false); ++ ++ __ push_call_clobbered_registers_except(RegSet::of(V0)); ++ ++ // Setup arguments ++ __ load_parameter(0, A0); ++ __ load_parameter(1, A1); ++ ++ __ call_VM_leaf(ZBarrierSetRuntime::load_barrier_on_oop_field_preloaded_addr(decorators), 2); ++ ++ __ pop_call_clobbered_registers_except(RegSet::of(V0)); ++ ++ __ epilogue(); ++} ++ ++void ZBarrierSetAssembler::generate_c1_store_barrier_runtime_stub(StubAssembler* sasm, ++ bool self_healing) const { ++ __ prologue("zgc_store_barrier stub", false); ++ ++ __ push_call_clobbered_registers(); ++ ++ // Setup arguments ++ __ load_parameter(0, c_rarg0); ++ ++ if (self_healing) { ++ __ call_VM_leaf(ZBarrierSetRuntime::store_barrier_on_oop_field_with_healing_addr(), 1); ++ } else { ++ __ call_VM_leaf(ZBarrierSetRuntime::store_barrier_on_oop_field_without_healing_addr(), 1); ++ } ++ ++ __ pop_call_clobbered_registers(); ++ ++ __ epilogue(); ++} ++ ++#endif // COMPILER1 ++ ++#ifdef COMPILER2 ++ ++OptoReg::Name ZBarrierSetAssembler::refine_register(const Node* node, OptoReg::Name opto_reg) { ++ if (!OptoReg::is_reg(opto_reg)) { ++ return OptoReg::Bad; ++ } ++ ++ const VMReg vm_reg = OptoReg::as_VMReg(opto_reg); ++ if (vm_reg->is_FloatRegister()) { ++ return opto_reg & ~1; ++ } ++ ++ return opto_reg; ++} ++ ++#undef __ ++#define __ _masm-> ++ ++class ZSaveLiveRegisters { ++private: ++ MacroAssembler* const _masm; ++ RegSet _gp_regs; ++ FloatRegSet _fp_regs; ++ FloatRegSet _lsx_vp_regs; ++ FloatRegSet _lasx_vp_regs; ++ ++public: ++ void initialize(ZBarrierStubC2* stub) { ++ // Record registers that needs to be saved/restored ++ RegMaskIterator rmi(stub->live()); ++ while (rmi.has_next()) { ++ const OptoReg::Name opto_reg = rmi.next(); ++ if (OptoReg::is_reg(opto_reg)) { ++ const VMReg vm_reg = OptoReg::as_VMReg(opto_reg); ++ if (vm_reg->is_Register()) { ++ _gp_regs += RegSet::of(vm_reg->as_Register()); ++ } else if (vm_reg->is_FloatRegister()) { ++ if (UseLASX && vm_reg->next(7)) ++ _lasx_vp_regs += FloatRegSet::of(vm_reg->as_FloatRegister()); ++ else if (UseLSX && vm_reg->next(3)) ++ _lsx_vp_regs += FloatRegSet::of(vm_reg->as_FloatRegister()); ++ else ++ _fp_regs += FloatRegSet::of(vm_reg->as_FloatRegister()); ++ } else { ++ fatal("Unknown register type"); ++ } ++ } ++ } ++ ++ // Remove C-ABI SOE registers, scratch regs and _ref register that will be updated ++ if (stub->result() != noreg) { ++ _gp_regs -= RegSet::range(S0, S7) + RegSet::of(SP, SCR1, SCR2, stub->result()); ++ } else { ++ _gp_regs -= RegSet::range(S0, S7) + RegSet::of(SP, SCR1, SCR2); ++ } ++ } ++ ++ ZSaveLiveRegisters(MacroAssembler* masm, ZBarrierStubC2* stub) : ++ _masm(masm), ++ _gp_regs(), ++ _fp_regs(), ++ _lsx_vp_regs(), ++ _lasx_vp_regs() { ++ ++ // Figure out what registers to save/restore ++ initialize(stub); ++ ++ // Save registers ++ __ push(_gp_regs); ++ __ push_fpu(_fp_regs); ++ __ push_vp(_lsx_vp_regs /* UseLSX */); ++ __ push_vp(_lasx_vp_regs /* UseLASX */); ++ } ++ ++ ~ZSaveLiveRegisters() { ++ // Restore registers ++ __ pop_vp(_lasx_vp_regs /* UseLASX */); ++ __ pop_vp(_lsx_vp_regs /* UseLSX */); ++ __ pop_fpu(_fp_regs); ++ __ pop(_gp_regs); ++ } ++}; ++ ++#undef __ ++#define __ _masm-> ++ ++class ZSetupArguments { ++private: ++ MacroAssembler* const _masm; ++ const Register _ref; ++ const Address _ref_addr; ++ ++public: ++ ZSetupArguments(MacroAssembler* masm, ZLoadBarrierStubC2* stub) : ++ _masm(masm), ++ _ref(stub->ref()), ++ _ref_addr(stub->ref_addr()) { ++ ++ // Setup arguments ++ if (_ref_addr.base() == noreg) { ++ // No self healing ++ if (_ref != A0) { ++ __ move(A0, _ref); ++ } ++ __ move(A1, R0); ++ } else { ++ // Self healing ++ if (_ref == A0) { ++ // _ref is already at correct place ++ __ lea(A1, _ref_addr); ++ } else if (_ref != A1) { ++ // _ref is in wrong place, but not in A1, so fix it first ++ __ lea(A1, _ref_addr); ++ __ move(A0, _ref); ++ } else if (_ref_addr.base() != A0 && _ref_addr.index() != A0) { ++ assert(_ref == A1, "Mov ref first, vacating A0"); ++ __ move(A0, _ref); ++ __ lea(A1, _ref_addr); ++ } else { ++ assert(_ref == A1, "Need to vacate A1 and _ref_addr is using A0"); ++ if (_ref_addr.base() == A0 || _ref_addr.index() == A0) { ++ __ move(SCR2, A1); ++ __ lea(A1, _ref_addr); ++ __ move(A0, SCR2); ++ } else { ++ ShouldNotReachHere(); ++ } ++ } ++ } ++ } ++ ++ ~ZSetupArguments() { ++ // Transfer result ++ if (_ref != V0) { ++ __ move(_ref, V0); ++ } ++ } ++}; ++ ++#undef __ ++#define __ masm-> ++ ++void ZBarrierSetAssembler::generate_c2_load_barrier_stub(MacroAssembler* masm, ZLoadBarrierStubC2* stub) const { ++ BLOCK_COMMENT("ZLoadBarrierStubC2"); ++ ++ // Stub entry ++ if (!Compile::current()->output()->in_scratch_emit_size()) { ++ __ bind(*stub->entry()); ++ } ++ ++ { ++ ZSaveLiveRegisters save_live_registers(masm, stub); ++ ZSetupArguments setup_arguments(masm, stub); ++ __ MacroAssembler::call_VM_leaf_base(stub->slow_path(), 2); ++ } ++ // Stub exit ++ __ b(*stub->continuation()); ++} ++ ++void ZBarrierSetAssembler::generate_c2_store_barrier_stub(MacroAssembler* masm, ZStoreBarrierStubC2* stub) const { ++ BLOCK_COMMENT("ZStoreBarrierStubC2"); ++ ++ // Stub entry ++ __ bind(*stub->entry()); ++ ++ Label slow; ++ Label slow_continuation; ++ store_barrier_medium(masm, ++ stub->ref_addr(), ++ stub->new_zpointer(), ++ SCR2, ++ SCR1, ++ stub->is_native(), ++ stub->is_atomic(), ++ *stub->continuation(), ++ slow, ++ slow_continuation); ++ ++ __ bind(slow); ++ ++ { ++ ZSaveLiveRegisters save_live_registers(masm, stub); ++ __ lea(A0, stub->ref_addr()); ++ ++ if (stub->is_native()) { ++ __ MacroAssembler::call_VM_leaf_base(ZBarrierSetRuntime::store_barrier_on_native_oop_field_without_healing_addr(), 1); ++ } else if (stub->is_atomic()) { ++ __ MacroAssembler::call_VM_leaf_base(ZBarrierSetRuntime::store_barrier_on_oop_field_with_healing_addr(), 1); ++ } else { ++ __ MacroAssembler::call_VM_leaf_base(ZBarrierSetRuntime::store_barrier_on_oop_field_without_healing_addr(), 1); ++ } ++ } ++ ++ // Stub exit ++ __ b(slow_continuation); ++} ++ ++#endif // COMPILER2 +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/z/zBarrierSetAssembler_loongarch.hpp b/src/hotspot/cpu/loongarch/gc/z/zBarrierSetAssembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/gc/z/zBarrierSetAssembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/z/zBarrierSetAssembler_loongarch.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,192 @@ ++/* ++ * Copyright (c) 2019, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#ifndef CPU_LOONGARCH_GC_Z_ZBARRIERSETASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_GC_Z_ZBARRIERSETASSEMBLER_LOONGARCH_HPP ++ ++#include "code/vmreg.hpp" ++#include "oops/accessDecorators.hpp" ++#ifdef COMPILER2 ++#include "gc/z/c2/zBarrierSetC2.hpp" ++#include "opto/optoreg.hpp" ++#endif // COMPILER2 ++ ++#ifdef COMPILER1 ++class LIR_Address; ++class LIR_Assembler; ++class LIR_Opr; ++class StubAssembler; ++class ZLoadBarrierStubC1; ++class ZStoreBarrierStubC1; ++#endif // COMPILER1 ++ ++#ifdef COMPILER2 ++class MachNode; ++class Node; ++#endif // COMPILER2 ++ ++class ZBarrierSetAssembler : public ZBarrierSetAssemblerBase { ++public: ++ virtual void load_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ Register dst, ++ Address src, ++ Register tmp1, ++ Register tmp2); ++ ++ void store_barrier_fast(MacroAssembler* masm, ++ Address ref_addr, ++ Register rnew_zaddress, ++ Register rnew_zpointer, ++ Register rtmp, ++ bool in_nmethod, ++ bool is_atomic, ++ Label& medium_path, ++ Label& medium_path_continuation) const; ++ ++ void store_barrier_medium(MacroAssembler* masm, ++ Address ref_addr, ++ Register rtmp1, ++ Register rtmp2, ++ Register rtmp3, ++ bool is_native, ++ bool is_atomic, ++ Label& medium_path_continuation, ++ Label& slow_path, ++ Label& slow_path_continuation) const; ++ ++ virtual void store_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ Address dst, ++ Register val, ++ Register tmp1, ++ Register tmp2, ++ Register tmp3); ++ ++ virtual void arraycopy_prologue(MacroAssembler* masm, ++ DecoratorSet decorators, ++ bool is_oop, ++ Register src, ++ Register dst, ++ Register count, ++ RegSet saved_regs); ++ ++ virtual void copy_load_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ Register dst, ++ Address src, ++ Register tmp); ++ ++ virtual void copy_store_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ Address dst, ++ Register src, ++ Register tmp1, ++ Register tmp2, ++ Register tmp3); ++ ++ virtual void copy_load_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ FloatRegister dst, ++ Address src, ++ Register tmp1, ++ Register tmp2, ++ FloatRegister vec_tmp, ++ bool need_save_restore = true); ++ ++ virtual void copy_store_at(MacroAssembler* masm, ++ DecoratorSet decorators, ++ BasicType type, ++ size_t bytes, ++ Address dst, ++ FloatRegister src, ++ Register tmp1, ++ Register tmp2, ++ Register tmp3, ++ Register tmp4, ++ FloatRegister vec_tmp1, ++ FloatRegister vec_tmp2, ++ bool need_save_restore = true); ++ ++ virtual void try_resolve_jobject_in_native(MacroAssembler* masm, ++ Register jni_env, ++ Register robj, ++ Register tmp, ++ Label& slowpath); ++ ++ virtual void check_oop(MacroAssembler* masm, Register obj, Register tmp1, Register tmp2, Label& error); ++ ++ virtual NMethodPatchingType nmethod_patching_type() { return NMethodPatchingType::conc_instruction_and_data_patch; } ++ ++ void patch_barrier_relocation(address addr, int format); ++ ++ void patch_barriers() {} ++ ++#ifdef COMPILER1 ++ void generate_c1_color(LIR_Assembler* ce, LIR_Opr ref) const; ++ void generate_c1_uncolor(LIR_Assembler* ce, LIR_Opr ref) const; ++ ++ void generate_c1_load_barrier(LIR_Assembler* ce, ++ LIR_Opr ref, ++ ZLoadBarrierStubC1* stub, ++ bool on_non_strong) const; ++ ++ void generate_c1_load_barrier_stub(LIR_Assembler* ce, ++ ZLoadBarrierStubC1* stub) const; ++ ++ void generate_c1_load_barrier_runtime_stub(StubAssembler* sasm, ++ DecoratorSet decorators) const; ++ ++ void generate_c1_store_barrier(LIR_Assembler* ce, ++ LIR_Address* addr, ++ LIR_Opr new_zaddress, ++ LIR_Opr new_zpointer, ++ ZStoreBarrierStubC1* stub) const; ++ ++ void generate_c1_store_barrier_stub(LIR_Assembler* ce, ++ ZStoreBarrierStubC1* stub) const; ++ void generate_c1_store_barrier_runtime_stub(StubAssembler* sasm, ++ bool self_healing) const; ++#endif // COMPILER1 ++ ++#ifdef COMPILER2 ++ OptoReg::Name refine_register(const Node* node, ++ OptoReg::Name opto_reg); ++ ++ void generate_c2_load_barrier_stub(MacroAssembler* masm, ++ ZLoadBarrierStubC2* stub) const; ++ void generate_c2_store_barrier_stub(MacroAssembler* masm, ++ ZStoreBarrierStubC2* stub) const; ++#endif // COMPILER2 ++}; ++ ++#endif // CPU_LOONGARCH_GC_Z_ZBARRIERSETASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/z/zGlobals_loongarch.hpp b/src/hotspot/cpu/loongarch/gc/z/zGlobals_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/gc/z/zGlobals_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/z/zGlobals_loongarch.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,30 @@ ++/* ++ * Copyright (c) 2015, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#ifndef CPU_LOONGARCH_GC_Z_ZGLOBALS_LOONGARCH_HPP ++#define CPU_LOONGARCH_GC_Z_ZGLOBALS_LOONGARCH_HPP ++ ++const size_t ZPlatformCacheLineSize = 64; ++ ++#endif // CPU_LOONGARCH_GC_Z_ZGLOBALS_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/gc/z/z_loongarch_64.ad b/src/hotspot/cpu/loongarch/gc/z/z_loongarch_64.ad +--- a/src/hotspot/cpu/loongarch/gc/z/z_loongarch_64.ad 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/gc/z/z_loongarch_64.ad 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,250 @@ ++// ++// Copyright (c) 2019, 2021, Oracle and/or its affiliates. All rights reserved. ++// Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++// DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++// ++// This code is free software; you can redistribute it and/or modify it ++// under the terms of the GNU General Public License version 2 only, as ++// published by the Free Software Foundation. ++// ++// This code is distributed in the hope that it will be useful, but WITHOUT ++// ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++// version 2 for more details (a copy is included in the LICENSE file that ++// accompanied this code). ++// ++// You should have received a copy of the GNU General Public License version ++// 2 along with this work; if not, write to the Free Software Foundation, ++// Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++// ++// Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++// or visit www.oracle.com if you need additional information or have any ++// questions. ++// ++ ++source_hpp %{ ++ ++#include "gc/shared/gc_globals.hpp" ++#include "gc/z/c2/zBarrierSetC2.hpp" ++#include "gc/z/zThreadLocalData.hpp" ++ ++%} ++ ++source %{ ++ ++#include "gc/z/zBarrierSetAssembler.hpp" ++ ++static void z_load_barrier(MacroAssembler& _masm, const MachNode* node, Address ref_addr, Register ref, Register tmp) { ++ if (node->barrier_data() == ZBarrierElided) { ++ __ z_uncolor(ref); ++ } else { ++ const bool on_non_strong = ++ ((node->barrier_data() & ZBarrierWeak) != 0) || ++ ((node->barrier_data() & ZBarrierPhantom) != 0); ++ ++ ZLoadBarrierStubC2* const stub = ZLoadBarrierStubC2::create(node, ref_addr, ref); ++ Label good; ++ __ check_color(ref, tmp, on_non_strong); ++ __ beqz(tmp, good); ++ __ b(*stub->entry()); ++ ++ __ bind(good); ++ __ z_uncolor(ref); ++ __ bind(*stub->continuation()); ++ } ++} ++ ++static void z_store_barrier(MacroAssembler& _masm, const MachNode* node, Address ref_addr, Register rnew_zaddress, Register rnew_zpointer, Register tmp, bool is_atomic) { ++ if (node->barrier_data() == ZBarrierElided) { ++ __ z_color(rnew_zpointer, rnew_zaddress, tmp); ++ } else { ++ bool is_native = (node->barrier_data() & ZBarrierNative) != 0; ++ ZStoreBarrierStubC2* const stub = ZStoreBarrierStubC2::create(node, ref_addr, rnew_zaddress, rnew_zpointer, is_native, is_atomic); ++ ZBarrierSetAssembler* bs_asm = ZBarrierSet::assembler(); ++ bs_asm->store_barrier_fast(&_masm, ref_addr, rnew_zaddress, rnew_zpointer, tmp, true /* in_nmethod */, is_atomic, *stub->entry(), *stub->continuation()); ++ } ++} ++ ++static void z_compare_and_swap(MacroAssembler& _masm, const MachNode* node, ++ Register res, Register mem, Register oldval, Register newval, ++ Register oldval_tmp, Register newval_tmp, Register tmp, bool acquire) { ++ Address addr(mem); ++ __ z_color(oldval_tmp, oldval, tmp); ++ z_store_barrier(_masm, node, addr, newval, newval_tmp, tmp, true /* is_atomic */); ++ __ cmpxchg(addr, oldval_tmp, newval_tmp, res, false /* retold */, acquire /* acquire */, ++ false /* weak */, false /* exchange */); ++} ++ ++static void z_compare_and_exchange(MacroAssembler& _masm, const MachNode* node, ++ Register res, Register mem, Register oldval, Register newval, ++ Register oldval_tmp, Register newval_tmp, Register tmp, bool acquire) { ++ Address addr(mem); ++ __ z_color(oldval_tmp, oldval, tmp); ++ z_store_barrier(_masm, node, addr, newval, newval_tmp, tmp, true /* is_atomic */); ++ __ cmpxchg(addr, oldval_tmp, newval_tmp, res, false /* retold */, acquire /* acquire */, ++ false /* weak */, true /* exchange */); ++ __ z_uncolor(res); ++} ++ ++%} ++ ++// Load Pointer ++instruct zLoadP(mRegP dst, memory mem, mRegP tmp) ++%{ ++ match(Set dst (LoadP mem)); ++ effect(TEMP_DEF dst, TEMP tmp); ++ ins_cost(125);//must be equal loadP in loongarch_64.ad ++ ++ predicate(UseZGC && ZGenerational && n->as_Load()->barrier_data() != 0); ++ ++ format %{ "zLoadP $dst, $mem" %} ++ ++ ins_encode %{ ++ Address ref_addr = Address(as_Register($mem$$base), as_Register($mem$$index), Address::no_scale, $mem$$disp); ++ __ block_comment("zLoadP"); ++ __ ld_d($dst$$Register, ref_addr); ++ z_load_barrier(_masm, this, ref_addr, $dst$$Register, $tmp$$Register); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++// Store Pointer ++instruct zStoreP(memory mem, mRegP src, mRegP tmp1, mRegP tmp2) ++%{ ++ predicate(UseZGC && ZGenerational && n->as_Store()->barrier_data() != 0); ++ match(Set mem (StoreP mem src)); ++ effect(TEMP_DEF tmp1, TEMP tmp2); ++ ++ ins_cost(125); // XXX ++ format %{ "zStoreP $mem, $src\t# ptr" %} ++ ins_encode %{ ++ Address ref_addr = Address(as_Register($mem$$base), as_Register($mem$$index), Address::no_scale, $mem$$disp); ++ __ block_comment("zStoreP"); ++ z_store_barrier(_masm, this, ref_addr, $src$$Register, $tmp1$$Register, $tmp2$$Register, false /* is_atomic */); ++ __ st_d($tmp1$$Register, ref_addr); ++ %} ++ ins_pipe(pipe_slow); ++%} ++ ++// Store Null Pointer ++instruct zStorePNull(memory mem, immP_0 zero, mRegP tmp1, mRegP tmp2) ++%{ ++ predicate(UseZGC && ZGenerational && n->as_Store()->barrier_data() != 0); ++ match(Set mem (StoreP mem zero)); ++ effect(TEMP_DEF tmp1, TEMP tmp2); ++ ++ ins_cost(125); // XXX ++ format %{ "zStoreP $mem, null\t# ptr" %} ++ ins_encode %{ ++ Address ref_addr = Address(as_Register($mem$$base), as_Register($mem$$index), Address::no_scale, $mem$$disp); ++ __ block_comment("zStoreP null"); ++ z_store_barrier(_masm, this, ref_addr, noreg, $tmp1$$Register, $tmp2$$Register, false /* is_atomic */); ++ __ st_d($tmp1$$Register, ref_addr); ++ %} ++ ins_pipe(pipe_slow); ++%} ++ ++instruct zCompareAndSwapP(mRegI res, mRegP mem, mRegP oldval, mRegP newval, mRegP oldval_tmp, mRegP newval_tmp, mRegP tmp) %{ ++ match(Set res (CompareAndSwapP mem (Binary oldval newval))); ++ match(Set res (WeakCompareAndSwapP mem (Binary oldval newval))); ++ effect(TEMP_DEF res, TEMP oldval_tmp, TEMP newval_tmp, TEMP tmp); ++ ++ predicate((UseZGC && ZGenerational && n->as_LoadStore()->barrier_data() != 0) ++ && (((CompareAndSwapNode*)n)->order() != MemNode::acquire && ((CompareAndSwapNode*) n)->order() != MemNode::seqcst)); ++ ins_cost(3 * MEMORY_REF_COST);//must be equal compareAndSwapP in loongarch_64.ad ++ ++ format %{ "zCompareAndSwapP $res, $mem, $oldval, $newval; as bool; ptr" %} ++ ins_encode %{ ++ __ block_comment("zCompareAndSwapP"); ++ z_compare_and_swap(_masm, this, ++ $res$$Register, $mem$$Register, $oldval$$Register, $newval$$Register, ++ $oldval_tmp$$Register, $newval_tmp$$Register, $tmp$$Register, false /* acquire */); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct zCompareAndSwapP_acq(mRegI res, mRegP mem, mRegP oldval, mRegP newval, mRegP oldval_tmp, mRegP newval_tmp, mRegP tmp) %{ ++ match(Set res (CompareAndSwapP mem (Binary oldval newval))); ++ match(Set res (WeakCompareAndSwapP mem (Binary oldval newval))); ++ effect(TEMP_DEF res, TEMP oldval_tmp, TEMP newval_tmp, TEMP tmp); ++ ++ predicate((UseZGC && ZGenerational && n->as_LoadStore()->barrier_data() != 0) ++ && (((CompareAndSwapNode*)n)->order() == MemNode::acquire || ((CompareAndSwapNode*) n)->order() == MemNode::seqcst)); ++ ins_cost(4 * MEMORY_REF_COST);//must be larger than zCompareAndSwapP ++ ++ format %{ "zCompareAndSwapP acq $res, $mem, $oldval, $newval; as bool; ptr" %} ++ ins_encode %{ ++ __ block_comment("zCompareAndSwapP_acq"); ++ z_compare_and_swap(_masm, this, ++ $res$$Register, $mem$$Register, $oldval$$Register, $newval$$Register, ++ $oldval_tmp$$Register, $newval_tmp$$Register, $tmp$$Register, true /* acquire */); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct zCompareAndExchangeP(mRegP res, mRegP mem, mRegP oldval, mRegP newval, mRegP oldval_tmp, mRegP newval_tmp, mRegP tmp) %{ ++ match(Set res (CompareAndExchangeP mem (Binary oldval newval))); ++ effect(TEMP_DEF res, TEMP oldval_tmp, TEMP newval_tmp, TEMP tmp); ++ ins_cost(2* MEMORY_REF_COST);//must be equal compareAndExchangeP in loongarch_64.ad ++ ++ predicate((UseZGC && ZGenerational && n->as_LoadStore()->barrier_data() != 0) ++ && ( ++ ((CompareAndSwapNode*)n)->order() != MemNode::acquire ++ && ((CompareAndSwapNode*)n)->order() != MemNode::seqcst ++ )); ++ ++ format %{ "zCompareAndExchangeP $res, $mem, $oldval, $newval; as ptr; ptr" %} ++ ins_encode %{ ++ __ block_comment("zCompareAndExchangeP"); ++ z_compare_and_exchange(_masm, this, ++ $res$$Register, $mem$$Register, $oldval$$Register, $newval$$Register, ++ $oldval_tmp$$Register, $newval_tmp$$Register, $tmp$$Register, false /* acquire */); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct zCompareAndExchangeP_acq(mRegP res, mRegP mem, mRegP oldval, mRegP newval, mRegP oldval_tmp, mRegP newval_tmp, mRegP tmp) %{ ++ match(Set res (CompareAndExchangeP mem (Binary oldval newval))); ++ effect(TEMP_DEF res, TEMP oldval_tmp, TEMP newval_tmp, TEMP tmp); ++ ++ predicate((UseZGC && ZGenerational && n->as_LoadStore()->barrier_data() != 0) ++ && ( ++ ((CompareAndSwapNode*)n)->order() == MemNode::acquire ++ || ((CompareAndSwapNode*)n)->order() == MemNode::seqcst ++ )); ++ ++ format %{ "zCompareAndExchangeP acq $res, $mem, $oldval, $newval; as ptr; ptr" %} ++ ins_encode %{ ++ __ block_comment("zCompareAndExchangeP_acq"); ++ z_compare_and_exchange(_masm, this, ++ $res$$Register, $mem$$Register, $oldval$$Register, $newval$$Register, ++ $oldval_tmp$$Register, $newval_tmp$$Register, $tmp$$Register, true /* acquire */); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} ++ ++instruct zGetAndSetP(mRegP mem, mRegP newv, mRegP prev, mRegP tmp1, mRegP tmp2) %{ ++ match(Set prev (GetAndSetP mem newv)); ++ effect(TEMP_DEF prev, TEMP tmp1, TEMP tmp2); ++ ++ predicate(UseZGC && ZGenerational && n->as_LoadStore()->barrier_data() != 0); ++ ++ format %{ "zGetAndSetP $prev, $mem, $newv" %} ++ ins_encode %{ ++ Register prev = $prev$$Register; ++ Register newv = $newv$$Register; ++ Register addr = $mem$$Register; ++ __ block_comment("zGetAndSetP"); ++ z_store_barrier(_masm, this, Address(addr, 0), newv, prev, $tmp1$$Register, true /* is_atomic */); ++ __ amswap_db_d($tmp2$$Register, prev, addr); ++ __ move(prev, $tmp2$$Register); ++ __ z_uncolor(prev); ++ %} ++ ++ ins_pipe(pipe_slow); ++%} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/globalDefinitions_loongarch.hpp b/src/hotspot/cpu/loongarch/globalDefinitions_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/globalDefinitions_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/globalDefinitions_loongarch.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,55 @@ ++/* ++ * Copyright (c) 1999, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_GLOBALDEFINITIONS_LOONGARCH_HPP ++#define CPU_LOONGARCH_GLOBALDEFINITIONS_LOONGARCH_HPP ++// Size of LoongArch Instructions ++const int BytesPerInstWord = 4; ++ ++const int StackAlignmentInBytes = (2*wordSize); ++ ++// Indicates whether the C calling conventions require that ++// 32-bit integer argument values are properly extended to 64 bits. ++// If set, SharedRuntime::c_calling_convention() must adapt ++// signatures accordingly. ++const bool CCallingConventionRequiresIntsAsLongs = false; ++ ++#define SUPPORTS_NATIVE_CX8 ++ ++#define SUPPORT_MONITOR_COUNT ++ ++// FIXME: LA ++// This makes the games we play when patching difficult, so when we ++// come across an access that needs patching we deoptimize. There are ++// ways we can avoid this, but these would slow down C1-compiled code ++// in the default case. We could revisit this decision if we get any ++// evidence that it's worth doing. ++#define DEOPTIMIZE_WHEN_PATCHING ++ ++#define SUPPORT_RESERVED_STACK_AREA ++ ++#define USE_POINTERS_TO_REGISTER_IMPL_ARRAY ++ ++#endif // CPU_LOONGARCH_GLOBALDEFINITIONS_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/globals_loongarch.hpp b/src/hotspot/cpu/loongarch/globals_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/globals_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/globals_loongarch.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,125 @@ ++/* ++ * Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_GLOBALS_LOONGARCH_HPP ++#define CPU_LOONGARCH_GLOBALS_LOONGARCH_HPP ++ ++#include "utilities/globalDefinitions.hpp" ++#include "utilities/macros.hpp" ++ ++// Sets the default values for platform dependent flags used by the runtime system. ++// (see globals.hpp) ++ ++define_pd_global(bool, ImplicitNullChecks, true); // Generate code for implicit null checks ++define_pd_global(bool, TrapBasedNullChecks, false); ++define_pd_global(bool, UncommonNullCast, true); // Uncommon-trap nulls passed to check cast ++ ++define_pd_global(bool, DelayCompilerStubsGeneration, COMPILER2_OR_JVMCI); ++ ++define_pd_global(uintx, CodeCacheSegmentSize, 64 COMPILER1_AND_COMPILER2_PRESENT(+64)); // Tiered compilation has large code-entry alignment. ++ ++// Ideally, this should be cache line size, ++// which keeps code end data on separate lines. ++define_pd_global(intx, CodeEntryAlignment, 64); ++define_pd_global(intx, OptoLoopAlignment, 16); ++define_pd_global(intx, InlineSmallCode, 2000); ++ ++#define DEFAULT_STACK_YELLOW_PAGES (2) ++#define DEFAULT_STACK_RED_PAGES (1) ++#define DEFAULT_STACK_SHADOW_PAGES (20 DEBUG_ONLY(+4)) ++#define DEFAULT_STACK_RESERVED_PAGES (1) ++ ++#define MIN_STACK_YELLOW_PAGES DEFAULT_STACK_YELLOW_PAGES ++#define MIN_STACK_RED_PAGES DEFAULT_STACK_RED_PAGES ++#define MIN_STACK_SHADOW_PAGES DEFAULT_STACK_SHADOW_PAGES ++#define MIN_STACK_RESERVED_PAGES (0) ++ ++define_pd_global(intx, StackYellowPages, DEFAULT_STACK_YELLOW_PAGES); ++define_pd_global(intx, StackRedPages, DEFAULT_STACK_RED_PAGES); ++define_pd_global(intx, StackShadowPages, DEFAULT_STACK_SHADOW_PAGES); ++define_pd_global(intx, StackReservedPages, DEFAULT_STACK_RESERVED_PAGES); ++ ++define_pd_global(bool, VMContinuations, true); ++ ++define_pd_global(bool, RewriteBytecodes, true); ++define_pd_global(bool, RewriteFrequentPairs, true); ++ ++define_pd_global(uintx, TypeProfileLevel, 111); ++ ++define_pd_global(bool, CompactStrings, true); ++ ++define_pd_global(bool, PreserveFramePointer, false); ++ ++define_pd_global(intx, InitArrayShortSize, 8*BytesPerLong); ++ ++#define ARCH_FLAGS(develop, \ ++ product, \ ++ notproduct, \ ++ range, \ ++ constraint) \ ++ \ ++ product(bool, UseCodeCacheAllocOpt, true, \ ++ "Allocate code cache within 32-bit memory address space") \ ++ \ ++ product(bool, UseLSX, false, \ ++ "Use LSX 128-bit vector instructions") \ ++ \ ++ product(bool, UseLASX, false, \ ++ "Use LASX 256-bit vector instructions") \ ++ \ ++ product(bool, UseCF2GR, false, \ ++ "Use CFR to GR instructions") \ ++ \ ++ product(bool, UseGR2CF, false, \ ++ "Use GR to CFR instructions") \ ++ \ ++ product(bool, UseAMBH, false, \ ++ "Use AM{SWAP/ADD}{_DB}.{B/H} instructions") \ ++ \ ++ product(bool, UseAMCAS, false, \ ++ "Use AMCAS{_DB}.{B/H/W/D} instructions") \ ++ \ ++ product(bool, UseBarriersForVolatile, false, \ ++ "Use memory barriers to implement volatile accesses") \ ++ \ ++ product(bool, UseCRC32, false, \ ++ "Use CRC32 instructions for CRC32 computation") \ ++ \ ++ product(bool, UseBigIntegerShiftIntrinsic, false, \ ++ "Enables intrinsification of BigInteger.shiftLeft/Right()") \ ++ \ ++ product(bool, UseActiveCoresMP, false, \ ++ "Eliminate barriers for single active cpu") \ ++ \ ++ product(uintx, NUMAMinHeapSizePerNode, 128 * M, \ ++ "The minimum heap size required for each NUMA node to init VM") \ ++ \ ++ product(bool, TraceTraps, false, "Trace all traps the signal handler") \ ++ product(uintx, NUMAMinG1RegionNumberPerNode, 8, \ ++ "Min initial region number for per NUMA node while using G1GC") ++ ++// end of ARCH_FLAGS ++ ++#endif // CPU_LOONGARCH_GLOBALS_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/icache_loongarch.cpp b/src/hotspot/cpu/loongarch/icache_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/icache_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/icache_loongarch.cpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,42 @@ ++/* ++ * Copyright (c) 1997, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "runtime/icache.hpp" ++ ++void ICacheStubGenerator::generate_icache_flush(ICache::flush_icache_stub_t* flush_icache_stub) ++{ ++#define __ _masm-> ++ StubCodeMark mark(this, "ICache", "flush_icache_stub"); ++ address start = __ pc(); ++ ++ __ ibar(0); ++ __ ori(V0, A2, 0); ++ __ jr(RA); ++ ++ *flush_icache_stub = (ICache::flush_icache_stub_t)start; ++#undef __ ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/icache_loongarch.hpp b/src/hotspot/cpu/loongarch/icache_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/icache_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/icache_loongarch.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,41 @@ ++/* ++ * Copyright (c) 1997, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_ICACHE_LOONGARCH_HPP ++#define CPU_LOONGARCH_ICACHE_LOONGARCH_HPP ++ ++// Interface for updating the instruction cache. Whenever the VM modifies ++// code, part of the processor instruction cache potentially has to be flushed. ++ ++class ICache : public AbstractICache { ++ public: ++ enum { ++ stub_size = 3 * BytesPerInstWord, // Size of the icache flush stub in bytes ++ line_size = 32, // flush instruction affects a dword ++ log2_line_size = 5 // log2(line_size) ++ }; ++}; ++ ++#endif // CPU_LOONGARCH_ICACHE_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/icBuffer_loongarch.cpp b/src/hotspot/cpu/loongarch/icBuffer_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/icBuffer_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/icBuffer_loongarch.cpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,82 @@ ++/* ++ * Copyright (c) 1997, 2012, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "code/icBuffer.hpp" ++#include "gc/shared/collectedHeap.inline.hpp" ++#include "interpreter/bytecodes.hpp" ++#include "memory/resourceArea.hpp" ++#include "nativeInst_loongarch.hpp" ++#include "oops/oop.inline.hpp" ++ ++ ++int InlineCacheBuffer::ic_stub_code_size() { ++ return NativeMovConstReg::instruction_size + // patchable_li52() == 3 ins ++ NativeGeneralJump::instruction_size; // patchable_jump() == 2 ins ++} ++ ++ ++// The use IC_Klass refer to SharedRuntime::gen_i2c2i_adapters ++void InlineCacheBuffer::assemble_ic_buffer_code(address code_begin, ++ void* cached_value, ++ address entry_point) { ++ ResourceMark rm; ++ CodeBuffer code(code_begin, ic_stub_code_size()); ++ MacroAssembler* masm = new MacroAssembler(&code); ++ // Note: even though the code contains an embedded value, we do not need reloc info ++ // because ++ // (1) the value is old (i.e., doesn't matter for scavenges) ++ // (2) these ICStubs are removed *before* a GC happens, so the roots disappear ++ ++#define __ masm-> ++ address start = __ pc(); ++ __ patchable_li52(IC_Klass, (long)cached_value); ++ __ jmp(entry_point, relocInfo::runtime_call_type); ++ ++ ICache::invalidate_range(code_begin, InlineCacheBuffer::ic_stub_code_size()); ++ assert(__ pc() - start == ic_stub_code_size(), "must be"); ++#undef __ ++} ++ ++ ++address InlineCacheBuffer::ic_buffer_entry_point(address code_begin) { ++ // move -> jump -> entry ++ NativeMovConstReg* move = nativeMovConstReg_at(code_begin); ++ NativeGeneralJump* jump = nativeGeneralJump_at(move->next_instruction_address()); ++ return jump->jump_destination(); ++} ++ ++ ++void* InlineCacheBuffer::ic_buffer_cached_value(address code_begin) { ++ // double check the instructions flow ++ NativeMovConstReg* move = nativeMovConstReg_at(code_begin); ++ NativeGeneralJump* jump = nativeGeneralJump_at(move->next_instruction_address()); ++ ++ // cached value is the data arg of NativeMovConstReg ++ void* cached_value = (void*)move->data(); ++ return cached_value; ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/interp_masm_loongarch_64.cpp b/src/hotspot/cpu/loongarch/interp_masm_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/interp_masm_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/interp_masm_loongarch_64.cpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,1963 @@ ++/* ++ * Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "gc/shared/barrierSet.hpp" ++#include "gc/shared/barrierSetAssembler.hpp" ++#include "interp_masm_loongarch.hpp" ++#include "interpreter/interpreter.hpp" ++#include "interpreter/interpreterRuntime.hpp" ++#include "oops/arrayOop.hpp" ++#include "oops/markWord.hpp" ++#include "oops/methodData.hpp" ++#include "oops/method.hpp" ++#include "prims/jvmtiExport.hpp" ++#include "prims/jvmtiThreadState.hpp" ++#include "runtime/basicLock.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/javaThread.hpp" ++#include "runtime/safepointMechanism.hpp" ++#include "runtime/sharedRuntime.hpp" ++ ++// Implementation of InterpreterMacroAssembler ++ ++void InterpreterMacroAssembler::get_2_byte_integer_at_bcp(Register reg, Register tmp, int offset) { ++ if (UseUnalignedAccesses) { ++ ld_hu(reg, BCP, offset); ++ } else { ++ ld_bu(reg, BCP, offset); ++ ld_bu(tmp, BCP, offset + 1); ++ bstrins_d(reg, tmp, 15, 8); ++ } ++} ++ ++void InterpreterMacroAssembler::get_4_byte_integer_at_bcp(Register reg, int offset) { ++ if (UseUnalignedAccesses) { ++ ld_wu(reg, BCP, offset); ++ } else { ++ ldr_w(reg, BCP, offset); ++ ldl_w(reg, BCP, offset + 3); ++ lu32i_d(reg, 0); ++ } ++} ++ ++void InterpreterMacroAssembler::jump_to_entry(address entry) { ++ assert(entry, "Entry must have been generated by now"); ++ jmp(entry); ++} ++ ++void InterpreterMacroAssembler::call_VM_leaf_base(address entry_point, ++ int number_of_arguments) { ++ // interpreter specific ++ // ++ // Note: No need to save/restore bcp & locals pointer ++ // since these are callee saved registers and no blocking/ ++ // GC can happen in leaf calls. ++ // Further Note: DO NOT save/restore bcp/locals. If a caller has ++ // already saved them so that it can use BCP/LVP as temporaries ++ // then a save/restore here will DESTROY the copy the caller ++ // saved! There used to be a save_bcp() that only happened in ++ // the ASSERT path (no restore_bcp). Which caused bizarre failures ++ // when jvm built with ASSERTs. ++#ifdef ASSERT ++ save_bcp(); ++ { ++ Label L; ++ ld_d(AT,FP,frame::interpreter_frame_last_sp_offset * wordSize); ++ beq(AT,R0,L); ++ stop("InterpreterMacroAssembler::call_VM_leaf_base: last_sp != nullptr"); ++ bind(L); ++ } ++#endif ++ // super call ++ MacroAssembler::call_VM_leaf_base(entry_point, number_of_arguments); ++ // interpreter specific ++ // Used to ASSERT that BCP/LVP were equal to frame's bcp/locals ++ // but since they may not have been saved (and we don't want to ++ // save them here (see note above) the assert is invalid. ++} ++ ++void InterpreterMacroAssembler::call_VM_base(Register oop_result, ++ Register java_thread, ++ Register last_java_sp, ++ address entry_point, ++ int number_of_arguments, ++ bool check_exceptions) { ++ // interpreter specific ++ // ++ // Note: Could avoid restoring locals ptr (callee saved) - however doesn't ++ // really make a difference for these runtime calls, since they are ++ // slow anyway. Btw., bcp must be saved/restored since it may change ++ // due to GC. ++ assert(java_thread == noreg , "not expecting a precomputed java thread"); ++ save_bcp(); ++#ifdef ASSERT ++ { ++ Label L; ++ ld_d(AT, FP, frame::interpreter_frame_last_sp_offset * wordSize); ++ beq(AT, R0, L); ++ stop("InterpreterMacroAssembler::call_VM_base: last_sp != nullptr"); ++ bind(L); ++ } ++#endif /* ASSERT */ ++ // super call ++ MacroAssembler::call_VM_base(oop_result, java_thread, last_java_sp, ++ entry_point, number_of_arguments, ++ check_exceptions); ++ // interpreter specific ++ restore_bcp(); ++ restore_locals(); ++} ++ ++ ++void InterpreterMacroAssembler::check_and_handle_popframe(Register java_thread) { ++ if (JvmtiExport::can_pop_frame()) { ++ Label L; ++ // Initiate popframe handling only if it is not already being ++ // processed. If the flag has the popframe_processing bit set, it ++ // means that this code is called *during* popframe handling - we ++ // don't want to reenter. ++ // This method is only called just after the call into the vm in ++ // call_VM_base, so the arg registers are available. ++ // Not clear if any other register is available, so load AT twice ++ assert(AT != java_thread, "check"); ++ ld_w(AT, java_thread, in_bytes(JavaThread::popframe_condition_offset())); ++ andi(AT, AT, JavaThread::popframe_pending_bit); ++ beq(AT, R0, L); ++ ++ ld_w(AT, java_thread, in_bytes(JavaThread::popframe_condition_offset())); ++ andi(AT, AT, JavaThread::popframe_processing_bit); ++ bne(AT, R0, L); ++ call_VM_leaf(CAST_FROM_FN_PTR(address, Interpreter::remove_activation_preserving_args_entry)); ++ jr(V0); ++ bind(L); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::load_earlyret_value(TosState state) { ++ ld_d(T8, Address(TREG, JavaThread::jvmti_thread_state_offset())); ++ const Address tos_addr (T8, in_bytes(JvmtiThreadState::earlyret_tos_offset())); ++ const Address oop_addr (T8, in_bytes(JvmtiThreadState::earlyret_oop_offset())); ++ const Address val_addr (T8, in_bytes(JvmtiThreadState::earlyret_value_offset())); ++ //V0, oop_addr,V1,val_addr ++ switch (state) { ++ case atos: ++ ld_d(V0, oop_addr); ++ st_d(R0, oop_addr); ++ verify_oop(V0); ++ break; ++ case ltos: ++ ld_d(V0, val_addr); // fall through ++ break; ++ case btos: // fall through ++ case ztos: // fall through ++ case ctos: // fall through ++ case stos: // fall through ++ case itos: ++ ld_w(V0, val_addr); ++ break; ++ case ftos: ++ fld_s(F0, T8, in_bytes(JvmtiThreadState::earlyret_value_offset())); ++ break; ++ case dtos: ++ fld_d(F0, T8, in_bytes(JvmtiThreadState::earlyret_value_offset())); ++ break; ++ case vtos: /* nothing to do */ break; ++ default : ShouldNotReachHere(); ++ } ++ // Clean up tos value in the thread object ++ li(AT, (int)ilgl); ++ st_w(AT, tos_addr); ++ st_w(R0, T8, in_bytes(JvmtiThreadState::earlyret_value_offset())); ++} ++ ++ ++void InterpreterMacroAssembler::check_and_handle_earlyret(Register java_thread) { ++ if (JvmtiExport::can_force_early_return()) { ++ assert(java_thread != AT, "check"); ++ ++ Label L; ++ ld_d(AT, Address(java_thread, JavaThread::jvmti_thread_state_offset())); ++ beqz(AT, L); ++ ++ // Initiate earlyret handling only if it is not already being processed. ++ // If the flag has the earlyret_processing bit set, it means that this code ++ // is called *during* earlyret handling - we don't want to reenter. ++ ld_w(AT, AT, in_bytes(JvmtiThreadState::earlyret_state_offset())); ++ addi_w(AT, AT, -JvmtiThreadState::earlyret_pending); ++ bnez(AT, L); ++ ++ // Call Interpreter::remove_activation_early_entry() to get the address of the ++ // same-named entrypoint in the generated interpreter code. ++ ld_d(A0, Address(java_thread, JavaThread::jvmti_thread_state_offset())); ++ ld_w(A0, A0, in_bytes(JvmtiThreadState::earlyret_tos_offset())); ++ call_VM_leaf(CAST_FROM_FN_PTR(address, Interpreter::remove_activation_early_entry), A0); ++ jr(A0); ++ bind(L); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::get_unsigned_2_byte_index_at_bcp(Register reg, ++ int bcp_offset) { ++ assert(bcp_offset >= 0, "bcp is still pointing to start of bytecode"); ++ ld_bu(AT, BCP, bcp_offset); ++ ld_bu(reg, BCP, bcp_offset + 1); ++ bstrins_w(reg, AT, 15, 8); ++} ++ ++void InterpreterMacroAssembler::get_dispatch() { ++ li(Rdispatch, (long)Interpreter::dispatch_table()); ++} ++ ++void InterpreterMacroAssembler::get_cache_index_at_bcp(Register index, ++ int bcp_offset, ++ size_t index_size) { ++ assert(bcp_offset > 0, "bcp is still pointing to start of bytecode"); ++ if (index_size == sizeof(u2)) { ++ get_2_byte_integer_at_bcp(index, AT, bcp_offset); ++ } else if (index_size == sizeof(u4)) { ++ get_4_byte_integer_at_bcp(index, bcp_offset); ++ // Check if the secondary index definition is still ~x, otherwise ++ // we have to change the following assembler code to calculate the ++ // plain index. ++ assert(ConstantPool::decode_invokedynamic_index(~123) == 123, "else change next line"); ++ nor(index, index, R0); ++ slli_w(index, index, 0); ++ } else if (index_size == sizeof(u1)) { ++ ld_bu(index, BCP, bcp_offset); ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::get_cache_and_index_at_bcp(Register cache, ++ Register index, ++ int bcp_offset, ++ size_t index_size) { ++ assert_different_registers(cache, index); ++ get_cache_index_at_bcp(index, bcp_offset, index_size); ++ ld_d(cache, FP, frame::interpreter_frame_cache_offset * wordSize); ++ assert(sizeof(ConstantPoolCacheEntry) == 4 * wordSize, "adjust code below"); ++ assert(exact_log2(in_words(ConstantPoolCacheEntry::size())) == 2, "else change next line"); ++ slli_d(index, index, 2); ++} ++ ++ ++void InterpreterMacroAssembler::get_cache_and_index_and_bytecode_at_bcp(Register cache, ++ Register index, ++ Register bytecode, ++ int byte_no, ++ int bcp_offset, ++ size_t index_size) { ++ get_cache_and_index_at_bcp(cache, index, bcp_offset, index_size); ++ // We use a 32-bit load here since the layout of 64-bit words on ++ // little-endian machines allow us that. ++ alsl_d(AT, index, cache, Address::times_ptr - 1); ++ ld_w(bytecode, AT, in_bytes(ConstantPoolCache::base_offset() + ConstantPoolCacheEntry::indices_offset())); ++ if(os::is_MP()) { ++ membar(Assembler::Membar_mask_bits(LoadLoad|LoadStore)); ++ } ++ ++ const int shift_count = (1 + byte_no) * BitsPerByte; ++ assert((byte_no == TemplateTable::f1_byte && shift_count == ConstantPoolCacheEntry::bytecode_1_shift) || ++ (byte_no == TemplateTable::f2_byte && shift_count == ConstantPoolCacheEntry::bytecode_2_shift), ++ "correct shift count"); ++ srli_d(bytecode, bytecode, shift_count); ++ assert(ConstantPoolCacheEntry::bytecode_1_mask == ConstantPoolCacheEntry::bytecode_2_mask, "common mask"); ++ li(AT, ConstantPoolCacheEntry::bytecode_1_mask); ++ andr(bytecode, bytecode, AT); ++} ++ ++void InterpreterMacroAssembler::get_cache_entry_pointer_at_bcp(Register cache, ++ Register tmp, ++ int bcp_offset, ++ size_t index_size) { ++ assert(bcp_offset > 0, "bcp is still pointing to start of bytecode"); ++ assert(cache != tmp, "must use different register"); ++ get_cache_index_at_bcp(tmp, bcp_offset, index_size); ++ assert(sizeof(ConstantPoolCacheEntry) == 4 * wordSize, "adjust code below"); ++ // convert from field index to ConstantPoolCacheEntry index ++ // and from word offset to byte offset ++ assert(exact_log2(in_bytes(ConstantPoolCacheEntry::size_in_bytes())) == 2 + LogBytesPerWord, "else change next line"); ++ slli_d(tmp, tmp, 2 + LogBytesPerWord); ++ ld_d(cache, FP, frame::interpreter_frame_cache_offset * wordSize); ++ // skip past the header ++ addi_d(cache, cache, in_bytes(ConstantPoolCache::base_offset())); ++ add_d(cache, cache, tmp); ++} ++ ++void InterpreterMacroAssembler::get_method_counters(Register method, ++ Register mcs, Label& skip) { ++ Label has_counters; ++ ld_d(mcs, method, in_bytes(Method::method_counters_offset())); ++ bne(mcs, R0, has_counters); ++ call_VM(noreg, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::build_method_counters), method); ++ ld_d(mcs, method, in_bytes(Method::method_counters_offset())); ++ beq(mcs, R0, skip); // No MethodCounters allocated, OutOfMemory ++ bind(has_counters); ++} ++ ++void InterpreterMacroAssembler::load_resolved_indy_entry(Register cache, Register index) { ++ // Get index out of bytecode pointer, get_cache_entry_pointer_at_bcp ++ get_cache_index_at_bcp(index, 1, sizeof(u4)); ++ // Get address of invokedynamic array ++ ld_d(cache, FP, frame::interpreter_frame_cache_offset * wordSize); ++ ld_d(cache, Address(cache, in_bytes(ConstantPoolCache::invokedynamic_entries_offset()))); ++ // Scale the index to be the entry index * sizeof(ResolvedInvokeDynamicInfo) ++ slli_d(index, index, log2i_exact(sizeof(ResolvedIndyEntry))); ++ addi_d(cache, cache, Array::base_offset_in_bytes()); ++ add_d(cache, cache, index); ++} ++ ++// Load object from cpool->resolved_references(index) ++void InterpreterMacroAssembler::load_resolved_reference_at_index( ++ Register result, Register index, Register tmp) { ++ assert_different_registers(result, index); ++ ++ get_constant_pool(result); ++ // load pointer for resolved_references[] objArray ++ ld_d(result, Address(result, ConstantPool::cache_offset())); ++ ld_d(result, Address(result, ConstantPoolCache::resolved_references_offset())); ++ resolve_oop_handle(result, tmp, SCR1); ++ // Add in the index ++ alsl_d(result, index, result, LogBytesPerHeapOop - 1); ++ load_heap_oop(result, Address(result, arrayOopDesc::base_offset_in_bytes(T_OBJECT)), tmp, SCR1); ++} ++ ++// load cpool->resolved_klass_at(index) ++void InterpreterMacroAssembler::load_resolved_klass_at_index(Register cpool, ++ Register index, Register klass) { ++ alsl_d(AT, index, cpool, Address::times_ptr - 1); ++ ld_h(index, AT, sizeof(ConstantPool)); ++ Register resolved_klasses = cpool; ++ ld_d(resolved_klasses, Address(cpool, ConstantPool::resolved_klasses_offset())); ++ alsl_d(AT, index, resolved_klasses, Address::times_ptr - 1); ++ ld_d(klass, AT, Array::base_offset_in_bytes()); ++} ++ ++void InterpreterMacroAssembler::load_resolved_method_at_index(int byte_no, ++ Register method, ++ Register cache, ++ Register index) { ++ const int method_offset = in_bytes( ++ ConstantPoolCache::base_offset() + ++ ((byte_no == TemplateTable::f2_byte) ++ ? ConstantPoolCacheEntry::f2_offset() ++ : ConstantPoolCacheEntry::f1_offset())); ++ ++ ld_d(method, Address(cache, index, Address::times_ptr, method_offset)); // get f1 Method* ++} ++ ++// Resets LVP to locals. Register sub_klass cannot be any of the above. ++void InterpreterMacroAssembler::gen_subtype_check( Register Rsup_klass, Register Rsub_klass, Label &ok_is_subtype ) { ++ ++ assert( Rsub_klass != Rsup_klass, "Rsup_klass holds superklass" ); ++ assert( Rsub_klass != T1, "T1 holds 2ndary super array length" ); ++ assert( Rsub_klass != T0, "T0 holds 2ndary super array scan ptr" ); ++ // Profile the not-null value's klass. ++ // Here T4 and T1 are used as temporary registers. ++ profile_typecheck(T4, Rsub_klass, T1); // blows T4, reloads T1 ++ ++ // Do the check. ++ check_klass_subtype(Rsub_klass, Rsup_klass, T1, ok_is_subtype); // blows T1 ++ ++ // Profile the failure of the check. ++ profile_typecheck_failed(T4); // blows T4 ++ ++} ++ ++ ++ ++// Java Expression Stack ++ ++void InterpreterMacroAssembler::pop_ptr(Register r) { ++ ld_d(r, SP, 0); ++ addi_d(SP, SP, Interpreter::stackElementSize); ++} ++ ++void InterpreterMacroAssembler::pop_i(Register r) { ++ ld_w(r, SP, 0); ++ addi_d(SP, SP, Interpreter::stackElementSize); ++} ++ ++void InterpreterMacroAssembler::pop_l(Register r) { ++ ld_d(r, SP, 0); ++ addi_d(SP, SP, 2 * Interpreter::stackElementSize); ++} ++ ++void InterpreterMacroAssembler::pop_f(FloatRegister r) { ++ fld_s(r, SP, 0); ++ addi_d(SP, SP, Interpreter::stackElementSize); ++} ++ ++void InterpreterMacroAssembler::pop_d(FloatRegister r) { ++ fld_d(r, SP, 0); ++ addi_d(SP, SP, 2 * Interpreter::stackElementSize); ++} ++ ++void InterpreterMacroAssembler::push_ptr(Register r) { ++ addi_d(SP, SP, - Interpreter::stackElementSize); ++ st_d(r, SP, 0); ++} ++ ++void InterpreterMacroAssembler::push_i(Register r) { ++ // For compatibility reason, don't change to sw. ++ addi_d(SP, SP, - Interpreter::stackElementSize); ++ st_d(r, SP, 0); ++} ++ ++void InterpreterMacroAssembler::push_l(Register r) { ++ addi_d(SP, SP, -2 * Interpreter::stackElementSize); ++ st_d(r, SP, 0); ++ st_d(R0, SP, Interpreter::stackElementSize); ++} ++ ++void InterpreterMacroAssembler::push_f(FloatRegister r) { ++ addi_d(SP, SP, - Interpreter::stackElementSize); ++ fst_s(r, SP, 0); ++} ++ ++void InterpreterMacroAssembler::push_d(FloatRegister r) { ++ addi_d(SP, SP, -2 * Interpreter::stackElementSize); ++ fst_d(r, SP, 0); ++ st_d(R0, SP, Interpreter::stackElementSize); ++} ++ ++void InterpreterMacroAssembler::pop(TosState state) { ++ switch (state) { ++ case atos: ++ pop_ptr(); ++ verify_oop(FSR); ++ break; ++ case btos: ++ case ztos: ++ case ctos: ++ case stos: ++ case itos: pop_i(); break; ++ case ltos: pop_l(); break; ++ case ftos: pop_f(); break; ++ case dtos: pop_d(); break; ++ case vtos: /* nothing to do */ break; ++ default: ShouldNotReachHere(); ++ } ++} ++ ++void InterpreterMacroAssembler::push(TosState state) { ++ switch (state) { ++ case atos: ++ verify_oop(FSR); ++ push_ptr(); ++ break; ++ case btos: ++ case ztos: ++ case ctos: ++ case stos: ++ case itos: push_i(); break; ++ case ltos: push_l(); break; ++ case ftos: push_f(); break; ++ case dtos: push_d(); break; ++ case vtos: /* nothing to do */ break; ++ default : ShouldNotReachHere(); ++ } ++} ++ ++void InterpreterMacroAssembler::load_ptr(int n, Register val) { ++ ld_d(val, SP, Interpreter::expr_offset_in_bytes(n)); ++} ++ ++void InterpreterMacroAssembler::store_ptr(int n, Register val) { ++ st_d(val, SP, Interpreter::expr_offset_in_bytes(n)); ++} ++ ++void InterpreterMacroAssembler::prepare_to_jump_from_interpreted() { ++ // set sender sp ++ move(Rsender, SP); ++ // record last_sp ++ st_d(SP, FP, frame::interpreter_frame_last_sp_offset * wordSize); ++} ++ ++// Jump to from_interpreted entry of a call unless single stepping is possible ++// in this thread in which case we must call the i2i entry ++void InterpreterMacroAssembler::jump_from_interpreted(Register method) { ++ prepare_to_jump_from_interpreted(); ++ if (JvmtiExport::can_post_interpreter_events()) { ++ Label run_compiled_code; ++ // JVMTI events, such as single-stepping, are implemented partly by avoiding running ++ // compiled code in threads for which the event is enabled. Check here for ++ // interp_only_mode if these events CAN be enabled. ++ ld_wu(AT, Address(TREG, JavaThread::interp_only_mode_offset())); ++ beqz(AT, run_compiled_code); ++ ld_d(AT, Address(method, Method::interpreter_entry_offset())); ++ jr(AT); ++ bind(run_compiled_code); ++ } ++ ++ ld_d(AT, Address(method, Method::from_interpreted_offset())); ++ jr(AT); ++} ++ ++ ++// The following two routines provide a hook so that an implementation ++// can schedule the dispatch in two parts. LoongArch64 does not do this. ++void InterpreterMacroAssembler::dispatch_prolog(TosState state, int step) { ++ // Nothing LoongArch64 specific to be done here ++} ++ ++void InterpreterMacroAssembler::dispatch_epilog(TosState state, int step) { ++ dispatch_next(state, step); ++} ++ ++void InterpreterMacroAssembler::dispatch_base(TosState state, ++ address* table, ++ bool verifyoop, ++ bool generate_poll) { ++ if (VerifyActivationFrameSize) { ++ Label L; ++ sub_d(SCR1, FP, SP); ++ int min_frame_size = (frame::sender_sp_offset - ++ frame::interpreter_frame_initial_sp_offset) * wordSize; ++ addi_d(SCR1, SCR1, -min_frame_size); ++ bge(SCR1, R0, L); ++ stop("broken stack frame"); ++ bind(L); ++ } ++ ++ if (verifyoop && state == atos) { ++ verify_oop(A0); ++ } ++ ++ Label safepoint; ++ address* const safepoint_table = Interpreter::safept_table(state); ++ bool needs_thread_local_poll = generate_poll && table != safepoint_table; ++ ++ if (needs_thread_local_poll) { ++ NOT_PRODUCT(block_comment("Thread-local Safepoint poll")); ++ ld_d(SCR1, TREG, in_bytes(JavaThread::polling_word_offset())); ++ andi(SCR1, SCR1, SafepointMechanism::poll_bit()); ++ bnez(SCR1, safepoint); ++ } ++ ++ if (table == Interpreter::dispatch_table(state)) { ++ ori(SCR2, Rnext, state * DispatchTable::length); ++ ld_d(SCR2, Address(Rdispatch, SCR2, Address::times_8)); ++ } else { ++ li(SCR2, (long)table); ++ ld_d(SCR2, Address(SCR2, Rnext, Address::times_8)); ++ } ++ jr(SCR2); ++ ++ if (needs_thread_local_poll) { ++ bind(safepoint); ++ li(SCR2, (long)safepoint_table); ++ ld_d(SCR2, Address(SCR2, Rnext, Address::times_8)); ++ jr(SCR2); ++ } ++} ++ ++void InterpreterMacroAssembler::dispatch_only(TosState state, bool generate_poll) { ++ dispatch_base(state, Interpreter::dispatch_table(state), true, generate_poll); ++} ++ ++void InterpreterMacroAssembler::dispatch_only_normal(TosState state) { ++ dispatch_base(state, Interpreter::normal_table(state), true, false); ++} ++ ++void InterpreterMacroAssembler::dispatch_only_noverify(TosState state) { ++ dispatch_base(state, Interpreter::normal_table(state), false, false); ++} ++ ++ ++void InterpreterMacroAssembler::dispatch_next(TosState state, int step, bool generate_poll) { ++ // load next bytecode ++ ld_bu(Rnext, BCP, step); ++ increment(BCP, step); ++ dispatch_base(state, Interpreter::dispatch_table(state), true, generate_poll); ++} ++ ++void InterpreterMacroAssembler::dispatch_via(TosState state, address* table) { ++ // load current bytecode ++ ld_bu(Rnext, BCP, 0); ++ dispatch_base(state, table); ++} ++ ++// remove activation ++// ++// Apply stack watermark barrier. ++// Unlock the receiver if this is a synchronized method. ++// Unlock any Java monitors from synchronized blocks. ++// Remove the activation from the stack. ++// ++// If there are locked Java monitors ++// If throw_monitor_exception ++// throws IllegalMonitorStateException ++// Else if install_monitor_exception ++// installs IllegalMonitorStateException ++// Else ++// no error processing ++void InterpreterMacroAssembler::remove_activation(TosState state, ++ bool throw_monitor_exception, ++ bool install_monitor_exception, ++ bool notify_jvmdi) { ++ const Register monitor_reg = j_rarg0; ++ ++ Label unlocked, unlock, no_unlock; ++ ++ // The below poll is for the stack watermark barrier. It allows fixing up frames lazily, ++ // that would normally not be safe to use. Such bad returns into unsafe territory of ++ // the stack, will call InterpreterRuntime::at_unwind. ++ Label slow_path; ++ Label fast_path; ++ safepoint_poll(slow_path, TREG, true /* at_return */, false /* acquire */, false /* in_nmethod */); ++ b(fast_path); ++ ++ bind(slow_path); ++ push(state); ++ Label L; ++ address the_pc = pc(); ++ bind(L); ++ set_last_Java_frame(TREG, SP, FP, L); ++ super_call_VM_leaf(CAST_FROM_FN_PTR(address, InterpreterRuntime::at_unwind), TREG); ++ reset_last_Java_frame(true); ++ pop(state); ++ ++ bind(fast_path); ++ ++ // get the value of _do_not_unlock_if_synchronized and then reset the flag ++ const Address do_not_unlock_if_synchronized(TREG, ++ in_bytes(JavaThread::do_not_unlock_if_synchronized_offset())); ++ ld_bu(TSR, do_not_unlock_if_synchronized); ++ st_b(R0, do_not_unlock_if_synchronized); ++ ++ // get method access flags ++ ld_d(AT, FP, frame::interpreter_frame_method_offset * wordSize); ++ ld_wu(AT, AT, in_bytes(Method::access_flags_offset())); ++ andi(AT, AT, JVM_ACC_SYNCHRONIZED); ++ beqz(AT, unlocked); ++ ++ // Don't unlock anything if the _do_not_unlock_if_synchronized flag is set. ++ bnez(TSR, no_unlock); ++ ++ // unlock monitor ++ push(state); // save result ++ ++ // BasicObjectLock will be first in list, since this is a ++ // synchronized method. However, need to check that the object has ++ // not been unlocked by an explicit monitorexit bytecode. ++ addi_d(monitor_reg, FP, frame::interpreter_frame_initial_sp_offset * wordSize ++ - (int) sizeof(BasicObjectLock)); ++ ++ // address of first monitor ++ ld_d(AT, Address(monitor_reg, BasicObjectLock::obj_offset())); ++ bnez(AT, unlock); ++ ++ pop(state); ++ if (throw_monitor_exception) { ++ // Entry already unlocked, need to throw exception ++ call_VM(NOREG, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::throw_illegal_monitor_state_exception)); ++ should_not_reach_here(); ++ } else { ++ // Monitor already unlocked during a stack unroll. If requested, ++ // install an illegal_monitor_state_exception. Continue with ++ // stack unrolling. ++ if (install_monitor_exception) { ++ call_VM(NOREG, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::new_illegal_monitor_state_exception)); ++ } ++ b(unlocked); ++ } ++ ++ bind(unlock); ++ unlock_object(monitor_reg); ++ pop(state); ++ ++ // Check that for block-structured locking (i.e., that all locked ++ // objects has been unlocked) ++ bind(unlocked); ++ ++ // A0: Might contain return value ++ ++ // Check that all monitors are unlocked ++ { ++ Label loop, exception, entry, restart; ++ const int entry_size = frame::interpreter_frame_monitor_size_in_bytes(); ++ const Address monitor_block_top(FP, ++ frame::interpreter_frame_monitor_block_top_offset * wordSize); ++ ++ bind(restart); ++ // points to current entry, starting with top-most entry ++ ld_d(monitor_reg, monitor_block_top); ++ // points to word before bottom of monitor block, should be callee-saved ++ addi_d(TSR, FP, frame::interpreter_frame_initial_sp_offset * wordSize); ++ b(entry); ++ ++ // Entry already locked, need to throw exception ++ bind(exception); ++ ++ if (throw_monitor_exception) { ++ // Throw exception ++ MacroAssembler::call_VM(NOREG, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::throw_illegal_monitor_state_exception)); ++ should_not_reach_here(); ++ } else { ++ // Stack unrolling. Unlock object and install illegal_monitor_exception ++ // Unlock does not block, so don't have to worry about the frame ++ // We don't have to preserve the monitor_reg, since we are going to ++ // throw an exception ++ ++ push(state); ++ unlock_object(monitor_reg); ++ pop(state); ++ ++ if (install_monitor_exception) { ++ call_VM(NOREG, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::new_illegal_monitor_state_exception)); ++ } ++ ++ b(restart); ++ } ++ ++ bind(loop); ++ // check if current entry is used ++ ld_d(AT, Address(monitor_reg, BasicObjectLock::obj_offset())); ++ bnez(AT, exception); ++ ++ // otherwise advance to next entry ++ addi_d(monitor_reg, monitor_reg, entry_size); ++ bind(entry); ++ bne(monitor_reg, TSR, loop); // check if bottom reached ++ } ++ ++ bind(no_unlock); ++ ++ // jvmpi support ++ if (notify_jvmdi) { ++ notify_method_exit(state, NotifyJVMTI); // preserve TOSCA ++ } else { ++ notify_method_exit(state, SkipNotifyJVMTI); // preserve TOSCA ++ } ++ ++ // remove activation ++ ld_d(Rsender, FP, frame::interpreter_frame_sender_sp_offset * wordSize); ++ if (StackReservedPages > 0) { ++ // testing if reserved zone needs to be re-enabled ++ Label no_reserved_zone_enabling; ++ ++ // check if already enabled - if so no re-enabling needed ++ assert(sizeof(StackOverflow::StackGuardState) == 4, "unexpected size"); ++ ld_w(AT, Address(TREG, JavaThread::stack_guard_state_offset())); ++ addi_w(AT, AT, StackOverflow::stack_guard_enabled); ++ beqz(AT, no_reserved_zone_enabling); ++ ++ ld_d(AT, Address(TREG, JavaThread::reserved_stack_activation_offset())); ++ bge(AT, Rsender, no_reserved_zone_enabling); ++ ++ call_VM_leaf( ++ CAST_FROM_FN_PTR(address, SharedRuntime::enable_stack_reserved_zone), TREG); ++ call_VM(noreg, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::throw_delayed_StackOverflowError)); ++ should_not_reach_here(); ++ ++ bind(no_reserved_zone_enabling); ++ } ++ ++ // remove frame anchor ++ leave(); ++ ++ // set sp to sender sp ++ move(SP, Rsender); ++} ++ ++// Lock object ++// ++// Args: ++// T0: BasicObjectLock to be used for locking ++// ++// Kills: ++// T1 ++// T2 ++void InterpreterMacroAssembler::lock_object(Register lock_reg) { ++ assert(lock_reg == T0, "The argument is only for looks. It must be T0"); ++ ++ if (LockingMode == LM_MONITOR) { ++ call_VM(NOREG, CAST_FROM_FN_PTR(address, InterpreterRuntime::monitorenter), lock_reg); ++ } else { ++ Label count, done, slow_case; ++ const Register tmp_reg = T2; ++ const Register scr_reg = T1; ++ const int obj_offset = in_bytes(BasicObjectLock::obj_offset()); ++ const int lock_offset = in_bytes(BasicObjectLock::lock_offset()); ++ const int mark_offset = lock_offset + BasicLock::displaced_header_offset_in_bytes(); ++ ++ // Load object pointer into scr_reg ++ ld_d(scr_reg, lock_reg, obj_offset); ++ ++ if (DiagnoseSyncOnValueBasedClasses != 0) { ++ load_klass(tmp_reg, scr_reg); ++ ld_w(tmp_reg, Address(tmp_reg, Klass::access_flags_offset())); ++ li(AT, JVM_ACC_IS_VALUE_BASED_CLASS); ++ andr(AT, AT, tmp_reg); ++ bnez(AT, slow_case); ++ } ++ ++ if (LockingMode == LM_LIGHTWEIGHT) { ++ ld_d(tmp_reg, Address(scr_reg, oopDesc::mark_offset_in_bytes())); ++ lightweight_lock(scr_reg, tmp_reg, SCR1, SCR2, slow_case); ++ b(count); ++ } else if (LockingMode == LM_LEGACY) { ++ // Load (object->mark() | 1) into tmp_reg ++ ld_d(AT, scr_reg, 0); ++ ori(tmp_reg, AT, 1); ++ ++ // Save (object->mark() | 1) into BasicLock's displaced header ++ st_d(tmp_reg, lock_reg, mark_offset); ++ ++ assert(lock_offset == 0, "displached header must be first word in BasicObjectLock"); ++ ++ cmpxchg(Address(scr_reg, 0), tmp_reg, lock_reg, AT, true, true /* acquire */, count); ++ ++ // Test if the oopMark is an obvious stack pointer, i.e., ++ // 1) (mark & 3) == 0, and ++ // 2) SP <= mark < SP + os::pagesize() ++ // ++ // These 3 tests can be done by evaluating the following ++ // expression: ((mark - sp) & (3 - os::vm_page_size())), ++ // assuming both stack pointer and pagesize have their ++ // least significant 2 bits clear. ++ // NOTE: the oopMark is in tmp_reg as the result of cmpxchg ++ sub_d(tmp_reg, tmp_reg, SP); ++ li(AT, 7 - (int)os::vm_page_size()); ++ andr(tmp_reg, tmp_reg, AT); ++ // Save the test result, for recursive case, the result is zero ++ st_d(tmp_reg, lock_reg, mark_offset); ++ beqz(tmp_reg, count); ++ } ++ ++ bind(slow_case); ++ // Call the runtime routine for slow case ++ if (LockingMode == LM_LIGHTWEIGHT) { ++ call_VM(NOREG, CAST_FROM_FN_PTR(address, InterpreterRuntime::monitorenter_obj), scr_reg); ++ } else { ++ call_VM(NOREG, CAST_FROM_FN_PTR(address, InterpreterRuntime::monitorenter), lock_reg); ++ } ++ b(done); ++ ++ bind(count); ++ increment(Address(TREG, JavaThread::held_monitor_count_offset()), 1); ++ ++ bind(done); ++ } ++} ++ ++// Unlocks an object. Used in monitorexit bytecode and ++// remove_activation. Throws an IllegalMonitorException if object is ++// not locked by current thread. ++// ++// Args: ++// T0: BasicObjectLock for lock ++// ++// Kills: ++// T1 ++// T2 ++// T3 ++// Throw an IllegalMonitorException if object is not locked by current thread ++void InterpreterMacroAssembler::unlock_object(Register lock_reg) { ++ assert(lock_reg == T0, "The argument is only for looks. It must be T0"); ++ ++ if (LockingMode == LM_MONITOR) { ++ call_VM_leaf(CAST_FROM_FN_PTR(address, InterpreterRuntime::monitorexit), lock_reg); ++ } else { ++ Label count, done; ++ const Register tmp_reg = T1; ++ const Register scr_reg = T2; ++ const Register hdr_reg = T3; ++ ++ save_bcp(); // Save in case of exception ++ ++ if (LockingMode != LM_LIGHTWEIGHT) { ++ // Convert from BasicObjectLock structure to object and BasicLock ++ // structure Store the BasicLock address into tmp_reg ++ lea(tmp_reg, Address(lock_reg, BasicObjectLock::lock_offset())); ++ } ++ ++ // Load oop into scr_reg ++ ld_d(scr_reg, Address(lock_reg, BasicObjectLock::obj_offset())); ++ // free entry ++ st_d(R0, Address(lock_reg, BasicObjectLock::obj_offset())); ++ ++ if (LockingMode == LM_LIGHTWEIGHT) { ++ Label slow_case; ++ ++ // Check for non-symmetric locking. This is allowed by the spec and the interpreter ++ // must handle it. ++ Register tmp = SCR1; ++ // First check for lock-stack underflow. ++ ld_wu(tmp, Address(TREG, JavaThread::lock_stack_top_offset())); ++ li(AT, (unsigned)LockStack::start_offset()); ++ bgeu(AT, tmp, slow_case); ++ // Then check if the top of the lock-stack matches the unlocked object. ++ addi_w(tmp, tmp, -oopSize); ++ ldx_d(tmp, TREG, tmp); ++ bne(scr_reg, tmp, slow_case); ++ ++ ld_d(hdr_reg, Address(scr_reg, oopDesc::mark_offset_in_bytes())); ++ andi(AT, hdr_reg, markWord::monitor_value); ++ bnez(AT, slow_case); ++ lightweight_unlock(scr_reg, hdr_reg, tmp_reg, SCR1, slow_case); ++ b(count); ++ bind(slow_case); ++ } else if (LockingMode == LM_LEGACY) { ++ // Load the old header from BasicLock structure ++ ld_d(hdr_reg, tmp_reg, BasicLock::displaced_header_offset_in_bytes()); ++ // zero for recursive case ++ beqz(hdr_reg, count); ++ ++ // Atomic swap back the old header ++ cmpxchg(Address(scr_reg, 0), tmp_reg, hdr_reg, AT, false, true /* acquire */, count); ++ } ++ // Call the runtime routine for slow case. ++ st_d(scr_reg, Address(lock_reg, BasicObjectLock::obj_offset())); // restore obj ++ call_VM_leaf(CAST_FROM_FN_PTR(address, InterpreterRuntime::monitorexit), lock_reg); ++ b(done); ++ ++ bind(count); ++ decrement(Address(TREG, JavaThread::held_monitor_count_offset()), 1); ++ ++ bind(done); ++ restore_bcp(); ++ } ++} ++ ++void InterpreterMacroAssembler::test_method_data_pointer(Register mdp, ++ Label& zero_continue) { ++ assert(ProfileInterpreter, "must be profiling interpreter"); ++ ld_d(mdp, Address(FP, frame::interpreter_frame_mdp_offset * wordSize)); ++ beqz(mdp, zero_continue); ++} ++ ++// Set the method data pointer for the current bcp. ++void InterpreterMacroAssembler::set_method_data_pointer_for_bcp() { ++ assert(ProfileInterpreter, "must be profiling interpreter"); ++ Label get_continue; ++ ++ // load mdp into callee-saved register and test it before call ++ ld_d(TSR, Address(Rmethod, in_bytes(Method::method_data_offset()))); ++ beqz(TSR, get_continue); ++ ++ // convert chain: bcp -> bci -> dp (data pointer) -> di (data index) ++ call_VM_leaf(CAST_FROM_FN_PTR(address, InterpreterRuntime::bcp_to_di), Rmethod, BCP); ++ ++ // mdp (new) = mdp (old) + data_offset + di (returned value) ++ addi_d(TSR, TSR, in_bytes(MethodData::data_offset())); ++ add_d(TSR, TSR, A0); ++ st_d(TSR, Address(FP, frame::interpreter_frame_mdp_offset * wordSize)); ++ ++ bind(get_continue); ++} ++ ++void InterpreterMacroAssembler::verify_method_data_pointer() { ++ assert(ProfileInterpreter, "must be profiling interpreter"); ++#ifdef ASSERT ++ Label verify_continue; ++ ++ Register method = c_rarg0; ++ Register mdp = c_rarg2; ++ ++ push2(method, mdp); // verification should not blows c_rarg0 ++ ++ test_method_data_pointer(mdp, verify_continue); // If mdp is zero, continue ++ get_method(method); ++ ++ // If the mdp is valid, it will point to a DataLayout header which is ++ // consistent with the bcp. The converse is highly probable also. ++ ld_hu(SCR1, mdp, in_bytes(DataLayout::bci_offset())); ++ ld_d(SCR2, method, in_bytes(Method::const_offset())); ++ add_d(SCR1, SCR1, SCR2); ++ addi_d(SCR1, SCR1, in_bytes(ConstMethod::codes_offset())); ++ beq(SCR1, BCP, verify_continue); ++ ++ call_VM_leaf(CAST_FROM_FN_PTR(address, InterpreterRuntime::verify_mdp), ++ method, BCP, mdp); ++ ++ bind(verify_continue); ++ ++ pop2(method, mdp); ++#endif // ASSERT ++} ++ ++ ++void InterpreterMacroAssembler::set_mdp_data_at(Register mdp_in, ++ int constant, ++ Register value) { ++ assert(ProfileInterpreter, "must be profiling interpreter"); ++ Address data(mdp_in, constant); ++ st_d(value, data); ++} ++ ++ ++void InterpreterMacroAssembler::increment_mdp_data_at(Register mdp_in, ++ int constant, ++ bool decrement) { ++ increment_mdp_data_at(mdp_in, noreg, constant, decrement); ++} ++ ++void InterpreterMacroAssembler::increment_mdp_data_at(Register mdp_in, ++ Register reg, ++ int constant, ++ bool decrement) { ++ assert(ProfileInterpreter, "must be profiling interpreter"); ++ // %%% this does 64bit counters at best it is wasting space ++ // at worst it is a rare bug when counters overflow ++ ++ assert_different_registers(AT, TSR, mdp_in, reg); ++ ++ Address addr1(mdp_in, constant); ++ Address addr2(TSR, 0); ++ Address &addr = addr1; ++ if (reg != noreg) { ++ lea(TSR, addr1); ++ add_d(TSR, TSR, reg); ++ addr = addr2; ++ } ++ ++ if (decrement) { ++ ld_d(AT, addr); ++ addi_d(AT, AT, -DataLayout::counter_increment); ++ Label L; ++ blt(AT, R0, L); // skip store if counter underflow ++ st_d(AT, addr); ++ bind(L); ++ } else { ++ assert(DataLayout::counter_increment == 1, ++ "flow-free idiom only works with 1"); ++ ld_d(AT, addr); ++ addi_d(AT, AT, DataLayout::counter_increment); ++ Label L; ++ bge(R0, AT, L); // skip store if counter overflow ++ st_d(AT, addr); ++ bind(L); ++ } ++} ++ ++void InterpreterMacroAssembler::set_mdp_flag_at(Register mdp_in, ++ int flag_byte_constant) { ++ assert(ProfileInterpreter, "must be profiling interpreter"); ++ int flags_offset = in_bytes(DataLayout::flags_offset()); ++ // Set the flag ++ ld_bu(AT, Address(mdp_in, flags_offset)); ++ ori(AT, AT, flag_byte_constant); ++ st_b(AT, Address(mdp_in, flags_offset)); ++} ++ ++ ++void InterpreterMacroAssembler::test_mdp_data_at(Register mdp_in, ++ int offset, ++ Register value, ++ Register test_value_out, ++ Label& not_equal_continue) { ++ assert(ProfileInterpreter, "must be profiling interpreter"); ++ if (test_value_out == noreg) { ++ ld_d(AT, Address(mdp_in, offset)); ++ bne(AT, value, not_equal_continue); ++ } else { ++ // Put the test value into a register, so caller can use it: ++ ld_d(test_value_out, Address(mdp_in, offset)); ++ bne(value, test_value_out, not_equal_continue); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::update_mdp_by_offset(Register mdp_in, ++ int offset_of_disp) { ++ assert(ProfileInterpreter, "must be profiling interpreter"); ++ ld_d(AT, Address(mdp_in, offset_of_disp)); ++ add_d(mdp_in, mdp_in, AT); ++ st_d(mdp_in, Address(FP, frame::interpreter_frame_mdp_offset * wordSize)); ++} ++ ++ ++void InterpreterMacroAssembler::update_mdp_by_offset(Register mdp_in, ++ Register reg, ++ int offset_of_disp) { ++ assert(ProfileInterpreter, "must be profiling interpreter"); ++ add_d(AT, mdp_in, reg); ++ ld_d(AT, AT, offset_of_disp); ++ add_d(mdp_in, mdp_in, AT); ++ st_d(mdp_in, Address(FP, frame::interpreter_frame_mdp_offset * wordSize)); ++} ++ ++ ++void InterpreterMacroAssembler::update_mdp_by_constant(Register mdp_in, ++ int constant) { ++ assert(ProfileInterpreter, "must be profiling interpreter"); ++ lea(mdp_in, Address(mdp_in, constant)); ++ st_d(mdp_in, Address(FP, frame::interpreter_frame_mdp_offset * wordSize)); ++} ++ ++ ++void InterpreterMacroAssembler::update_mdp_for_ret(Register return_bci) { ++ assert(ProfileInterpreter, "must be profiling interpreter"); ++ push(return_bci); // save/restore across call_VM ++ call_VM(noreg, ++ CAST_FROM_FN_PTR(address, InterpreterRuntime::update_mdp_for_ret), ++ return_bci); ++ pop(return_bci); ++} ++ ++ ++void InterpreterMacroAssembler::profile_taken_branch(Register mdp, ++ Register bumped_count) { ++ if (ProfileInterpreter) { ++ Label profile_continue; ++ ++ // If no method data exists, go to profile_continue. ++ // Otherwise, assign to mdp ++ test_method_data_pointer(mdp, profile_continue); ++ ++ // We are taking a branch. Increment the taken count. ++ // We inline increment_mdp_data_at to return bumped_count in a register ++ //increment_mdp_data_at(mdp, in_bytes(JumpData::taken_offset())); ++ ld_d(bumped_count, mdp, in_bytes(JumpData::taken_offset())); ++ assert(DataLayout::counter_increment == 1, "flow-free idiom only works with 1"); ++ addi_d(AT, bumped_count, DataLayout::counter_increment); ++ sltu(AT, R0, AT); ++ add_d(bumped_count, bumped_count, AT); ++ st_d(bumped_count, mdp, in_bytes(JumpData::taken_offset())); // Store back out ++ // The method data pointer needs to be updated to reflect the new target. ++ update_mdp_by_offset(mdp, in_bytes(JumpData::displacement_offset())); ++ bind(profile_continue); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::profile_not_taken_branch(Register mdp) { ++ if (ProfileInterpreter) { ++ Label profile_continue; ++ ++ // If no method data exists, go to profile_continue. ++ test_method_data_pointer(mdp, profile_continue); ++ ++ // We are taking a branch. Increment the not taken count. ++ increment_mdp_data_at(mdp, in_bytes(BranchData::not_taken_offset())); ++ ++ // The method data pointer needs to be updated to correspond to ++ // the next bytecode ++ update_mdp_by_constant(mdp, in_bytes(BranchData::branch_data_size())); ++ bind(profile_continue); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::profile_call(Register mdp) { ++ if (ProfileInterpreter) { ++ Label profile_continue; ++ ++ // If no method data exists, go to profile_continue. ++ test_method_data_pointer(mdp, profile_continue); ++ ++ // We are making a call. Increment the count. ++ increment_mdp_data_at(mdp, in_bytes(CounterData::count_offset())); ++ ++ // The method data pointer needs to be updated to reflect the new target. ++ update_mdp_by_constant(mdp, in_bytes(CounterData::counter_data_size())); ++ bind(profile_continue); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::profile_final_call(Register mdp) { ++ if (ProfileInterpreter) { ++ Label profile_continue; ++ ++ // If no method data exists, go to profile_continue. ++ test_method_data_pointer(mdp, profile_continue); ++ ++ // We are making a call. Increment the count. ++ increment_mdp_data_at(mdp, in_bytes(CounterData::count_offset())); ++ ++ // The method data pointer needs to be updated to reflect the new target. ++ update_mdp_by_constant(mdp, ++ in_bytes(VirtualCallData:: ++ virtual_call_data_size())); ++ bind(profile_continue); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::profile_virtual_call(Register receiver, ++ Register mdp, ++ Register reg2, ++ bool receiver_can_be_null) { ++ if (ProfileInterpreter) { ++ Label profile_continue; ++ ++ // If no method data exists, go to profile_continue. ++ test_method_data_pointer(mdp, profile_continue); ++ ++ Label skip_receiver_profile; ++ if (receiver_can_be_null) { ++ Label not_null; ++ bnez(receiver, not_null); ++ // We are making a call. Increment the count. ++ increment_mdp_data_at(mdp, in_bytes(CounterData::count_offset())); ++ b(skip_receiver_profile); ++ bind(not_null); ++ } ++ ++ // Record the receiver type. ++ record_klass_in_profile(receiver, mdp, reg2, true); ++ bind(skip_receiver_profile); ++ ++ // The method data pointer needs to be updated to reflect the new target. ++ update_mdp_by_constant(mdp, ++ in_bytes(VirtualCallData:: ++ virtual_call_data_size())); ++ bind(profile_continue); ++ } ++} ++ ++// This routine creates a state machine for updating the multi-row ++// type profile at a virtual call site (or other type-sensitive bytecode). ++// The machine visits each row (of receiver/count) until the receiver type ++// is found, or until it runs out of rows. At the same time, it remembers ++// the location of the first empty row. (An empty row records null for its ++// receiver, and can be allocated for a newly-observed receiver type.) ++// Because there are two degrees of freedom in the state, a simple linear ++// search will not work; it must be a decision tree. Hence this helper ++// function is recursive, to generate the required tree structured code. ++// It's the interpreter, so we are trading off code space for speed. ++// See below for example code. ++void InterpreterMacroAssembler::record_klass_in_profile_helper( ++ Register receiver, Register mdp, ++ Register reg2, int start_row, ++ Label& done, bool is_virtual_call) { ++ if (TypeProfileWidth == 0) { ++ if (is_virtual_call) { ++ increment_mdp_data_at(mdp, in_bytes(CounterData::count_offset())); ++ } ++#if INCLUDE_JVMCI ++ else if (EnableJVMCI) { ++ increment_mdp_data_at(mdp, in_bytes(ReceiverTypeData::nonprofiled_receiver_count_offset())); ++ } ++#endif // INCLUDE_JVMCI ++ } else { ++ int non_profiled_offset = -1; ++ if (is_virtual_call) { ++ non_profiled_offset = in_bytes(CounterData::count_offset()); ++ } ++#if INCLUDE_JVMCI ++ else if (EnableJVMCI) { ++ non_profiled_offset = in_bytes(ReceiverTypeData::nonprofiled_receiver_count_offset()); ++ } ++#endif // INCLUDE_JVMCI ++ ++ record_item_in_profile_helper(receiver, mdp, reg2, 0, done, TypeProfileWidth, ++ &VirtualCallData::receiver_offset, &VirtualCallData::receiver_count_offset, non_profiled_offset); ++ } ++} ++ ++void InterpreterMacroAssembler::record_item_in_profile_helper(Register item, Register mdp, ++ Register reg2, int start_row, Label& done, int total_rows, ++ OffsetFunction item_offset_fn, OffsetFunction item_count_offset_fn, ++ int non_profiled_offset) { ++ int last_row = total_rows - 1; ++ assert(start_row <= last_row, "must be work left to do"); ++ // Test this row for both the item and for null. ++ // Take any of three different outcomes: ++ // 1. found item => increment count and goto done ++ // 2. found null => keep looking for case 1, maybe allocate this cell ++ // 3. found something else => keep looking for cases 1 and 2 ++ // Case 3 is handled by a recursive call. ++ for (int row = start_row; row <= last_row; row++) { ++ Label next_test; ++ bool test_for_null_also = (row == start_row); ++ ++ // See if the receiver is item[n]. ++ int item_offset = in_bytes(item_offset_fn(row)); ++ test_mdp_data_at(mdp, item_offset, item, ++ (test_for_null_also ? reg2 : noreg), ++ next_test); ++ // (Reg2 now contains the item from the CallData.) ++ ++ // The receiver is item[n]. Increment count[n]. ++ int count_offset = in_bytes(item_count_offset_fn(row)); ++ increment_mdp_data_at(mdp, count_offset); ++ b(done); ++ bind(next_test); ++ ++ if (test_for_null_also) { ++ Label found_null; ++ // Failed the equality check on item[n]... Test for null. ++ if (start_row == last_row) { ++ // The only thing left to do is handle the null case. ++ if (non_profiled_offset >= 0) { ++ beqz(reg2, found_null); ++ // Item did not match any saved item and there is no empty row for it. ++ // Increment total counter to indicate polymorphic case. ++ increment_mdp_data_at(mdp, non_profiled_offset); ++ b(done); ++ bind(found_null); ++ } else { ++ bnez(reg2, done); ++ } ++ break; ++ } ++ // Since null is rare, make it be the branch-taken case. ++ beqz(reg2, found_null); ++ ++ // Put all the "Case 3" tests here. ++ record_item_in_profile_helper(item, mdp, reg2, start_row + 1, done, total_rows, ++ item_offset_fn, item_count_offset_fn, non_profiled_offset); ++ ++ // Found a null. Keep searching for a matching item, ++ // but remember that this is an empty (unused) slot. ++ bind(found_null); ++ } ++ } ++ ++ // In the fall-through case, we found no matching item, but we ++ // observed the item[start_row] is null. ++ ++ // Fill in the item field and increment the count. ++ int item_offset = in_bytes(item_offset_fn(start_row)); ++ set_mdp_data_at(mdp, item_offset, item); ++ int count_offset = in_bytes(item_count_offset_fn(start_row)); ++ li(reg2, DataLayout::counter_increment); ++ set_mdp_data_at(mdp, count_offset, reg2); ++ if (start_row > 0) { ++ b(done); ++ } ++} ++ ++// Example state machine code for three profile rows: ++// // main copy of decision tree, rooted at row[1] ++// if (row[0].rec == rec) { row[0].incr(); goto done; } ++// if (row[0].rec != nullptr) { ++// // inner copy of decision tree, rooted at row[1] ++// if (row[1].rec == rec) { row[1].incr(); goto done; } ++// if (row[1].rec != nullptr) { ++// // degenerate decision tree, rooted at row[2] ++// if (row[2].rec == rec) { row[2].incr(); goto done; } ++// if (row[2].rec != nullptr) { goto done; } // overflow ++// row[2].init(rec); goto done; ++// } else { ++// // remember row[1] is empty ++// if (row[2].rec == rec) { row[2].incr(); goto done; } ++// row[1].init(rec); goto done; ++// } ++// } else { ++// // remember row[0] is empty ++// if (row[1].rec == rec) { row[1].incr(); goto done; } ++// if (row[2].rec == rec) { row[2].incr(); goto done; } ++// row[0].init(rec); goto done; ++// } ++// done: ++ ++void InterpreterMacroAssembler::record_klass_in_profile(Register receiver, ++ Register mdp, Register reg2, ++ bool is_virtual_call) { ++ assert(ProfileInterpreter, "must be profiling"); ++ Label done; ++ ++ record_klass_in_profile_helper(receiver, mdp, reg2, 0, done, is_virtual_call); ++ ++ bind (done); ++} ++ ++void InterpreterMacroAssembler::profile_ret(Register return_bci, ++ Register mdp) { ++ if (ProfileInterpreter) { ++ Label profile_continue; ++ uint row; ++ ++ // If no method data exists, go to profile_continue. ++ test_method_data_pointer(mdp, profile_continue); ++ ++ // Update the total ret count. ++ increment_mdp_data_at(mdp, in_bytes(CounterData::count_offset())); ++ ++ for (row = 0; row < RetData::row_limit(); row++) { ++ Label next_test; ++ ++ // See if return_bci is equal to bci[n]: ++ test_mdp_data_at(mdp, ++ in_bytes(RetData::bci_offset(row)), ++ return_bci, noreg, ++ next_test); ++ ++ // return_bci is equal to bci[n]. Increment the count. ++ increment_mdp_data_at(mdp, in_bytes(RetData::bci_count_offset(row))); ++ ++ // The method data pointer needs to be updated to reflect the new target. ++ update_mdp_by_offset(mdp, ++ in_bytes(RetData::bci_displacement_offset(row))); ++ b(profile_continue); ++ bind(next_test); ++ } ++ ++ update_mdp_for_ret(return_bci); ++ ++ bind(profile_continue); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::profile_null_seen(Register mdp) { ++ if (ProfileInterpreter) { ++ Label profile_continue; ++ ++ // If no method data exists, go to profile_continue. ++ test_method_data_pointer(mdp, profile_continue); ++ ++ set_mdp_flag_at(mdp, BitData::null_seen_byte_constant()); ++ ++ // The method data pointer needs to be updated. ++ int mdp_delta = in_bytes(BitData::bit_data_size()); ++ if (TypeProfileCasts) { ++ mdp_delta = in_bytes(VirtualCallData::virtual_call_data_size()); ++ } ++ update_mdp_by_constant(mdp, mdp_delta); ++ ++ bind(profile_continue); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::profile_typecheck_failed(Register mdp) { ++ if (ProfileInterpreter && TypeProfileCasts) { ++ Label profile_continue; ++ ++ // If no method data exists, go to profile_continue. ++ test_method_data_pointer(mdp, profile_continue); ++ ++ int count_offset = in_bytes(CounterData::count_offset()); ++ // Back up the address, since we have already bumped the mdp. ++ count_offset -= in_bytes(VirtualCallData::virtual_call_data_size()); ++ ++ // *Decrement* the counter. We expect to see zero or small negatives. ++ increment_mdp_data_at(mdp, count_offset, true); ++ ++ bind (profile_continue); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::profile_typecheck(Register mdp, Register klass, Register reg2) { ++ if (ProfileInterpreter) { ++ Label profile_continue; ++ ++ // If no method data exists, go to profile_continue. ++ test_method_data_pointer(mdp, profile_continue); ++ ++ // The method data pointer needs to be updated. ++ int mdp_delta = in_bytes(BitData::bit_data_size()); ++ if (TypeProfileCasts) { ++ mdp_delta = in_bytes(VirtualCallData::virtual_call_data_size()); ++ ++ // Record the object type. ++ record_klass_in_profile(klass, mdp, reg2, false); ++ } ++ update_mdp_by_constant(mdp, mdp_delta); ++ ++ bind(profile_continue); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::profile_switch_default(Register mdp) { ++ if (ProfileInterpreter) { ++ Label profile_continue; ++ ++ // If no method data exists, go to profile_continue. ++ test_method_data_pointer(mdp, profile_continue); ++ ++ // Update the default case count ++ increment_mdp_data_at(mdp, ++ in_bytes(MultiBranchData::default_count_offset())); ++ ++ // The method data pointer needs to be updated. ++ update_mdp_by_offset(mdp, ++ in_bytes(MultiBranchData:: ++ default_displacement_offset())); ++ ++ bind(profile_continue); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::profile_switch_case(Register index, ++ Register mdp, ++ Register reg2) { ++ if (ProfileInterpreter) { ++ Label profile_continue; ++ ++ // If no method data exists, go to profile_continue. ++ test_method_data_pointer(mdp, profile_continue); ++ ++ // Build the base (index * per_case_size_in_bytes()) + ++ // case_array_offset_in_bytes() ++ li(reg2, in_bytes(MultiBranchData::per_case_size())); ++ mul_d(index, index, reg2); ++ addi_d(index, index, in_bytes(MultiBranchData::case_array_offset())); ++ ++ // Update the case count ++ increment_mdp_data_at(mdp, ++ index, ++ in_bytes(MultiBranchData::relative_count_offset())); ++ ++ // The method data pointer needs to be updated. ++ update_mdp_by_offset(mdp, ++ index, ++ in_bytes(MultiBranchData:: ++ relative_displacement_offset())); ++ ++ bind(profile_continue); ++ } ++} ++ ++ ++void InterpreterMacroAssembler::narrow(Register result) { ++ // Get method->_constMethod->_result_type ++ ld_d(T4, FP, frame::interpreter_frame_method_offset * wordSize); ++ ld_d(T4, T4, in_bytes(Method::const_offset())); ++ ld_bu(T4, T4, in_bytes(ConstMethod::result_type_offset())); ++ ++ Label done, notBool, notByte, notChar; ++ ++ // common case first ++ addi_d(AT, T4, -T_INT); ++ beq(AT, R0, done); ++ ++ // mask integer result to narrower return type. ++ addi_d(AT, T4, -T_BOOLEAN); ++ bne(AT, R0, notBool); ++ andi(result, result, 0x1); ++ beq(R0, R0, done); ++ ++ bind(notBool); ++ addi_d(AT, T4, -T_BYTE); ++ bne(AT, R0, notByte); ++ ext_w_b(result, result); ++ beq(R0, R0, done); ++ ++ bind(notByte); ++ addi_d(AT, T4, -T_CHAR); ++ bne(AT, R0, notChar); ++ bstrpick_d(result, result, 15, 0); ++ beq(R0, R0, done); ++ ++ bind(notChar); ++ ext_w_h(result, result); ++ ++ // Nothing to do for T_INT ++ bind(done); ++} ++ ++ ++void InterpreterMacroAssembler::profile_obj_type(Register obj, const Address& mdo_addr) { ++ Label update, next, none; ++ ++ verify_oop(obj); ++ ++ if (mdo_addr.index() != noreg) { ++ guarantee(T0 != mdo_addr.base(), "The base register will be corrupted !"); ++ guarantee(T0 != mdo_addr.index(), "The index register will be corrupted !"); ++ push(T0); ++ alsl_d(T0, mdo_addr.index(), mdo_addr.base(), mdo_addr.scale() - 1); ++ } ++ ++ bnez(obj, update); ++ ++ if (mdo_addr.index() == noreg) { ++ ld_d(AT, mdo_addr); ++ } else { ++ ld_d(AT, T0, mdo_addr.disp()); ++ } ++ ori(AT, AT, TypeEntries::null_seen); ++ if (mdo_addr.index() == noreg) { ++ st_d(AT, mdo_addr); ++ } else { ++ st_d(AT, T0, mdo_addr.disp()); ++ } ++ ++ b(next); ++ ++ bind(update); ++ load_klass(obj, obj); ++ ++ if (mdo_addr.index() == noreg) { ++ ld_d(AT, mdo_addr); ++ } else { ++ ld_d(AT, T0, mdo_addr.disp()); ++ } ++ xorr(obj, obj, AT); ++ ++ assert(TypeEntries::type_klass_mask == -4, "must be"); ++ bstrpick_d(AT, obj, 63, 2); ++ beqz(AT, next); ++ ++ andi(AT, obj, TypeEntries::type_unknown); ++ bnez(AT, next); ++ ++ if (mdo_addr.index() == noreg) { ++ ld_d(AT, mdo_addr); ++ } else { ++ ld_d(AT, T0, mdo_addr.disp()); ++ } ++ beqz(AT, none); ++ ++ addi_d(AT, AT, -(TypeEntries::null_seen)); ++ beqz(AT, none); ++ ++ // There is a chance that the checks above (re-reading profiling ++ // data from memory) fail if another thread has just set the ++ // profiling to this obj's klass ++ if (mdo_addr.index() == noreg) { ++ ld_d(AT, mdo_addr); ++ } else { ++ ld_d(AT, T0, mdo_addr.disp()); ++ } ++ xorr(obj, obj, AT); ++ assert(TypeEntries::type_klass_mask == -4, "must be"); ++ bstrpick_d(AT, obj, 63, 2); ++ beqz(AT, next); ++ ++ // different than before. Cannot keep accurate profile. ++ if (mdo_addr.index() == noreg) { ++ ld_d(AT, mdo_addr); ++ } else { ++ ld_d(AT, T0, mdo_addr.disp()); ++ } ++ ori(AT, AT, TypeEntries::type_unknown); ++ if (mdo_addr.index() == noreg) { ++ st_d(AT, mdo_addr); ++ } else { ++ st_d(AT, T0, mdo_addr.disp()); ++ } ++ b(next); ++ ++ bind(none); ++ // first time here. Set profile type. ++ if (mdo_addr.index() == noreg) { ++ st_d(obj, mdo_addr); ++ } else { ++ st_d(obj, T0, mdo_addr.disp()); ++ } ++ ++ bind(next); ++ if (mdo_addr.index() != noreg) { ++ pop(T0); ++ } ++} ++ ++void InterpreterMacroAssembler::profile_arguments_type(Register mdp, Register callee, Register tmp, bool is_virtual) { ++ if (!ProfileInterpreter) { ++ return; ++ } ++ ++ if (MethodData::profile_arguments() || MethodData::profile_return()) { ++ Label profile_continue; ++ ++ test_method_data_pointer(mdp, profile_continue); ++ ++ int off_to_start = is_virtual ? in_bytes(VirtualCallData::virtual_call_data_size()) : in_bytes(CounterData::counter_data_size()); ++ ++ ld_b(AT, mdp, in_bytes(DataLayout::tag_offset()) - off_to_start); ++ li(tmp, is_virtual ? DataLayout::virtual_call_type_data_tag : DataLayout::call_type_data_tag); ++ bne(tmp, AT, profile_continue); ++ ++ ++ if (MethodData::profile_arguments()) { ++ Label done; ++ int off_to_args = in_bytes(TypeEntriesAtCall::args_data_offset()); ++ if (Assembler::is_simm(off_to_args, 12)) { ++ addi_d(mdp, mdp, off_to_args); ++ } else { ++ li(AT, off_to_args); ++ add_d(mdp, mdp, AT); ++ } ++ ++ ++ for (int i = 0; i < TypeProfileArgsLimit; i++) { ++ if (i > 0 || MethodData::profile_return()) { ++ // If return value type is profiled we may have no argument to profile ++ ld_d(tmp, mdp, in_bytes(TypeEntriesAtCall::cell_count_offset())-off_to_args); ++ ++ if (Assembler::is_simm(-1 * i * TypeStackSlotEntries::per_arg_count(), 12)) { ++ addi_w(tmp, tmp, -1 * i * TypeStackSlotEntries::per_arg_count()); ++ } else { ++ li(AT, i*TypeStackSlotEntries::per_arg_count()); ++ sub_w(tmp, tmp, AT); ++ } ++ ++ li(AT, TypeStackSlotEntries::per_arg_count()); ++ blt(tmp, AT, done); ++ } ++ ld_d(tmp, callee, in_bytes(Method::const_offset())); ++ ++ ld_hu(tmp, tmp, in_bytes(ConstMethod::size_of_parameters_offset())); ++ ++ // stack offset o (zero based) from the start of the argument ++ // list, for n arguments translates into offset n - o - 1 from ++ // the end of the argument list ++ ld_d(AT, mdp, in_bytes(TypeEntriesAtCall::stack_slot_offset(i))-off_to_args); ++ sub_d(tmp, tmp, AT); ++ ++ addi_w(tmp, tmp, -1); ++ ++ Address arg_addr = argument_address(tmp); ++ ld_d(tmp, arg_addr); ++ ++ Address mdo_arg_addr(mdp, in_bytes(TypeEntriesAtCall::argument_type_offset(i))-off_to_args); ++ profile_obj_type(tmp, mdo_arg_addr); ++ ++ int to_add = in_bytes(TypeStackSlotEntries::per_arg_size()); ++ if (Assembler::is_simm(to_add, 12)) { ++ addi_d(mdp, mdp, to_add); ++ } else { ++ li(AT, to_add); ++ add_d(mdp, mdp, AT); ++ } ++ ++ off_to_args += to_add; ++ } ++ ++ if (MethodData::profile_return()) { ++ ld_d(tmp, mdp, in_bytes(TypeEntriesAtCall::cell_count_offset())-off_to_args); ++ ++ int tmp_arg_counts = TypeProfileArgsLimit*TypeStackSlotEntries::per_arg_count(); ++ if (Assembler::is_simm(-1 * tmp_arg_counts, 12)) { ++ addi_w(tmp, tmp, -1 * tmp_arg_counts); ++ } else { ++ li(AT, tmp_arg_counts); ++ sub_w(mdp, mdp, AT); ++ } ++ } ++ ++ bind(done); ++ ++ if (MethodData::profile_return()) { ++ // We're right after the type profile for the last ++ // argument. tmp is the number of cells left in the ++ // CallTypeData/VirtualCallTypeData to reach its end. Non null ++ // if there's a return to profile. ++ assert(ReturnTypeEntry::static_cell_count() < TypeStackSlotEntries::per_arg_count(), "can't move past ret type"); ++ slli_w(tmp, tmp, exact_log2(DataLayout::cell_size)); ++ add_d(mdp, mdp, tmp); ++ } ++ st_d(mdp, FP, frame::interpreter_frame_mdp_offset * wordSize); ++ } else { ++ assert(MethodData::profile_return(), "either profile call args or call ret"); ++ update_mdp_by_constant(mdp, in_bytes(TypeEntriesAtCall::return_only_size())); ++ } ++ ++ // mdp points right after the end of the ++ // CallTypeData/VirtualCallTypeData, right after the cells for the ++ // return value type if there's one ++ ++ bind(profile_continue); ++ } ++} ++ ++void InterpreterMacroAssembler::profile_return_type(Register mdp, Register ret, Register tmp) { ++ assert_different_registers(mdp, ret, tmp, _bcp_register); ++ if (ProfileInterpreter && MethodData::profile_return()) { ++ Label profile_continue, done; ++ ++ test_method_data_pointer(mdp, profile_continue); ++ ++ if (MethodData::profile_return_jsr292_only()) { ++ assert(Method::intrinsic_id_size_in_bytes() == 2, "assuming Method::_intrinsic_id is u2"); ++ ++ // If we don't profile all invoke bytecodes we must make sure ++ // it's a bytecode we indeed profile. We can't go back to the ++ // beginning of the ProfileData we intend to update to check its ++ // type because we're right after it and we don't known its ++ // length ++ Label do_profile; ++ ld_b(tmp, _bcp_register, 0); ++ addi_d(AT, tmp, -1 * Bytecodes::_invokedynamic); ++ beqz(AT, do_profile); ++ addi_d(AT, tmp, -1 * Bytecodes::_invokehandle); ++ beqz(AT, do_profile); ++ ++ get_method(tmp); ++ ld_hu(tmp, Address(tmp, Method::intrinsic_id_offset())); ++ li(AT, static_cast(vmIntrinsics::_compiledLambdaForm)); ++ bne(tmp, AT, profile_continue); ++ ++ bind(do_profile); ++ } ++ ++ Address mdo_ret_addr(mdp, -in_bytes(ReturnTypeEntry::size())); ++ add_d(tmp, ret, R0); ++ profile_obj_type(tmp, mdo_ret_addr); ++ ++ bind(profile_continue); ++ } ++} ++ ++void InterpreterMacroAssembler::profile_parameters_type(Register mdp, Register tmp1, Register tmp2) { ++ guarantee(T4 == tmp1, "You are reqired to use T4 as the index register for LoongArch !"); ++ ++ if (ProfileInterpreter && MethodData::profile_parameters()) { ++ Label profile_continue, done; ++ ++ test_method_data_pointer(mdp, profile_continue); ++ ++ // Load the offset of the area within the MDO used for ++ // parameters. If it's negative we're not profiling any parameters ++ ld_w(tmp1, mdp, in_bytes(MethodData::parameters_type_data_di_offset()) - in_bytes(MethodData::data_offset())); ++ blt(tmp1, R0, profile_continue); ++ ++ // Compute a pointer to the area for parameters from the offset ++ // and move the pointer to the slot for the last ++ // parameters. Collect profiling from last parameter down. ++ // mdo start + parameters offset + array length - 1 ++ add_d(mdp, mdp, tmp1); ++ ld_d(tmp1, mdp, in_bytes(ArrayData::array_len_offset())); ++ decrement(tmp1, TypeStackSlotEntries::per_arg_count()); ++ ++ ++ Label loop; ++ bind(loop); ++ ++ int off_base = in_bytes(ParametersTypeData::stack_slot_offset(0)); ++ int type_base = in_bytes(ParametersTypeData::type_offset(0)); ++ Address::ScaleFactor per_arg_scale = Address::times(DataLayout::cell_size); ++ Address arg_type(mdp, tmp1, per_arg_scale, type_base); ++ ++ // load offset on the stack from the slot for this parameter ++ alsl_d(AT, tmp1, mdp, per_arg_scale - 1); ++ ld_d(tmp2, AT, off_base); ++ ++ sub_d(tmp2, R0, tmp2); ++ ++ // read the parameter from the local area ++ alsl_d(AT, tmp2, _locals_register, Interpreter::logStackElementSize - 1); ++ ld_d(tmp2, AT, 0); ++ ++ // profile the parameter ++ profile_obj_type(tmp2, arg_type); ++ ++ // go to next parameter ++ decrement(tmp1, TypeStackSlotEntries::per_arg_count()); ++ blt(R0, tmp1, loop); ++ ++ bind(profile_continue); ++ } ++} ++ ++void InterpreterMacroAssembler::verify_FPU(int stack_depth, TosState state) { ++} ++ ++void InterpreterMacroAssembler::notify_method_entry() { ++ // Whenever JVMTI is interp_only_mode, method entry/exit events are sent to ++ // track stack depth. If it is possible to enter interp_only_mode we add ++ // the code to check if the event should be sent. ++ if (JvmtiExport::can_post_interpreter_events()) { ++ Label L; ++ ld_wu(AT, Address(TREG, JavaThread::interp_only_mode_offset())); ++ beqz(AT, L); ++ call_VM(noreg, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::post_method_entry)); ++ bind(L); ++ } ++ ++ { ++ SkipIfEqual skip(this, &DTraceMethodProbes, false); ++ get_method(c_rarg1); ++ call_VM_leaf( ++ CAST_FROM_FN_PTR(address, SharedRuntime::dtrace_method_entry), ++ TREG, c_rarg1); ++ } ++ ++ // RedefineClasses() tracing support for obsolete method entry ++ if (log_is_enabled(Trace, redefine, class, obsolete)) { ++ get_method(c_rarg1); ++ call_VM_leaf( ++ CAST_FROM_FN_PTR(address, SharedRuntime::rc_trace_method_entry), ++ TREG, c_rarg1); ++ } ++} ++ ++void InterpreterMacroAssembler::notify_method_exit( ++ TosState state, NotifyMethodExitMode mode) { ++ // Whenever JVMTI is interp_only_mode, method entry/exit events are sent to ++ // track stack depth. If it is possible to enter interp_only_mode we add ++ // the code to check if the event should be sent. ++ if (mode == NotifyJVMTI && JvmtiExport::can_post_interpreter_events()) { ++ Label skip; ++ // Note: frame::interpreter_frame_result has a dependency on how the ++ // method result is saved across the call to post_method_exit. If this ++ // is changed then the interpreter_frame_result implementation will ++ // need to be updated too. ++ ++ // template interpreter will leave the result on the top of the stack. ++ push(state); ++ ld_wu(AT, Address(TREG, JavaThread::interp_only_mode_offset())); ++ beqz(AT, skip); ++ call_VM(noreg, ++ CAST_FROM_FN_PTR(address, InterpreterRuntime::post_method_exit)); ++ bind(skip); ++ pop(state); ++ } ++ ++ { ++ SkipIfEqual skip(this, &DTraceMethodProbes, false); ++ push(state); ++ get_method(c_rarg1); ++ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::dtrace_method_exit), ++ TREG, c_rarg1); ++ pop(state); ++ } ++} ++ ++// Jump if ((*counter_addr += increment) & mask) satisfies the condition. ++void InterpreterMacroAssembler::increment_mask_and_jump(Address counter_addr, ++ int increment, Address mask, ++ Register scratch, bool preloaded, ++ Condition cond, Label* where) { ++ assert_different_registers(scratch, AT); ++ ++ if (!preloaded) { ++ ld_w(scratch, counter_addr); ++ } ++ addi_w(scratch, scratch, increment); ++ st_w(scratch, counter_addr); ++ ++ ld_w(AT, mask); ++ andr(scratch, scratch, AT); ++ ++ if (cond == Assembler::zero) { ++ beq(scratch, R0, *where); ++ } else { ++ unimplemented(); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/interp_masm_loongarch.hpp b/src/hotspot/cpu/loongarch/interp_masm_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/interp_masm_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/interp_masm_loongarch.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,272 @@ ++/* ++ * Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_INTERP_MASM_LOONGARCH_64_HPP ++#define CPU_LOONGARCH_INTERP_MASM_LOONGARCH_64_HPP ++ ++#include "asm/assembler.hpp" ++#include "asm/macroAssembler.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "interpreter/invocationCounter.hpp" ++#include "runtime/frame.hpp" ++ ++// This file specializes the assembler with interpreter-specific macros ++ ++typedef ByteSize (*OffsetFunction)(uint); ++ ++class InterpreterMacroAssembler: public MacroAssembler { ++ private: ++ ++ Register _locals_register; // register that contains the pointer to the locals ++ Register _bcp_register; // register that contains the bcp ++ ++ protected: ++ // Interpreter specific version of call_VM_base ++ virtual void call_VM_leaf_base(address entry_point, ++ int number_of_arguments); ++ ++ virtual void call_VM_base(Register oop_result, ++ Register java_thread, ++ Register last_java_sp, ++ address entry_point, ++ int number_of_arguments, ++ bool check_exceptions); ++ ++ // base routine for all dispatches ++ void dispatch_base(TosState state, address* table, bool verifyoop = true, bool generate_poll = false); ++ ++ public: ++ void jump_to_entry(address entry); ++ // narrow int return value ++ void narrow(Register result); ++ ++ InterpreterMacroAssembler(CodeBuffer* code) : MacroAssembler(code), _locals_register(LVP), _bcp_register(BCP) {} ++ ++ void get_2_byte_integer_at_bcp(Register reg, Register tmp, int offset); ++ void get_4_byte_integer_at_bcp(Register reg, int offset); ++ ++ virtual void check_and_handle_popframe(Register java_thread); ++ virtual void check_and_handle_earlyret(Register java_thread); ++ ++ void load_earlyret_value(TosState state); ++ ++ // Interpreter-specific registers ++ void save_bcp() { ++ st_d(BCP, FP, frame::interpreter_frame_bcp_offset * wordSize); ++ } ++ ++ void restore_bcp() { ++ ld_d(BCP, FP, frame::interpreter_frame_bcp_offset * wordSize); ++ } ++ ++ void restore_locals() { ++ ld_d(LVP, FP, frame::interpreter_frame_locals_offset * wordSize); ++ alsl_d(LVP, LVP, FP, LogBytesPerWord-1); ++ } ++ ++ void get_dispatch(); ++ ++ // Helpers for runtime call arguments/results ++ void get_method(Register reg) { ++ ld_d(reg, FP, frame::interpreter_frame_method_offset * wordSize); ++ } ++ ++ void get_const(Register reg){ ++ get_method(reg); ++ ld_d(reg, reg, in_bytes(Method::const_offset())); ++ } ++ ++ void get_constant_pool(Register reg) { ++ get_const(reg); ++ ld_d(reg, reg, in_bytes(ConstMethod::constants_offset())); ++ } ++ ++ void get_constant_pool_cache(Register reg) { ++ get_constant_pool(reg); ++ ld_d(reg, reg, in_bytes(ConstantPool::cache_offset())); ++ } ++ ++ void get_cpool_and_tags(Register cpool, Register tags) { ++ get_constant_pool(cpool); ++ ld_d(tags, cpool, in_bytes(ConstantPool::tags_offset())); ++ } ++ ++ void get_unsigned_2_byte_index_at_bcp(Register reg, int bcp_offset); ++ void get_cache_and_index_at_bcp(Register cache, Register index, int bcp_offset, size_t index_size = sizeof(u2)); ++ void get_cache_and_index_and_bytecode_at_bcp(Register cache, Register index, Register bytecode, int byte_no, int bcp_offset, size_t index_size = sizeof(u2)); ++ void get_cache_entry_pointer_at_bcp(Register cache, Register tmp, int bcp_offset, size_t index_size = sizeof(u2)); ++ void get_cache_index_at_bcp(Register index, int bcp_offset, size_t index_size = sizeof(u2)); ++ void get_method_counters(Register method, Register mcs, Label& skip); ++ ++ void load_resolved_indy_entry(Register cache, Register index); ++ ++ // load cpool->resolved_references(index); ++ void load_resolved_reference_at_index(Register result, Register index, Register tmp); ++ ++ // load cpool->resolved_klass_at(index) ++ void load_resolved_klass_at_index(Register cpool, // the constant pool (corrupted on return) ++ Register index, // the constant pool index (corrupted on return) ++ Register klass); // contains the Klass on return ++ ++ void load_resolved_method_at_index(int byte_no, ++ Register method, ++ Register cache, ++ Register index); ++ ++ void pop_ptr( Register r = FSR); ++ void pop_i( Register r = FSR); ++ void pop_l( Register r = FSR); ++ void pop_f(FloatRegister r = FSF); ++ void pop_d(FloatRegister r = FSF); ++ ++ void push_ptr( Register r = FSR); ++ void push_i( Register r = FSR); ++ void push_l( Register r = FSR); ++ void push_f(FloatRegister r = FSF); ++ void push_d(FloatRegister r = FSF); ++ ++ void pop(Register r ) { ((MacroAssembler*)this)->pop(r); } ++ ++ void push(Register r ) { ((MacroAssembler*)this)->push(r); } ++ ++ void pop(TosState state); // transition vtos -> state ++ void push(TosState state); // transition state -> vtos ++ ++ void empty_expression_stack() { ++ ld_d(SP, FP, frame::interpreter_frame_monitor_block_top_offset * wordSize); ++ // null last_sp until next java call ++ st_d(R0, FP, frame::interpreter_frame_last_sp_offset * wordSize); ++ } ++ ++ // Super call_VM calls - correspond to MacroAssembler::call_VM(_leaf) calls ++ void load_ptr(int n, Register val); ++ void store_ptr(int n, Register val); ++ ++ // Generate a subtype check: branch to ok_is_subtype if sub_klass is ++ // a subtype of super_klass. ++ //void gen_subtype_check( Register sub_klass, Label &ok_is_subtype ); ++ void gen_subtype_check( Register Rsup_klass, Register sub_klass, Label &ok_is_subtype ); ++ ++ // Dispatching ++ void dispatch_prolog(TosState state, int step = 0); ++ void dispatch_epilog(TosState state, int step = 0); ++ void dispatch_only(TosState state, bool generate_poll = false); ++ void dispatch_only_normal(TosState state); ++ void dispatch_only_noverify(TosState state); ++ void dispatch_next(TosState state, int step = 0, bool generate_poll = false); ++ void dispatch_via (TosState state, address* table); ++ ++ // jump to an invoked target ++ void prepare_to_jump_from_interpreted(); ++ void jump_from_interpreted(Register method); ++ ++ ++ // Returning from interpreted functions ++ // ++ // Removes the current activation (incl. unlocking of monitors) ++ // and sets up the return address. This code is also used for ++ // exception unwindwing. In that case, we do not want to throw ++ // IllegalMonitorStateExceptions, since that might get us into an ++ // infinite rethrow exception loop. ++ // Additionally this code is used for popFrame and earlyReturn. ++ // In popFrame case we want to skip throwing an exception, ++ // installing an exception, and notifying jvmdi. ++ // In earlyReturn case we only want to skip throwing an exception ++ // and installing an exception. ++ void remove_activation(TosState state, ++ bool throw_monitor_exception = true, ++ bool install_monitor_exception = true, ++ bool notify_jvmdi = true); ++ ++ // Object locking ++ void lock_object (Register lock_reg); ++ void unlock_object(Register lock_reg); ++ ++ // Interpreter profiling operations ++ void set_method_data_pointer_for_bcp(); ++ void test_method_data_pointer(Register mdp, Label& zero_continue); ++ void verify_method_data_pointer(); ++ ++ void set_mdp_data_at(Register mdp_in, int constant, Register value); ++ void increment_mdp_data_at(Register mdp_in, int constant, ++ bool decrement = false); ++ void increment_mdp_data_at(Register mdp_in, Register reg, int constant, ++ bool decrement = false); ++ void increment_mask_and_jump(Address counter_addr, ++ int increment, Address mask, ++ Register scratch, bool preloaded, ++ Condition cond, Label* where); ++ void set_mdp_flag_at(Register mdp_in, int flag_constant); ++ void test_mdp_data_at(Register mdp_in, int offset, Register value, ++ Register test_value_out, ++ Label& not_equal_continue); ++ ++ void record_klass_in_profile(Register receiver, Register mdp, ++ Register reg2, bool is_virtual_call); ++ void record_klass_in_profile_helper(Register receiver, Register mdp, ++ Register reg2, int start_row, ++ Label& done, bool is_virtual_call); ++ ++ void record_item_in_profile_helper(Register item, Register mdp, ++ Register reg2, int start_row, Label& done, int total_rows, ++ OffsetFunction item_offset_fn, OffsetFunction item_count_offset_fn, ++ int non_profiled_offset); ++ void update_mdp_by_offset(Register mdp_in, int offset_of_offset); ++ void update_mdp_by_offset(Register mdp_in, Register reg, int offset_of_disp); ++ void update_mdp_by_constant(Register mdp_in, int constant); ++ void update_mdp_for_ret(Register return_bci); ++ ++ void profile_taken_branch(Register mdp, Register bumped_count); ++ void profile_not_taken_branch(Register mdp); ++ void profile_call(Register mdp); ++ void profile_final_call(Register mdp); ++ void profile_virtual_call(Register receiver, Register mdp, ++ Register scratch2, ++ bool receiver_can_be_null = false); ++ void profile_ret(Register return_bci, Register mdp); ++ void profile_null_seen(Register mdp); ++ void profile_typecheck(Register mdp, Register klass, Register scratch); ++ void profile_typecheck_failed(Register mdp); ++ void profile_switch_default(Register mdp); ++ void profile_switch_case(Register index_in_scratch, Register mdp, ++ Register scratch2); ++ ++ // Debugging ++ // only if +VerifyFPU && (state == ftos || state == dtos) ++ void verify_FPU(int stack_depth, TosState state = ftos); ++ ++ void profile_obj_type(Register obj, const Address& mdo_addr); ++ void profile_arguments_type(Register mdp, Register callee, Register tmp, bool is_virtual); ++ void profile_return_type(Register mdp, Register ret, Register tmp); ++ void profile_parameters_type(Register mdp, Register tmp1, Register tmp2); ++ ++ typedef enum { NotifyJVMTI, SkipNotifyJVMTI } NotifyMethodExitMode; ++ ++ // support for jvmti/dtrace ++ void notify_method_entry(); ++ void notify_method_exit(TosState state, NotifyMethodExitMode mode); ++}; ++ ++#endif // CPU_LOONGARCH_INTERP_MASM_LOONGARCH_64_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/interpreterRT_loongarch_64.cpp b/src/hotspot/cpu/loongarch/interpreterRT_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/interpreterRT_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/interpreterRT_loongarch_64.cpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,264 @@ ++/* ++ * Copyright (c) 2003, 2012, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "interpreter/interp_masm.hpp" ++#include "interpreter/interpreter.hpp" ++#include "interpreter/interpreterRuntime.hpp" ++#include "memory/allocation.inline.hpp" ++#include "memory/universe.hpp" ++#include "oops/method.hpp" ++#include "oops/oop.inline.hpp" ++#include "runtime/handles.inline.hpp" ++#include "runtime/icache.hpp" ++#include "runtime/interfaceSupport.inline.hpp" ++#include "runtime/signature.hpp" ++ ++#define __ _masm-> ++ ++// Implementation of SignatureHandlerGenerator ++InterpreterRuntime::SignatureHandlerGenerator::SignatureHandlerGenerator( ++ const methodHandle& method, CodeBuffer* buffer) : NativeSignatureIterator(method) { ++ _masm = new MacroAssembler(buffer); ++ _num_int_args = (method->is_static() ? 1 : 0); ++ _num_fp_args = 0; ++ _stack_offset = 0; ++} ++ ++void InterpreterRuntime::SignatureHandlerGenerator::move(int from_offset, int to_offset) { ++ __ ld_d(temp(), from(), Interpreter::local_offset_in_bytes(from_offset)); ++ __ st_d(temp(), to(), to_offset * longSize); ++} ++ ++void InterpreterRuntime::SignatureHandlerGenerator::box(int from_offset, int to_offset) { ++ __ addi_d(temp(), from(),Interpreter::local_offset_in_bytes(from_offset) ); ++ __ ld_w(AT, from(), Interpreter::local_offset_in_bytes(from_offset) ); ++ ++ __ maskeqz(temp(), temp(), AT); ++ __ st_w(temp(), to(), to_offset * wordSize); ++} ++ ++void InterpreterRuntime::SignatureHandlerGenerator::generate(uint64_t fingerprint) { ++ // generate code to handle arguments ++ iterate(fingerprint); ++ // return result handler ++ __ li(V0, AbstractInterpreter::result_handler(method()->result_type())); ++ // return ++ __ jr(RA); ++ ++ __ flush(); ++} ++ ++void InterpreterRuntime::SignatureHandlerGenerator::pass_int() { ++ if (_num_int_args < Argument::n_int_register_parameters_c - 1) { ++ __ ld_w(as_Register(++_num_int_args + A0->encoding()), from(), Interpreter::local_offset_in_bytes(offset())); ++ } else { ++ __ ld_w(AT, from(), Interpreter::local_offset_in_bytes(offset())); ++ __ st_w(AT, to(), _stack_offset); ++ _stack_offset += wordSize; ++ } ++} ++ ++// the jvm specifies that long type takes 2 stack spaces, so in do_long(), _offset += 2. ++void InterpreterRuntime::SignatureHandlerGenerator::pass_long() { ++ if (_num_int_args < Argument::n_int_register_parameters_c - 1) { ++ __ ld_d(as_Register(++_num_int_args + A0->encoding()), from(), Interpreter::local_offset_in_bytes(offset() + 1)); ++ } else { ++ __ ld_d(AT, from(), Interpreter::local_offset_in_bytes(offset() + 1)); ++ __ st_d(AT, to(), _stack_offset); ++ _stack_offset += wordSize; ++ } ++} ++ ++void InterpreterRuntime::SignatureHandlerGenerator::pass_object() { ++ if (_num_int_args < Argument::n_int_register_parameters_c - 1) { ++ Register reg = as_Register(++_num_int_args + A0->encoding()); ++ if (_num_int_args == 1) { ++ assert(offset() == 0, "argument register 1 can only be (non-null) receiver"); ++ __ addi_d(reg, from(), Interpreter::local_offset_in_bytes(offset())); ++ } else { ++ __ ld_d(reg, from(), Interpreter::local_offset_in_bytes(offset())); ++ __ addi_d(AT, from(), Interpreter::local_offset_in_bytes(offset())); ++ __ maskeqz(reg, AT, reg); ++ } ++ } else { ++ __ ld_d(temp(), from(), Interpreter::local_offset_in_bytes(offset())); ++ __ addi_d(AT, from(), Interpreter::local_offset_in_bytes(offset())); ++ __ maskeqz(temp(), AT, temp()); ++ __ st_d(temp(), to(), _stack_offset); ++ _stack_offset += wordSize; ++ } ++} ++ ++void InterpreterRuntime::SignatureHandlerGenerator::pass_float() { ++ if (_num_fp_args < Argument::n_float_register_parameters_c) { ++ __ fld_s(as_FloatRegister(_num_fp_args++), from(), Interpreter::local_offset_in_bytes(offset())); ++ } else if (_num_int_args < Argument::n_int_register_parameters_c - 1) { ++ __ ld_w(as_Register(++_num_int_args + A0->encoding()), from(), Interpreter::local_offset_in_bytes(offset())); ++ } else { ++ __ ld_w(AT, from(), Interpreter::local_offset_in_bytes(offset())); ++ __ st_w(AT, to(), _stack_offset); ++ _stack_offset += wordSize; ++ } ++} ++ ++// the jvm specifies that double type takes 2 stack spaces, so in do_double(), _offset += 2. ++void InterpreterRuntime::SignatureHandlerGenerator::pass_double() { ++ if (_num_fp_args < Argument::n_float_register_parameters_c) { ++ __ fld_d(as_FloatRegister(_num_fp_args++), from(), Interpreter::local_offset_in_bytes(offset() + 1)); ++ } else if (_num_int_args < Argument::n_int_register_parameters_c - 1) { ++ __ ld_d(as_Register(++_num_int_args + A0->encoding()), from(), Interpreter::local_offset_in_bytes(offset() + 1)); ++ } else { ++ __ ld_d(AT, from(), Interpreter::local_offset_in_bytes(offset() + 1)); ++ __ st_d(AT, to(), _stack_offset); ++ _stack_offset += wordSize; ++ } ++} ++ ++ ++Register InterpreterRuntime::SignatureHandlerGenerator::from() { return LVP; } ++Register InterpreterRuntime::SignatureHandlerGenerator::to() { return SP; } ++Register InterpreterRuntime::SignatureHandlerGenerator::temp() { return T8; } ++ ++// Implementation of SignatureHandlerLibrary ++ ++void SignatureHandlerLibrary::pd_set_handler(address handler) {} ++ ++ ++class SlowSignatureHandler ++ : public NativeSignatureIterator { ++ private: ++ address _from; ++ intptr_t* _to; ++ intptr_t* _int_args; ++ intptr_t* _fp_args; ++ intptr_t* _fp_identifiers; ++ unsigned int _num_int_args; ++ unsigned int _num_fp_args; ++ ++ virtual void pass_int() ++ { ++ jint from_obj = *(jint *)(_from+Interpreter::local_offset_in_bytes(0)); ++ _from -= Interpreter::stackElementSize; ++ ++ if (_num_int_args < Argument::n_int_register_parameters_c - 1) { ++ *_int_args++ = from_obj; ++ _num_int_args++; ++ } else { ++ *_to++ = from_obj; ++ } ++ } ++ ++ virtual void pass_long() ++ { ++ intptr_t from_obj = *(intptr_t*)(_from+Interpreter::local_offset_in_bytes(1)); ++ _from -= 2 * Interpreter::stackElementSize; ++ ++ if (_num_int_args < Argument::n_int_register_parameters_c - 1) { ++ *_int_args++ = from_obj; ++ _num_int_args++; ++ } else { ++ *_to++ = from_obj; ++ } ++ } ++ ++ virtual void pass_object() ++ { ++ intptr_t *from_addr = (intptr_t*)(_from + Interpreter::local_offset_in_bytes(0)); ++ _from -= Interpreter::stackElementSize; ++ ++ if (_num_int_args < Argument::n_int_register_parameters_c - 1) { ++ *_int_args++ = (*from_addr == 0) ? (intptr_t)0 : (intptr_t)from_addr; ++ _num_int_args++; ++ } else { ++ *_to++ = (*from_addr == 0) ? (intptr_t)0 : (intptr_t)from_addr; ++ } ++ } ++ ++ virtual void pass_float() ++ { ++ jint from_obj = *(jint *)(_from+Interpreter::local_offset_in_bytes(0)); ++ _from -= Interpreter::stackElementSize; ++ ++ if (_num_fp_args < Argument::n_float_register_parameters_c) { ++ *_fp_args++ = from_obj; ++ _num_fp_args++; ++ } else if (_num_int_args < Argument::n_int_register_parameters_c - 1) { ++ *_int_args++ = from_obj; ++ _num_int_args++; ++ } else { ++ *_to++ = from_obj; ++ } ++ } ++ ++ virtual void pass_double() ++ { ++ intptr_t from_obj = *(intptr_t*)(_from+Interpreter::local_offset_in_bytes(1)); ++ _from -= 2*Interpreter::stackElementSize; ++ ++ if (_num_fp_args < Argument::n_float_register_parameters_c) { ++ *_fp_args++ = from_obj; ++ *_fp_identifiers |= (1 << _num_fp_args); // mark as double ++ _num_fp_args++; ++ } else if (_num_int_args < Argument::n_int_register_parameters_c - 1) { ++ *_int_args++ = from_obj; ++ _num_int_args++; ++ } else { ++ *_to++ = from_obj; ++ } ++ } ++ ++ public: ++ SlowSignatureHandler(methodHandle method, address from, intptr_t* to) ++ : NativeSignatureIterator(method) ++ { ++ _from = from; ++ _to = to; ++ ++ // see TemplateInterpreterGenerator::generate_slow_signature_handler() ++ _int_args = to - (method->is_static() ? 15 : 16); ++ _fp_args = to - 8; ++ _fp_identifiers = to - 9; ++ *(int*) _fp_identifiers = 0; ++ _num_int_args = (method->is_static() ? 1 : 0); ++ _num_fp_args = 0; ++ } ++}; ++ ++ ++JRT_ENTRY(address, ++ InterpreterRuntime::slow_signature_handler(JavaThread* current, ++ Method* method, ++ intptr_t* from, ++ intptr_t* to)) ++ methodHandle m(current, (Method*)method); ++ assert(m->is_native(), "sanity check"); ++ ++ // handle arguments ++ SlowSignatureHandler(m, (address)from, to).iterate(UCONST64(-1)); ++ ++ // return result handler ++ return Interpreter::result_handler(m->result_type()); ++JRT_END +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/interpreterRT_loongarch.hpp b/src/hotspot/cpu/loongarch/interpreterRT_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/interpreterRT_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/interpreterRT_loongarch.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,62 @@ ++/* ++ * Copyright (c) 1998, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_INTERPRETERRT_LOONGARCH_HPP ++#define CPU_LOONGARCH_INTERPRETERRT_LOONGARCH_HPP ++ ++// This is included in the middle of class Interpreter. ++// Do not include files here. ++ ++// native method calls ++ ++class SignatureHandlerGenerator: public NativeSignatureIterator { ++ private: ++ MacroAssembler* _masm; ++ unsigned int _num_fp_args; ++ unsigned int _num_int_args; ++ int _stack_offset; ++ ++ void move(int from_offset, int to_offset); ++ void box(int from_offset, int to_offset); ++ void pass_int(); ++ void pass_long(); ++ void pass_object(); ++ void pass_float(); ++ void pass_double(); ++ ++ public: ++ // Creation ++ SignatureHandlerGenerator(const methodHandle& method, CodeBuffer* buffer); ++ ++ // Code generation ++ void generate(uint64_t fingerprint); ++ ++ // Code generation support ++ static Register from(); ++ static Register to(); ++ static Register temp(); ++}; ++ ++#endif // CPU_LOONGARCH_INTERPRETERRT_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/javaFrameAnchor_loongarch.hpp b/src/hotspot/cpu/loongarch/javaFrameAnchor_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/javaFrameAnchor_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/javaFrameAnchor_loongarch.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,86 @@ ++/* ++ * Copyright (c) 2002, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_JAVAFRAMEANCHOR_LOONGARCH_HPP ++#define CPU_LOONGARCH_JAVAFRAMEANCHOR_LOONGARCH_HPP ++ ++private: ++ ++ // FP value associated with _last_Java_sp: ++ intptr_t* volatile _last_Java_fp; // pointer is volatile not what it points to ++ ++public: ++ // Each arch must define reset, save, restore ++ // These are used by objects that only care about: ++ // 1 - initializing a new state (thread creation, javaCalls) ++ // 2 - saving a current state (javaCalls) ++ // 3 - restoring an old state (javaCalls) ++ ++ void clear(void) { ++ // clearing _last_Java_sp must be first ++ _last_Java_sp = nullptr; ++ // fence? ++ _last_Java_fp = nullptr; ++ _last_Java_pc = nullptr; ++ } ++ ++ void copy(JavaFrameAnchor* src) { ++ // In order to make sure the transition state is valid for "this" ++ // We must clear _last_Java_sp before copying the rest of the new data ++ // ++ // Hack Alert: Temporary bugfix for 4717480/4721647 ++ // To act like previous version (pd_cache_state) don't null _last_Java_sp ++ // unless the value is changing ++ // ++ if (_last_Java_sp != src->_last_Java_sp) ++ _last_Java_sp = nullptr; ++ ++ _last_Java_fp = src->_last_Java_fp; ++ _last_Java_pc = src->_last_Java_pc; ++ // Must be last so profiler will always see valid frame if has_last_frame() is true ++ _last_Java_sp = src->_last_Java_sp; ++ } ++ ++ bool walkable(void) { return _last_Java_sp != nullptr && _last_Java_pc != nullptr; } ++ ++ void make_walkable(); ++ ++ intptr_t* last_Java_sp(void) const { return _last_Java_sp; } ++ ++ const address last_Java_pc(void) { return _last_Java_pc; } ++ ++private: ++ ++ static ByteSize last_Java_fp_offset() { return byte_offset_of(JavaFrameAnchor, _last_Java_fp); } ++ ++public: ++ ++ void set_last_Java_sp(intptr_t* java_sp) { _last_Java_sp = java_sp; } ++ ++ intptr_t* last_Java_fp(void) { return _last_Java_fp; } ++ // Assert (last_Java_sp == nullptr || fp == nullptr) ++ void set_last_Java_fp(intptr_t* fp) { _last_Java_fp = fp; } ++ ++#endif // CPU_LOONGARCH_JAVAFRAMEANCHOR_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/jniFastGetField_loongarch_64.cpp b/src/hotspot/cpu/loongarch/jniFastGetField_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/jniFastGetField_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/jniFastGetField_loongarch_64.cpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,179 @@ ++/* ++ * Copyright (c) 2004, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "code/codeBlob.hpp" ++#include "gc/shared/barrierSet.hpp" ++#include "gc/shared/barrierSetAssembler.hpp" ++#include "memory/resourceArea.hpp" ++#include "prims/jniFastGetField.hpp" ++#include "prims/jvm_misc.hpp" ++#include "prims/jvmtiExport.hpp" ++#include "runtime/safepoint.hpp" ++ ++#define __ masm-> ++ ++#define BUFFER_SIZE 30*wordSize ++ ++// Instead of issuing membar for LoadLoad barrier, we create address dependency ++// between loads, which is more efficient than membar. ++ ++address JNI_FastGetField::generate_fast_get_int_field0(BasicType type) { ++ const char *name = nullptr; ++ switch (type) { ++ case T_BOOLEAN: name = "jni_fast_GetBooleanField"; break; ++ case T_BYTE: name = "jni_fast_GetByteField"; break; ++ case T_CHAR: name = "jni_fast_GetCharField"; break; ++ case T_SHORT: name = "jni_fast_GetShortField"; break; ++ case T_INT: name = "jni_fast_GetIntField"; break; ++ case T_LONG: name = "jni_fast_GetLongField"; break; ++ case T_FLOAT: name = "jni_fast_GetFloatField"; break; ++ case T_DOUBLE: name = "jni_fast_GetDoubleField"; break; ++ default: ShouldNotReachHere(); ++ } ++ ResourceMark rm; ++ BufferBlob* blob = BufferBlob::create(name, BUFFER_SIZE); ++ CodeBuffer cbuf(blob); ++ MacroAssembler* masm = new MacroAssembler(&cbuf); ++ address fast_entry = __ pc(); ++ Label slow; ++ ++ const Register env = A0; ++ const Register obj = A1; ++ const Register fid = A2; ++ const Register tmp1 = AT; ++ const Register tmp2 = T4; ++ const Register obj_addr = T0; ++ const Register field_val = T0; ++ const Register field_addr = T0; ++ const Register counter_addr = T2; ++ const Register counter_prev_val = T1; ++ ++ __ li(counter_addr, SafepointSynchronize::safepoint_counter_addr()); ++ __ ld_w(counter_prev_val, counter_addr, 0); ++ ++ // Parameters(A0~A3) should not be modified, since they will be used in slow path ++ __ andi(tmp1, counter_prev_val, 1); ++ __ bnez(tmp1, slow); ++ ++ if (JvmtiExport::can_post_field_access()) { ++ // Check to see if a field access watch has been set before we ++ // take the fast path. ++ __ li(tmp2, JvmtiExport::get_field_access_count_addr()); ++ // address dependency ++ __ XOR(tmp1, counter_prev_val, counter_prev_val); ++ __ ldx_w(tmp1, tmp2, tmp1); ++ __ bnez(tmp1, slow); ++ } ++ ++ __ move(obj_addr, obj); ++ // Both obj_addr and tmp2 are clobbered by try_resolve_jobject_in_native. ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ bs->try_resolve_jobject_in_native(masm, env, obj_addr, tmp2, slow); ++ ++ __ srli_d(tmp1, fid, 2); // offset ++ __ add_d(field_addr, obj_addr, tmp1); ++ // address dependency ++ __ XOR(tmp1, counter_prev_val, counter_prev_val); ++ ++ assert(count < LIST_CAPACITY, "LIST_CAPACITY too small"); ++ speculative_load_pclist[count] = __ pc(); ++ switch (type) { ++ case T_BOOLEAN: __ ldx_bu (field_val, field_addr, tmp1); break; ++ case T_BYTE: __ ldx_b (field_val, field_addr, tmp1); break; ++ case T_CHAR: __ ldx_hu (field_val, field_addr, tmp1); break; ++ case T_SHORT: __ ldx_h (field_val, field_addr, tmp1); break; ++ case T_INT: __ ldx_w (field_val, field_addr, tmp1); break; ++ case T_LONG: __ ldx_d (field_val, field_addr, tmp1); break; ++ case T_FLOAT: __ ldx_wu (field_val, field_addr, tmp1); break; ++ case T_DOUBLE: __ ldx_d (field_val, field_addr, tmp1); break; ++ default: ShouldNotReachHere(); ++ } ++ ++ // address dependency ++ __ XOR(tmp1, field_val, field_val); ++ __ ldx_w(tmp1, counter_addr, tmp1); ++ __ bne(counter_prev_val, tmp1, slow); ++ ++ switch (type) { ++ case T_FLOAT: __ movgr2fr_w(F0, field_val); break; ++ case T_DOUBLE: __ movgr2fr_d(F0, field_val); break; ++ default: __ move(V0, field_val); break; ++ } ++ ++ __ jr(RA); ++ ++ slowcase_entry_pclist[count++] = __ pc(); ++ __ bind (slow); ++ address slow_case_addr = nullptr; ++ switch (type) { ++ case T_BOOLEAN: slow_case_addr = jni_GetBooleanField_addr(); break; ++ case T_BYTE: slow_case_addr = jni_GetByteField_addr(); break; ++ case T_CHAR: slow_case_addr = jni_GetCharField_addr(); break; ++ case T_SHORT: slow_case_addr = jni_GetShortField_addr(); break; ++ case T_INT: slow_case_addr = jni_GetIntField_addr(); break; ++ case T_LONG: slow_case_addr = jni_GetLongField_addr(); break; ++ case T_FLOAT: slow_case_addr = jni_GetFloatField_addr(); break; ++ case T_DOUBLE: slow_case_addr = jni_GetDoubleField_addr(); break; ++ default: ShouldNotReachHere(); ++ } ++ __ jmp(slow_case_addr); ++ ++ __ flush (); ++ return fast_entry; ++} ++ ++address JNI_FastGetField::generate_fast_get_boolean_field() { ++ return generate_fast_get_int_field0(T_BOOLEAN); ++} ++ ++address JNI_FastGetField::generate_fast_get_byte_field() { ++ return generate_fast_get_int_field0(T_BYTE); ++} ++ ++address JNI_FastGetField::generate_fast_get_char_field() { ++ return generate_fast_get_int_field0(T_CHAR); ++} ++ ++address JNI_FastGetField::generate_fast_get_short_field() { ++ return generate_fast_get_int_field0(T_SHORT); ++} ++ ++address JNI_FastGetField::generate_fast_get_int_field() { ++ return generate_fast_get_int_field0(T_INT); ++} ++ ++address JNI_FastGetField::generate_fast_get_long_field() { ++ return generate_fast_get_int_field0(T_LONG); ++} ++ ++address JNI_FastGetField::generate_fast_get_float_field() { ++ return generate_fast_get_int_field0(T_FLOAT); ++} ++ ++address JNI_FastGetField::generate_fast_get_double_field() { ++ return generate_fast_get_int_field0(T_DOUBLE); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/jniTypes_loongarch.hpp b/src/hotspot/cpu/loongarch/jniTypes_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/jniTypes_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/jniTypes_loongarch.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,143 @@ ++/* ++ * Copyright (c) 1998, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_JNITYPES_LOONGARCH_HPP ++#define CPU_LOONGARCH_JNITYPES_LOONGARCH_HPP ++ ++#include "jni.h" ++#include "memory/allStatic.hpp" ++#include "oops/oop.hpp" ++ ++// This file holds platform-dependent routines used to write primitive jni ++// types to the array of arguments passed into JavaCalls::call ++ ++class JNITypes : AllStatic { ++ // These functions write a java primitive type (in native format) ++ // to a java stack slot array to be passed as an argument to JavaCalls:calls. ++ // I.e., they are functionally 'push' operations if they have a 'pos' ++ // formal parameter. Note that jlong's and jdouble's are written ++ // _in reverse_ of the order in which they appear in the interpreter ++ // stack. This is because call stubs (see stubGenerator_sparc.cpp) ++ // reverse the argument list constructed by JavaCallArguments (see ++ // javaCalls.hpp). ++ ++private: ++ ++ // 32bit Helper routines. ++ static inline void put_int2r(jint *from, intptr_t *to) { *(jint *)(to++) = from[1]; ++ *(jint *)(to ) = from[0]; } ++ static inline void put_int2r(jint *from, intptr_t *to, int& pos) { put_int2r(from, to + pos); pos += 2; } ++ ++public: ++ // In LoongArch64, the sizeof intptr_t is 8 bytes, and each unit in JavaCallArguments::_value_buffer[] ++ // is 8 bytes. ++ // If we only write the low 4 bytes with (jint *), the high 4-bits will be left with uncertain values. ++ // Then, in JavaCallArguments::parameters(), the whole 8 bytes of a T_INT parameter is loaded. ++ // This error occurs in ReflectInvoke.java ++ // The parameter of DD(int) should be 4 instead of 0x550000004. ++ // ++ // See: [runtime/javaCalls.hpp] ++ ++ static inline void put_int(jint from, intptr_t *to) { *(intptr_t *)(to + 0 ) = from; } ++ static inline void put_int(jint from, intptr_t *to, int& pos) { *(intptr_t *)(to + pos++) = from; } ++ static inline void put_int(jint *from, intptr_t *to, int& pos) { *(intptr_t *)(to + pos++) = *from; } ++ ++ // Longs are stored in native format in one JavaCallArgument slot at ++ // *(to). ++ // In theory, *(to + 1) is an empty slot. But, for several Java2D testing programs (TestBorderLayout, SwingTest), ++ // *(to + 1) must contains a copy of the long value. Otherwise it will corrupts. ++ static inline void put_long(jlong from, intptr_t *to) { ++ *(jlong*) (to + 1) = from; ++ *(jlong*) (to) = from; ++ } ++ ++ // A long parameter occupies two slot. ++ // It must fit the layout rule in methodHandle. ++ // ++ // See: [runtime/reflection.cpp] Reflection::invoke() ++ // assert(java_args.size_of_parameters() == method->size_of_parameters(), "just checking"); ++ ++ static inline void put_long(jlong from, intptr_t *to, int& pos) { ++ *(jlong*) (to + 1 + pos) = from; ++ *(jlong*) (to + pos) = from; ++ pos += 2; ++ } ++ ++ static inline void put_long(jlong *from, intptr_t *to, int& pos) { ++ *(jlong*) (to + 1 + pos) = *from; ++ *(jlong*) (to + pos) = *from; ++ pos += 2; ++ } ++ ++ // Oops are stored in native format in one JavaCallArgument slot at *to. ++ static inline void put_obj(const Handle& from_handle, intptr_t *to, int& pos) { *(to + pos++) = (intptr_t)from_handle.raw_value(); } ++ static inline void put_obj(jobject from_handle, intptr_t *to, int& pos) { *(to + pos++) = (intptr_t)from_handle; } ++ ++ // Floats are stored in native format in one JavaCallArgument slot at *to. ++ static inline void put_float(jfloat from, intptr_t *to) { *(jfloat *)(to + 0 ) = from; } ++ static inline void put_float(jfloat from, intptr_t *to, int& pos) { *(jfloat *)(to + pos++) = from; } ++ static inline void put_float(jfloat *from, intptr_t *to, int& pos) { *(jfloat *)(to + pos++) = *from; } ++ ++#undef _JNI_SLOT_OFFSET ++#define _JNI_SLOT_OFFSET 0 ++ ++ // Longs are stored in native format in one JavaCallArgument slot at ++ // *(to). ++ // In theory, *(to + 1) is an empty slot. But, for several Java2D testing programs (TestBorderLayout, SwingTest), ++ // *(to + 1) must contains a copy of the long value. Otherwise it will corrupts. ++ static inline void put_double(jdouble from, intptr_t *to) { ++ *(jdouble*) (to + 1) = from; ++ *(jdouble*) (to) = from; ++ } ++ ++ // A long parameter occupies two slot. ++ // It must fit the layout rule in methodHandle. ++ // ++ // See: [runtime/reflection.cpp] Reflection::invoke() ++ // assert(java_args.size_of_parameters() == method->size_of_parameters(), "just checking"); ++ ++ static inline void put_double(jdouble from, intptr_t *to, int& pos) { ++ *(jdouble*) (to + 1 + pos) = from; ++ *(jdouble*) (to + pos) = from; ++ pos += 2; ++ } ++ ++ static inline void put_double(jdouble *from, intptr_t *to, int& pos) { ++ *(jdouble*) (to + 1 + pos) = *from; ++ *(jdouble*) (to + pos) = *from; ++ pos += 2; ++ } ++ ++ // The get_xxx routines, on the other hand, actually _do_ fetch ++ // java primitive types from the interpreter stack. ++ static inline jint get_int (intptr_t *from) { return *(jint *) from; } ++ static inline jlong get_long (intptr_t *from) { return *(jlong *) (from + _JNI_SLOT_OFFSET); } ++ static inline oop get_obj (intptr_t *from) { return *(oop *) from; } ++ static inline jfloat get_float (intptr_t *from) { return *(jfloat *) from; } ++ static inline jdouble get_double(intptr_t *from) { return *(jdouble *)(from + _JNI_SLOT_OFFSET); } ++#undef _JNI_SLOT_OFFSET ++}; ++ ++#endif // CPU_LOONGARCH_JNITYPES_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/jvmciCodeInstaller_loongarch.cpp b/src/hotspot/cpu/loongarch/jvmciCodeInstaller_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/jvmciCodeInstaller_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/jvmciCodeInstaller_loongarch.cpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,194 @@ ++/* ++ * Copyright (c) 2015, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "jvmci/jvmci.hpp" ++#include "jvmci/jvmciCodeInstaller.hpp" ++#include "jvmci/jvmciRuntime.hpp" ++#include "jvmci/jvmciCompilerToVM.hpp" ++#include "jvmci/jvmciJavaClasses.hpp" ++#include "oops/oop.inline.hpp" ++#include "runtime/handles.inline.hpp" ++#include "runtime/jniHandles.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "vmreg_loongarch.inline.hpp" ++ ++jint CodeInstaller::pd_next_offset(NativeInstruction* inst, jint pc_offset, JVMCI_TRAPS) { ++ address pc = (address) inst; ++ if (inst->is_int_branch() || inst->is_float_branch()) { ++ return pc_offset + NativeInstruction::nop_instruction_size; ++ } else if (inst->is_call()) { ++ return pc_offset + NativeCall::instruction_size; ++ } else if (inst->is_far_call()) { ++ return pc_offset + NativeFarCall::instruction_size; ++ } else if (inst->is_jump()) { ++ return pc_offset + NativeGeneralJump::instruction_size; ++ } else if (inst->is_lu12iw_lu32id()) { ++ // match LoongArch64TestAssembler.java emitCall ++ // lu12i_w; lu32i_d; jirl ++ return pc_offset + 3 * NativeInstruction::nop_instruction_size; ++ } else { ++ JVMCI_ERROR_0("unsupported type of instruction for call site"); ++ } ++ return 0; ++} ++ ++void CodeInstaller::pd_patch_OopConstant(int pc_offset, Handle& obj, bool compressed, JVMCI_TRAPS) { ++ address pc = _instructions->start() + pc_offset; ++ jobject value = JNIHandles::make_local(obj()); ++ if (compressed) { ++ NativeMovConstReg* move = nativeMovConstReg_at(pc); ++ move->set_data((intptr_t)(CompressedOops::encode(cast_to_oop(cast_from_oop
(obj()))))); ++ int oop_index = _oop_recorder->find_index(value); ++ RelocationHolder rspec = oop_Relocation::spec(oop_index); ++ _instructions->relocate(pc, rspec, Assembler::narrow_oop_operand); ++ } else { ++ NativeMovConstReg* move = nativeMovConstReg_at(pc); ++ move->set_data((intptr_t)(cast_from_oop
(obj()))); ++ int oop_index = _oop_recorder->find_index(value); ++ RelocationHolder rspec = oop_Relocation::spec(oop_index); ++ _instructions->relocate(pc, rspec); ++ } ++} ++ ++void CodeInstaller::pd_patch_MetaspaceConstant(int pc_offset, HotSpotCompiledCodeStream* stream, u1 tag, JVMCI_TRAPS) { ++ address pc = _instructions->start() + pc_offset; ++ if (tag == PATCH_NARROW_KLASS) { ++ NativeMovConstReg* move = nativeMovConstReg_at(pc); ++ narrowKlass narrowOop = record_narrow_metadata_reference(_instructions, pc, stream, tag, JVMCI_CHECK); ++ move->set_data((intptr_t) narrowOop); ++ JVMCI_event_3("relocating (narrow metaspace constant) at " PTR_FORMAT "/0x%x", p2i(pc), narrowOop); ++ } else { ++ NativeMovConstReg* move = nativeMovConstReg_at(pc); ++ void* reference = record_metadata_reference(_instructions, pc, stream, tag, JVMCI_CHECK); ++ move->set_data((intptr_t) reference); ++ JVMCI_event_3("relocating (metaspace constant) at " PTR_FORMAT "/" PTR_FORMAT, p2i(pc), p2i(reference)); ++ } ++} ++ ++void CodeInstaller::pd_patch_DataSectionReference(int pc_offset, int data_offset, JVMCI_TRAPS) { ++ address pc = _instructions->start() + pc_offset; ++ NativeInstruction* inst = nativeInstruction_at(pc); ++ if (inst->is_pcaddu12i_add()) { ++ address dest = _constants->start() + data_offset; ++ _instructions->relocate(pc, section_word_Relocation::spec((address) dest, CodeBuffer::SECT_CONSTS)); ++ JVMCI_event_3("relocating at " PTR_FORMAT " (+%d) with destination at %d", p2i(pc), pc_offset, data_offset); ++ } else { ++ JVMCI_ERROR("unknown load or move instruction at " PTR_FORMAT, p2i(pc)); ++ } ++} ++ ++void CodeInstaller::pd_relocate_ForeignCall(NativeInstruction* inst, jlong foreign_call_destination, JVMCI_TRAPS) { ++ address pc = (address) inst; ++ if (inst->is_call()) { ++ NativeCall* call = nativeCall_at(pc); ++ call->set_destination((address) foreign_call_destination); ++ _instructions->relocate(call->instruction_address(), runtime_call_Relocation::spec()); ++ } else if (inst->is_far_call()) { ++ NativeFarCall* call = nativeFarCall_at(pc); ++ call->set_destination((address) foreign_call_destination); ++ _instructions->relocate(call->instruction_address(), runtime_call_Relocation::spec()); ++ } else if (inst->is_jump()) { ++ NativeGeneralJump* jump = nativeGeneralJump_at(pc); ++ jump->set_jump_destination((address) foreign_call_destination); ++ _instructions->relocate(jump->instruction_address(), runtime_call_Relocation::spec()); ++ } else if (inst->is_lu12iw_lu32id()) { ++ // match emitCall of LoongArch64TestAssembler.java ++ // lu12i_w; lu32i_d; jirl ++ MacroAssembler::pd_patch_instruction((address)inst, (address)foreign_call_destination); ++ } else { ++ JVMCI_ERROR("unknown call or jump instruction at " PTR_FORMAT, p2i(pc)); ++ } ++ JVMCI_event_3("relocating (foreign call) at " PTR_FORMAT, p2i(inst)); ++} ++ ++void CodeInstaller::pd_relocate_JavaMethod(CodeBuffer &cbuf, methodHandle& method, jint pc_offset, JVMCI_TRAPS) { ++ switch (_next_call_type) { ++ case INLINE_INVOKE: ++ break; ++ case INVOKEVIRTUAL: ++ case INVOKEINTERFACE: { ++ assert(!method->is_static(), "cannot call static method with invokeinterface"); ++ NativeCall* call = nativeCall_at(_instructions->start() + pc_offset); ++ _instructions->relocate(call->instruction_address(), virtual_call_Relocation::spec(_invoke_mark_pc)); ++ call->trampoline_jump(cbuf, SharedRuntime::get_resolve_virtual_call_stub()); ++ break; ++ } ++ case INVOKESTATIC: { ++ assert(method->is_static(), "cannot call non-static method with invokestatic"); ++ NativeCall* call = nativeCall_at(_instructions->start() + pc_offset); ++ _instructions->relocate(call->instruction_address(), relocInfo::static_call_type); ++ call->trampoline_jump(cbuf, SharedRuntime::get_resolve_static_call_stub()); ++ break; ++ } ++ case INVOKESPECIAL: { ++ assert(!method->is_static(), "cannot call static method with invokespecial"); ++ NativeCall* call = nativeCall_at(_instructions->start() + pc_offset); ++ _instructions->relocate(call->instruction_address(), relocInfo::opt_virtual_call_type); ++ call->trampoline_jump(cbuf, SharedRuntime::get_resolve_opt_virtual_call_stub()); ++ break; ++ } ++ default: ++ JVMCI_ERROR("invalid _next_call_type value"); ++ break; ++ } ++} ++ ++void CodeInstaller::pd_relocate_poll(address pc, jint mark, JVMCI_TRAPS) { ++ switch (mark) { ++ case POLL_NEAR: ++ JVMCI_ERROR("unimplemented"); ++ break; ++ case POLL_FAR: ++ _instructions->relocate(pc, relocInfo::poll_type); ++ break; ++ case POLL_RETURN_NEAR: ++ JVMCI_ERROR("unimplemented"); ++ break; ++ case POLL_RETURN_FAR: ++ _instructions->relocate(pc, relocInfo::poll_return_type); ++ break; ++ default: ++ JVMCI_ERROR("invalid mark value"); ++ break; ++ } ++} ++ ++// convert JVMCI register indices (as used in oop maps) to HotSpot registers ++VMReg CodeInstaller::get_hotspot_reg(jint jvmci_reg, JVMCI_TRAPS) { ++ if (jvmci_reg < Register::number_of_registers) { ++ return as_Register(jvmci_reg)->as_VMReg(); ++ } else { ++ jint floatRegisterNumber = jvmci_reg - Register::number_of_registers; ++ if (floatRegisterNumber >= 0 && floatRegisterNumber < FloatRegister::number_of_registers) { ++ return as_FloatRegister(floatRegisterNumber)->as_VMReg(); ++ } ++ JVMCI_ERROR_NULL("invalid register number: %d", jvmci_reg); ++ } ++} ++ ++bool CodeInstaller::is_general_purpose_reg(VMReg hotspotRegister) { ++ return !hotspotRegister->is_FloatRegister(); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/loongarch_64.ad b/src/hotspot/cpu/loongarch/loongarch_64.ad +--- a/src/hotspot/cpu/loongarch/loongarch_64.ad 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/loongarch_64.ad 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,15770 @@ ++// ++// Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. ++// Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++// DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++// ++// This code is free software; you can redistribute it and/or modify it ++// under the terms of the GNU General Public License version 2 only, as ++// published by the Free Software Foundation. ++// ++// This code is distributed in the hope that it will be useful, but WITHOUT ++// ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++// version 2 for more details (a copy is included in the LICENSE file that ++// accompanied this code). ++// ++// You should have received a copy of the GNU General Public License version ++// 2 along with this work; if not, write to the Free Software Foundation, ++// Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++// ++// Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++// or visit www.oracle.com if you need additional information or have any ++// questions. ++// ++// ++ ++// GodSon3 Architecture Description File ++ ++//----------REGISTER DEFINITION BLOCK------------------------------------------ ++// This information is used by the matcher and the register allocator to ++// describe individual registers and classes of registers within the target ++// architecture. ++ ++// format: ++// reg_def name (call convention, c-call convention, ideal type, encoding); ++// call convention : ++// NS = No-Save ++// SOC = Save-On-Call ++// SOE = Save-On-Entry ++// AS = Always-Save ++// ideal type : ++// see opto/opcodes.hpp for more info ++// reg_class name (reg, ...); ++// alloc_class name (reg, ...); ++register %{ ++ ++// General Registers ++// Integer Registers ++ reg_def R0 ( NS, NS, Op_RegI, 0, R0->as_VMReg()); ++ reg_def R0_H ( NS, NS, Op_RegI, 0, R0->as_VMReg()->next()); ++ reg_def RA ( NS, NS, Op_RegI, 1, RA->as_VMReg()); ++ reg_def RA_H ( NS, NS, Op_RegI, 1, RA->as_VMReg()->next()); ++ reg_def TP ( NS, NS, Op_RegI, 2, TP->as_VMReg()); ++ reg_def TP_H ( NS, NS, Op_RegI, 2, TP->as_VMReg()->next()); ++ reg_def SP ( NS, NS, Op_RegI, 3, SP->as_VMReg()); ++ reg_def SP_H ( NS, NS, Op_RegI, 3, SP->as_VMReg()->next()); ++ reg_def A0 (SOC, SOC, Op_RegI, 4, A0->as_VMReg()); ++ reg_def A0_H (SOC, SOC, Op_RegI, 4, A0->as_VMReg()->next()); ++ reg_def A1 (SOC, SOC, Op_RegI, 5, A1->as_VMReg()); ++ reg_def A1_H (SOC, SOC, Op_RegI, 5, A1->as_VMReg()->next()); ++ reg_def A2 (SOC, SOC, Op_RegI, 6, A2->as_VMReg()); ++ reg_def A2_H (SOC, SOC, Op_RegI, 6, A2->as_VMReg()->next()); ++ reg_def A3 (SOC, SOC, Op_RegI, 7, A3->as_VMReg()); ++ reg_def A3_H (SOC, SOC, Op_RegI, 7, A3->as_VMReg()->next()); ++ reg_def A4 (SOC, SOC, Op_RegI, 8, A4->as_VMReg()); ++ reg_def A4_H (SOC, SOC, Op_RegI, 8, A4->as_VMReg()->next()); ++ reg_def A5 (SOC, SOC, Op_RegI, 9, A5->as_VMReg()); ++ reg_def A5_H (SOC, SOC, Op_RegI, 9, A5->as_VMReg()->next()); ++ reg_def A6 (SOC, SOC, Op_RegI, 10, A6->as_VMReg()); ++ reg_def A6_H (SOC, SOC, Op_RegI, 10, A6->as_VMReg()->next()); ++ reg_def A7 (SOC, SOC, Op_RegI, 11, A7->as_VMReg()); ++ reg_def A7_H (SOC, SOC, Op_RegI, 11, A7->as_VMReg()->next()); ++ reg_def T0 (SOC, SOC, Op_RegI, 12, T0->as_VMReg()); ++ reg_def T0_H (SOC, SOC, Op_RegI, 12, T0->as_VMReg()->next()); ++ reg_def T1 (SOC, SOC, Op_RegI, 13, T1->as_VMReg()); ++ reg_def T1_H (SOC, SOC, Op_RegI, 13, T1->as_VMReg()->next()); ++ reg_def T2 (SOC, SOC, Op_RegI, 14, T2->as_VMReg()); ++ reg_def T2_H (SOC, SOC, Op_RegI, 14, T2->as_VMReg()->next()); ++ reg_def T3 (SOC, SOC, Op_RegI, 15, T3->as_VMReg()); ++ reg_def T3_H (SOC, SOC, Op_RegI, 15, T3->as_VMReg()->next()); ++ reg_def T4 (SOC, SOC, Op_RegI, 16, T4->as_VMReg()); ++ reg_def T4_H (SOC, SOC, Op_RegI, 16, T4->as_VMReg()->next()); ++ reg_def T5 (SOC, SOC, Op_RegI, 17, T5->as_VMReg()); ++ reg_def T5_H (SOC, SOC, Op_RegI, 17, T5->as_VMReg()->next()); ++ reg_def T6 (SOC, SOC, Op_RegI, 18, T6->as_VMReg()); ++ reg_def T6_H (SOC, SOC, Op_RegI, 18, T6->as_VMReg()->next()); ++ reg_def T7 (SOC, SOC, Op_RegI, 19, T7->as_VMReg()); ++ reg_def T7_H (SOC, SOC, Op_RegI, 19, T7->as_VMReg()->next()); ++ reg_def T8 (SOC, SOC, Op_RegI, 20, T8->as_VMReg()); ++ reg_def T8_H (SOC, SOC, Op_RegI, 20, T8->as_VMReg()->next()); ++ reg_def RX ( NS, NS, Op_RegI, 21, RX->as_VMReg()); ++ reg_def RX_H ( NS, NS, Op_RegI, 21, RX->as_VMReg()->next()); ++ reg_def FP ( NS, NS, Op_RegI, 22, FP->as_VMReg()); ++ reg_def FP_H ( NS, NS, Op_RegI, 22, FP->as_VMReg()->next()); ++ reg_def S0 (SOC, SOE, Op_RegI, 23, S0->as_VMReg()); ++ reg_def S0_H (SOC, SOE, Op_RegI, 23, S0->as_VMReg()->next()); ++ reg_def S1 (SOC, SOE, Op_RegI, 24, S1->as_VMReg()); ++ reg_def S1_H (SOC, SOE, Op_RegI, 24, S1->as_VMReg()->next()); ++ reg_def S2 (SOC, SOE, Op_RegI, 25, S2->as_VMReg()); ++ reg_def S2_H (SOC, SOE, Op_RegI, 25, S2->as_VMReg()->next()); ++ reg_def S3 (SOC, SOE, Op_RegI, 26, S3->as_VMReg()); ++ reg_def S3_H (SOC, SOE, Op_RegI, 26, S3->as_VMReg()->next()); ++ reg_def S4 (SOC, SOE, Op_RegI, 27, S4->as_VMReg()); ++ reg_def S4_H (SOC, SOE, Op_RegI, 27, S4->as_VMReg()->next()); ++ reg_def S5 (SOC, SOE, Op_RegI, 28, S5->as_VMReg()); ++ reg_def S5_H (SOC, SOE, Op_RegI, 28, S5->as_VMReg()->next()); ++ reg_def S6 (SOC, SOE, Op_RegI, 29, S6->as_VMReg()); ++ reg_def S6_H (SOC, SOE, Op_RegI, 29, S6->as_VMReg()->next()); ++ reg_def S7 (SOC, SOE, Op_RegI, 30, S7->as_VMReg()); ++ reg_def S7_H (SOC, SOE, Op_RegI, 30, S7->as_VMReg()->next()); ++ reg_def S8 (SOC, SOE, Op_RegI, 31, S8->as_VMReg()); ++ reg_def S8_H (SOC, SOE, Op_RegI, 31, S8->as_VMReg()->next()); ++ ++ ++// Floating/Vector registers. ++ reg_def F0 ( SOC, SOC, Op_RegF, 0, F0->as_VMReg() ); ++ reg_def F0_H ( SOC, SOC, Op_RegF, 0, F0->as_VMReg()->next() ); ++ reg_def F0_J ( SOC, SOC, Op_RegF, 0, F0->as_VMReg()->next(2) ); ++ reg_def F0_K ( SOC, SOC, Op_RegF, 0, F0->as_VMReg()->next(3) ); ++ reg_def F0_L ( SOC, SOC, Op_RegF, 0, F0->as_VMReg()->next(4) ); ++ reg_def F0_M ( SOC, SOC, Op_RegF, 0, F0->as_VMReg()->next(5) ); ++ reg_def F0_N ( SOC, SOC, Op_RegF, 0, F0->as_VMReg()->next(6) ); ++ reg_def F0_O ( SOC, SOC, Op_RegF, 0, F0->as_VMReg()->next(7) ); ++ ++ reg_def F1 ( SOC, SOC, Op_RegF, 1, F1->as_VMReg() ); ++ reg_def F1_H ( SOC, SOC, Op_RegF, 1, F1->as_VMReg()->next() ); ++ reg_def F1_J ( SOC, SOC, Op_RegF, 1, F1->as_VMReg()->next(2) ); ++ reg_def F1_K ( SOC, SOC, Op_RegF, 1, F1->as_VMReg()->next(3) ); ++ reg_def F1_L ( SOC, SOC, Op_RegF, 1, F1->as_VMReg()->next(4) ); ++ reg_def F1_M ( SOC, SOC, Op_RegF, 1, F1->as_VMReg()->next(5) ); ++ reg_def F1_N ( SOC, SOC, Op_RegF, 1, F1->as_VMReg()->next(6) ); ++ reg_def F1_O ( SOC, SOC, Op_RegF, 1, F1->as_VMReg()->next(7) ); ++ ++ reg_def F2 ( SOC, SOC, Op_RegF, 2, F2->as_VMReg() ); ++ reg_def F2_H ( SOC, SOC, Op_RegF, 2, F2->as_VMReg()->next() ); ++ reg_def F2_J ( SOC, SOC, Op_RegF, 2, F2->as_VMReg()->next(2) ); ++ reg_def F2_K ( SOC, SOC, Op_RegF, 2, F2->as_VMReg()->next(3) ); ++ reg_def F2_L ( SOC, SOC, Op_RegF, 2, F2->as_VMReg()->next(4) ); ++ reg_def F2_M ( SOC, SOC, Op_RegF, 2, F2->as_VMReg()->next(5) ); ++ reg_def F2_N ( SOC, SOC, Op_RegF, 2, F2->as_VMReg()->next(6) ); ++ reg_def F2_O ( SOC, SOC, Op_RegF, 2, F2->as_VMReg()->next(7) ); ++ ++ reg_def F3 ( SOC, SOC, Op_RegF, 3, F3->as_VMReg() ); ++ reg_def F3_H ( SOC, SOC, Op_RegF, 3, F3->as_VMReg()->next() ); ++ reg_def F3_J ( SOC, SOC, Op_RegF, 3, F3->as_VMReg()->next(2) ); ++ reg_def F3_K ( SOC, SOC, Op_RegF, 3, F3->as_VMReg()->next(3) ); ++ reg_def F3_L ( SOC, SOC, Op_RegF, 3, F3->as_VMReg()->next(4) ); ++ reg_def F3_M ( SOC, SOC, Op_RegF, 3, F3->as_VMReg()->next(5) ); ++ reg_def F3_N ( SOC, SOC, Op_RegF, 3, F3->as_VMReg()->next(6) ); ++ reg_def F3_O ( SOC, SOC, Op_RegF, 3, F3->as_VMReg()->next(7) ); ++ ++ reg_def F4 ( SOC, SOC, Op_RegF, 4, F4->as_VMReg() ); ++ reg_def F4_H ( SOC, SOC, Op_RegF, 4, F4->as_VMReg()->next() ); ++ reg_def F4_J ( SOC, SOC, Op_RegF, 4, F4->as_VMReg()->next(2) ); ++ reg_def F4_K ( SOC, SOC, Op_RegF, 4, F4->as_VMReg()->next(3) ); ++ reg_def F4_L ( SOC, SOC, Op_RegF, 4, F4->as_VMReg()->next(4) ); ++ reg_def F4_M ( SOC, SOC, Op_RegF, 4, F4->as_VMReg()->next(5) ); ++ reg_def F4_N ( SOC, SOC, Op_RegF, 4, F4->as_VMReg()->next(6) ); ++ reg_def F4_O ( SOC, SOC, Op_RegF, 4, F4->as_VMReg()->next(7) ); ++ ++ reg_def F5 ( SOC, SOC, Op_RegF, 5, F5->as_VMReg() ); ++ reg_def F5_H ( SOC, SOC, Op_RegF, 5, F5->as_VMReg()->next() ); ++ reg_def F5_J ( SOC, SOC, Op_RegF, 5, F5->as_VMReg()->next(2) ); ++ reg_def F5_K ( SOC, SOC, Op_RegF, 5, F5->as_VMReg()->next(3) ); ++ reg_def F5_L ( SOC, SOC, Op_RegF, 5, F5->as_VMReg()->next(4) ); ++ reg_def F5_M ( SOC, SOC, Op_RegF, 5, F5->as_VMReg()->next(5) ); ++ reg_def F5_N ( SOC, SOC, Op_RegF, 5, F5->as_VMReg()->next(6) ); ++ reg_def F5_O ( SOC, SOC, Op_RegF, 5, F5->as_VMReg()->next(7) ); ++ ++ reg_def F6 ( SOC, SOC, Op_RegF, 6, F6->as_VMReg() ); ++ reg_def F6_H ( SOC, SOC, Op_RegF, 6, F6->as_VMReg()->next() ); ++ reg_def F6_J ( SOC, SOC, Op_RegF, 6, F6->as_VMReg()->next(2) ); ++ reg_def F6_K ( SOC, SOC, Op_RegF, 6, F6->as_VMReg()->next(3) ); ++ reg_def F6_L ( SOC, SOC, Op_RegF, 6, F6->as_VMReg()->next(4) ); ++ reg_def F6_M ( SOC, SOC, Op_RegF, 6, F6->as_VMReg()->next(5) ); ++ reg_def F6_N ( SOC, SOC, Op_RegF, 6, F6->as_VMReg()->next(6) ); ++ reg_def F6_O ( SOC, SOC, Op_RegF, 6, F6->as_VMReg()->next(7) ); ++ ++ reg_def F7 ( SOC, SOC, Op_RegF, 7, F7->as_VMReg() ); ++ reg_def F7_H ( SOC, SOC, Op_RegF, 7, F7->as_VMReg()->next() ); ++ reg_def F7_J ( SOC, SOC, Op_RegF, 7, F7->as_VMReg()->next(2) ); ++ reg_def F7_K ( SOC, SOC, Op_RegF, 7, F7->as_VMReg()->next(3) ); ++ reg_def F7_L ( SOC, SOC, Op_RegF, 7, F7->as_VMReg()->next(4) ); ++ reg_def F7_M ( SOC, SOC, Op_RegF, 7, F7->as_VMReg()->next(5) ); ++ reg_def F7_N ( SOC, SOC, Op_RegF, 7, F7->as_VMReg()->next(6) ); ++ reg_def F7_O ( SOC, SOC, Op_RegF, 7, F7->as_VMReg()->next(7) ); ++ ++ reg_def F8 ( SOC, SOC, Op_RegF, 8, F8->as_VMReg() ); ++ reg_def F8_H ( SOC, SOC, Op_RegF, 8, F8->as_VMReg()->next() ); ++ reg_def F8_J ( SOC, SOC, Op_RegF, 8, F8->as_VMReg()->next(2) ); ++ reg_def F8_K ( SOC, SOC, Op_RegF, 8, F8->as_VMReg()->next(3) ); ++ reg_def F8_L ( SOC, SOC, Op_RegF, 8, F8->as_VMReg()->next(4) ); ++ reg_def F8_M ( SOC, SOC, Op_RegF, 8, F8->as_VMReg()->next(5) ); ++ reg_def F8_N ( SOC, SOC, Op_RegF, 8, F8->as_VMReg()->next(6) ); ++ reg_def F8_O ( SOC, SOC, Op_RegF, 8, F8->as_VMReg()->next(7) ); ++ ++ reg_def F9 ( SOC, SOC, Op_RegF, 9, F9->as_VMReg() ); ++ reg_def F9_H ( SOC, SOC, Op_RegF, 9, F9->as_VMReg()->next() ); ++ reg_def F9_J ( SOC, SOC, Op_RegF, 9, F9->as_VMReg()->next(2) ); ++ reg_def F9_K ( SOC, SOC, Op_RegF, 9, F9->as_VMReg()->next(3) ); ++ reg_def F9_L ( SOC, SOC, Op_RegF, 9, F9->as_VMReg()->next(4) ); ++ reg_def F9_M ( SOC, SOC, Op_RegF, 9, F9->as_VMReg()->next(5) ); ++ reg_def F9_N ( SOC, SOC, Op_RegF, 9, F9->as_VMReg()->next(6) ); ++ reg_def F9_O ( SOC, SOC, Op_RegF, 9, F9->as_VMReg()->next(7) ); ++ ++ reg_def F10 ( SOC, SOC, Op_RegF, 10, F10->as_VMReg() ); ++ reg_def F10_H ( SOC, SOC, Op_RegF, 10, F10->as_VMReg()->next() ); ++ reg_def F10_J ( SOC, SOC, Op_RegF, 10, F10->as_VMReg()->next(2) ); ++ reg_def F10_K ( SOC, SOC, Op_RegF, 10, F10->as_VMReg()->next(3) ); ++ reg_def F10_L ( SOC, SOC, Op_RegF, 10, F10->as_VMReg()->next(4) ); ++ reg_def F10_M ( SOC, SOC, Op_RegF, 10, F10->as_VMReg()->next(5) ); ++ reg_def F10_N ( SOC, SOC, Op_RegF, 10, F10->as_VMReg()->next(6) ); ++ reg_def F10_O ( SOC, SOC, Op_RegF, 10, F10->as_VMReg()->next(7) ); ++ ++ reg_def F11 ( SOC, SOC, Op_RegF, 11, F11->as_VMReg() ); ++ reg_def F11_H ( SOC, SOC, Op_RegF, 11, F11->as_VMReg()->next() ); ++ reg_def F11_J ( SOC, SOC, Op_RegF, 11, F11->as_VMReg()->next(2) ); ++ reg_def F11_K ( SOC, SOC, Op_RegF, 11, F11->as_VMReg()->next(3) ); ++ reg_def F11_L ( SOC, SOC, Op_RegF, 11, F11->as_VMReg()->next(4) ); ++ reg_def F11_M ( SOC, SOC, Op_RegF, 11, F11->as_VMReg()->next(5) ); ++ reg_def F11_N ( SOC, SOC, Op_RegF, 11, F11->as_VMReg()->next(6) ); ++ reg_def F11_O ( SOC, SOC, Op_RegF, 11, F11->as_VMReg()->next(7) ); ++ ++ reg_def F12 ( SOC, SOC, Op_RegF, 12, F12->as_VMReg() ); ++ reg_def F12_H ( SOC, SOC, Op_RegF, 12, F12->as_VMReg()->next() ); ++ reg_def F12_J ( SOC, SOC, Op_RegF, 12, F12->as_VMReg()->next(2) ); ++ reg_def F12_K ( SOC, SOC, Op_RegF, 12, F12->as_VMReg()->next(3) ); ++ reg_def F12_L ( SOC, SOC, Op_RegF, 12, F12->as_VMReg()->next(4) ); ++ reg_def F12_M ( SOC, SOC, Op_RegF, 12, F12->as_VMReg()->next(5) ); ++ reg_def F12_N ( SOC, SOC, Op_RegF, 12, F12->as_VMReg()->next(6) ); ++ reg_def F12_O ( SOC, SOC, Op_RegF, 12, F12->as_VMReg()->next(7) ); ++ ++ reg_def F13 ( SOC, SOC, Op_RegF, 13, F13->as_VMReg() ); ++ reg_def F13_H ( SOC, SOC, Op_RegF, 13, F13->as_VMReg()->next() ); ++ reg_def F13_J ( SOC, SOC, Op_RegF, 13, F13->as_VMReg()->next(2) ); ++ reg_def F13_K ( SOC, SOC, Op_RegF, 13, F13->as_VMReg()->next(3) ); ++ reg_def F13_L ( SOC, SOC, Op_RegF, 13, F13->as_VMReg()->next(4) ); ++ reg_def F13_M ( SOC, SOC, Op_RegF, 13, F13->as_VMReg()->next(5) ); ++ reg_def F13_N ( SOC, SOC, Op_RegF, 13, F13->as_VMReg()->next(6) ); ++ reg_def F13_O ( SOC, SOC, Op_RegF, 13, F13->as_VMReg()->next(7) ); ++ ++ reg_def F14 ( SOC, SOC, Op_RegF, 14, F14->as_VMReg() ); ++ reg_def F14_H ( SOC, SOC, Op_RegF, 14, F14->as_VMReg()->next() ); ++ reg_def F14_J ( SOC, SOC, Op_RegF, 14, F14->as_VMReg()->next(2) ); ++ reg_def F14_K ( SOC, SOC, Op_RegF, 14, F14->as_VMReg()->next(3) ); ++ reg_def F14_L ( SOC, SOC, Op_RegF, 14, F14->as_VMReg()->next(4) ); ++ reg_def F14_M ( SOC, SOC, Op_RegF, 14, F14->as_VMReg()->next(5) ); ++ reg_def F14_N ( SOC, SOC, Op_RegF, 14, F14->as_VMReg()->next(6) ); ++ reg_def F14_O ( SOC, SOC, Op_RegF, 14, F14->as_VMReg()->next(7) ); ++ ++ reg_def F15 ( SOC, SOC, Op_RegF, 15, F15->as_VMReg() ); ++ reg_def F15_H ( SOC, SOC, Op_RegF, 15, F15->as_VMReg()->next() ); ++ reg_def F15_J ( SOC, SOC, Op_RegF, 15, F15->as_VMReg()->next(2) ); ++ reg_def F15_K ( SOC, SOC, Op_RegF, 15, F15->as_VMReg()->next(3) ); ++ reg_def F15_L ( SOC, SOC, Op_RegF, 15, F15->as_VMReg()->next(4) ); ++ reg_def F15_M ( SOC, SOC, Op_RegF, 15, F15->as_VMReg()->next(5) ); ++ reg_def F15_N ( SOC, SOC, Op_RegF, 15, F15->as_VMReg()->next(6) ); ++ reg_def F15_O ( SOC, SOC, Op_RegF, 15, F15->as_VMReg()->next(7) ); ++ ++ reg_def F16 ( SOC, SOC, Op_RegF, 16, F16->as_VMReg() ); ++ reg_def F16_H ( SOC, SOC, Op_RegF, 16, F16->as_VMReg()->next() ); ++ reg_def F16_J ( SOC, SOC, Op_RegF, 16, F16->as_VMReg()->next(2) ); ++ reg_def F16_K ( SOC, SOC, Op_RegF, 16, F16->as_VMReg()->next(3) ); ++ reg_def F16_L ( SOC, SOC, Op_RegF, 16, F16->as_VMReg()->next(4) ); ++ reg_def F16_M ( SOC, SOC, Op_RegF, 16, F16->as_VMReg()->next(5) ); ++ reg_def F16_N ( SOC, SOC, Op_RegF, 16, F16->as_VMReg()->next(6) ); ++ reg_def F16_O ( SOC, SOC, Op_RegF, 16, F16->as_VMReg()->next(7) ); ++ ++ reg_def F17 ( SOC, SOC, Op_RegF, 17, F17->as_VMReg() ); ++ reg_def F17_H ( SOC, SOC, Op_RegF, 17, F17->as_VMReg()->next() ); ++ reg_def F17_J ( SOC, SOC, Op_RegF, 17, F17->as_VMReg()->next(2) ); ++ reg_def F17_K ( SOC, SOC, Op_RegF, 17, F17->as_VMReg()->next(3) ); ++ reg_def F17_L ( SOC, SOC, Op_RegF, 17, F17->as_VMReg()->next(4) ); ++ reg_def F17_M ( SOC, SOC, Op_RegF, 17, F17->as_VMReg()->next(5) ); ++ reg_def F17_N ( SOC, SOC, Op_RegF, 17, F17->as_VMReg()->next(6) ); ++ reg_def F17_O ( SOC, SOC, Op_RegF, 17, F17->as_VMReg()->next(7) ); ++ ++ reg_def F18 ( SOC, SOC, Op_RegF, 18, F18->as_VMReg() ); ++ reg_def F18_H ( SOC, SOC, Op_RegF, 18, F18->as_VMReg()->next() ); ++ reg_def F18_J ( SOC, SOC, Op_RegF, 18, F18->as_VMReg()->next(2) ); ++ reg_def F18_K ( SOC, SOC, Op_RegF, 18, F18->as_VMReg()->next(3) ); ++ reg_def F18_L ( SOC, SOC, Op_RegF, 18, F18->as_VMReg()->next(4) ); ++ reg_def F18_M ( SOC, SOC, Op_RegF, 18, F18->as_VMReg()->next(5) ); ++ reg_def F18_N ( SOC, SOC, Op_RegF, 18, F18->as_VMReg()->next(6) ); ++ reg_def F18_O ( SOC, SOC, Op_RegF, 18, F18->as_VMReg()->next(7) ); ++ ++ reg_def F19 ( SOC, SOC, Op_RegF, 19, F19->as_VMReg() ); ++ reg_def F19_H ( SOC, SOC, Op_RegF, 19, F19->as_VMReg()->next() ); ++ reg_def F19_J ( SOC, SOC, Op_RegF, 19, F19->as_VMReg()->next(2) ); ++ reg_def F19_K ( SOC, SOC, Op_RegF, 19, F19->as_VMReg()->next(3) ); ++ reg_def F19_L ( SOC, SOC, Op_RegF, 19, F19->as_VMReg()->next(4) ); ++ reg_def F19_M ( SOC, SOC, Op_RegF, 19, F19->as_VMReg()->next(5) ); ++ reg_def F19_N ( SOC, SOC, Op_RegF, 19, F19->as_VMReg()->next(6) ); ++ reg_def F19_O ( SOC, SOC, Op_RegF, 19, F19->as_VMReg()->next(7) ); ++ ++ reg_def F20 ( SOC, SOC, Op_RegF, 20, F20->as_VMReg() ); ++ reg_def F20_H ( SOC, SOC, Op_RegF, 20, F20->as_VMReg()->next() ); ++ reg_def F20_J ( SOC, SOC, Op_RegF, 20, F20->as_VMReg()->next(2) ); ++ reg_def F20_K ( SOC, SOC, Op_RegF, 20, F20->as_VMReg()->next(3) ); ++ reg_def F20_L ( SOC, SOC, Op_RegF, 20, F20->as_VMReg()->next(4) ); ++ reg_def F20_M ( SOC, SOC, Op_RegF, 20, F20->as_VMReg()->next(5) ); ++ reg_def F20_N ( SOC, SOC, Op_RegF, 20, F20->as_VMReg()->next(6) ); ++ reg_def F20_O ( SOC, SOC, Op_RegF, 20, F20->as_VMReg()->next(7) ); ++ ++ reg_def F21 ( SOC, SOC, Op_RegF, 21, F21->as_VMReg() ); ++ reg_def F21_H ( SOC, SOC, Op_RegF, 21, F21->as_VMReg()->next() ); ++ reg_def F21_J ( SOC, SOC, Op_RegF, 21, F21->as_VMReg()->next(2) ); ++ reg_def F21_K ( SOC, SOC, Op_RegF, 21, F21->as_VMReg()->next(3) ); ++ reg_def F21_L ( SOC, SOC, Op_RegF, 21, F21->as_VMReg()->next(4) ); ++ reg_def F21_M ( SOC, SOC, Op_RegF, 21, F21->as_VMReg()->next(5) ); ++ reg_def F21_N ( SOC, SOC, Op_RegF, 21, F21->as_VMReg()->next(6) ); ++ reg_def F21_O ( SOC, SOC, Op_RegF, 21, F21->as_VMReg()->next(7) ); ++ ++ reg_def F22 ( SOC, SOC, Op_RegF, 22, F22->as_VMReg() ); ++ reg_def F22_H ( SOC, SOC, Op_RegF, 22, F22->as_VMReg()->next() ); ++ reg_def F22_J ( SOC, SOC, Op_RegF, 22, F22->as_VMReg()->next(2) ); ++ reg_def F22_K ( SOC, SOC, Op_RegF, 22, F22->as_VMReg()->next(3) ); ++ reg_def F22_L ( SOC, SOC, Op_RegF, 22, F22->as_VMReg()->next(4) ); ++ reg_def F22_M ( SOC, SOC, Op_RegF, 22, F22->as_VMReg()->next(5) ); ++ reg_def F22_N ( SOC, SOC, Op_RegF, 22, F22->as_VMReg()->next(6) ); ++ reg_def F22_O ( SOC, SOC, Op_RegF, 22, F22->as_VMReg()->next(7) ); ++ ++ reg_def F23 ( SOC, SOC, Op_RegF, 23, F23->as_VMReg() ); ++ reg_def F23_H ( SOC, SOC, Op_RegF, 23, F23->as_VMReg()->next() ); ++ reg_def F23_J ( SOC, SOC, Op_RegF, 23, F23->as_VMReg()->next(2) ); ++ reg_def F23_K ( SOC, SOC, Op_RegF, 23, F23->as_VMReg()->next(3) ); ++ reg_def F23_L ( SOC, SOC, Op_RegF, 23, F23->as_VMReg()->next(4) ); ++ reg_def F23_M ( SOC, SOC, Op_RegF, 23, F23->as_VMReg()->next(5) ); ++ reg_def F23_N ( SOC, SOC, Op_RegF, 23, F23->as_VMReg()->next(6) ); ++ reg_def F23_O ( SOC, SOC, Op_RegF, 23, F23->as_VMReg()->next(7) ); ++ ++ reg_def F24 ( SOC, SOE, Op_RegF, 24, F24->as_VMReg() ); ++ reg_def F24_H ( SOC, SOE, Op_RegF, 24, F24->as_VMReg()->next() ); ++ reg_def F24_J ( SOC, SOC, Op_RegF, 24, F24->as_VMReg()->next(2) ); ++ reg_def F24_K ( SOC, SOC, Op_RegF, 24, F24->as_VMReg()->next(3) ); ++ reg_def F24_L ( SOC, SOC, Op_RegF, 24, F24->as_VMReg()->next(4) ); ++ reg_def F24_M ( SOC, SOC, Op_RegF, 24, F24->as_VMReg()->next(5) ); ++ reg_def F24_N ( SOC, SOC, Op_RegF, 24, F24->as_VMReg()->next(6) ); ++ reg_def F24_O ( SOC, SOC, Op_RegF, 24, F24->as_VMReg()->next(7) ); ++ ++ reg_def F25 ( SOC, SOE, Op_RegF, 25, F25->as_VMReg() ); ++ reg_def F25_H ( SOC, SOE, Op_RegF, 25, F25->as_VMReg()->next() ); ++ reg_def F25_J ( SOC, SOC, Op_RegF, 25, F25->as_VMReg()->next(2) ); ++ reg_def F25_K ( SOC, SOC, Op_RegF, 25, F25->as_VMReg()->next(3) ); ++ reg_def F25_L ( SOC, SOC, Op_RegF, 25, F25->as_VMReg()->next(4) ); ++ reg_def F25_M ( SOC, SOC, Op_RegF, 25, F25->as_VMReg()->next(5) ); ++ reg_def F25_N ( SOC, SOC, Op_RegF, 25, F25->as_VMReg()->next(6) ); ++ reg_def F25_O ( SOC, SOC, Op_RegF, 25, F25->as_VMReg()->next(7) ); ++ ++ reg_def F26 ( SOC, SOE, Op_RegF, 26, F26->as_VMReg() ); ++ reg_def F26_H ( SOC, SOE, Op_RegF, 26, F26->as_VMReg()->next() ); ++ reg_def F26_J ( SOC, SOC, Op_RegF, 26, F26->as_VMReg()->next(2) ); ++ reg_def F26_K ( SOC, SOC, Op_RegF, 26, F26->as_VMReg()->next(3) ); ++ reg_def F26_L ( SOC, SOC, Op_RegF, 26, F26->as_VMReg()->next(4) ); ++ reg_def F26_M ( SOC, SOC, Op_RegF, 26, F26->as_VMReg()->next(5) ); ++ reg_def F26_N ( SOC, SOC, Op_RegF, 26, F26->as_VMReg()->next(6) ); ++ reg_def F26_O ( SOC, SOC, Op_RegF, 26, F26->as_VMReg()->next(7) ); ++ ++ reg_def F27 ( SOC, SOE, Op_RegF, 27, F27->as_VMReg() ); ++ reg_def F27_H ( SOC, SOE, Op_RegF, 27, F27->as_VMReg()->next() ); ++ reg_def F27_J ( SOC, SOC, Op_RegF, 27, F27->as_VMReg()->next(2) ); ++ reg_def F27_K ( SOC, SOC, Op_RegF, 27, F27->as_VMReg()->next(3) ); ++ reg_def F27_L ( SOC, SOC, Op_RegF, 27, F27->as_VMReg()->next(4) ); ++ reg_def F27_M ( SOC, SOC, Op_RegF, 27, F27->as_VMReg()->next(5) ); ++ reg_def F27_N ( SOC, SOC, Op_RegF, 27, F27->as_VMReg()->next(6) ); ++ reg_def F27_O ( SOC, SOC, Op_RegF, 27, F27->as_VMReg()->next(7) ); ++ ++ reg_def F28 ( SOC, SOE, Op_RegF, 28, F28->as_VMReg() ); ++ reg_def F28_H ( SOC, SOE, Op_RegF, 28, F28->as_VMReg()->next() ); ++ reg_def F28_J ( SOC, SOC, Op_RegF, 28, F28->as_VMReg()->next(2) ); ++ reg_def F28_K ( SOC, SOC, Op_RegF, 28, F28->as_VMReg()->next(3) ); ++ reg_def F28_L ( SOC, SOC, Op_RegF, 28, F28->as_VMReg()->next(4) ); ++ reg_def F28_M ( SOC, SOC, Op_RegF, 28, F28->as_VMReg()->next(5) ); ++ reg_def F28_N ( SOC, SOC, Op_RegF, 28, F28->as_VMReg()->next(6) ); ++ reg_def F28_O ( SOC, SOC, Op_RegF, 28, F28->as_VMReg()->next(7) ); ++ ++ reg_def F29 ( SOC, SOE, Op_RegF, 29, F29->as_VMReg() ); ++ reg_def F29_H ( SOC, SOE, Op_RegF, 29, F29->as_VMReg()->next() ); ++ reg_def F29_J ( SOC, SOC, Op_RegF, 29, F29->as_VMReg()->next(2) ); ++ reg_def F29_K ( SOC, SOC, Op_RegF, 29, F29->as_VMReg()->next(3) ); ++ reg_def F29_L ( SOC, SOC, Op_RegF, 29, F29->as_VMReg()->next(4) ); ++ reg_def F29_M ( SOC, SOC, Op_RegF, 29, F29->as_VMReg()->next(5) ); ++ reg_def F29_N ( SOC, SOC, Op_RegF, 29, F29->as_VMReg()->next(6) ); ++ reg_def F29_O ( SOC, SOC, Op_RegF, 29, F29->as_VMReg()->next(7) ); ++ ++ reg_def F30 ( SOC, SOE, Op_RegF, 30, F30->as_VMReg() ); ++ reg_def F30_H ( SOC, SOE, Op_RegF, 30, F30->as_VMReg()->next() ); ++ reg_def F30_J ( SOC, SOC, Op_RegF, 30, F30->as_VMReg()->next(2) ); ++ reg_def F30_K ( SOC, SOC, Op_RegF, 30, F30->as_VMReg()->next(3) ); ++ reg_def F30_L ( SOC, SOC, Op_RegF, 30, F30->as_VMReg()->next(4) ); ++ reg_def F30_M ( SOC, SOC, Op_RegF, 30, F30->as_VMReg()->next(5) ); ++ reg_def F30_N ( SOC, SOC, Op_RegF, 30, F30->as_VMReg()->next(6) ); ++ reg_def F30_O ( SOC, SOC, Op_RegF, 30, F30->as_VMReg()->next(7) ); ++ ++ reg_def F31 ( SOC, SOE, Op_RegF, 31, F31->as_VMReg() ); ++ reg_def F31_H ( SOC, SOE, Op_RegF, 31, F31->as_VMReg()->next() ); ++ reg_def F31_J ( SOC, SOC, Op_RegF, 31, F31->as_VMReg()->next(2) ); ++ reg_def F31_K ( SOC, SOC, Op_RegF, 31, F31->as_VMReg()->next(3) ); ++ reg_def F31_L ( SOC, SOC, Op_RegF, 31, F31->as_VMReg()->next(4) ); ++ reg_def F31_M ( SOC, SOC, Op_RegF, 31, F31->as_VMReg()->next(5) ); ++ reg_def F31_N ( SOC, SOC, Op_RegF, 31, F31->as_VMReg()->next(6) ); ++ reg_def F31_O ( SOC, SOC, Op_RegF, 31, F31->as_VMReg()->next(7) ); ++ ++ ++// ---------------------------- ++// Special Registers ++//S6 is used for get_thread(S6) ++//S5 is used for heapbase of compressed oop ++alloc_class chunk0( ++ // volatiles ++ T0, T0_H, ++ T1, T1_H, ++ T2, T2_H, ++ T3, T3_H, ++ T4, T4_H, ++ T5, T5_H, ++ T6, T6_H, ++ T8, T8_H, ++ ++ // args ++ A7, A7_H, ++ A6, A6_H, ++ A5, A5_H, ++ A4, A4_H, ++ A3, A3_H, ++ A2, A2_H, ++ A1, A1_H, ++ A0, A0_H, ++ ++ // non-volatiles ++ S0, S0_H, ++ S1, S1_H, ++ S2, S2_H, ++ S3, S3_H, ++ S4, S4_H, ++ S5, S5_H, ++ S6, S6_H, ++ S7, S7_H, ++ S8, S8_H ++ ++ // non-allocatable registers ++ RA, RA_H, ++ SP, SP_H, // stack_pointer ++ FP, FP_H, // frame_pointer ++ T7, T7_H, // rscratch ++ TP, TP_H, ++ RX, RX_H, ++ R0, R0_H, ++ ); ++ ++// F23 is scratch reg ++alloc_class chunk1( ++ ++ // no save ++ F8, F8_H, F8_J, F8_K, F8_L, F8_M, F8_N, F8_O, ++ F9, F9_H, F9_J, F9_K, F9_L, F9_M, F9_N, F9_O, ++ F10, F10_H, F10_J, F10_K, F10_L, F10_M, F10_N, F10_O, ++ F11, F11_H, F11_J, F11_K, F11_L, F11_M, F11_N, F11_O, ++ F12, F12_H, F12_J, F12_K, F12_L, F12_M, F12_N, F12_O, ++ F13, F13_H, F13_J, F13_K, F13_L, F13_M, F13_N, F13_O, ++ F14, F14_H, F14_J, F14_K, F14_L, F14_M, F14_N, F14_O, ++ F15, F15_H, F15_J, F15_K, F15_L, F15_M, F15_N, F15_O, ++ F16, F16_H, F16_J, F16_K, F16_L, F16_M, F16_N, F16_O, ++ F17, F17_H, F17_J, F17_K, F17_L, F17_M, F17_N, F17_O, ++ F18, F18_H, F18_J, F18_K, F18_L, F18_M, F18_N, F18_O, ++ F19, F19_H, F19_J, F19_K, F19_L, F19_M, F19_N, F19_O, ++ F20, F20_H, F20_J, F20_K, F20_L, F20_M, F20_N, F20_O, ++ F21, F21_H, F21_J, F21_K, F21_L, F21_M, F21_N, F21_O, ++ F22, F22_H, F22_J, F22_K, F22_L, F22_M, F22_N, F22_O, ++ ++ // arg registers ++ F0, F0_H, F0_J, F0_K, F0_L, F0_M, F0_N, F0_O, ++ F1, F1_H, F1_J, F1_K, F1_L, F1_M, F1_N, F1_O, ++ F2, F2_H, F2_J, F2_K, F2_L, F2_M, F2_N, F2_O, ++ F3, F3_H, F3_J, F3_K, F3_L, F3_M, F3_N, F3_O, ++ F4, F4_H, F4_J, F4_K, F4_L, F4_M, F4_N, F4_O, ++ F5, F5_H, F5_J, F5_K, F5_L, F5_M, F5_N, F5_O, ++ F6, F6_H, F6_J, F6_K, F6_L, F6_M, F6_N, F6_O, ++ F7, F7_H, F7_J, F7_K, F7_L, F7_M, F7_N, F7_O, ++ ++ // non-volatiles ++ F24, F24_H, F24_J, F24_K, F24_L, F24_M, F24_N, F24_O, ++ F25, F25_H, F25_J, F25_K, F25_L, F25_M, F25_N, F25_O, ++ F26, F26_H, F26_J, F26_K, F26_L, F26_M, F26_N, F26_O, ++ F27, F27_H, F27_J, F27_K, F27_L, F27_M, F27_N, F27_O, ++ F28, F28_H, F28_J, F28_K, F28_L, F28_M, F28_N, F28_O, ++ F29, F29_H, F29_J, F29_K, F29_L, F29_M, F29_N, F29_O, ++ F30, F30_H, F30_J, F30_K, F30_L, F30_M, F30_N, F30_O, ++ F31, F31_H, F31_J, F31_K, F31_L, F31_M, F31_N, F31_O, ++ ++ // non-allocatable registers ++ F23, F23_H, F23_J, F23_K, F23_L, F23_M, F23_N, F23_O, ++ ); ++ ++reg_class s_reg( S0, S1, S2, S3, S4, S5, S6, S7 ); ++reg_class s0_reg( S0 ); ++reg_class s1_reg( S1 ); ++reg_class s2_reg( S2 ); ++reg_class s3_reg( S3 ); ++reg_class s4_reg( S4 ); ++reg_class s5_reg( S5 ); ++reg_class s6_reg( S6 ); ++reg_class s7_reg( S7 ); ++ ++reg_class t_reg( T0, T1, T2, T3, T8, T4 ); ++reg_class t0_reg( T0 ); ++reg_class t1_reg( T1 ); ++reg_class t2_reg( T2 ); ++reg_class t3_reg( T3 ); ++reg_class t8_reg( T8 ); ++reg_class t4_reg( T4 ); ++ ++reg_class a_reg( A0, A1, A2, A3, A4, A5, A6, A7 ); ++reg_class a0_reg( A0 ); ++reg_class a1_reg( A1 ); ++reg_class a2_reg( A2 ); ++reg_class a3_reg( A3 ); ++reg_class a4_reg( A4 ); ++reg_class a5_reg( A5 ); ++reg_class a6_reg( A6 ); ++reg_class a7_reg( A7 ); ++ ++// TODO: LA ++//reg_class v0_reg( A0 ); ++//reg_class v1_reg( A1 ); ++ ++reg_class sp_reg( SP, SP_H ); ++reg_class fp_reg( FP, FP_H ); ++ ++reg_class v0_long_reg( A0, A0_H ); ++reg_class v1_long_reg( A1, A1_H ); ++reg_class a0_long_reg( A0, A0_H ); ++reg_class a1_long_reg( A1, A1_H ); ++reg_class a2_long_reg( A2, A2_H ); ++reg_class a3_long_reg( A3, A3_H ); ++reg_class a4_long_reg( A4, A4_H ); ++reg_class a5_long_reg( A5, A5_H ); ++reg_class a6_long_reg( A6, A6_H ); ++reg_class a7_long_reg( A7, A7_H ); ++reg_class t0_long_reg( T0, T0_H ); ++reg_class t1_long_reg( T1, T1_H ); ++reg_class t2_long_reg( T2, T2_H ); ++reg_class t3_long_reg( T3, T3_H ); ++reg_class t8_long_reg( T8, T8_H ); ++reg_class t4_long_reg( T4, T4_H ); ++reg_class s0_long_reg( S0, S0_H ); ++reg_class s1_long_reg( S1, S1_H ); ++reg_class s2_long_reg( S2, S2_H ); ++reg_class s3_long_reg( S3, S3_H ); ++reg_class s4_long_reg( S4, S4_H ); ++reg_class s5_long_reg( S5, S5_H ); ++reg_class s6_long_reg( S6, S6_H ); ++reg_class s7_long_reg( S7, S7_H ); ++ ++reg_class all_reg32( ++ S8, ++ S7, ++ S5, /* S5_heapbase */ ++ /* S6, S6 TREG */ ++ S4, ++ S3, ++ S2, ++ S1, ++ S0, ++ T8, ++ /* T7, AT */ ++ T6, ++ T5, ++ T4, ++ T3, ++ T2, ++ T1, ++ T0, ++ A7, ++ A6, ++ A5, ++ A4, ++ A3, ++ A2, ++ A1, ++ A0, ++ FP ); ++ ++reg_class int_reg %{ ++ return _ANY_REG32_mask; ++%} ++ ++reg_class p_reg %{ ++ return _PTR_REG_mask; ++%} ++ ++reg_class no_CR_reg %{ ++ return _NO_CR_REG_mask; ++%} ++ ++reg_class p_has_s6_reg %{ ++ return _PTR_HAS_S6_REG_mask; ++%} ++ ++reg_class all_reg( ++ S8, S8_H, ++ S7, S7_H, ++ /* S6, S6_H, S6 TREG */ ++ S5, S5_H, /* S5_heapbase */ ++ S4, S4_H, ++ S3, S3_H, ++ S2, S2_H, ++ S1, S1_H, ++ S0, S0_H, ++ T8, T8_H, ++ /* T7, T7_H, AT */ ++ T6, T6_H, ++ T5, T5_H, ++ T4, T4_H, ++ T3, T3_H, ++ T2, T2_H, ++ T1, T1_H, ++ T0, T0_H, ++ A7, A7_H, ++ A6, A6_H, ++ A5, A5_H, ++ A4, A4_H, ++ A3, A3_H, ++ A2, A2_H, ++ A1, A1_H, ++ A0, A0_H, ++ FP, FP_H ++ ); ++ ++ ++reg_class long_reg %{ ++ return _ANY_REG_mask; ++%} ++ ++// Floating point registers. ++// F31 are not used as temporary registers in D2I ++reg_class flt_reg( F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, F10, F11, F12, F13, F14, F15, F16, F17, F18, F19, F20, F21, F22, F24, F25, F26, F27, F28, F29, F30, F31); ++ ++reg_class dbl_reg( F0, F0_H, ++ F1, F1_H, ++ F2, F2_H, ++ F3, F3_H, ++ F4, F4_H, ++ F5, F5_H, ++ F6, F6_H, ++ F7, F7_H, ++ F8, F8_H, ++ F9, F9_H, ++ F10, F10_H, ++ F11, F11_H, ++ F12, F12_H, ++ F13, F13_H, ++ F14, F14_H, ++ F15, F15_H, ++ F16, F16_H, ++ F17, F17_H, ++ F18, F18_H, ++ F19, F19_H, ++ F20, F20_H, ++ F21, F21_H, ++ F22, F22_H, ++ F24, F24_H, ++ F25, F25_H, ++ F26, F26_H, ++ F27, F27_H, ++ F28, F28_H, ++ F29, F29_H, ++ F30, F30_H, ++ F31, F31_H); ++ ++// Class for all 128bit vector registers ++reg_class vectorx_reg( F0, F0_H, F0_J, F0_K, ++ F1, F1_H, F1_J, F1_K, ++ F2, F2_H, F2_J, F2_K, ++ F3, F3_H, F3_J, F3_K, ++ F4, F4_H, F4_J, F4_K, ++ F5, F5_H, F5_J, F5_K, ++ F6, F6_H, F6_J, F6_K, ++ F7, F7_H, F7_J, F7_K, ++ F8, F8_H, F8_J, F8_K, ++ F9, F9_H, F9_J, F9_K, ++ F10, F10_H, F10_J, F10_K, ++ F11, F11_H, F11_J, F11_K, ++ F12, F12_H, F12_J, F12_K, ++ F13, F13_H, F13_J, F13_K, ++ F14, F14_H, F14_J, F14_K, ++ F15, F15_H, F15_J, F15_K, ++ F16, F16_H, F16_J, F16_K, ++ F17, F17_H, F17_J, F17_K, ++ F18, F18_H, F18_J, F18_K, ++ F19, F19_H, F19_J, F19_K, ++ F20, F20_H, F20_J, F20_K, ++ F21, F21_H, F21_J, F21_K, ++ F22, F22_H, F22_J, F22_K, ++ F24, F24_H, F24_J, F24_K, ++ F25, F25_H, F25_J, F25_K, ++ F26, F26_H, F26_J, F26_K, ++ F27, F27_H, F27_J, F27_K, ++ F28, F28_H, F28_J, F28_K, ++ F29, F29_H, F29_J, F29_K, ++ F30, F30_H, F30_J, F30_K, ++ F31, F31_H, F31_J, F31_K); ++ ++// Class for all 256bit vector registers ++reg_class vectory_reg( F0, F0_H, F0_J, F0_K, F0_L, F0_M, F0_N, F0_O, ++ F1, F1_H, F1_J, F1_K, F1_L, F1_M, F1_N, F1_O, ++ F2, F2_H, F2_J, F2_K, F2_L, F2_M, F2_N, F2_O, ++ F3, F3_H, F3_J, F3_K, F3_L, F3_M, F3_N, F3_O, ++ F4, F4_H, F4_J, F4_K, F4_L, F4_M, F4_N, F4_O, ++ F5, F5_H, F5_J, F5_K, F5_L, F5_M, F5_N, F5_O, ++ F6, F6_H, F6_J, F6_K, F6_L, F6_M, F6_N, F6_O, ++ F7, F7_H, F7_J, F7_K, F7_L, F7_M, F7_N, F7_O, ++ F8, F8_H, F8_J, F8_K, F8_L, F8_M, F8_N, F8_O, ++ F9, F9_H, F9_J, F9_K, F9_L, F9_M, F9_N, F9_O, ++ F10, F10_H, F10_J, F10_K, F10_L, F10_M, F10_N, F10_O, ++ F11, F11_H, F11_J, F11_K, F11_L, F11_M, F11_N, F11_O, ++ F12, F12_H, F12_J, F12_K, F12_L, F12_M, F12_N, F12_O, ++ F13, F13_H, F13_J, F13_K, F13_L, F13_M, F13_N, F13_O, ++ F14, F14_H, F14_J, F14_K, F14_L, F14_M, F14_N, F14_O, ++ F15, F15_H, F15_J, F15_K, F15_L, F15_M, F15_N, F15_O, ++ F16, F16_H, F16_J, F16_K, F16_L, F16_M, F16_N, F16_O, ++ F17, F17_H, F17_J, F17_K, F17_L, F17_M, F17_N, F17_O, ++ F18, F18_H, F18_J, F18_K, F18_L, F18_M, F18_N, F18_O, ++ F19, F19_H, F19_J, F19_K, F19_L, F19_M, F19_N, F19_O, ++ F20, F20_H, F20_J, F20_K, F20_L, F20_M, F20_N, F20_O, ++ F21, F21_H, F21_J, F21_K, F21_L, F21_M, F21_N, F21_O, ++ F22, F22_H, F22_J, F22_K, F22_L, F22_M, F22_N, F22_O, ++ F24, F24_H, F24_J, F24_K, F24_L, F24_M, F24_N, F24_O, ++ F25, F25_H, F25_J, F25_K, F25_L, F25_M, F25_N, F25_O, ++ F26, F26_H, F26_J, F26_K, F26_L, F26_M, F26_N, F26_O, ++ F27, F27_H, F27_J, F27_K, F27_L, F27_M, F27_N, F27_O, ++ F28, F28_H, F28_J, F28_K, F28_L, F28_M, F28_N, F28_O, ++ F29, F29_H, F29_J, F29_K, F29_L, F29_M, F29_N, F29_O, ++ F30, F30_H, F30_J, F30_K, F30_L, F30_M, F30_N, F30_O, ++ F31, F31_H, F31_J, F31_K, F31_L, F31_M, F31_N, F31_O); ++ ++// TODO: LA ++//reg_class flt_arg0( F0 ); ++//reg_class dbl_arg0( F0, F0_H ); ++//reg_class dbl_arg1( F1, F1_H ); ++ ++%} ++ ++//----------DEFINITION BLOCK--------------------------------------------------- ++// Define name --> value mappings to inform the ADLC of an integer valued name ++// Current support includes integer values in the range [0, 0x7FFFFFFF] ++// Format: ++// int_def ( , ); ++// Generated Code in ad_.hpp ++// #define () ++// // value == ++// Generated code in ad_.cpp adlc_verification() ++// assert( == , "Expect () to equal "); ++// ++definitions %{ ++ int_def DEFAULT_COST ( 100, 100); ++ int_def HUGE_COST (1000000, 1000000); ++ ++ // Memory refs are twice as expensive as run-of-the-mill. ++ int_def MEMORY_REF_COST ( 200, DEFAULT_COST * 2); ++ ++ // Branches are even more expensive. ++ int_def BRANCH_COST ( 300, DEFAULT_COST * 3); ++ // we use jr instruction to construct call, so more expensive ++ int_def CALL_COST ( 500, DEFAULT_COST * 5); ++/* ++ int_def EQUAL ( 1, 1 ); ++ int_def NOT_EQUAL ( 2, 2 ); ++ int_def GREATER ( 3, 3 ); ++ int_def GREATER_EQUAL ( 4, 4 ); ++ int_def LESS ( 5, 5 ); ++ int_def LESS_EQUAL ( 6, 6 ); ++*/ ++%} ++ ++ ++ ++//----------SOURCE BLOCK------------------------------------------------------- ++// This is a block of C++ code which provides values, functions, and ++// definitions necessary in the rest of the architecture description ++ ++source_hpp %{ ++// Header information of the source block. ++// Method declarations/definitions which are used outside ++// the ad-scope can conveniently be defined here. ++// ++// To keep related declarations/definitions/uses close together, ++// we switch between source %{ }% and source_hpp %{ }% freely as needed. ++ ++#include "asm/macroAssembler.hpp" ++#include "gc/shared/barrierSetAssembler.hpp" ++#include "gc/shared/cardTable.hpp" ++#include "gc/shared/cardTableBarrierSet.hpp" ++#include "gc/shared/collectedHeap.hpp" ++#include "opto/addnode.hpp" ++#include "opto/convertnode.hpp" ++#include "runtime/objectMonitor.hpp" ++ ++extern RegMask _ANY_REG32_mask; ++extern RegMask _ANY_REG_mask; ++extern RegMask _PTR_REG_mask; ++extern RegMask _NO_CR_REG_mask; ++extern RegMask _PTR_HAS_S6_REG_mask; ++ ++class CallStubImpl { ++ ++ //-------------------------------------------------------------- ++ //---< Used for optimization in Compile::shorten_branches >--- ++ //-------------------------------------------------------------- ++ ++ public: ++ // Size of call trampoline stub. ++ static uint size_call_trampoline() { ++ return 0; // no call trampolines on this platform ++ } ++ ++ // number of relocations needed by a call trampoline stub ++ static uint reloc_call_trampoline() { ++ return 0; // no call trampolines on this platform ++ } ++}; ++ ++class HandlerImpl { ++ ++ public: ++ ++ static int emit_exception_handler(CodeBuffer &cbuf); ++ static int emit_deopt_handler(CodeBuffer& cbuf); ++ ++ static uint size_exception_handler() { ++ // NativeCall instruction size is the same as NativeJump. ++ // exception handler starts out as jump and can be patched to ++ // a call be deoptimization. (4932387) ++ // Note that this value is also credited (in output.cpp) to ++ // the size of the code section. ++ int size = NativeFarCall::instruction_size; ++ const uintx m = 16 - 1; ++ return mask_bits(size + m, ~m); ++ //return round_to(size, 16); ++ } ++ ++ static uint size_deopt_handler() { ++ int size = NativeFarCall::instruction_size; ++ const uintx m = 16 - 1; ++ return mask_bits(size + m, ~m); ++ //return round_to(size, 16); ++ } ++}; ++ ++class Node::PD { ++public: ++ enum NodeFlags { ++ _last_flag = Node::_last_flag ++ }; ++}; ++ ++bool is_CAS(int opcode); ++bool use_AMO(int opcode); ++ ++bool unnecessary_acquire(const Node *barrier); ++bool unnecessary_release(const Node *barrier); ++bool unnecessary_volatile(const Node *barrier); ++bool needs_releasing_store(const Node *store); ++ ++%} // end source_hpp ++ ++source %{ ++ ++#define NO_INDEX 0 ++#define RELOC_IMM64 Assembler::imm_operand ++#define RELOC_DISP32 Assembler::disp32_operand ++ ++#define V0_num A0_num ++#define V0_H_num A0_H_num ++ ++#define __ _masm. ++ ++RegMask _ANY_REG32_mask; ++RegMask _ANY_REG_mask; ++RegMask _PTR_REG_mask; ++RegMask _NO_CR_REG_mask; ++RegMask _PTR_HAS_S6_REG_mask; ++ ++void reg_mask_init() { ++ _ANY_REG32_mask = _ALL_REG32_mask; ++ _ANY_REG_mask = _ALL_REG_mask; ++ _PTR_REG_mask = _ALL_REG_mask; ++ _PTR_HAS_S6_REG_mask = _ALL_REG_mask; ++ ++ if (UseCompressedOops && (CompressedOops::ptrs_base() != nullptr)) { ++ _ANY_REG32_mask.Remove(OptoReg::as_OptoReg(r28->as_VMReg())); ++ _ANY_REG_mask.SUBTRACT(_S5_LONG_REG_mask); ++ _PTR_REG_mask.SUBTRACT(_S5_LONG_REG_mask); ++ } ++ ++ // FP(r22) is not allocatable when PreserveFramePointer is on ++ if (PreserveFramePointer) { ++ _ANY_REG32_mask.Remove(OptoReg::as_OptoReg(r22->as_VMReg())); ++ _ANY_REG_mask.SUBTRACT(_FP_REG_mask); ++ _PTR_REG_mask.SUBTRACT(_FP_REG_mask); ++ } ++ ++#if INCLUDE_ZGC || INCLUDE_SHENANDOAHGC ++ if (UseZGC || UseShenandoahGC) { ++ _ANY_REG32_mask.Remove(OptoReg::as_OptoReg(r16->as_VMReg())); ++ _ANY_REG_mask.SUBTRACT(_T4_LONG_REG_mask); ++ _PTR_REG_mask.SUBTRACT(_T4_LONG_REG_mask); ++ } ++#endif ++ ++ _NO_CR_REG_mask = _PTR_REG_mask; ++ _NO_CR_REG_mask.SUBTRACT(_T0_LONG_REG_mask); ++ ++ _PTR_HAS_S6_REG_mask.OR(_S6_LONG_REG_mask); ++} ++ ++void PhaseOutput::pd_perform_mach_node_analysis() { ++} ++ ++int MachNode::pd_alignment_required() const { ++ return 1; ++} ++ ++int MachNode::compute_padding(int current_offset) const { ++ return 0; ++} ++ ++// Emit exception handler code. ++// Stuff framesize into a register and call a VM stub routine. ++int HandlerImpl::emit_exception_handler(CodeBuffer& cbuf) { ++ // Note that the code buffer's insts_mark is always relative to insts. ++ // That's why we must use the macroassembler to generate a handler. ++ C2_MacroAssembler _masm(&cbuf); ++ address base = __ start_a_stub(size_exception_handler()); ++ if (base == nullptr) { ++ ciEnv::current()->record_failure("CodeCache is full"); ++ return 0; // CodeBuffer::expand failed ++ } ++ ++ int offset = __ offset(); ++ ++ __ block_comment("; emit_exception_handler"); ++ ++ cbuf.set_insts_mark(); ++ __ relocate(relocInfo::runtime_call_type); ++ __ patchable_jump((address)OptoRuntime::exception_blob()->entry_point()); ++ assert(__ offset() - offset <= (int) size_exception_handler(), "overflow"); ++ __ end_a_stub(); ++ return offset; ++} ++ ++// Emit deopt handler code. ++int HandlerImpl::emit_deopt_handler(CodeBuffer& cbuf) { ++ // Note that the code buffer's insts_mark is always relative to insts. ++ // That's why we must use the macroassembler to generate a handler. ++ C2_MacroAssembler _masm(&cbuf); ++ address base = __ start_a_stub(size_deopt_handler()); ++ if (base == nullptr) { ++ ciEnv::current()->record_failure("CodeCache is full"); ++ return 0; // CodeBuffer::expand failed ++ } ++ ++ int offset = __ offset(); ++ ++ __ block_comment("; emit_deopt_handler"); ++ ++ cbuf.set_insts_mark(); ++ __ relocate(relocInfo::runtime_call_type); ++ __ patchable_call(SharedRuntime::deopt_blob()->unpack()); ++ assert(__ offset() - offset <= (int) size_deopt_handler(), "overflow"); ++ __ end_a_stub(); ++ return offset; ++} ++ ++ ++const bool Matcher::match_rule_supported(int opcode) { ++ if (!has_match_rule(opcode)) ++ return false; ++ ++ switch (opcode) { ++ case Op_RoundDoubleMode: ++ case Op_ConvF2HF: ++ case Op_ConvHF2F: ++ case Op_StrInflatedCopy: ++ case Op_StrCompressedCopy: ++ case Op_EncodeISOArray: ++ if (!UseLSX) ++ return false; ++ case Op_PopCountI: ++ case Op_PopCountL: ++ return UsePopCountInstruction; ++ case Op_CompareAndSwapB: ++ case Op_CompareAndSwapS: ++ case Op_CompareAndExchangeB: ++ case Op_CompareAndExchangeS: ++ case Op_WeakCompareAndSwapB: ++ case Op_WeakCompareAndSwapS: ++ if (!UseAMCAS) ++ return false; ++ case Op_GetAndSetB: ++ case Op_GetAndSetS: ++ case Op_GetAndAddB: ++ case Op_GetAndAddS: ++ if (!UseAMBH) ++ return false; ++ default: ++ break; ++ } ++ ++ return true; // Per default match rules are supported. ++} ++ ++const bool Matcher::match_rule_supported_superword(int opcode, int vlen, BasicType bt) { ++ return match_rule_supported_vector(opcode, vlen, bt); ++} ++ ++const bool Matcher::match_rule_supported_vector(int opcode, int vlen, BasicType bt) { ++ if (!match_rule_supported(opcode) || !vector_size_supported(bt, vlen)) ++ return false; ++ ++ switch (opcode) { ++ case Op_RotateRightV: ++ case Op_RotateLeftV: ++ if (bt != T_INT && bt != T_LONG) { ++ return false; ++ } ++ break; ++ case Op_MaxReductionV: ++ case Op_MinReductionV: ++ if (bt == T_FLOAT || bt == T_DOUBLE) { ++ return false; ++ } ++ break; ++ case Op_VectorLoadShuffle: ++ if (vlen < 4 || !UseLASX) ++ return false; ++ break; ++ case Op_VectorCastB2X: ++ case Op_VectorCastS2X: ++ case Op_VectorCastI2X: ++ case Op_VectorCastL2X: ++ case Op_VectorCastF2X: ++ case Op_VectorCastD2X: ++ case Op_VectorLoadMask: ++ case Op_VectorMaskCast: ++ if (!UseLASX) ++ return false; ++ break; ++ case Op_VectorTest: ++ if (vlen * type2aelembytes(bt) < 16) ++ return false; ++ break; ++ default: ++ break; ++ } ++ ++ return true; ++} ++ ++const bool Matcher::match_rule_supported_vector_masked(int opcode, int vlen, BasicType bt) { ++ return false; ++} ++ ++// Vector calling convention not yet implemented. ++const bool Matcher::supports_vector_calling_convention(void) { ++ return false; ++} ++ ++OptoRegPair Matcher::vector_return_value(uint ideal_reg) { ++ Unimplemented(); ++ return OptoRegPair(0, 0); ++} ++ ++bool Matcher::is_short_branch_offset(int rule, int br_size, int offset) { ++ const int safety_zone = 3 * BytesPerInstWord; ++ int offs = offset - br_size + 4; ++ // To be conservative on LoongArch ++ // branch node should be end with: ++ // branch inst ++ offs = (offs < 0 ? offs - safety_zone : offs + safety_zone) >> 2; ++ switch (rule) { ++ case jmpDir_long_rule: ++ case jmpDir_short_rule: ++ return Assembler::is_simm(offs, 26); ++ case jmpCon_flags_long_rule: ++ case jmpCon_flags_short_rule: ++ case branchConP_0_long_rule: ++ case branchConP_0_short_rule: ++ case branchConN2P_0_long_rule: ++ case branchConN2P_0_short_rule: ++ case cmpN_null_branch_long_rule: ++ case cmpN_null_branch_short_rule: ++ case branchConF_reg_reg_long_rule: ++ case branchConF_reg_reg_short_rule: ++ case branchConD_reg_reg_long_rule: ++ case branchConD_reg_reg_short_rule: ++ case partialSubtypeCheckVsZero_long_rule: ++ case partialSubtypeCheckVsZero_short_rule: ++ return Assembler::is_simm(offs, 21); ++ default: ++ return Assembler::is_simm(offs, 16); ++ } ++ return false; ++} ++ ++MachOper* Matcher::pd_specialize_generic_vector_operand(MachOper* generic_opnd, uint ideal_reg, bool is_temp) { ++ assert(Matcher::is_generic_vector(generic_opnd), "not generic"); ++ switch (ideal_reg) { ++ case Op_VecS: return new vecSOper(); ++ case Op_VecD: return new vecDOper(); ++ case Op_VecX: return new vecXOper(); ++ case Op_VecY: return new vecYOper(); ++ } ++ ShouldNotReachHere(); ++ return nullptr; ++} ++ ++bool Matcher::is_reg2reg_move(MachNode* m) { ++ return false; ++} ++ ++bool Matcher::is_generic_vector(MachOper* opnd) { ++ return opnd->opcode() == VREG; ++} ++ ++const bool Matcher::vector_needs_partial_operations(Node* node, const TypeVect* vt) { ++ return false; ++} ++ ++const RegMask* Matcher::predicate_reg_mask(void) { ++ return nullptr; ++} ++ ++const TypeVectMask* Matcher::predicate_reg_type(const Type* elemTy, int length) { ++ return nullptr; ++} ++ ++const int Matcher::scalable_vector_reg_size(const BasicType bt) { ++ return -1; ++} ++ ++const int Matcher::superword_max_vector_size(const BasicType bt) { ++ return Matcher::max_vector_size(bt); ++} ++ ++// Vector ideal reg ++const uint Matcher::vector_ideal_reg(int size) { ++ switch(size) { ++ case 4: return Op_VecS; ++ case 8: return Op_VecD; ++ case 16: return Op_VecX; ++ case 32: return Op_VecY; ++ } ++ ShouldNotReachHere(); ++ return 0; ++} ++ ++// Should the matcher clone input 'm' of node 'n'? ++bool Matcher::pd_clone_node(Node* n, Node* m, Matcher::MStack& mstack) { ++ if (is_vshift_con_pattern(n, m)) { // ShiftV src (ShiftCntV con) ++ mstack.push(m, Visit); // m = ShiftCntV ++ return true; ++ } ++ return false; ++} ++ ++// Should the Matcher clone shifts on addressing modes, expecting them ++// to be subsumed into complex addressing expressions or compute them ++// into registers? ++bool Matcher::pd_clone_address_expressions(AddPNode* m, Matcher::MStack& mstack, VectorSet& address_visited) { ++ return clone_base_plus_offset_address(m, mstack, address_visited); ++} ++ ++// Max vector size in bytes. 0 if not supported. ++const int Matcher::vector_width_in_bytes(BasicType bt) { ++ int size = (int)MaxVectorSize; ++ if (size < 2*type2aelembytes(bt)) size = 0; ++ // But never < 4 ++ if (size < 4) size = 0; ++ return size; ++} ++ ++// Limits on vector size (number of elements) loaded into vector. ++const int Matcher::max_vector_size(const BasicType bt) { ++ assert(is_java_primitive(bt), "only primitive type vectors"); ++ return vector_width_in_bytes(bt)/type2aelembytes(bt); ++} ++ ++const int Matcher::min_vector_size(const BasicType bt) { ++ int max_size = max_vector_size(bt); ++ int size = 0; ++ ++ if (UseLSX) size = 4; ++ size = size / type2aelembytes(bt); ++ if (size < 2) size = 2; ++ return MIN2(size,max_size); ++} ++ ++// Register for DIVI projection of divmodI ++RegMask Matcher::divI_proj_mask() { ++ return T1_REG_mask(); ++} ++ ++// Register for MODI projection of divmodI ++RegMask Matcher::modI_proj_mask() { ++ return T2_REG_mask(); ++} ++ ++// Register for DIVL projection of divmodL ++RegMask Matcher::divL_proj_mask() { ++ return T1_LONG_REG_mask(); ++} ++ ++RegMask Matcher::modL_proj_mask() { ++ return T2_LONG_REG_mask(); ++} ++// Return whether or not this register is ever used as an argument. This ++// function is used on startup to build the trampoline stubs in generateOptoStub. ++// Registers not mentioned will be killed by the VM call in the trampoline, and ++// arguments in those registers not be available to the callee. ++bool Matcher::can_be_java_arg( int reg ) { ++ // Refer to: [sharedRuntime_loongarch_64.cpp] SharedRuntime::java_calling_convention() ++ if ( reg == T0_num || reg == T0_H_num ++ || reg == A0_num || reg == A0_H_num ++ || reg == A1_num || reg == A1_H_num ++ || reg == A2_num || reg == A2_H_num ++ || reg == A3_num || reg == A3_H_num ++ || reg == A4_num || reg == A4_H_num ++ || reg == A5_num || reg == A5_H_num ++ || reg == A6_num || reg == A6_H_num ++ || reg == A7_num || reg == A7_H_num ) ++ return true; ++ ++ if ( reg == F0_num || reg == F0_H_num ++ || reg == F1_num || reg == F1_H_num ++ || reg == F2_num || reg == F2_H_num ++ || reg == F3_num || reg == F3_H_num ++ || reg == F4_num || reg == F4_H_num ++ || reg == F5_num || reg == F5_H_num ++ || reg == F6_num || reg == F6_H_num ++ || reg == F7_num || reg == F7_H_num ) ++ return true; ++ ++ return false; ++} ++ ++bool Matcher::is_spillable_arg( int reg ) { ++ return can_be_java_arg(reg); ++} ++ ++uint Matcher::int_pressure_limit() ++{ ++ uint default_intpresssure = _ANY_REG32_mask.Size() - 1; ++ ++ if (!PreserveFramePointer) { ++ default_intpresssure--; ++ } ++ return (INTPRESSURE == -1) ? default_intpresssure : INTPRESSURE; ++} ++ ++uint Matcher::float_pressure_limit() ++{ ++ return (FLOATPRESSURE == -1) ? _FLT_REG_mask.Size() : FLOATPRESSURE; ++} ++ ++bool Matcher::use_asm_for_ldiv_by_con( jlong divisor ) { ++ return false; ++} ++ ++const RegMask Matcher::method_handle_invoke_SP_save_mask() { ++ return FP_REG_mask(); ++} ++ ++int CallStaticJavaDirectNode::compute_padding(int current_offset) const { ++ const uintx m = alignment_required() - 1; ++ return mask_bits(current_offset + m, ~m) - current_offset; ++} ++ ++int CallDynamicJavaDirectNode::compute_padding(int current_offset) const { ++ const uintx m = alignment_required() - 1; ++ return mask_bits(current_offset + m, ~m) - current_offset; ++} ++ ++int CallLeafNoFPDirectNode::compute_padding(int current_offset) const { ++ const uintx m = alignment_required() - 1; ++ return mask_bits(current_offset + m, ~m) - current_offset; ++} ++ ++int CallLeafDirectNode::compute_padding(int current_offset) const { ++ const uintx m = alignment_required() - 1; ++ return mask_bits(current_offset + m, ~m) - current_offset; ++} ++ ++int CallRuntimeDirectNode::compute_padding(int current_offset) const { ++ const uintx m = alignment_required() - 1; ++ return mask_bits(current_offset + m, ~m) - current_offset; ++} ++ ++#ifndef PRODUCT ++void MachBreakpointNode::format( PhaseRegAlloc *, outputStream* st ) const { ++ st->print("BRK"); ++} ++#endif ++ ++void MachBreakpointNode::emit(CodeBuffer &cbuf, PhaseRegAlloc* ra_) const { ++ C2_MacroAssembler _masm(&cbuf); ++ __ brk(5); ++} ++ ++uint MachBreakpointNode::size(PhaseRegAlloc* ra_) const { ++ return MachNode::size(ra_); ++} ++ ++ ++ ++// !!!!! Special hack to get all type of calls to specify the byte offset ++// from the start of the call to the point where the return address ++// will point. ++int MachCallStaticJavaNode::ret_addr_offset() { ++ // bl ++ return NativeCall::instruction_size; ++} ++ ++int MachCallDynamicJavaNode::ret_addr_offset() { ++ // lu12i_w IC_Klass, ++ // ori IC_Klass, ++ // lu32i_d IC_Klass ++ // lu52i_d IC_Klass ++ ++ // bl ++ return NativeMovConstReg::instruction_size + NativeCall::instruction_size; ++} ++ ++//============================================================================= ++ ++// Figure out which register class each belongs in: rc_int, rc_float, rc_stack ++enum RC { rc_bad, rc_int, rc_float, rc_stack }; ++static enum RC rc_class( OptoReg::Name reg ) { ++ if( !OptoReg::is_valid(reg) ) return rc_bad; ++ if (OptoReg::is_stack(reg)) return rc_stack; ++ VMReg r = OptoReg::as_VMReg(reg); ++ if (r->is_Register()) return rc_int; ++ assert(r->is_FloatRegister(), "must be"); ++ return rc_float; ++} ++ ++// Helper methods for MachSpillCopyNode::implementation(). ++static int vec_mov_helper(CodeBuffer *cbuf, bool do_size, int src_lo, int dst_lo, ++ int src_hi, int dst_hi, uint ireg, outputStream* st) { ++ int size = 0; ++ if (cbuf) { ++ MacroAssembler _masm(cbuf); ++ int offset = __ offset(); ++ switch (ireg) { ++ case Op_VecS: ++ __ fmov_s(as_FloatRegister(Matcher::_regEncode[dst_lo]), as_FloatRegister(Matcher::_regEncode[src_lo])); ++ break; ++ case Op_VecD: ++ __ fmov_d(as_FloatRegister(Matcher::_regEncode[dst_lo]), as_FloatRegister(Matcher::_regEncode[src_lo])); ++ break; ++ case Op_VecX: ++ __ vori_b(as_FloatRegister(Matcher::_regEncode[dst_lo]), as_FloatRegister(Matcher::_regEncode[src_lo]), 0); ++ break; ++ case Op_VecY: ++ __ xvori_b(as_FloatRegister(Matcher::_regEncode[dst_lo]), as_FloatRegister(Matcher::_regEncode[src_lo]), 0); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++#ifndef PRODUCT ++ } else if (!do_size) { ++ switch (ireg) { ++ case Op_VecS: ++ st->print("fmov.s %s, %s\t# spill", Matcher::regName[dst_lo], Matcher::regName[src_lo]); ++ break; ++ case Op_VecD: ++ st->print("fmov.d %s, %s\t# spill", Matcher::regName[dst_lo], Matcher::regName[src_lo]); ++ break; ++ case Op_VecX: ++ st->print("vori.b %s, %s, 0\t# spill", Matcher::regName[dst_lo], Matcher::regName[src_lo]); ++ break; ++ case Op_VecY: ++ st->print("xvori.b %s, %s, 0\t# spill", Matcher::regName[dst_lo], Matcher::regName[src_lo]); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++#endif ++ } ++ size += 4; ++ return size; ++} ++ ++static int vec_spill_helper(CodeBuffer *cbuf, bool do_size, bool is_load, ++ int stack_offset, int reg, uint ireg, outputStream* st) { ++ int size = 0; ++ if (cbuf) { ++ MacroAssembler _masm(cbuf); ++ int offset = __ offset(); ++ if (is_load) { ++ switch (ireg) { ++ case Op_VecS: ++ __ fld_s(as_FloatRegister(Matcher::_regEncode[reg]), SP, stack_offset); ++ break; ++ case Op_VecD: ++ __ fld_d(as_FloatRegister(Matcher::_regEncode[reg]), SP, stack_offset); ++ break; ++ case Op_VecX: ++ __ vld(as_FloatRegister(Matcher::_regEncode[reg]), SP, stack_offset); ++ break; ++ case Op_VecY: ++ __ xvld(as_FloatRegister(Matcher::_regEncode[reg]), SP, stack_offset); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { // store ++ switch (ireg) { ++ case Op_VecS: ++ __ fst_s(as_FloatRegister(Matcher::_regEncode[reg]), SP, stack_offset); ++ break; ++ case Op_VecD: ++ __ fst_d(as_FloatRegister(Matcher::_regEncode[reg]), SP, stack_offset); ++ break; ++ case Op_VecX: ++ __ vst(as_FloatRegister(Matcher::_regEncode[reg]), SP, stack_offset); ++ break; ++ case Op_VecY: ++ __ xvst(as_FloatRegister(Matcher::_regEncode[reg]), SP, stack_offset); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++#ifndef PRODUCT ++ } else if (!do_size) { ++ if (is_load) { ++ switch (ireg) { ++ case Op_VecS: ++ st->print("fld.s %s, [SP + %d]\t# spill", Matcher::regName[reg], stack_offset); ++ break; ++ case Op_VecD: ++ st->print("fld.d %s, [SP + %d]\t# spill", Matcher::regName[reg], stack_offset); ++ break; ++ case Op_VecX: ++ st->print("vld %s, [SP + %d]\t# spill", Matcher::regName[reg], stack_offset); ++ break; ++ case Op_VecY: ++ st->print("xvld %s, [SP + %d]\t# spill", Matcher::regName[reg], stack_offset); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { // store ++ switch (ireg) { ++ case Op_VecS: ++ st->print("fst.s %s, [SP + %d]\t# spill", Matcher::regName[reg], stack_offset); ++ break; ++ case Op_VecD: ++ st->print("fst.d %s, [SP + %d]\t# spill", Matcher::regName[reg], stack_offset); ++ break; ++ case Op_VecX: ++ st->print("vst %s, [SP + %d]\t# spill", Matcher::regName[reg], stack_offset); ++ break; ++ case Op_VecY: ++ st->print("xvst %s, [SP + %d]\t# spill", Matcher::regName[reg], stack_offset); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++#endif ++ } ++ size += 4; ++ return size; ++} ++ ++static int vec_stack_to_stack_helper(CodeBuffer *cbuf, int src_offset, ++ int dst_offset, uint ireg, outputStream* st) { ++ int size = 0; ++ if (cbuf) { ++ MacroAssembler _masm(cbuf); ++ switch (ireg) { ++ case Op_VecS: ++ __ fld_s(F23, SP, src_offset); ++ __ fst_s(F23, SP, dst_offset); ++ break; ++ case Op_VecD: ++ __ fld_d(F23, SP, src_offset); ++ __ fst_d(F23, SP, dst_offset); ++ break; ++ case Op_VecX: ++ __ vld(F23, SP, src_offset); ++ __ vst(F23, SP, dst_offset); ++ break; ++ case Op_VecY: ++ __ xvld(F23, SP, src_offset); ++ __ xvst(F23, SP, dst_offset); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++#ifndef PRODUCT ++ } else { ++ switch (ireg) { ++ case Op_VecS: ++ st->print("fld.s f23, %d(sp)\n\t" ++ "fst.s f23, %d(sp)\t# 32-bit mem-mem spill", ++ src_offset, dst_offset); ++ break; ++ case Op_VecD: ++ st->print("fld.d f23, %d(sp)\n\t" ++ "fst.d f23, %d(sp)\t# 64-bit mem-mem spill", ++ src_offset, dst_offset); ++ break; ++ case Op_VecX: ++ st->print("vld f23, %d(sp)\n\t" ++ "vst f23, %d(sp)\t# 128-bit mem-mem spill", ++ src_offset, dst_offset); ++ break; ++ case Op_VecY: ++ st->print("xvld f23, %d(sp)\n\t" ++ "xvst f23, %d(sp)\t# 256-bit mem-mem spill", ++ src_offset, dst_offset); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++#endif ++ } ++ size += 8; ++ return size; ++} ++ ++uint MachSpillCopyNode::implementation( CodeBuffer *cbuf, PhaseRegAlloc *ra_, bool do_size, outputStream* st ) const { ++ // Get registers to move ++ OptoReg::Name src_second = ra_->get_reg_second(in(1)); ++ OptoReg::Name src_first = ra_->get_reg_first(in(1)); ++ OptoReg::Name dst_second = ra_->get_reg_second(this ); ++ OptoReg::Name dst_first = ra_->get_reg_first(this ); ++ ++ enum RC src_second_rc = rc_class(src_second); ++ enum RC src_first_rc = rc_class(src_first); ++ enum RC dst_second_rc = rc_class(dst_second); ++ enum RC dst_first_rc = rc_class(dst_first); ++ ++ assert(OptoReg::is_valid(src_first) && OptoReg::is_valid(dst_first), "must move at least 1 register" ); ++ ++ // Generate spill code! ++ ++ if( src_first == dst_first && src_second == dst_second ) ++ return 0; // Self copy, no move ++ ++ if (bottom_type()->isa_vect() != nullptr) { ++ uint ireg = ideal_reg(); ++ assert((src_first_rc != rc_int && dst_first_rc != rc_int), "sanity"); ++ if (src_first_rc == rc_stack && dst_first_rc == rc_stack) { ++ // mem -> mem ++ int src_offset = ra_->reg2offset(src_first); ++ int dst_offset = ra_->reg2offset(dst_first); ++ vec_stack_to_stack_helper(cbuf, src_offset, dst_offset, ireg, st); ++ } else if (src_first_rc == rc_float && dst_first_rc == rc_float) { ++ vec_mov_helper(cbuf, do_size, src_first, dst_first, src_second, dst_second, ireg, st); ++ } else if (src_first_rc == rc_float && dst_first_rc == rc_stack) { ++ int stack_offset = ra_->reg2offset(dst_first); ++ vec_spill_helper(cbuf, do_size, false, stack_offset, src_first, ireg, st); ++ } else if (src_first_rc == rc_stack && dst_first_rc == rc_float) { ++ int stack_offset = ra_->reg2offset(src_first); ++ vec_spill_helper(cbuf, do_size, true, stack_offset, dst_first, ireg, st); ++ } else { ++ ShouldNotReachHere(); ++ } ++ return 0; ++ } ++ ++ if (src_first_rc == rc_stack) { ++ // mem -> ++ if (dst_first_rc == rc_stack) { ++ // mem -> mem ++ assert(src_second != dst_first, "overlap"); ++ if ((src_first & 1) == 0 && src_first + 1 == src_second && ++ (dst_first & 1) == 0 && dst_first + 1 == dst_second) { ++ // 64-bit ++ int src_offset = ra_->reg2offset(src_first); ++ int dst_offset = ra_->reg2offset(dst_first); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ ld_d(AT, Address(SP, src_offset)); ++ __ st_d(AT, Address(SP, dst_offset)); ++#ifndef PRODUCT ++ } else { ++ st->print("\tld_d AT, [SP + #%d]\t# 64-bit mem-mem spill 1\n\t" ++ "st_d AT, [SP + #%d]", ++ src_offset, dst_offset); ++#endif ++ } ++ } else { ++ // 32-bit ++ assert(!((src_first & 1) == 0 && src_first + 1 == src_second), "no transform"); ++ assert(!((dst_first & 1) == 0 && dst_first + 1 == dst_second), "no transform"); ++ // No pushl/popl, so: ++ int src_offset = ra_->reg2offset(src_first); ++ int dst_offset = ra_->reg2offset(dst_first); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ ld_w(AT, Address(SP, src_offset)); ++ __ st_w(AT, Address(SP, dst_offset)); ++#ifndef PRODUCT ++ } else { ++ st->print("\tld_w AT, [SP + #%d] spill 2\n\t" ++ "st_w AT, [SP + #%d]\n\t", ++ src_offset, dst_offset); ++#endif ++ } ++ } ++ return 0; ++ } else if (dst_first_rc == rc_int) { ++ // mem -> gpr ++ if ((src_first & 1) == 0 && src_first + 1 == src_second && ++ (dst_first & 1) == 0 && dst_first + 1 == dst_second) { ++ // 64-bit ++ int offset = ra_->reg2offset(src_first); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ ld_d(as_Register(Matcher::_regEncode[dst_first]), Address(SP, offset)); ++#ifndef PRODUCT ++ } else { ++ st->print("\tld_d %s, [SP + #%d]\t# spill 3", ++ Matcher::regName[dst_first], ++ offset); ++#endif ++ } ++ } else { ++ // 32-bit ++ assert(!((src_first & 1) == 0 && src_first + 1 == src_second), "no transform"); ++ assert(!((dst_first & 1) == 0 && dst_first + 1 == dst_second), "no transform"); ++ int offset = ra_->reg2offset(src_first); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ if (this->ideal_reg() == Op_RegI) ++ __ ld_w(as_Register(Matcher::_regEncode[dst_first]), Address(SP, offset)); ++ else { ++ __ ld_wu(as_Register(Matcher::_regEncode[dst_first]), Address(SP, offset)); ++ } ++#ifndef PRODUCT ++ } else { ++ if (this->ideal_reg() == Op_RegI) ++ st->print("\tld_w %s, [SP + #%d]\t# spill 4", ++ Matcher::regName[dst_first], ++ offset); ++ else ++ st->print("\tld_wu %s, [SP + #%d]\t# spill 5", ++ Matcher::regName[dst_first], ++ offset); ++#endif ++ } ++ } ++ return 0; ++ } else if (dst_first_rc == rc_float) { ++ // mem-> xmm ++ if ((src_first & 1) == 0 && src_first + 1 == src_second && ++ (dst_first & 1) == 0 && dst_first + 1 == dst_second) { ++ // 64-bit ++ int offset = ra_->reg2offset(src_first); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ fld_d( as_FloatRegister(Matcher::_regEncode[dst_first]), Address(SP, offset)); ++#ifndef PRODUCT ++ } else { ++ st->print("\tfld_d %s, [SP + #%d]\t# spill 6", ++ Matcher::regName[dst_first], ++ offset); ++#endif ++ } ++ } else { ++ // 32-bit ++ assert(!((src_first & 1) == 0 && src_first + 1 == src_second), "no transform"); ++ assert(!((dst_first & 1) == 0 && dst_first + 1 == dst_second), "no transform"); ++ int offset = ra_->reg2offset(src_first); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ fld_s( as_FloatRegister(Matcher::_regEncode[dst_first]), Address(SP, offset)); ++#ifndef PRODUCT ++ } else { ++ st->print("\tfld_s %s, [SP + #%d]\t# spill 7", ++ Matcher::regName[dst_first], ++ offset); ++#endif ++ } ++ } ++ } ++ return 0; ++ } else if (src_first_rc == rc_int) { ++ // gpr -> ++ if (dst_first_rc == rc_stack) { ++ // gpr -> mem ++ if ((src_first & 1) == 0 && src_first + 1 == src_second && ++ (dst_first & 1) == 0 && dst_first + 1 == dst_second) { ++ // 64-bit ++ int offset = ra_->reg2offset(dst_first); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ st_d(as_Register(Matcher::_regEncode[src_first]), Address(SP, offset)); ++#ifndef PRODUCT ++ } else { ++ st->print("\tst_d %s, [SP + #%d] # spill 8", ++ Matcher::regName[src_first], ++ offset); ++#endif ++ } ++ } else { ++ // 32-bit ++ assert(!((src_first & 1) == 0 && src_first + 1 == src_second), "no transform"); ++ assert(!((dst_first & 1) == 0 && dst_first + 1 == dst_second), "no transform"); ++ int offset = ra_->reg2offset(dst_first); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ st_w(as_Register(Matcher::_regEncode[src_first]), Address(SP, offset)); ++#ifndef PRODUCT ++ } else { ++ st->print("\tst_w %s, [SP + #%d]\t# spill 9", ++ Matcher::regName[src_first], offset); ++#endif ++ } ++ } ++ return 0; ++ } else if (dst_first_rc == rc_int) { ++ // gpr -> gpr ++ if ((src_first & 1) == 0 && src_first + 1 == src_second && ++ (dst_first & 1) == 0 && dst_first + 1 == dst_second) { ++ // 64-bit ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ move(as_Register(Matcher::_regEncode[dst_first]), ++ as_Register(Matcher::_regEncode[src_first])); ++#ifndef PRODUCT ++ } else { ++ st->print("\tmove(64bit) %s <-- %s\t# spill 10", ++ Matcher::regName[dst_first], ++ Matcher::regName[src_first]); ++#endif ++ } ++ return 0; ++ } else { ++ // 32-bit ++ assert(!((src_first & 1) == 0 && src_first + 1 == src_second), "no transform"); ++ assert(!((dst_first & 1) == 0 && dst_first + 1 == dst_second), "no transform"); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ if (this->ideal_reg() == Op_RegI) ++ __ move_u32(as_Register(Matcher::_regEncode[dst_first]), as_Register(Matcher::_regEncode[src_first])); ++ else ++ __ add_d(as_Register(Matcher::_regEncode[dst_first]), as_Register(Matcher::_regEncode[src_first]), R0); ++#ifndef PRODUCT ++ } else { ++ st->print("\n\t"); ++ st->print("move(32-bit) %s <-- %s\t# spill 11", ++ Matcher::regName[dst_first], ++ Matcher::regName[src_first]); ++#endif ++ } ++ return 0; ++ } ++ } else if (dst_first_rc == rc_float) { ++ // gpr -> xmm ++ if ((src_first & 1) == 0 && src_first + 1 == src_second && ++ (dst_first & 1) == 0 && dst_first + 1 == dst_second) { ++ // 64-bit ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ movgr2fr_d(as_FloatRegister(Matcher::_regEncode[dst_first]), as_Register(Matcher::_regEncode[src_first])); ++#ifndef PRODUCT ++ } else { ++ st->print("\n\t"); ++ st->print("movgr2fr_d %s, %s\t# spill 12", ++ Matcher::regName[dst_first], ++ Matcher::regName[src_first]); ++#endif ++ } ++ } else { ++ // 32-bit ++ assert(!((src_first & 1) == 0 && src_first + 1 == src_second), "no transform"); ++ assert(!((dst_first & 1) == 0 && dst_first + 1 == dst_second), "no transform"); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ movgr2fr_w(as_FloatRegister(Matcher::_regEncode[dst_first]), as_Register(Matcher::_regEncode[src_first])); ++#ifndef PRODUCT ++ } else { ++ st->print("\n\t"); ++ st->print("movgr2fr_w %s, %s\t# spill 13", ++ Matcher::regName[dst_first], ++ Matcher::regName[src_first]); ++#endif ++ } ++ } ++ return 0; ++ } ++ } else if (src_first_rc == rc_float) { ++ // xmm -> ++ if (dst_first_rc == rc_stack) { ++ // xmm -> mem ++ if ((src_first & 1) == 0 && src_first + 1 == src_second && ++ (dst_first & 1) == 0 && dst_first + 1 == dst_second) { ++ // 64-bit ++ int offset = ra_->reg2offset(dst_first); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ fst_d( as_FloatRegister(Matcher::_regEncode[src_first]), Address(SP, offset) ); ++#ifndef PRODUCT ++ } else { ++ st->print("\n\t"); ++ st->print("fst_d %s, [SP + #%d]\t# spill 14", ++ Matcher::regName[src_first], ++ offset); ++#endif ++ } ++ } else { ++ // 32-bit ++ assert(!((src_first & 1) == 0 && src_first + 1 == src_second), "no transform"); ++ assert(!((dst_first & 1) == 0 && dst_first + 1 == dst_second), "no transform"); ++ int offset = ra_->reg2offset(dst_first); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ fst_s(as_FloatRegister(Matcher::_regEncode[src_first]), Address(SP, offset)); ++#ifndef PRODUCT ++ } else { ++ st->print("\n\t"); ++ st->print("fst_s %s, [SP + #%d]\t# spill 15", ++ Matcher::regName[src_first], ++ offset); ++#endif ++ } ++ } ++ return 0; ++ } else if (dst_first_rc == rc_int) { ++ // xmm -> gpr ++ if ((src_first & 1) == 0 && src_first + 1 == src_second && ++ (dst_first & 1) == 0 && dst_first + 1 == dst_second) { ++ // 64-bit ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ movfr2gr_d( as_Register(Matcher::_regEncode[dst_first]), as_FloatRegister(Matcher::_regEncode[src_first])); ++#ifndef PRODUCT ++ } else { ++ st->print("\n\t"); ++ st->print("movfr2gr_d %s, %s\t# spill 16", ++ Matcher::regName[dst_first], ++ Matcher::regName[src_first]); ++#endif ++ } ++ } else { ++ // 32-bit ++ assert(!((src_first & 1) == 0 && src_first + 1 == src_second), "no transform"); ++ assert(!((dst_first & 1) == 0 && dst_first + 1 == dst_second), "no transform"); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ movfr2gr_s( as_Register(Matcher::_regEncode[dst_first]), as_FloatRegister(Matcher::_regEncode[src_first])); ++#ifndef PRODUCT ++ } else { ++ st->print("\n\t"); ++ st->print("movfr2gr_s %s, %s\t# spill 17", ++ Matcher::regName[dst_first], ++ Matcher::regName[src_first]); ++#endif ++ } ++ } ++ return 0; ++ } else if (dst_first_rc == rc_float) { ++ // xmm -> xmm ++ if ((src_first & 1) == 0 && src_first + 1 == src_second && ++ (dst_first & 1) == 0 && dst_first + 1 == dst_second) { ++ // 64-bit ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ fmov_d( as_FloatRegister(Matcher::_regEncode[dst_first]), as_FloatRegister(Matcher::_regEncode[src_first])); ++#ifndef PRODUCT ++ } else { ++ st->print("\n\t"); ++ st->print("fmov_d %s <-- %s\t# spill 18", ++ Matcher::regName[dst_first], ++ Matcher::regName[src_first]); ++#endif ++ } ++ } else { ++ // 32-bit ++ assert(!((src_first & 1) == 0 && src_first + 1 == src_second), "no transform"); ++ assert(!((dst_first & 1) == 0 && dst_first + 1 == dst_second), "no transform"); ++ if (cbuf) { ++ C2_MacroAssembler _masm(cbuf); ++ __ fmov_s( as_FloatRegister(Matcher::_regEncode[dst_first]), as_FloatRegister(Matcher::_regEncode[src_first])); ++#ifndef PRODUCT ++ } else { ++ st->print("\n\t"); ++ st->print("fmov_s %s <-- %s\t# spill 19", ++ Matcher::regName[dst_first], ++ Matcher::regName[src_first]); ++#endif ++ } ++ } ++ return 0; ++ } ++ } ++ ++ assert(0," foo "); ++ Unimplemented(); ++ return 0; ++} ++ ++#ifndef PRODUCT ++void MachSpillCopyNode::format( PhaseRegAlloc *ra_, outputStream* st ) const { ++ implementation( nullptr, ra_, false, st ); ++} ++#endif ++ ++void MachSpillCopyNode::emit(CodeBuffer &cbuf, PhaseRegAlloc *ra_) const { ++ implementation( &cbuf, ra_, false, nullptr ); ++} ++ ++uint MachSpillCopyNode::size(PhaseRegAlloc *ra_) const { ++ return MachNode::size(ra_); ++} ++ ++//============================================================================= ++#ifndef PRODUCT ++void MachEpilogNode::format( PhaseRegAlloc *ra_, outputStream* st ) const { ++ Compile *C = ra_->C; ++ int framesize = C->output()->frame_size_in_bytes(); ++ ++ assert((framesize & (StackAlignmentInBytes-1)) == 0, "frame size not aligned"); ++ ++ st->print_cr("ld_d RA, SP, %d # Restore RA @ MachEpilogNode", -wordSize); ++ st->print("\t"); ++ st->print_cr("ld_d FP, SP, %d # Restore FP @ MachEpilogNode", -wordSize*2); ++ st->print("\t"); ++ if (Assembler::is_simm(framesize, 12)) { ++ st->print_cr("addi_d SP, SP, %d # Rlease stack @ MachEpilogNode", framesize); ++ } else { ++ st->print_cr("li AT, %d # Rlease stack @ MachEpilogNode", framesize); ++ st->print_cr("add_d SP, SP, AT # Rlease stack @ MachEpilogNode"); ++ } ++ if( do_polling() && C->is_method_compilation() ) { ++ st->print("\t"); ++ st->print_cr("ld_d AT, poll_offset[thread] #polling_word_offset\n\t" ++ "ld_w AT, [AT]\t" ++ "# Safepoint: poll for GC"); ++ } ++} ++#endif ++ ++void MachEpilogNode::emit(CodeBuffer &cbuf, PhaseRegAlloc *ra_) const { ++ Compile *C = ra_->C; ++ C2_MacroAssembler _masm(&cbuf); ++ int framesize = C->output()->frame_size_in_bytes(); ++ ++ assert((framesize & (StackAlignmentInBytes-1)) == 0, "frame size not aligned"); ++ ++ __ remove_frame(framesize); ++ ++ if (StackReservedPages > 0 && C->has_reserved_stack_access()) { ++ __ reserved_stack_check(); ++ } ++ ++ if( do_polling() && C->is_method_compilation() ) { ++ Label dummy_label; ++ Label* code_stub = &dummy_label; ++ if (!C->output()->in_scratch_emit_size()) { ++ C2SafepointPollStub* stub = new (C->comp_arena()) C2SafepointPollStub(__ offset()); ++ C->output()->add_stub(stub); ++ code_stub = &stub->entry(); ++ } ++ __ relocate(relocInfo::poll_return_type); ++ __ safepoint_poll(*code_stub, TREG, true /* at_return */, false /* acquire */, true /* in_nmethod */); ++ } ++} ++ ++uint MachEpilogNode::size(PhaseRegAlloc *ra_) const { ++ return MachNode::size(ra_); // too many variables; just compute it the hard way ++} ++ ++int MachEpilogNode::reloc() const { ++ return 0; // a large enough number ++} ++ ++const Pipeline * MachEpilogNode::pipeline() const { ++ return MachNode::pipeline_class(); ++} ++ ++//============================================================================= ++ ++#ifndef PRODUCT ++void BoxLockNode::format( PhaseRegAlloc *ra_, outputStream* st ) const { ++ int offset = ra_->reg2offset(in_RegMask(0).find_first_elem()); ++ int reg = ra_->get_reg_first(this); ++ st->print("ADDI_D %s, SP, %d @BoxLockNode",Matcher::regName[reg],offset); ++} ++#endif ++ ++ ++uint BoxLockNode::size(PhaseRegAlloc *ra_) const { ++ int offset = ra_->reg2offset(in_RegMask(0).find_first_elem()); ++ ++ if (Assembler::is_simm(offset, 12)) ++ return 4; ++ else ++ return 3 * 4; ++} ++ ++void BoxLockNode::emit(CodeBuffer &cbuf, PhaseRegAlloc *ra_) const { ++ C2_MacroAssembler _masm(&cbuf); ++ int offset = ra_->reg2offset(in_RegMask(0).find_first_elem()); ++ int reg = ra_->get_encode(this); ++ ++ if (Assembler::is_simm(offset, 12)) { ++ __ addi_d(as_Register(reg), SP, offset); ++ } else { ++ __ lu12i_w(AT, Assembler::split_low20(offset >> 12)); ++ __ ori(AT, AT, Assembler::split_low12(offset)); ++ __ add_d(as_Register(reg), SP, AT); ++ } ++} ++ ++int MachCallRuntimeNode::ret_addr_offset() { ++ // pcaddu18i ++ // jirl ++ return NativeFarCall::instruction_size; ++} ++ ++//============================================================================= ++#ifndef PRODUCT ++void MachNopNode::format( PhaseRegAlloc *, outputStream* st ) const { ++ st->print("NOP \t# %d bytes pad for loops and calls", 4 * _count); ++} ++#endif ++ ++void MachNopNode::emit(CodeBuffer &cbuf, PhaseRegAlloc * ) const { ++ C2_MacroAssembler _masm(&cbuf); ++ int i = 0; ++ for(i = 0; i < _count; i++) ++ __ nop(); ++} ++ ++uint MachNopNode::size(PhaseRegAlloc *) const { ++ return 4 * _count; ++} ++const Pipeline* MachNopNode::pipeline() const { ++ return MachNode::pipeline_class(); ++} ++ ++//============================================================================= ++ ++//============================================================================= ++#ifndef PRODUCT ++void MachUEPNode::format( PhaseRegAlloc *ra_, outputStream* st ) const { ++ st->print_cr("load_klass(T4, T0)"); ++ st->print_cr("\tbeq(T4, iCache, L)"); ++ st->print_cr("\tjmp(SharedRuntime::get_ic_miss_stub(), relocInfo::runtime_call_type)"); ++ st->print_cr(" L:"); ++} ++#endif ++ ++ ++void MachUEPNode::emit(CodeBuffer &cbuf, PhaseRegAlloc *ra_) const { ++ C2_MacroAssembler _masm(&cbuf); ++ int ic_reg = Matcher::inline_cache_reg_encode(); ++ Label L; ++ Register receiver = T0; ++ Register iCache = as_Register(ic_reg); ++ ++ __ load_klass(T4, receiver); ++ __ beq(T4, iCache, L); ++ __ jmp((address)SharedRuntime::get_ic_miss_stub(), relocInfo::runtime_call_type); ++ __ bind(L); ++} ++ ++uint MachUEPNode::size(PhaseRegAlloc *ra_) const { ++ return MachNode::size(ra_); ++} ++ ++ ++ ++//============================================================================= ++ ++const RegMask& MachConstantBaseNode::_out_RegMask = P_REG_mask(); ++ ++int ConstantTable::calculate_table_base_offset() const { ++ return 0; // absolute addressing, no offset ++} ++ ++bool MachConstantBaseNode::requires_postalloc_expand() const { return false; } ++void MachConstantBaseNode::postalloc_expand(GrowableArray *nodes, PhaseRegAlloc *ra_) { ++ ShouldNotReachHere(); ++} ++ ++void MachConstantBaseNode::emit(CodeBuffer& cbuf, PhaseRegAlloc* ra_) const { ++ Compile* C = ra_->C; ++ ConstantTable& constant_table = C->output()->constant_table(); ++ C2_MacroAssembler _masm(&cbuf); ++ ++ Register Rtoc = as_Register(ra_->get_encode(this)); ++ CodeSection* consts_section = cbuf.consts(); ++ int consts_size = cbuf.insts()->align_at_start(consts_section->size()); ++ assert(constant_table.size() == consts_size, "must be equal"); ++ ++ if (consts_section->size()) { ++ assert((CodeBuffer::SECT_CONSTS + 1) == CodeBuffer::SECT_INSTS, ++ "insts must be immediately follow consts"); ++ // Materialize the constant table base. ++ address baseaddr = cbuf.insts()->start() - consts_size + -(constant_table.table_base_offset()); ++ jint offs = (baseaddr - __ pc()) >> 2; ++ guarantee(Assembler::is_simm(offs, 20), "Not signed 20-bit offset"); ++ __ pcaddi(Rtoc, offs); ++ } ++} ++ ++uint MachConstantBaseNode::size(PhaseRegAlloc* ra_) const { ++ // pcaddi ++ return 1 * BytesPerInstWord; ++} ++ ++#ifndef PRODUCT ++void MachConstantBaseNode::format(PhaseRegAlloc* ra_, outputStream* st) const { ++ Register r = as_Register(ra_->get_encode(this)); ++ st->print("pcaddi %s, &constanttable (constant table base) @ MachConstantBaseNode", r->name()); ++} ++#endif ++ ++ ++//============================================================================= ++#ifndef PRODUCT ++void MachPrologNode::format( PhaseRegAlloc *ra_, outputStream* st ) const { ++ Compile* C = ra_->C; ++ ++ int framesize = C->output()->frame_size_in_bytes(); ++ int bangsize = C->output()->bang_size_in_bytes(); ++ assert((framesize & (StackAlignmentInBytes-1)) == 0, "frame size not aligned"); ++ ++ // Calls to C2R adapters often do not accept exceptional returns. ++ // We require that their callers must bang for them. But be careful, because ++ // some VM calls (such as call site linkage) can use several kilobytes of ++ // stack. But the stack safety zone should account for that. ++ // See bugs 4446381, 4468289, 4497237. ++ if (C->output()->need_stack_bang(bangsize)) { ++ st->print_cr("# stack bang"); st->print("\t"); ++ } ++ st->print("st_d RA, %d(SP) @ MachPrologNode\n\t", -wordSize); ++ st->print("st_d FP, %d(SP) @ MachPrologNode\n\t", -wordSize*2); ++ if (PreserveFramePointer) { ++ if (Assembler::is_simm((framesize - wordSize * 2), 12)) { ++ st->print("addi_d FP, SP, %d \n\t", framesize); ++ } else { ++ st->print("li AT, %d \n\t", framesize); ++ st->print("add_d FP, AT \n\t"); ++ } ++ } ++ st->print("addi_d SP, SP, -%d \t",framesize); ++ if (C->stub_function() == nullptr && BarrierSet::barrier_set()->barrier_set_nmethod() != nullptr) { ++ st->print("\n\t"); ++ st->print("ld_d T1, guard, 0\n\t"); ++ st->print("membar LoadLoad\n\t"); ++ st->print("ld_d T2, TREG, thread_disarmed_guard_value_offset\n\t"); ++ st->print("beq T1, T2, skip\n\t"); ++ st->print("\n\t"); ++ st->print("jalr #nmethod_entry_barrier_stub\n\t"); ++ st->print("b skip\n\t"); ++ st->print("guard: int\n\t"); ++ st->print("\n\t"); ++ st->print("skip:\n\t"); ++ } ++} ++#endif ++ ++ ++void MachPrologNode::emit(CodeBuffer &cbuf, PhaseRegAlloc *ra_) const { ++ Compile* C = ra_->C; ++ C2_MacroAssembler _masm(&cbuf); ++ ++ int framesize = C->output()->frame_size_in_bytes(); ++ int bangsize = C->output()->bang_size_in_bytes(); ++ ++ assert((framesize & (StackAlignmentInBytes-1)) == 0, "frame size not aligned"); ++ ++#ifdef ASSERT ++ address start = __ pc(); ++#endif ++ ++ if (C->clinit_barrier_on_entry()) { ++ assert(!C->method()->holder()->is_not_initialized(), "initialization should have been started"); ++ ++ Label L_skip_barrier; ++ ++ __ mov_metadata(T4, C->method()->holder()->constant_encoding()); ++ __ clinit_barrier(T4, AT, &L_skip_barrier); ++ __ jmp((address)SharedRuntime::get_handle_wrong_method_stub(), relocInfo::runtime_call_type); ++ __ bind(L_skip_barrier); ++ } ++ ++ if (C->output()->need_stack_bang(bangsize)) { ++ __ generate_stack_overflow_check(bangsize); ++ } ++ ++ __ build_frame(framesize); ++ ++ assert((__ pc() - start) >= 1 * BytesPerInstWord, "No enough room for patch_verified_entry"); ++ ++ if (C->stub_function() == nullptr) { ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ if (BarrierSet::barrier_set()->barrier_set_nmethod() != nullptr) { ++ // Dummy labels for just measuring the code size ++ Label dummy_slow_path; ++ Label dummy_continuation; ++ Label dummy_guard; ++ Label* slow_path = &dummy_slow_path; ++ Label* continuation = &dummy_continuation; ++ Label* guard = &dummy_guard; ++ if (!Compile::current()->output()->in_scratch_emit_size()) { ++ // Use real labels from actual stub when not emitting code for purpose of measuring its size ++ C2EntryBarrierStub* stub = new (Compile::current()->comp_arena()) C2EntryBarrierStub(); ++ Compile::current()->output()->add_stub(stub); ++ slow_path = &stub->entry(); ++ continuation = &stub->continuation(); ++ guard = &stub->guard(); ++ } ++ // In the C2 code, we move the non-hot part of nmethod entry barriers out-of-line to a stub. ++ bs->nmethod_entry_barrier(&_masm, slow_path, continuation, guard); ++ } ++ } ++ ++ C->output()->set_frame_complete(cbuf.insts_size()); ++ if (C->has_mach_constant_base_node()) { ++ // NOTE: We set the table base offset here because users might be ++ // emitted before MachConstantBaseNode. ++ ConstantTable& constant_table = C->output()->constant_table(); ++ constant_table.set_table_base_offset(constant_table.calculate_table_base_offset()); ++ } ++} ++ ++ ++uint MachPrologNode::size(PhaseRegAlloc *ra_) const { ++ return MachNode::size(ra_); // too many variables; just compute it the hard way ++} ++ ++int MachPrologNode::reloc() const { ++ return 0; // a large enough number ++} ++ ++bool is_CAS(int opcode) ++{ ++ switch(opcode) { ++ // We handle these ++ case Op_CompareAndSwapB: ++ case Op_CompareAndSwapS: ++ case Op_CompareAndSwapI: ++ case Op_CompareAndSwapL: ++ case Op_CompareAndSwapP: ++ case Op_CompareAndSwapN: ++ case Op_ShenandoahCompareAndSwapP: ++ case Op_ShenandoahCompareAndSwapN: ++ case Op_ShenandoahWeakCompareAndSwapP: ++ case Op_ShenandoahWeakCompareAndSwapN: ++ case Op_ShenandoahCompareAndExchangeP: ++ case Op_ShenandoahCompareAndExchangeN: ++ case Op_GetAndSetB: ++ case Op_GetAndSetS: ++ case Op_GetAndSetI: ++ case Op_GetAndSetL: ++ case Op_GetAndSetP: ++ case Op_GetAndSetN: ++ case Op_GetAndAddB: ++ case Op_GetAndAddS: ++ case Op_GetAndAddI: ++ case Op_GetAndAddL: ++ return true; ++ default: ++ return false; ++ } ++} ++ ++bool use_AMO(int opcode) ++{ ++ switch(opcode) { ++ // We handle these ++ case Op_StoreI: ++ case Op_StoreL: ++ case Op_StoreP: ++ case Op_StoreN: ++ case Op_StoreNKlass: ++ return true; ++ default: ++ return false; ++ } ++} ++ ++bool unnecessary_acquire(const Node *barrier) ++{ ++ assert(barrier->is_MemBar(), "expecting a membar"); ++ ++ if (UseBarriersForVolatile) { ++ // we need to plant a dbar ++ return false; ++ } ++ ++ MemBarNode* mb = barrier->as_MemBar(); ++ ++ if (mb->trailing_load_store()) { ++ Node* load_store = mb->in(MemBarNode::Precedent); ++ assert(load_store->is_LoadStore(), "unexpected graph shape"); ++ return is_CAS(load_store->Opcode()); ++ } ++ ++ return false; ++} ++ ++bool unnecessary_release(const Node *n) ++{ ++ assert((n->is_MemBar() && n->Opcode() == Op_MemBarRelease), "expecting a release membar"); ++ ++ if (UseBarriersForVolatile) { ++ // we need to plant a dbar ++ return false; ++ } ++ ++ MemBarNode *barrier = n->as_MemBar(); ++ ++ if (!barrier->leading()) { ++ return false; ++ } else { ++ Node* trailing = barrier->trailing_membar(); ++ MemBarNode* trailing_mb = trailing->as_MemBar(); ++ assert(trailing_mb->trailing(), "Not a trailing membar?"); ++ assert(trailing_mb->leading_membar() == n, "inconsistent leading/trailing membars"); ++ ++ Node* mem = trailing_mb->in(MemBarNode::Precedent); ++ if (mem->is_Store()) { ++ assert(mem->as_Store()->is_release(), ""); ++ assert(trailing_mb->Opcode() == Op_MemBarVolatile, ""); ++ return use_AMO(mem->Opcode()); ++ } else { ++ assert(mem->is_LoadStore(), ""); ++ assert(trailing_mb->Opcode() == Op_MemBarAcquire, ""); ++ return is_CAS(mem->Opcode()); ++ } ++ } ++ ++ return false; ++} ++ ++bool unnecessary_volatile(const Node *n) ++{ ++ // assert n->is_MemBar(); ++ if (UseBarriersForVolatile) { ++ // we need to plant a dbar ++ return false; ++ } ++ ++ MemBarNode *mbvol = n->as_MemBar(); ++ ++ bool release = false; ++ if (mbvol->trailing_store()) { ++ Node* mem = mbvol->in(MemBarNode::Precedent); ++ release = use_AMO(mem->Opcode()); ++ } ++ ++ assert(!release || (mbvol->in(MemBarNode::Precedent)->is_Store() && mbvol->in(MemBarNode::Precedent)->as_Store()->is_release()), ""); ++#ifdef ASSERT ++ if (release) { ++ Node* leading = mbvol->leading_membar(); ++ assert(leading->Opcode() == Op_MemBarRelease, ""); ++ assert(leading->as_MemBar()->leading_store(), ""); ++ assert(leading->as_MemBar()->trailing_membar() == mbvol, ""); ++ } ++#endif ++ ++ return release; ++} ++ ++bool needs_releasing_store(const Node *n) ++{ ++ // assert n->is_Store(); ++ if (UseBarriersForVolatile) { ++ // we use a normal store and dbar combination ++ return false; ++ } ++ ++ StoreNode *st = n->as_Store(); ++ ++ return st->trailing_membar() != nullptr; ++} ++ ++%} ++ ++//----------ENCODING BLOCK----------------------------------------------------- ++// This block specifies the encoding classes used by the compiler to output ++// byte streams. Encoding classes generate functions which are called by ++// Machine Instruction Nodes in order to generate the bit encoding of the ++// instruction. Operands specify their base encoding interface with the ++// interface keyword. There are currently supported four interfaces, ++// REG_INTER, CONST_INTER, MEMORY_INTER, & COND_INTER. REG_INTER causes an ++// operand to generate a function which returns its register number when ++// queried. CONST_INTER causes an operand to generate a function which ++// returns the value of the constant when queried. MEMORY_INTER causes an ++// operand to generate four functions which return the Base Register, the ++// Index Register, the Scale Value, and the Offset Value of the operand when ++// queried. COND_INTER causes an operand to generate six functions which ++// return the encoding code (ie - encoding bits for the instruction) ++// associated with each basic boolean condition for a conditional instruction. ++// Instructions specify two basic values for encoding. They use the ++// ins_encode keyword to specify their encoding class (which must be one of ++// the class names specified in the encoding block), and they use the ++// opcode keyword to specify, in order, their primary, secondary, and ++// tertiary opcode. Only the opcode sections which a particular instruction ++// needs for encoding need to be specified. ++encode %{ ++ ++ enc_class Java_To_Runtime (method meth) %{ // CALL Java_To_Runtime, Java_To_Runtime_Leaf ++ C2_MacroAssembler _masm(&cbuf); ++ // This is the instruction starting address for relocation info. ++ __ block_comment("Java_To_Runtime"); ++ cbuf.set_insts_mark(); ++ __ relocate(relocInfo::runtime_call_type); ++ __ patchable_call((address)$meth$$method); ++ _masm.clear_inst_mark(); ++ __ post_call_nop(); ++ %} ++ ++ enc_class Java_Static_Call (method meth) %{ // JAVA STATIC CALL ++ // CALL to fixup routine. Fixup routine uses ScopeDesc info to determine ++ // who we intended to call. ++ C2_MacroAssembler _masm(&cbuf); ++ cbuf.set_insts_mark(); ++ address addr = (address)$meth$$method; ++ address call; ++ __ block_comment("Java_Static_Call"); ++ ++ if ( !_method ) { ++ // A call to a runtime wrapper, e.g. new, new_typeArray_Java, uncommon_trap. ++ call = __ trampoline_call(AddressLiteral(addr, relocInfo::runtime_call_type), &cbuf); ++ if (call == nullptr) { ++ ciEnv::current()->record_failure("CodeCache is full"); ++ return; ++ } ++ } else { ++ int method_index = resolved_method_index(cbuf); ++ RelocationHolder rspec = _optimized_virtual ? opt_virtual_call_Relocation::spec(method_index) ++ : static_call_Relocation::spec(method_index); ++ call = __ trampoline_call(AddressLiteral(addr, rspec), &cbuf); ++ if (call == nullptr) { ++ ciEnv::current()->record_failure("CodeCache is full"); ++ return; ++ } ++ if (CodeBuffer::supports_shared_stubs() && _method->can_be_statically_bound()) { ++ // Calls of the same statically bound method can share ++ // a stub to the interpreter. ++ cbuf.shared_stub_to_interp_for(_method, cbuf.insts()->mark_off()); ++ } else { ++ // Emit stub for static call ++ address stub = CompiledStaticCall::emit_to_interp_stub(cbuf); ++ if (stub == nullptr) { ++ ciEnv::current()->record_failure("CodeCache is full"); ++ return; ++ } ++ } ++ } ++ _masm.clear_inst_mark(); ++ __ post_call_nop(); ++ %} ++ ++ ++ // ++ // [Ref: LIR_Assembler::ic_call() ] ++ // ++ enc_class Java_Dynamic_Call (method meth) %{ // JAVA DYNAMIC CALL ++ C2_MacroAssembler _masm(&cbuf); ++ __ block_comment("Java_Dynamic_Call"); ++ address call = __ ic_call((address)$meth$$method, resolved_method_index(cbuf)); ++ if (call == nullptr) { ++ ciEnv::current()->record_failure("CodeCache is full"); ++ return; ++ } ++ _masm.clear_inst_mark(); ++ __ post_call_nop(); ++ %} ++ ++ ++ enc_class enc_PartialSubtypeCheck(mRegP result, mRegP sub, mRegP super, mRegI tmp, mRegI tmp2) %{ ++ Register result = $result$$Register; ++ Register sub = $sub$$Register; ++ Register super = $super$$Register; ++ Register length = $tmp$$Register; ++ Register tmp = $tmp2$$Register; ++ Label miss; ++ ++ // result may be the same as sub ++ // 47c B40: # B21 B41 <- B20 Freq: 0.155379 ++ // 47c partialSubtypeCheck result=S1, sub=S1, super=S3, length=S0 ++ // 4bc mov S2, nullptr #@loadConP ++ // 4c0 beq S1, S2, B21 #@branchConP P=0.999999 C=-1.000000 ++ // ++ C2_MacroAssembler _masm(&cbuf); ++ Label done; ++ __ check_klass_subtype_slow_path(sub, super, length, tmp, ++ nullptr, &miss, ++ /*set_cond_codes:*/ true); ++ // Refer to X86_64's RDI ++ __ move(result, R0); ++ __ b(done); ++ ++ __ bind(miss); ++ __ li(result, 1); ++ __ bind(done); ++ %} ++ ++%} ++ ++ ++//---------LOONGARCH FRAME-------------------------------------------------------------- ++// Definition of frame structure and management information. ++// ++// S T A C K L A Y O U T Allocators stack-slot number ++// | (to get allocators register number ++// G Owned by | | v add SharedInfo::stack0) ++// r CALLER | | ++// o | +--------+ pad to even-align allocators stack-slot ++// w V | pad0 | numbers; owned by CALLER ++// t -----------+--------+----> Matcher::_in_arg_limit, unaligned ++// h ^ | in | 5 ++// | | args | 4 Holes in incoming args owned by SELF ++// | | old | | 3 ++// | | SP-+--------+----> Matcher::_old_SP, even aligned ++// v | | ret | 3 return address ++// Owned by +--------+ ++// Self | pad2 | 2 pad to align old SP ++// | +--------+ 1 ++// | | locks | 0 ++// | +--------+----> SharedInfo::stack0, even aligned ++// | | pad1 | 11 pad to align new SP ++// | +--------+ ++// | | | 10 ++// | | spills | 9 spills ++// V | | 8 (pad0 slot for callee) ++// -----------+--------+----> Matcher::_out_arg_limit, unaligned ++// ^ | out | 7 ++// | | args | 6 Holes in outgoing args owned by CALLEE ++// Owned by new | | ++// Callee SP-+--------+----> Matcher::_new_SP, even aligned ++// | | ++// ++// Note 1: Only region 8-11 is determined by the allocator. Region 0-5 is ++// known from SELF's arguments and the Java calling convention. ++// Region 6-7 is determined per call site. ++// Note 2: If the calling convention leaves holes in the incoming argument ++// area, those holes are owned by SELF. Holes in the outgoing area ++// are owned by the CALLEE. Holes should not be necessary in the ++// incoming area, as the Java calling convention is completely under ++// the control of the AD file. Doubles can be sorted and packed to ++// avoid holes. Holes in the outgoing arguments may be necessary for ++// varargs C calling conventions. ++// Note 3: Region 0-3 is even aligned, with pad2 as needed. Region 3-5 is ++// even aligned with pad0 as needed. ++// Region 6 is even aligned. Region 6-7 is NOT even aligned; ++// region 6-11 is even aligned; it may be padded out more so that ++// the region from SP to FP meets the minimum stack alignment. ++// Note 4: For I2C adapters, the incoming FP may not meet the minimum stack ++// alignment. Region 11, pad1, may be dynamically extended so that ++// SP meets the minimum alignment. ++ ++ ++frame %{ ++ // These two registers define part of the calling convention ++ // between compiled code and the interpreter. ++ // SEE StartI2CNode::calling_convention & StartC2INode::calling_convention & StartOSRNode::calling_convention ++ // for more information. ++ ++ inline_cache_reg(T1); // Inline Cache Register ++ ++ // Optional: name the operand used by cisc-spilling to access [stack_pointer + offset] ++ cisc_spilling_operand_name(indOffset32); ++ ++ // Number of stack slots consumed by locking an object ++ // generate Compile::sync_stack_slots ++ sync_stack_slots(2); ++ ++ frame_pointer(SP); ++ ++ // Interpreter stores its frame pointer in a register which is ++ // stored to the stack by I2CAdaptors. ++ // I2CAdaptors convert from interpreted java to compiled java. ++ ++ interpreter_frame_pointer(FP); ++ ++ // generate Matcher::stack_alignment ++ stack_alignment(StackAlignmentInBytes); //wordSize = sizeof(char*); ++ ++ // Number of outgoing stack slots killed above the out_preserve_stack_slots ++ // for calls to C. Supports the var-args backing area for register parms. ++ varargs_C_out_slots_killed(0); ++ ++ // The after-PROLOG location of the return address. Location of ++ // return address specifies a type (REG or STACK) and a number ++ // representing the register number (i.e. - use a register name) or ++ // stack slot. ++ // Ret Addr is on stack in slot 0 if no locks or verification or alignment. ++ // Otherwise, it is above the locks and verification slot and alignment word ++ //return_addr(STACK -1+ round_to(1+VerifyStackAtCalls+Compile::current()->sync()*Compile::current()->sync_stack_slots(),WordsPerLong)); ++ return_addr(REG RA); ++ ++ // Location of C & interpreter return values ++ // register(s) contain(s) return value for Op_StartI2C and Op_StartOSR. ++ // SEE Matcher::match. ++ c_return_value %{ ++ assert( ideal_reg >= Op_RegI && ideal_reg <= Op_RegL, "only return normal values" ); ++ /* -- , -- , Op_RegN, Op_RegI, Op_RegP, Op_RegF, Op_RegD, Op_RegL */ ++ static int lo[Op_RegL+1] = { 0, 0, V0_num, V0_num, V0_num, F0_num, F0_num, V0_num }; ++ static int hi[Op_RegL+1] = { 0, 0, OptoReg::Bad, OptoReg::Bad, V0_H_num, OptoReg::Bad, F0_H_num, V0_H_num }; ++ return OptoRegPair(hi[ideal_reg],lo[ideal_reg]); ++ %} ++ ++ // Location of return values ++ // register(s) contain(s) return value for Op_StartC2I and Op_Start. ++ // SEE Matcher::match. ++ ++ return_value %{ ++ assert( ideal_reg >= Op_RegI && ideal_reg <= Op_RegL, "only return normal values" ); ++ /* -- , -- , Op_RegN, Op_RegI, Op_RegP, Op_RegF, Op_RegD, Op_RegL */ ++ static int lo[Op_RegL+1] = { 0, 0, V0_num, V0_num, V0_num, F0_num, F0_num, V0_num }; ++ static int hi[Op_RegL+1] = { 0, 0, OptoReg::Bad, OptoReg::Bad, V0_H_num, OptoReg::Bad, F0_H_num, V0_H_num}; ++ return OptoRegPair(hi[ideal_reg],lo[ideal_reg]); ++ %} ++ ++%} ++ ++//----------ATTRIBUTES--------------------------------------------------------- ++//----------Operand Attributes------------------------------------------------- ++op_attrib op_cost(0); // Required cost attribute ++ ++//----------Instruction Attributes--------------------------------------------- ++ins_attrib ins_cost(100); // Required cost attribute ++ins_attrib ins_size(32); // Required size attribute (in bits) ++ins_attrib ins_pc_relative(0); // Required PC Relative flag ++ins_attrib ins_short_branch(0); // Required flag: is this instruction a ++ // non-matching short branch variant of some ++ // long branch? ++ins_attrib ins_alignment(4); // Required alignment attribute (must be a power of 2) ++ // specifies the alignment that some part of the instruction (not ++ // necessarily the start) requires. If > 1, a compute_padding() ++ // function must be provided for the instruction ++ ++//----------OPERANDS----------------------------------------------------------- ++// Operand definitions must precede instruction definitions for correct parsing ++// in the ADLC because operands constitute user defined types which are used in ++// instruction definitions. ++ ++// Vectors ++ ++operand vReg() %{ ++ constraint(ALLOC_IN_RC(dynamic)); ++ match(VecS); ++ match(VecD); ++ match(VecX); ++ match(VecY); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand vecS() %{ ++ constraint(ALLOC_IN_RC(flt_reg)); ++ match(VecS); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand vecD() %{ ++ constraint(ALLOC_IN_RC(dbl_reg)); ++ match(VecD); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand vecX() %{ ++ constraint(ALLOC_IN_RC(vectorx_reg)); ++ match(VecX); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand vecY() %{ ++ constraint(ALLOC_IN_RC(vectory_reg)); ++ match(VecY); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++// Flags register, used as output of compare instructions ++operand FlagsReg() %{ ++ constraint(ALLOC_IN_RC(t0_reg)); ++ match(RegFlags); ++ ++ format %{ "T0" %} ++ interface(REG_INTER); ++%} ++ ++//----------Simple Operands---------------------------------------------------- ++// TODO: Should we need to define some more special immediate number ? ++// Immediate Operands ++// Integer Immediate ++operand immI() %{ ++ match(ConI); ++ ++ op_cost(20); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immIU1() %{ ++ predicate((0 <= n->get_int()) && (n->get_int() <= 1)); ++ match(ConI); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immIU2() %{ ++ predicate((0 <= n->get_int()) && (n->get_int() <= 3)); ++ match(ConI); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immIU3() %{ ++ predicate((0 <= n->get_int()) && (n->get_int() <= 7)); ++ match(ConI); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immIU4() %{ ++ predicate((0 <= n->get_int()) && (n->get_int() <= 15)); ++ match(ConI); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immIU5() %{ ++ predicate((0 <= n->get_int()) && (n->get_int() <= 31)); ++ match(ConI); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immIU6() %{ ++ predicate((0 <= n->get_int()) && (n->get_int() <= 63)); ++ match(ConI); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immIU8() %{ ++ predicate((0 <= n->get_int()) && (n->get_int() <= 255)); ++ match(ConI); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI10() %{ ++ predicate((-512 <= n->get_int()) && (n->get_int() <= 511)); ++ match(ConI); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI12() %{ ++ predicate((-2048 <= n->get_int()) && (n->get_int() <= 2047)); ++ match(ConI); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_M65536() %{ ++ predicate(n->get_int() == -65536); ++ match(ConI); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Constant for decrement ++operand immI_M1() %{ ++ predicate(n->get_int() == -1); ++ match(ConI); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Constant for zero ++operand immI_0() %{ ++ predicate(n->get_int() == 0); ++ match(ConI); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_1() %{ ++ predicate(n->get_int() == 1); ++ match(ConI); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_2() %{ ++ predicate(n->get_int() == 2); ++ match(ConI); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_4() %{ ++ predicate(n->get_int() == 4); ++ match(ConI); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_8() %{ ++ predicate(n->get_int() == 8); ++ match(ConI); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_16() %{ ++ predicate(n->get_int() == 16); ++ match(ConI); ++ ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_24() %{ ++ predicate(n->get_int() == 24); ++ match(ConI); ++ ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Constant for long shifts ++operand immI_32() %{ ++ predicate(n->get_int() == 32); ++ match(ConI); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Constant for byte-wide masking ++operand immI_255() %{ ++ predicate(n->get_int() == 255); ++ match(ConI); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_65535() %{ ++ predicate(n->get_int() == 65535); ++ match(ConI); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_MaxI() %{ ++ predicate(n->get_int() == 2147483647); ++ match(ConI); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_M2047_2048() %{ ++ predicate((-2047 <= n->get_int()) && (n->get_int() <= 2048)); ++ match(ConI); ++ ++ op_cost(10); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Valid scale values for addressing modes ++operand immI_0_3() %{ ++ predicate(0 <= n->get_int() && (n->get_int() <= 3)); ++ match(ConI); ++ ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_0_31() %{ ++ predicate(n->get_int() >= 0 && n->get_int() <= 31); ++ match(ConI); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_0_4095() %{ ++ predicate(n->get_int() >= 0 && n->get_int() <= 4095); ++ match(ConI); ++ op_cost(0); ++ ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_1_4() %{ ++ predicate(1 <= n->get_int() && (n->get_int() <= 4)); ++ match(ConI); ++ ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_32_63() %{ ++ predicate(n->get_int() >= 32 && n->get_int() <= 63); ++ match(ConI); ++ op_cost(0); ++ ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_M128_255() %{ ++ predicate((-128 <= n->get_int()) && (n->get_int() <= 255)); ++ match(ConI); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Operand for non-negtive integer mask ++operand immI_nonneg_mask() %{ ++ predicate((n->get_int() >= 0) && Assembler::is_nonneg_mask(n->get_int())); ++ match(ConI); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immI_zeroins_mask() %{ ++ predicate(Assembler::is_zeroins_mask(n->get_int())); ++ match(ConI); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Long Immediate ++operand immL() %{ ++ match(ConL); ++ ++ op_cost(20); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immLU5() %{ ++ predicate((0 <= n->get_long()) && (n->get_long() <= 31)); ++ match(ConL); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immL10() %{ ++ predicate((-512 <= n->get_long()) && (n->get_long() <= 511)); ++ match(ConL); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immAlign16() %{ ++ predicate((-65536 <= n->get_long()) && (n->get_long() <= 65535) && ((n->get_long() & 0x3) == 0)); ++ match(ConL); ++ ++ op_cost(10); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immL12() %{ ++ predicate((-2048 <= n->get_long()) && (n->get_long() <= 2047)); ++ match(ConL); ++ ++ op_cost(10); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Long Immediate 32-bit signed ++operand immL32() ++%{ ++ predicate(n->get_long() == (int)n->get_long()); ++ match(ConL); ++ ++ op_cost(15); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Long Immediate zero ++operand immL_0() %{ ++ predicate(n->get_long() == 0L); ++ match(ConL); ++ op_cost(0); ++ ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immL_MaxUI() %{ ++ predicate(n->get_long() == 0xFFFFFFFFL); ++ match(ConL); ++ op_cost(20); ++ ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immL_M2047_2048() %{ ++ predicate((-2047 <= n->get_long()) && (n->get_long() <= 2048)); ++ match(ConL); ++ ++ op_cost(10); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immL_0_4095() %{ ++ predicate(n->get_long() >= 0 && n->get_long() <= 4095); ++ match(ConL); ++ op_cost(0); ++ ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Operand for non-negtive long mask ++operand immL_nonneg_mask() %{ ++ predicate((n->get_long() >= 0) && Assembler::is_nonneg_mask(n->get_long())); ++ match(ConL); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immL_zeroins_mask() %{ ++ predicate(Assembler::is_zeroins_mask(n->get_long())); ++ match(ConL); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immL_gt_7() ++%{ ++ predicate(n->get_long() > 7); ++ match(ConL); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immL_gt_15() ++%{ ++ predicate(n->get_long() > 15); ++ match(ConL); ++ ++ op_cost(0); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Pointer Immediate ++operand immP() %{ ++ match(ConP); ++ ++ op_cost(10); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// null Pointer Immediate ++operand immP_0() %{ ++ predicate(n->get_ptr() == 0); ++ match(ConP); ++ op_cost(0); ++ ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Pointer Immediate ++operand immP_no_oop_cheap() %{ ++ predicate(!n->bottom_type()->isa_oop_ptr()); ++ match(ConP); ++ ++ op_cost(5); ++ // formats are generated automatically for constants and base registers ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Pointer Immediate ++operand immN() %{ ++ match(ConN); ++ ++ op_cost(10); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// null Pointer Immediate ++operand immN_0() %{ ++ predicate(n->get_narrowcon() == 0); ++ match(ConN); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immNKlass() %{ ++ match(ConNKlass); ++ ++ op_cost(10); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Single-precision floating-point immediate ++operand immF() %{ ++ match(ConF); ++ ++ op_cost(20); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Single-precision floating-point zero ++operand immF_0() %{ ++ predicate(jint_cast(n->getf()) == 0); ++ match(ConF); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immFVec() %{ ++ predicate(UseLSX && Assembler::is_vec_imm(n->getf())); ++ match(ConF); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Double-precision floating-point immediate ++operand immD() %{ ++ match(ConD); ++ ++ op_cost(20); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Double-precision floating-point zero ++operand immD_0() %{ ++ predicate(jlong_cast(n->getd()) == 0); ++ match(ConD); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++operand immDVec() %{ ++ predicate(UseLSX && Assembler::is_vec_imm(n->getd())); ++ match(ConD); ++ ++ op_cost(5); ++ format %{ %} ++ interface(CONST_INTER); ++%} ++ ++// Register Operands ++// Integer Register ++operand mRegI() %{ ++ constraint(ALLOC_IN_RC(int_reg)); ++ match(RegI); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand mS0RegI() %{ ++ constraint(ALLOC_IN_RC(s0_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "S0" %} ++ interface(REG_INTER); ++%} ++ ++operand mS1RegI() %{ ++ constraint(ALLOC_IN_RC(s1_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "S1" %} ++ interface(REG_INTER); ++%} ++ ++operand mS3RegI() %{ ++ constraint(ALLOC_IN_RC(s3_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "S3" %} ++ interface(REG_INTER); ++%} ++ ++operand mS4RegI() %{ ++ constraint(ALLOC_IN_RC(s4_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "S4" %} ++ interface(REG_INTER); ++%} ++ ++operand mS5RegI() %{ ++ constraint(ALLOC_IN_RC(s5_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "S5" %} ++ interface(REG_INTER); ++%} ++ ++operand mS6RegI() %{ ++ constraint(ALLOC_IN_RC(s6_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "S6" %} ++ interface(REG_INTER); ++%} ++ ++operand mS7RegI() %{ ++ constraint(ALLOC_IN_RC(s7_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "S7" %} ++ interface(REG_INTER); ++%} ++ ++ ++operand mT0RegI() %{ ++ constraint(ALLOC_IN_RC(t0_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "T0" %} ++ interface(REG_INTER); ++%} ++ ++operand mT1RegI() %{ ++ constraint(ALLOC_IN_RC(t1_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "T1" %} ++ interface(REG_INTER); ++%} ++ ++operand mT2RegI() %{ ++ constraint(ALLOC_IN_RC(t2_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "T2" %} ++ interface(REG_INTER); ++%} ++ ++operand mT3RegI() %{ ++ constraint(ALLOC_IN_RC(t3_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "T3" %} ++ interface(REG_INTER); ++%} ++ ++operand mT8RegI() %{ ++ constraint(ALLOC_IN_RC(t8_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "T8" %} ++ interface(REG_INTER); ++%} ++ ++operand mT4RegI() %{ ++ constraint(ALLOC_IN_RC(t4_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "T4" %} ++ interface(REG_INTER); ++%} ++ ++operand mA0RegI() %{ ++ constraint(ALLOC_IN_RC(a0_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "A0" %} ++ interface(REG_INTER); ++%} ++ ++operand mA1RegI() %{ ++ constraint(ALLOC_IN_RC(a1_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "A1" %} ++ interface(REG_INTER); ++%} ++ ++operand mA2RegI() %{ ++ constraint(ALLOC_IN_RC(a2_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "A2" %} ++ interface(REG_INTER); ++%} ++ ++operand mA3RegI() %{ ++ constraint(ALLOC_IN_RC(a3_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "A3" %} ++ interface(REG_INTER); ++%} ++ ++operand mA4RegI() %{ ++ constraint(ALLOC_IN_RC(a4_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "A4" %} ++ interface(REG_INTER); ++%} ++ ++operand mA5RegI() %{ ++ constraint(ALLOC_IN_RC(a5_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "A5" %} ++ interface(REG_INTER); ++%} ++ ++operand mA6RegI() %{ ++ constraint(ALLOC_IN_RC(a6_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "A6" %} ++ interface(REG_INTER); ++%} ++ ++operand mA7RegI() %{ ++ constraint(ALLOC_IN_RC(a7_reg)); ++ match(RegI); ++ match(mRegI); ++ ++ format %{ "A7" %} ++ interface(REG_INTER); ++%} ++ ++operand mRegN() %{ ++ constraint(ALLOC_IN_RC(int_reg)); ++ match(RegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t0_RegN() %{ ++ constraint(ALLOC_IN_RC(t0_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t1_RegN() %{ ++ constraint(ALLOC_IN_RC(t1_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t3_RegN() %{ ++ constraint(ALLOC_IN_RC(t3_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t8_RegN() %{ ++ constraint(ALLOC_IN_RC(t8_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a0_RegN() %{ ++ constraint(ALLOC_IN_RC(a0_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a1_RegN() %{ ++ constraint(ALLOC_IN_RC(a1_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a2_RegN() %{ ++ constraint(ALLOC_IN_RC(a2_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a3_RegN() %{ ++ constraint(ALLOC_IN_RC(a3_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a4_RegN() %{ ++ constraint(ALLOC_IN_RC(a4_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a5_RegN() %{ ++ constraint(ALLOC_IN_RC(a5_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a6_RegN() %{ ++ constraint(ALLOC_IN_RC(a6_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a7_RegN() %{ ++ constraint(ALLOC_IN_RC(a7_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s0_RegN() %{ ++ constraint(ALLOC_IN_RC(s0_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s1_RegN() %{ ++ constraint(ALLOC_IN_RC(s1_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s2_RegN() %{ ++ constraint(ALLOC_IN_RC(s2_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s3_RegN() %{ ++ constraint(ALLOC_IN_RC(s3_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s4_RegN() %{ ++ constraint(ALLOC_IN_RC(s4_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s5_RegN() %{ ++ constraint(ALLOC_IN_RC(s5_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s6_RegN() %{ ++ constraint(ALLOC_IN_RC(s6_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s7_RegN() %{ ++ constraint(ALLOC_IN_RC(s7_reg)); ++ match(RegN); ++ match(mRegN); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++// Pointer Register ++operand mRegP() %{ ++ constraint(ALLOC_IN_RC(p_reg)); ++ match(RegP); ++ match(a0_RegP); ++ match(javaThread_RegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand no_CR_mRegP() %{ ++ constraint(ALLOC_IN_RC(no_CR_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand p_has_s6_mRegP() %{ ++ constraint(ALLOC_IN_RC(p_has_s6_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s1_RegP() ++%{ ++ constraint(ALLOC_IN_RC(s1_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s3_RegP() ++%{ ++ constraint(ALLOC_IN_RC(s3_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s4_RegP() ++%{ ++ constraint(ALLOC_IN_RC(s4_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s5_RegP() ++%{ ++ constraint(ALLOC_IN_RC(s5_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s6_RegP() ++%{ ++ constraint(ALLOC_IN_RC(s6_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++// Java Thread Register ++operand javaThread_RegP(mRegP reg) ++%{ ++ constraint(ALLOC_IN_RC(s6_long_reg)); // S6 denotes TREG (java_thread) ++ match(reg); ++ op_cost(0); ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s7_RegP() ++%{ ++ constraint(ALLOC_IN_RC(s7_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t0_RegP() ++%{ ++ constraint(ALLOC_IN_RC(t0_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t1_RegP() ++%{ ++ constraint(ALLOC_IN_RC(t1_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t2_RegP() ++%{ ++ constraint(ALLOC_IN_RC(t2_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t3_RegP() ++%{ ++ constraint(ALLOC_IN_RC(t3_long_reg)); ++ match(RegP); ++ match(mRegP); ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t8_RegP() ++%{ ++ constraint(ALLOC_IN_RC(t8_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a0_RegP() ++%{ ++ constraint(ALLOC_IN_RC(a0_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a1_RegP() ++%{ ++ constraint(ALLOC_IN_RC(a1_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a2_RegP() ++%{ ++ constraint(ALLOC_IN_RC(a2_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a3_RegP() ++%{ ++ constraint(ALLOC_IN_RC(a3_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a4_RegP() ++%{ ++ constraint(ALLOC_IN_RC(a4_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++ ++operand a5_RegP() ++%{ ++ constraint(ALLOC_IN_RC(a5_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a6_RegP() ++%{ ++ constraint(ALLOC_IN_RC(a6_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a7_RegP() ++%{ ++ constraint(ALLOC_IN_RC(a7_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand v0_RegP() ++%{ ++ constraint(ALLOC_IN_RC(v0_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand v1_RegP() ++%{ ++ constraint(ALLOC_IN_RC(v1_long_reg)); ++ match(RegP); ++ match(mRegP); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand mRegL() %{ ++ constraint(ALLOC_IN_RC(long_reg)); ++ match(RegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand mRegI2L(mRegI reg) %{ ++ match(ConvI2L reg); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand mRegL2I(mRegL reg) %{ ++ match(ConvL2I reg); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand v0RegL() %{ ++ constraint(ALLOC_IN_RC(v0_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand v1RegL() %{ ++ constraint(ALLOC_IN_RC(v1_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a0RegL() %{ ++ constraint(ALLOC_IN_RC(a0_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ "A0" %} ++ interface(REG_INTER); ++%} ++ ++operand a1RegL() %{ ++ constraint(ALLOC_IN_RC(a1_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a2RegL() %{ ++ constraint(ALLOC_IN_RC(a2_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a3RegL() %{ ++ constraint(ALLOC_IN_RC(a3_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t0RegL() %{ ++ constraint(ALLOC_IN_RC(t0_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t1RegL() %{ ++ constraint(ALLOC_IN_RC(t1_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t2RegL() %{ ++ constraint(ALLOC_IN_RC(t2_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t3RegL() %{ ++ constraint(ALLOC_IN_RC(t3_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand t8RegL() %{ ++ constraint(ALLOC_IN_RC(t8_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a4RegL() %{ ++ constraint(ALLOC_IN_RC(a4_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a5RegL() %{ ++ constraint(ALLOC_IN_RC(a5_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a6RegL() %{ ++ constraint(ALLOC_IN_RC(a6_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand a7RegL() %{ ++ constraint(ALLOC_IN_RC(a7_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s0RegL() %{ ++ constraint(ALLOC_IN_RC(s0_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s1RegL() %{ ++ constraint(ALLOC_IN_RC(s1_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s3RegL() %{ ++ constraint(ALLOC_IN_RC(s3_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s4RegL() %{ ++ constraint(ALLOC_IN_RC(s4_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++operand s7RegL() %{ ++ constraint(ALLOC_IN_RC(s7_long_reg)); ++ match(RegL); ++ match(mRegL); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++// Floating register operands ++operand regF() %{ ++ constraint(ALLOC_IN_RC(flt_reg)); ++ match(RegF); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++//Double Precision Floating register operands ++operand regD() %{ ++ constraint(ALLOC_IN_RC(dbl_reg)); ++ match(RegD); ++ ++ format %{ %} ++ interface(REG_INTER); ++%} ++ ++//----------Memory Operands---------------------------------------------------- ++// Indirect Memory Operand ++operand indirect(mRegP reg) %{ ++ constraint(ALLOC_IN_RC(p_reg)); ++ match(reg); ++ ++ format %{ "[$reg] @ indirect" %} ++ interface(MEMORY_INTER) %{ ++ base($reg); ++ index(0xffffffff); /* NO_INDEX */ ++ scale(0x0); ++ disp(0x0); ++ %} ++%} ++ ++// Indirect Memory Plus Short Offset Operand ++operand indOffset12(p_has_s6_mRegP reg, immL12 off) ++%{ ++ constraint(ALLOC_IN_RC(p_reg)); ++ match(AddP reg off); ++ ++ op_cost(10); ++ format %{ "[$reg + $off (12-bit)] @ indOffset12" %} ++ interface(MEMORY_INTER) %{ ++ base($reg); ++ index(0xffffffff); /* NO_INDEX */ ++ scale(0x0); ++ disp($off); ++ %} ++%} ++ ++operand indOffset12I2L(mRegP reg, immI12 off) ++%{ ++ constraint(ALLOC_IN_RC(p_reg)); ++ match(AddP reg (ConvI2L off)); ++ ++ op_cost(10); ++ format %{ "[$reg + $off (12-bit)] @ indOffset12I2L" %} ++ interface(MEMORY_INTER) %{ ++ base($reg); ++ index(0xffffffff); /* NO_INDEX */ ++ scale(0x0); ++ disp($off); ++ %} ++%} ++ ++operand indOffset16(p_has_s6_mRegP reg, immAlign16 off) ++%{ ++ constraint(ALLOC_IN_RC(p_reg)); ++ match(AddP reg off); ++ ++ op_cost(10); ++ format %{ "[$reg + $off (16-bit)] @ indOffset16" %} ++ interface(MEMORY_INTER) %{ ++ base($reg); ++ index(0xffffffff); /* NO_INDEX */ ++ scale(0x0); ++ disp($off); ++ %} ++%} ++ ++// Indirect Memory Plus Index Register ++operand indIndex(mRegP addr, mRegL index) %{ ++ constraint(ALLOC_IN_RC(p_reg)); ++ match(AddP addr index); ++ ++ op_cost(20); ++ format %{"[$addr + $index] @ indIndex" %} ++ interface(MEMORY_INTER) %{ ++ base($addr); ++ index($index); ++ scale(0x0); ++ disp(0x0); ++ %} ++%} ++ ++operand indIndexI2L(mRegP reg, mRegI ireg) ++%{ ++ constraint(ALLOC_IN_RC(ptr_reg)); ++ match(AddP reg (ConvI2L ireg)); ++ op_cost(10); ++ format %{ "[$reg + $ireg] @ indIndexI2L" %} ++ interface(MEMORY_INTER) %{ ++ base($reg); ++ index($ireg); ++ scale(0x0); ++ disp(0x0); ++ %} ++%} ++ ++// Indirect Memory Operand ++operand indirectNarrow(mRegN reg) ++%{ ++ predicate(CompressedOops::shift() == 0); ++ constraint(ALLOC_IN_RC(p_reg)); ++ op_cost(10); ++ match(DecodeN reg); ++ ++ format %{ "[$reg] @ indirectNarrow" %} ++ interface(MEMORY_INTER) %{ ++ base($reg); ++ index(0xffffffff); ++ scale(0x0); ++ disp(0x0); ++ %} ++%} ++ ++// Indirect Memory Plus Short Offset Operand ++operand indOffset12Narrow(mRegN reg, immL12 off) ++%{ ++ predicate(CompressedOops::shift() == 0); ++ constraint(ALLOC_IN_RC(p_reg)); ++ op_cost(10); ++ match(AddP (DecodeN reg) off); ++ ++ format %{ "[$reg + $off (12-bit)] @ indOffset12Narrow" %} ++ interface(MEMORY_INTER) %{ ++ base($reg); ++ index(0xffffffff); ++ scale(0x0); ++ disp($off); ++ %} ++%} ++ ++//----------Conditional Branch Operands---------------------------------------- ++// Comparison Op - This is the operation of the comparison, and is limited to ++// the following set of codes: ++// L (<), LE (<=), G (>), GE (>=), E (==), NE (!=) ++// ++// Other attributes of the comparison, such as unsignedness, are specified ++// by the comparison instruction that sets a condition code flags register. ++// That result is represented by a flags operand whose subtype is appropriate ++// to the unsignedness (etc.) of the comparison. ++// ++// Later, the instruction which matches both the Comparison Op (a Bool) and ++// the flags (produced by the Cmp) specifies the coding of the comparison op ++// by matching a specific subtype of Bool operand below, such as cmpOp. ++ ++// Comparison Code ++operand cmpOp() %{ ++ match(Bool); ++ ++ format %{ "" %} ++ interface(COND_INTER) %{ ++ equal(0x01); ++ not_equal(0x02); ++ greater(0x03); ++ greater_equal(0x04); ++ less(0x05); ++ less_equal(0x06); ++ overflow(0x7); ++ no_overflow(0x8); ++ %} ++%} ++ ++operand cmpOpEqNe() %{ ++ match(Bool); ++ predicate(n->as_Bool()->_test._test == BoolTest::ne ++ || n->as_Bool()->_test._test == BoolTest::eq); ++ ++ format %{ "" %} ++ interface(COND_INTER) %{ ++ equal(0x01); ++ not_equal(0x02); ++ greater(0x03); ++ greater_equal(0x04); ++ less(0x05); ++ less_equal(0x06); ++ overflow(0x7); ++ no_overflow(0x8); ++ %} ++%} ++ ++//----------Special Memory Operands-------------------------------------------- ++// Stack Slot Operand - This operand is used for loading and storing temporary ++// values on the stack where a match requires a value to ++// flow through memory. ++operand stackSlotP(sRegP reg) %{ ++ constraint(ALLOC_IN_RC(stack_slots)); ++ // No match rule because this operand is only generated in matching ++ op_cost(50); ++ format %{ "[$reg]" %} ++ interface(MEMORY_INTER) %{ ++ base(0x1d); // SP ++ index(0xffffffff); // No Index ++ scale(0x0); // No Scale ++ disp($reg); // Stack Offset ++ %} ++%} ++ ++operand stackSlotI(sRegI reg) %{ ++ constraint(ALLOC_IN_RC(stack_slots)); ++ // No match rule because this operand is only generated in matching ++ op_cost(50); ++ format %{ "[$reg]" %} ++ interface(MEMORY_INTER) %{ ++ base(0x1d); // SP ++ index(0xffffffff); // No Index ++ scale(0x0); // No Scale ++ disp($reg); // Stack Offset ++ %} ++%} ++ ++operand stackSlotF(sRegF reg) %{ ++ constraint(ALLOC_IN_RC(stack_slots)); ++ // No match rule because this operand is only generated in matching ++ op_cost(50); ++ format %{ "[$reg]" %} ++ interface(MEMORY_INTER) %{ ++ base(0x1d); // SP ++ index(0xffffffff); // No Index ++ scale(0x0); // No Scale ++ disp($reg); // Stack Offset ++ %} ++%} ++ ++operand stackSlotD(sRegD reg) %{ ++ constraint(ALLOC_IN_RC(stack_slots)); ++ // No match rule because this operand is only generated in matching ++ op_cost(50); ++ format %{ "[$reg]" %} ++ interface(MEMORY_INTER) %{ ++ base(0x1d); // SP ++ index(0xffffffff); // No Index ++ scale(0x0); // No Scale ++ disp($reg); // Stack Offset ++ %} ++%} ++ ++operand stackSlotL(sRegL reg) %{ ++ constraint(ALLOC_IN_RC(stack_slots)); ++ // No match rule because this operand is only generated in matching ++ op_cost(50); ++ format %{ "[$reg]" %} ++ interface(MEMORY_INTER) %{ ++ base(0x1d); // SP ++ index(0xffffffff); // No Index ++ scale(0x0); // No Scale ++ disp($reg); // Stack Offset ++ %} ++%} ++ ++ ++//------------------------OPERAND CLASSES-------------------------------------- ++opclass memory( indirect, indOffset12, indOffset12I2L, indIndex, indIndexI2L, ++ indirectNarrow, indOffset12Narrow); ++opclass memory_loadRange(indOffset12, indirect); ++opclass memory_exclusive(indOffset16, indirect); ++ ++opclass mRegLorI2L(mRegI2L, mRegL); ++opclass mRegIorL2I( mRegI, mRegL2I); ++ ++//----------PIPELINE----------------------------------------------------------- ++// Rules which define the behavior of the target architectures pipeline. ++ ++pipeline %{ ++ ++ //----------ATTRIBUTES--------------------------------------------------------- ++ attributes %{ ++ fixed_size_instructions; // Fixed size instructions ++ max_instructions_per_bundle = 1; // 1 instruction per bundle ++ max_bundles_per_cycle = 4; // Up to 4 bundles per cycle ++ bundle_unit_size=4; ++ instruction_unit_size = 4; // An instruction is 4 bytes long ++ instruction_fetch_unit_size = 16; // The processor fetches one line ++ instruction_fetch_units = 1; // of 16 bytes ++ ++ // List of nop instructions ++ nops( MachNop ); ++ %} ++ ++ //----------RESOURCES---------------------------------------------------------- ++ // Resources are the functional units available to the machine ++ ++ resources(D1, D2, D3, D4, DECODE = D1 | D2 | D3 | D4, ++ ALU1, ALU2, ALU3, ALU4, ALU = ALU1 | ALU2 | ALU3 | ALU4, ++ FPU1, FPU2, FPU = FPU1 | FPU2, ++ MEM1, MEM2, MEM = MEM1 | MEM2); ++ ++ //----------PIPELINE DESCRIPTION----------------------------------------------- ++ // Pipeline Description specifies the stages in the machine's pipeline ++ ++ // PC: ++ // IF: fetch ++ // ID: decode ++ // ID1: decode 1 ++ // ID2: decode 2 ++ // RN: register rename ++ // SCHED: schedule ++ // EMIT: emit ++ // RD: read ++ // CA: calculate ++ // WB: write back ++ // CM: commit ++ ++ pipe_desc(PC, IF, ID, ID1, ID2, RN, SCHED, EMIT, RD, CA, WB, CM); ++ ++ //----------PIPELINE CLASSES--------------------------------------------------- ++ // Pipeline Classes describe the stages in which input and output are ++ // referenced by the hardware pipeline. ++ ++ // No.1 ALU reg-reg operation : dst <-- reg1 op reg2 ++ pipe_class ialu_reg_reg(mRegI dst, mRegI src1, mRegI src2) %{ ++ single_instruction; ++ fixed_latency(1); ++ src1 : RD(read); ++ src2 : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ ALU : CA; ++ %} ++ ++ // No.2 ALU reg-imm operation : dst <-- reg1 op imm ++ pipe_class ialu_reg_imm(mRegI dst, mRegI src) %{ ++ single_instruction; ++ fixed_latency(1); ++ src : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ ALU : CA; ++ %} ++ ++ // No.3 Integer mult operation : dst <-- reg1 mult reg2 ++ pipe_class ialu_mult(mRegI dst, mRegI src1, mRegI src2) %{ ++ single_instruction; ++ fixed_latency(3); ++ src1 : RD(read); ++ src2 : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ ALU : CA; ++ %} ++ ++ // No.4 Integer div operation : dst <-- reg1 div reg2 ++ pipe_class ialu_div(mRegI dst, mRegI src1, mRegI src2) %{ ++ single_instruction; ++ fixed_latency(10); ++ src1 : RD(read); ++ src2 : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ ALU : CA; ++ %} ++ ++ // No.5 load from memory : ++ pipe_class ialu_load(mRegL dst, memory mem) %{ ++ single_instruction; ++ fixed_latency(4); ++ mem : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ MEM : RD; ++ %} ++ ++ // No.6 Store to Memory : ++ pipe_class ialu_store(mRegL src, memory mem) %{ ++ single_instruction; ++ fixed_latency(0); ++ mem : RD(read); ++ src : RD(read); ++ DECODE : ID; ++ MEM : RD; ++ %} ++ ++ // No.7 No instructions : do nothing ++ pipe_class empty( ) %{ ++ instruction_count(0); ++ %} ++ ++ // No.8 prefetch data : ++ pipe_class pipe_prefetch( memory mem ) %{ ++ single_instruction; ++ fixed_latency(0); ++ mem : RD(read); ++ DECODE : ID; ++ MEM : RD; ++ %} ++ ++ // No.9 UnConditional branch : ++ pipe_class pipe_jump( label labl ) %{ ++ multiple_bundles; ++ DECODE : ID; ++ ALU : RD; ++ %} ++ ++ // No.10 ALU Conditional branch : ++ pipe_class pipe_alu_branch(mRegI src1, mRegI src2, label labl ) %{ ++ multiple_bundles; ++ src1 : RD(read); ++ src2 : RD(read); ++ DECODE : ID; ++ ALU : RD; ++ %} ++ ++ // No.11 Floating FPU reg-reg operation : dst <-- reg ++ //include f{abs/neg}.{s/d} fmov.{s/d} ++ pipe_class fpu_absnegmov(regF dst, regF src) %{ ++ single_instruction; ++ fixed_latency(1); ++ src : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ FPU : CA; ++ %} ++ ++ // No.12 Floating FPU reg-reg operation : dst <-- reg1 op reg2 ++ // include fsel ++ pipe_class fpu_sel(regF dst, regF src1, regF src2) %{ ++ single_instruction; ++ fixed_latency(1); ++ src1 : RD(read); ++ src2 : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ FPU : CA; ++ %} ++ ++ // No.13 Floating FPU reg-reg operation : dst <-- reg ++ // include fclass.s/d ++ pipe_class fpu_class(regF dst, regF src) %{ ++ single_instruction; ++ fixed_latency(2); ++ src : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ FPU : CA; ++ %} ++ ++ // No.14 Floating FPU reg-reg operation : dst <-- reg1 op reg2 ++ // include f{max/min}.s/d, f{maxa/mina}.s/d ++ pipe_class fpu_maxmin(regF dst, regF src1, regF src2) %{ ++ single_instruction; ++ fixed_latency(2); ++ src1 : RD(read); ++ src2 : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ FPU : CA; ++ %} ++ ++ // No.15 Floating FPU reg-reg operation : dst <-- reg ++ // include movgr2fr.{w/d}, movfr2gr.{w/d} ++ pipe_class fpu_movgrfr(regF dst, regF src) %{ ++ single_instruction; ++ fixed_latency(2); ++ src : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ FPU : CA; ++ %} ++ ++ // No.16 Floating FPU reg-reg operation : dst <-- reg ++ // include fcvt.s/d, ffint.{s/d}.{w/l}, ftint.{w/l}.{s.d}, ftint{rm/rp/rz/rne}.{w/l}.{s/d}, frint.{s/d} ++ pipe_class fpu_cvt(regF dst, regF src) %{ ++ single_instruction; ++ fixed_latency(4); ++ src : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ FPU : CA; ++ %} ++ ++ // No.17 Floating FPU reg-reg operation : dst <-- reg1 op reg2 ++ // include fadd.s/d, fsub.s/d, fmul.s/d, f{scaleb/copysign}.s/d ++ pipe_class fpu_arith(regF dst, regF src1, regF src2) %{ ++ single_instruction; ++ fixed_latency(5); ++ src1 : RD(read); ++ src2 : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ FPU : CA; ++ %} ++ ++ // No.18 Floating FPU reg-reg operation : dst <-- reg1 op reg2 op reg3 ++ // include f{madd/msub/nmadd/nmsub}.s/d ++ pipe_class fpu_arith3(regF dst, regF src1, regF src2, regF src3) %{ ++ single_instruction; ++ fixed_latency(5); ++ src1 : RD(read); ++ src2 : RD(read); ++ src3 : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ FPU : CA; ++ %} ++ ++ // No.19 Floating FPU reg-reg operation : dst <-- reg ++ // include flogb.s/d ++ pipe_class fpu_logb(regF dst, regF src) %{ ++ single_instruction; ++ fixed_latency(5); ++ src : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ FPU : CA; ++ %} ++ ++ // No.20 Floating div operation : dst <-- reg1 div reg2 ++ pipe_class fpu_div(regF dst, regF src1, regF src2) %{ ++ single_instruction; ++ fixed_latency(10); ++ src1 : RD(read); ++ src2 : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ FPU2 : CA; ++ %} ++ ++ // No.21 Load Floating from Memory : ++ pipe_class fpu_load(regF dst, memory mem) %{ ++ single_instruction; ++ fixed_latency(5); ++ mem : RD(read); ++ dst : WB(write); ++ DECODE : ID; ++ MEM : RD; ++ %} ++ ++ // No.22 Store Floating to Memory : ++ pipe_class fpu_store(regF src, memory mem) %{ ++ single_instruction; ++ fixed_latency(0); ++ mem : RD(read); ++ src : RD(read); ++ DECODE : ID; ++ MEM : RD; ++ %} ++ ++ // No.23 FPU Conditional branch : ++ pipe_class pipe_fpu_branch(regF src1, regF src2, label labl ) %{ ++ multiple_bundles; ++ src1 : RD(read); ++ src2 : RD(read); ++ DECODE : ID; ++ ALU : RD; ++ %} ++ ++ // No.24 ++ pipe_class long_memory_op() %{ ++ instruction_count(10); multiple_bundles; force_serialization; ++ fixed_latency(30); ++ %} ++ ++ // No.25 Any operation requiring serialization : ++ // EG. DBAR/Atomic ++ pipe_class pipe_serial() ++ %{ ++ single_instruction; ++ force_serialization; ++ fixed_latency(16); ++ DECODE : ID; ++ MEM : RD; ++ %} ++ ++ // No.26 Piple slow : for multi-instructions ++ pipe_class pipe_slow( ) %{ ++ instruction_count(20); ++ force_serialization; ++ multiple_bundles; ++ fixed_latency(50); ++ %} ++ ++%} ++ ++//----------INSTRUCTIONS------------------------------------------------------- ++// ++// match -- States which machine-independent subtree may be replaced ++// by this instruction. ++// ins_cost -- The estimated cost of this instruction is used by instruction ++// selection to identify a minimum cost tree of machine ++// instructions that matches a tree of machine-independent ++// instructions. ++// format -- A string providing the disassembly for this instruction. ++// The value of an instruction's operand may be inserted ++// by referring to it with a '$' prefix. ++// opcode -- Three instruction opcodes may be provided. These are referred ++// to within an encode class as $primary, $secondary, and $tertiary ++// respectively. The primary opcode is commonly used to ++// indicate the type of machine instruction, while secondary ++// and tertiary are often used for prefix options or addressing ++// modes. ++// ins_encode -- A list of encode classes with parameters. The encode class ++// name must have been defined in an 'enc_class' specification ++// in the encode section of the architecture description. ++ ++ ++// Load Integer ++instruct loadI(mRegI dst, memory mem) %{ ++ match(Set dst (LoadI mem)); ++ ++ ins_cost(125); ++ format %{ "ld_w $dst, $mem #@loadI" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_INT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++instruct loadI_convI2L(mRegL dst, memory mem) %{ ++ match(Set dst (ConvI2L (LoadI mem))); ++ ++ ins_cost(125); ++ format %{ "ld_w $dst, $mem #@loadI_convI2L" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_INT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load Integer (32 bit signed) to Byte (8 bit signed) ++instruct loadI2B(mRegI dst, memory mem, immI_24 twentyfour) %{ ++ match(Set dst (RShiftI (LShiftI (LoadI mem) twentyfour) twentyfour)); ++ ++ ins_cost(125); ++ format %{ "ld_b $dst, $mem\t# int -> byte #@loadI2B" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_BYTE); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load Integer (32 bit signed) to Unsigned Byte (8 bit UNsigned) ++instruct loadI2UB(mRegI dst, memory mem, immI_255 mask) %{ ++ match(Set dst (AndI (LoadI mem) mask)); ++ ++ ins_cost(125); ++ format %{ "ld_bu $dst, $mem\t# int -> ubyte #@loadI2UB" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_U_BYTE); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load Integer (32 bit signed) to Short (16 bit signed) ++instruct loadI2S(mRegI dst, memory mem, immI_16 sixteen) %{ ++ match(Set dst (RShiftI (LShiftI (LoadI mem) sixteen) sixteen)); ++ ++ ins_cost(125); ++ format %{ "ld_h $dst, $mem\t# int -> short #@loadI2S" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_SHORT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load Integer (32 bit signed) to Unsigned Short/Char (16 bit UNsigned) ++instruct loadI2US(mRegI dst, memory mem, immI_65535 mask) %{ ++ match(Set dst (AndI (LoadI mem) mask)); ++ ++ ins_cost(125); ++ format %{ "ld_hu $dst, $mem\t# int -> ushort/char #@loadI2US" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_U_SHORT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load Long. ++instruct loadL(mRegL dst, memory mem) %{ ++// predicate(!((LoadLNode*)n)->require_atomic_access()); ++ match(Set dst (LoadL mem)); ++ ++ ins_cost(250); ++ format %{ "ld_d $dst, $mem #@loadL" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_LONG); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load Long - UNaligned ++instruct loadL_unaligned(mRegL dst, memory mem) %{ ++ match(Set dst (LoadL_unaligned mem)); ++ ++ // FIXME: Need more effective ldl/ldr ++ ins_cost(450); ++ format %{ "ld_d $dst, $mem #@loadL_unaligned\n\t" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_LONG); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Store Long ++instruct storeL_reg(memory mem, mRegL src) %{ ++ match(Set mem (StoreL mem src)); ++ predicate(!needs_releasing_store(n)); ++ ++ ins_cost(200); ++ format %{ "st_d $mem, $src #@storeL_reg\n" %} ++ ins_encode %{ ++ __ loadstore_enc($src$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_LONG); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeL_reg_volatile(indirect mem, mRegL src) %{ ++ match(Set mem (StoreL mem src)); ++ ++ ins_cost(205); ++ format %{ "amswap_db_d R0, $src, $mem #@storeL_reg\n" %} ++ ins_encode %{ ++ __ amswap_db_d(R0, $src$$Register, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct storeL_immL_0(memory mem, immL_0 zero) %{ ++ match(Set mem (StoreL mem zero)); ++ predicate(!needs_releasing_store(n)); ++ ++ ins_cost(180); ++ format %{ "st_d zero, $mem #@storeL_immL_0" %} ++ ins_encode %{ ++ __ loadstore_enc(R0, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_LONG); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeL_immL_0_volatile(indirect mem, immL_0 zero) %{ ++ match(Set mem (StoreL mem zero)); ++ ++ ins_cost(185); ++ format %{ "amswap_db_d AT, R0, $mem #@storeL_immL_0" %} ++ ins_encode %{ ++ __ amswap_db_d(AT, R0, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++// Load Compressed Pointer ++instruct loadN(mRegN dst, memory mem) ++%{ ++ match(Set dst (LoadN mem)); ++ ++ ins_cost(125); ++ format %{ "ld_wu $dst, $mem\t# compressed ptr @ loadN" %} ++ ins_encode %{ ++ relocInfo::relocType disp_reloc = $mem->disp_reloc(); ++ assert(disp_reloc == relocInfo::none, "cannot have disp"); ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_U_INT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++instruct loadN2P(mRegP dst, memory mem) ++%{ ++ match(Set dst (DecodeN (LoadN mem))); ++ predicate(CompressedOops::base() == nullptr && CompressedOops::shift() == 0); ++ ++ ins_cost(125); ++ format %{ "ld_wu $dst, $mem\t# @ loadN2P" %} ++ ins_encode %{ ++ relocInfo::relocType disp_reloc = $mem->disp_reloc(); ++ assert(disp_reloc == relocInfo::none, "cannot have disp"); ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_U_INT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load Pointer ++instruct loadP(mRegP dst, memory mem) %{ ++ match(Set dst (LoadP mem)); ++ predicate(n->as_Load()->barrier_data() == 0); ++ ++ ins_cost(125); ++ format %{ "ld_d $dst, $mem #@loadP" %} ++ ins_encode %{ ++ relocInfo::relocType disp_reloc = $mem->disp_reloc(); ++ assert(disp_reloc == relocInfo::none, "cannot have disp"); ++ __ block_comment("loadP"); ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_LONG); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load Klass Pointer ++instruct loadKlass(mRegP dst, memory mem) %{ ++ match(Set dst (LoadKlass mem)); ++ ++ ins_cost(125); ++ format %{ "MOV $dst,$mem @ loadKlass" %} ++ ins_encode %{ ++ relocInfo::relocType disp_reloc = $mem->disp_reloc(); ++ assert(disp_reloc == relocInfo::none, "cannot have disp"); ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_LONG); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load narrow Klass Pointer ++instruct loadNKlass(mRegN dst, memory mem) ++%{ ++ match(Set dst (LoadNKlass mem)); ++ ++ ins_cost(125); ++ format %{ "ld_wu $dst, $mem\t# compressed klass ptr @ loadNKlass" %} ++ ins_encode %{ ++ relocInfo::relocType disp_reloc = $mem->disp_reloc(); ++ assert(disp_reloc == relocInfo::none, "cannot have disp"); ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_U_INT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++instruct loadN2PKlass(mRegP dst, memory mem) ++%{ ++ match(Set dst (DecodeNKlass (LoadNKlass mem))); ++ predicate(CompressedKlassPointers::base() == nullptr && CompressedKlassPointers::shift() == 0); ++ ++ ins_cost(125); ++ format %{ "ld_wu $dst, $mem\t# compressed klass ptr @ loadN2PKlass" %} ++ ins_encode %{ ++ relocInfo::relocType disp_reloc = $mem->disp_reloc(); ++ assert(disp_reloc == relocInfo::none, "cannot have disp"); ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_U_INT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load Constant ++instruct loadConI(mRegI dst, immI src) %{ ++ match(Set dst src); ++ ++ ins_cost(120); ++ format %{ "mov $dst, $src #@loadConI" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ int value = $src$$constant; ++ __ li(dst, value); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++ ++instruct loadConL(mRegL dst, immL src) %{ ++ match(Set dst src); ++ ins_cost(120); ++ format %{ "li $dst, $src @ loadConL" %} ++ ins_encode %{ ++ __ li($dst$$Register, $src$$constant); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Load Range ++instruct loadRange(mRegI dst, memory_loadRange mem) %{ ++ match(Set dst (LoadRange mem)); ++ ++ ins_cost(125); ++ format %{ "MOV $dst,$mem @ loadRange" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_INT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++ ++instruct storeP(memory mem, mRegP src ) %{ ++ match(Set mem (StoreP mem src)); ++ predicate(!needs_releasing_store(n)); ++ ++ ins_cost(125); ++ format %{ "st_d $src, $mem #@storeP" %} ++ ins_encode %{ ++ __ loadstore_enc($src$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_LONG); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeP_volatile(indirect mem, mRegP src ) %{ ++ match(Set mem (StoreP mem src)); ++ ++ ins_cost(130); ++ format %{ "amswap_db_d R0, $src, $mem #@storeP" %} ++ ins_encode %{ ++ __ amswap_db_d(R0, $src$$Register, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++// Store null Pointer, mark word, or other simple pointer constant. ++instruct storeImmP_immP_0(memory mem, immP_0 zero) %{ ++ match(Set mem (StoreP mem zero)); ++ predicate(!needs_releasing_store(n)); ++ ++ ins_cost(125); ++ format %{ "mov $mem, $zero #@storeImmP_0" %} ++ ins_encode %{ ++ __ loadstore_enc(R0, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_LONG); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeImmP_immP_0_volatile(indirect mem, immP_0 zero) %{ ++ match(Set mem (StoreP mem zero)); ++ ++ ins_cost(130); ++ format %{ "amswap_db_d AT, R0, $mem #@storeImmP_0" %} ++ ins_encode %{ ++ __ amswap_db_d(AT, R0, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++// Store Compressed Pointer ++instruct storeN(memory mem, mRegN src) ++%{ ++ match(Set mem (StoreN mem src)); ++ predicate(!needs_releasing_store(n)); ++ ++ ins_cost(125); ++ format %{ "st_w $mem, $src\t# compressed ptr @ storeN" %} ++ ins_encode %{ ++ __ loadstore_enc($src$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_INT); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeN_volatile(indirect mem, mRegN src) ++%{ ++ match(Set mem (StoreN mem src)); ++ ++ ins_cost(130); ++ format %{ "amswap_db_w R0, $src, $mem # compressed ptr @ storeN" %} ++ ins_encode %{ ++ __ amswap_db_w(R0, $src$$Register, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct storeP2N(memory mem, mRegP src) ++%{ ++ match(Set mem (StoreN mem (EncodeP src))); ++ predicate(CompressedOops::base() == nullptr && CompressedOops::shift() == 0 && !needs_releasing_store(n)); ++ ++ ins_cost(125); ++ format %{ "st_w $mem, $src\t# @ storeP2N" %} ++ ins_encode %{ ++ __ loadstore_enc($src$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_INT); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeP2N_volatile(indirect mem, mRegP src) ++%{ ++ match(Set mem (StoreN mem (EncodeP src))); ++ predicate(CompressedOops::base() == nullptr && CompressedOops::shift() == 0); ++ ++ ins_cost(130); ++ format %{ "amswap_db_w R0, $src, $mem # @ storeP2N" %} ++ ins_encode %{ ++ __ amswap_db_w(R0, $src$$Register, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct storeNKlass(memory mem, mRegN src) ++%{ ++ match(Set mem (StoreNKlass mem src)); ++ predicate(!needs_releasing_store(n)); ++ ++ ins_cost(125); ++ format %{ "st_w $mem, $src\t# compressed klass ptr @ storeNKlass" %} ++ ins_encode %{ ++ __ loadstore_enc($src$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_INT); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeNKlass_volatile(indirect mem, mRegN src) ++%{ ++ match(Set mem (StoreNKlass mem src)); ++ ++ ins_cost(130); ++ format %{ "amswap_db_w R0, $src, $mem # compressed klass ptr @ storeNKlass" %} ++ ins_encode %{ ++ __ amswap_db_w(R0, $src$$Register, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct storeP2NKlass(memory mem, mRegP src) ++%{ ++ match(Set mem (StoreNKlass mem (EncodePKlass src))); ++ predicate(CompressedKlassPointers::base() == nullptr && CompressedKlassPointers::shift() == 0 && !needs_releasing_store(n)); ++ ++ ins_cost(125); ++ format %{ "st_w $mem, $src\t# @ storeP2NKlass" %} ++ ins_encode %{ ++ __ loadstore_enc($src$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_INT); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeP2NKlass_volatile(indirect mem, mRegP src) ++%{ ++ match(Set mem (StoreNKlass mem (EncodePKlass src))); ++ predicate(CompressedKlassPointers::base() == nullptr && CompressedKlassPointers::shift() == 0); ++ ++ ins_cost(130); ++ format %{ "amswap_db_w R0, $src, $mem # @ storeP2NKlass" %} ++ ins_encode %{ ++ __ amswap_db_w(R0, $src$$Register, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct storeImmN_immN_0(memory mem, immN_0 zero) ++%{ ++ match(Set mem (StoreN mem zero)); ++ predicate(!needs_releasing_store(n)); ++ ++ ins_cost(125); ++ format %{ "storeN0 zero, $mem\t# compressed ptr" %} ++ ins_encode %{ ++ __ loadstore_enc(R0, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_INT); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeImmN_immN_0_volatile(indirect mem, immN_0 zero) ++%{ ++ match(Set mem (StoreN mem zero)); ++ ++ ins_cost(130); ++ format %{ "amswap_db_w AT, R0, $mem # compressed ptr" %} ++ ins_encode %{ ++ __ amswap_db_w(AT, R0, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++// Store Byte ++instruct storeB_immB_0(memory mem, immI_0 zero) %{ ++ match(Set mem (StoreB mem zero)); ++ ++ format %{ "mov $mem, zero #@storeB_immB_0" %} ++ ins_encode %{ ++ __ loadstore_enc(R0, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_BYTE); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeB(memory mem, mRegIorL2I src) %{ ++ match(Set mem (StoreB mem src)); ++ ++ ins_cost(125); ++ format %{ "st_b $src, $mem #@storeB" %} ++ ins_encode %{ ++ __ loadstore_enc($src$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_BYTE); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++// Load Byte (8bit signed) ++instruct loadB(mRegI dst, memory mem) %{ ++ match(Set dst (LoadB mem)); ++ ++ ins_cost(125); ++ format %{ "ld_b $dst, $mem #@loadB" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_BYTE); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++instruct loadB_convI2L(mRegL dst, memory mem) %{ ++ match(Set dst (ConvI2L (LoadB mem))); ++ ++ ins_cost(125); ++ format %{ "ld_b $dst, $mem #@loadB_convI2L" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_BYTE); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load Byte (8bit UNsigned) ++instruct loadUB(mRegI dst, memory mem) %{ ++ match(Set dst (LoadUB mem)); ++ ++ ins_cost(125); ++ format %{ "ld_bu $dst, $mem #@loadUB" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_U_BYTE); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++instruct loadUB_convI2L(mRegL dst, memory mem) %{ ++ match(Set dst (ConvI2L (LoadUB mem))); ++ ++ ins_cost(125); ++ format %{ "ld_bu $dst, $mem #@loadUB_convI2L" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_U_BYTE); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load Short (16bit signed) ++instruct loadS(mRegI dst, memory mem) %{ ++ match(Set dst (LoadS mem)); ++ ++ ins_cost(125); ++ format %{ "ld_h $dst, $mem #@loadS" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_SHORT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Load Short (16 bit signed) to Byte (8 bit signed) ++instruct loadS2B(mRegI dst, memory mem, immI_24 twentyfour) %{ ++ match(Set dst (RShiftI (LShiftI (LoadS mem) twentyfour) twentyfour)); ++ ++ ins_cost(125); ++ format %{ "ld_b $dst, $mem\t# short -> byte #@loadS2B" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_BYTE); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++instruct loadS_convI2L(mRegL dst, memory mem) %{ ++ match(Set dst (ConvI2L (LoadS mem))); ++ ++ ins_cost(125); ++ format %{ "ld_h $dst, $mem #@loadS_convI2L" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_SHORT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Store Integer Immediate ++instruct storeI_immI_0(memory mem, immI_0 zero) %{ ++ match(Set mem (StoreI mem zero)); ++ predicate(!needs_releasing_store(n)); ++ ++ ins_cost(120); ++ format %{ "mov $mem, zero #@storeI_immI_0" %} ++ ins_encode %{ ++ __ loadstore_enc(R0, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_INT); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeI_immI_0_volatile(indirect mem, immI_0 zero) %{ ++ match(Set mem (StoreI mem zero)); ++ ++ ins_cost(125); ++ format %{ "amswap_db_w AT, R0, $mem #@storeI_immI_0" %} ++ ins_encode %{ ++ __ amswap_db_w(AT, R0, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++// Store Integer ++instruct storeI(memory mem, mRegIorL2I src) %{ ++ match(Set mem (StoreI mem src)); ++ predicate(!needs_releasing_store(n)); ++ ++ ins_cost(125); ++ format %{ "st_w $mem, $src #@storeI" %} ++ ins_encode %{ ++ __ loadstore_enc($src$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_INT); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeI_volatile(indirect mem, mRegIorL2I src) %{ ++ match(Set mem (StoreI mem src)); ++ ++ ins_cost(130); ++ format %{ "amswap_db_w R0, $src, $mem #@storeI" %} ++ ins_encode %{ ++ __ amswap_db_w(R0, $src$$Register, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++// Load Float ++instruct loadF(regF dst, memory mem) %{ ++ match(Set dst (LoadF mem)); ++ ++ ins_cost(150); ++ format %{ "loadF $dst, $mem #@loadF" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_FLOAT); ++ %} ++ ins_pipe( fpu_load ); ++%} ++ ++instruct loadConP_general(mRegP dst, immP src) %{ ++ match(Set dst src); ++ ++ ins_cost(120); ++ format %{ "li $dst, $src #@loadConP_general" %} ++ ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ long* value = (long*)$src$$constant; ++ ++ if ($src->constant_reloc() == relocInfo::metadata_type){ ++ __ mov_metadata(dst, (Metadata*)value); ++ } else if($src->constant_reloc() == relocInfo::oop_type){ ++ __ movoop(dst, (jobject)value); ++ } else if ($src->constant_reloc() == relocInfo::none) { ++ __ li(dst, (long)value); ++ } ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct loadConP_no_oop_cheap(mRegP dst, immP_no_oop_cheap src) %{ ++ match(Set dst src); ++ ++ ins_cost(80); ++ format %{ "li $dst, $src @ loadConP_no_oop_cheap" %} ++ ++ ins_encode %{ ++ if ($src->constant_reloc() == relocInfo::metadata_type) { ++ __ mov_metadata($dst$$Register, (Metadata*)$src$$constant); ++ } else { ++ __ li($dst$$Register, $src$$constant); ++ } ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct loadConP_immP_0(mRegP dst, immP_0 src) ++%{ ++ match(Set dst src); ++ ++ ins_cost(50); ++ format %{ "mov $dst, R0\t# ptr" %} ++ ins_encode %{ ++ Register dst_reg = $dst$$Register; ++ __ add_d(dst_reg, R0, R0); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct loadConN_immN_0(mRegN dst, immN_0 src) %{ ++ match(Set dst src); ++ format %{ "move $dst, R0\t# compressed null ptr" %} ++ ins_encode %{ ++ __ move($dst$$Register, R0); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct loadConN(mRegN dst, immN src) %{ ++ match(Set dst src); ++ ++ ins_cost(125); ++ format %{ "li $dst, $src\t# compressed ptr @ loadConN" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ __ set_narrow_oop(dst, (jobject)$src$$constant); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct loadConNKlass(mRegN dst, immNKlass src) %{ ++ match(Set dst src); ++ ++ ins_cost(125); ++ format %{ "li $dst, $src\t# compressed klass ptr @ loadConNKlass" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ __ set_narrow_klass(dst, (Klass*)$src$$constant); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// Tail Call; Jump from runtime stub to Java code. ++// Also known as an 'interprocedural jump'. ++// Target of jump will eventually return to caller. ++// TailJump below removes the return address. ++instruct TailCalljmpInd(mRegP jump_target, s3_RegP method_ptr) %{ ++ match(TailCall jump_target method_ptr); ++ ++ format %{ "JMP $jump_target \t# @TailCalljmpInd" %} ++ ++ ins_encode %{ ++ __ jr($jump_target$$Register); ++ %} ++ ++ ins_pipe( pipe_jump ); ++%} ++ ++// Create exception oop: created by stack-crawling runtime code. ++// Created exception is now available to this handler, and is setup ++// just prior to jumping to this handler. No code emitted. ++instruct CreateException( a0_RegP ex_oop ) ++%{ ++ match(Set ex_oop (CreateEx)); ++ ++ // use the following format syntax ++ format %{ "# exception oop is in A0; no code emitted @CreateException" %} ++ ins_encode %{ ++ // X86 leaves this function empty ++ __ block_comment("CreateException is empty in LA"); ++ %} ++ ins_pipe( empty ); ++%} ++ ++ ++/* The mechanism of exception handling is clear now. ++ ++- Common try/catch: ++ [stubGenerator_loongarch.cpp] generate_forward_exception() ++ |- V0, V1 are created ++ |- T4 <= SharedRuntime::exception_handler_for_return_address ++ `- jr T4 ++ `- the caller's exception_handler ++ `- jr OptoRuntime::exception_blob ++ `- here ++- Rethrow(e.g. 'unwind'): ++ * The callee: ++ |- an exception is triggered during execution ++ `- exits the callee method through RethrowException node ++ |- The callee pushes exception_oop(T0) and exception_pc(RA) ++ `- The callee jumps to OptoRuntime::rethrow_stub() ++ * In OptoRuntime::rethrow_stub: ++ |- The VM calls _rethrow_Java to determine the return address in the caller method ++ `- exits the stub with tailjmpInd ++ |- pops exception_oop(V0) and exception_pc(V1) ++ `- jumps to the return address(usually an exception_handler) ++ * The caller: ++ `- continues processing the exception_blob with V0/V1 ++*/ ++ ++// Rethrow exception: ++// The exception oop will come in the first argument position. ++// Then JUMP (not call) to the rethrow stub code. ++instruct RethrowException() ++%{ ++ match(Rethrow); ++ ++ // use the following format syntax ++ format %{ "JMP rethrow_stub #@RethrowException" %} ++ ins_encode %{ ++ __ block_comment("@ RethrowException"); ++ ++ cbuf.set_insts_mark(); ++ cbuf.relocate(cbuf.insts_mark(), runtime_call_Relocation::spec()); ++ ++ // call OptoRuntime::rethrow_stub to get the exception handler in parent method ++ __ patchable_jump((address)OptoRuntime::rethrow_stub()); ++ %} ++ ins_pipe( pipe_jump ); ++%} ++ ++// ============================================================================ ++// Branch Instructions --- long offset versions ++ ++// Jump Direct ++instruct jmpDir_long(label labl) %{ ++ match(Goto); ++ effect(USE labl); ++ ++ ins_cost(300); ++ format %{ "JMP $labl #@jmpDir_long" %} ++ ++ ins_encode %{ ++ Label* L = $labl$$label; ++ __ jmp_far(*L); ++ %} ++ ++ ins_pipe( pipe_jump ); ++ //ins_pc_relative(1); ++%} ++ ++// Jump Direct Conditional - Label defines a relative address from Jcc+1 ++instruct CountedLoopEnd_reg_reg_long(cmpOp cop, mRegI src1, mRegI src2, label labl) %{ ++ match(CountedLoopEnd cop (CmpI src1 src2)); ++ effect(USE labl); ++ ++ ins_cost(300); ++ format %{ "J$cop $src1, $src2, $labl\t# Loop end @ CountedLoopEnd_reg_reg_long" %} ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Register op2 = $src2$$Register; ++ Label* L = $labl$$label; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, op2, L, true /* signed */); ++ %} ++ ins_pipe( pipe_jump ); ++ ins_pc_relative(1); ++%} ++ ++// Note: LA does not have a branching instruction with an immediate number ++// The purpose of retaining CountedLoopEnd_reg_imm12_short/long and branchConIU_reg_imm12_short/long ++// is to reduce the long lifecycle of shared nodes. ++// see #29437 ++instruct CountedLoopEnd_reg_imm12_short(cmpOp cop, mRegI src1, immI12 imm, label labl) %{ ++ match(CountedLoopEnd cop (CmpI src1 imm)); ++ effect(USE labl); ++ ++ ins_cost(300); ++ format %{ "J$cop $src1, $imm, $labl\t# Loop end @ CountedLoopEnd_reg_imm12_short" %} ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cop$$cmpcode; ++ ++ __ addi_d(AT, R0, $imm$$constant); ++ __ cmp_branch_short(flag, op1, AT, L, true /* signed */); ++ %} ++ ins_pipe( pipe_jump ); ++ ins_pc_relative(1); ++ ins_short_branch(1); ++%} ++ ++instruct CountedLoopEnd_reg_imm12_long(cmpOp cop, mRegI src1, immI12 imm, label labl) %{ ++ match(CountedLoopEnd cop (CmpI src1 imm)); ++ effect(USE labl); ++ ++ ins_cost(300); ++ format %{ "J$cop $src1, $imm, $labl\t# Loop end @ CountedLoopEnd_reg_imm12_long" %} ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Label* L = $labl$$label; ++ int flag = $cop$$cmpcode; ++ ++ __ addi_d(AT, R0, $imm$$constant); ++ __ cmp_branch_long(flag, op1, AT, L, true /* signed */); ++ %} ++ ins_pipe( pipe_jump ); ++ ins_pc_relative(1); ++%} ++ ++instruct CountedLoopEnd_reg_zero_long(cmpOp cop, mRegI src1, immI_0 zero, label labl) %{ ++ match(CountedLoopEnd cop (CmpI src1 zero)); ++ effect(USE labl); ++ ++ ins_cost(300); ++ format %{ "J$cop $src1, $zero, $labl\t# Loop end @ CountedLoopEnd_reg_zero_long" %} ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Label* L = $labl$$label; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, R0, L, true /* signed */); ++ %} ++ ins_pipe( pipe_jump ); ++ ins_pc_relative(1); ++%} ++ ++ ++instruct jmpCon_flags_long(cmpOpEqNe cop, FlagsReg cr, label labl) %{ ++ match(If cop cr); ++ effect(USE labl); ++ ++ ins_cost(300); ++ format %{ "J$cop $labl #LoongArch uses T0 as equivalent to eflag @jmpCon_flags_long" %} ++ ++ ins_encode %{ ++ Label* L = $labl$$label; ++ Label not_taken; ++ switch($cop$$cmpcode) { ++ case 0x01: //equal ++ __ bne($cr$$Register, R0, not_taken); ++ break; ++ case 0x02: //not equal ++ __ beq($cr$$Register, R0, not_taken); ++ break; ++ default: ++ Unimplemented(); ++ } ++ __ jmp_far(*L); ++ __ bind(not_taken); ++ %} ++ ++ ins_pipe( pipe_jump ); ++ ins_pc_relative(1); ++%} ++ ++// Conditional jumps ++instruct branchConP_0_long(cmpOpEqNe cmp, mRegP op1, immP_0 zero, label labl) %{ ++ match(If cmp (CmpP op1 zero)); ++ effect(USE labl); ++ ++ ins_cost(180); ++ format %{ "b$cmp $op1, R0, $labl #@branchConP_0_long" %} ++ ++ ins_encode %{ ++ Register op1 = $op1$$Register; ++ Label* L = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, R0, L, true /* signed */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++instruct branchConN2P_0_long(cmpOpEqNe cmp, mRegN op1, immP_0 zero, label labl) %{ ++ match(If cmp (CmpP (DecodeN op1) zero)); ++ predicate(CompressedOops::base() == nullptr); ++ effect(USE labl); ++ ++ ins_cost(180); ++ format %{ "b$cmp $op1, R0, $labl #@branchConN2P_0_long" %} ++ ++ ins_encode %{ ++ Register op1 = $op1$$Register; ++ Label* L = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, R0, L, true /* signed */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++ ++instruct branchConP_long(cmpOp cmp, mRegP op1, mRegP op2, label labl) %{ ++ match(If cmp (CmpP op1 op2)); ++// predicate(can_branch_register(_kids[0]->_leaf, _kids[1]->_leaf)); ++ effect(USE labl); ++ ++ ins_cost(200); ++ format %{ "b$cmp $op1, $op2, $labl #@branchConP_long" %} ++ ++ ins_encode %{ ++ Register op1 = $op1$$Register; ++ Register op2 = $op2$$Register; ++ Label* L = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, op2, L, false /* unsigned */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++instruct cmpN_null_branch_long(cmpOpEqNe cmp, mRegN op1, immN_0 null, label labl) %{ ++ match(If cmp (CmpN op1 null)); ++ effect(USE labl); ++ ++ ins_cost(180); ++ format %{ "CMP $op1,0\t! compressed ptr\n\t" ++ "BP$cmp $labl @ cmpN_null_branch_long" %} ++ ins_encode %{ ++ Register op1 = $op1$$Register; ++ Label* L = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, R0, L, true /* signed */); ++ %} ++//TODO: pipe_branchP or create pipe_branchN LEE ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++instruct cmpN_reg_branch_long(cmpOp cmp, mRegN op1, mRegN op2, label labl) %{ ++ match(If cmp (CmpN op1 op2)); ++ effect(USE labl); ++ ++ ins_cost(180); ++ format %{ "CMP $op1,$op2\t! compressed ptr\n\t" ++ "BP$cmp $labl @ cmpN_reg_branch_long" %} ++ ins_encode %{ ++ Register op1 = $op1$$Register; ++ Register op2 = $op2$$Register; ++ Label* L = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, op2, L, false /* unsigned */); ++ %} ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++instruct branchConIU_reg_imm12_short(cmpOp cmp, mRegI src1, immI12 imm, label labl) %{ ++ match( If cmp (CmpU src1 imm) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $imm, $labl #@branchConIU_reg_imm12_short" %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ int imm = $imm$$constant; ++ __ addi_d(AT, R0, imm); ++ __ cmp_branch_short(flag, op1, AT, L, false /* unsigned*/); ++ %} ++ ++ ins_short_branch(1); ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++instruct branchConIU_reg_imm12_long(cmpOp cmp, mRegI src1, immI12 src2, label labl) %{ ++ match( If cmp (CmpU src1 src2) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConIU_reg_imm12_long" %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Label* L = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ int imm = $src2$$constant; ++ __ addi_d(AT, R0, imm); ++ __ cmp_branch_long(flag, op1, AT, L, false /* unsigned*/); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++instruct branchConIU_reg_reg_long(cmpOp cmp, mRegI src1, mRegI src2, label labl) %{ ++ match( If cmp (CmpU src1 src2) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConIU_reg_reg_long" %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Register op2 = $src2$$Register; ++ Label* L = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, op2, L, false /* unsigned */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++ ++instruct branchConIU_reg_zero_long(cmpOp cmp, mRegI src1, immI_0 zero, label labl) %{ ++ match( If cmp (CmpU src1 zero) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $zero, $labl #@branchConIU_reg_zero_long" %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Label* L = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, R0, L, false /* unsigned */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++instruct branchConI_reg_reg_long(cmpOp cmp, mRegI src1, mRegI src2, label labl) %{ ++ match( If cmp (CmpI src1 src2) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConI_reg_reg_long" %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Register op2 = $src2$$Register; ++ Label* L = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, op2, L, true /* signed */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++instruct branchConI_reg_zero_long(cmpOp cmp, mRegI src1, immI_0 zero, label labl) %{ ++ match( If cmp (CmpI src1 zero) ); ++ effect(USE labl); ++ ins_cost(200); ++ format %{ "BR$cmp $src1, $zero, $labl #@branchConI_reg_zero_long" %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Label* L = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, R0, L, true /* signed */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++instruct branchConL_regL_regL_long(cmpOp cmp, mRegLorI2L src1, mRegLorI2L src2, label labl) %{ ++ match( If cmp (CmpL src1 src2) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConL_regL_regL_long" %} ++ ins_cost(250); ++ ++ ins_encode %{ ++ Register op1 = as_Register($src1$$reg); ++ Register op2 = as_Register($src2$$reg); ++ ++ Label* target = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, op2, target, true /* signed */); ++ %} ++ ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++instruct branchConUL_regL_regL_long(cmpOp cmp, mRegLorI2L src1, mRegLorI2L src2, label labl) %{ ++ match(If cmp (CmpUL src1 src2)); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConUL_regL_regL_long" %} ++ ins_cost(250); ++ ++ ins_encode %{ ++ Register op1 = as_Register($src1$$reg); ++ Register op2 = as_Register($src2$$reg); ++ Label* target = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, op2, target, false /* signed */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++instruct branchConL_regL_zero_long(cmpOp cmp, mRegL src1, immL_0 zero, label labl) %{ ++ match( If cmp (CmpL src1 zero) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $zero, $labl #@branchConL_regL_immL_long" %} ++ ins_cost(180); ++ ++ ins_encode %{ ++ Register op1 = as_Register($src1$$reg); ++ Label* target = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, R0, target, true /* signed */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++instruct branchConUL_regL_zero_long(cmpOp cmp, mRegL src1, immL_0 zero, label labl) %{ ++ match(If cmp (CmpUL src1 zero)); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $zero, $labl #@branchConUL_regL_immL_long" %} ++ ins_cost(180); ++ ++ ins_encode %{ ++ Register op1 = as_Register($src1$$reg); ++ Label* target = $labl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_long(flag, op1, R0, target, false /* signed */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++%} ++ ++//FIXME ++instruct branchConF_reg_reg_long(cmpOp cmp, regF src1, regF src2, label labl) %{ ++ match( If cmp (CmpF src1 src2) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConF_reg_reg_long" %} ++ ++ ins_encode %{ ++ FloatRegister reg_op1 = $src1$$FloatRegister; ++ FloatRegister reg_op2 = $src2$$FloatRegister; ++ Label* L = $labl$$label; ++ Label not_taken; ++ int flag = $cmp$$cmpcode; ++ ++ switch(flag) { ++ case 0x01: //equal ++ __ fcmp_ceq_s(FCC0, reg_op1, reg_op2); ++ __ bceqz(FCC0, not_taken); ++ break; ++ case 0x02: //not_equal ++ __ fcmp_ceq_s(FCC0, reg_op1, reg_op2); ++ __ bcnez(FCC0, not_taken); ++ break; ++ case 0x03: //greater ++ __ fcmp_cule_s(FCC0, reg_op1, reg_op2); ++ __ bcnez(FCC0, not_taken); ++ break; ++ case 0x04: //greater_equal ++ __ fcmp_cult_s(FCC0, reg_op1, reg_op2); ++ __ bcnez(FCC0, not_taken); ++ break; ++ case 0x05: //less ++ __ fcmp_cult_s(FCC0, reg_op1, reg_op2); ++ __ bceqz(FCC0, not_taken); ++ break; ++ case 0x06: //less_equal ++ __ fcmp_cule_s(FCC0, reg_op1, reg_op2); ++ __ bceqz(FCC0, not_taken); ++ break; ++ default: ++ Unimplemented(); ++ } ++ __ jmp_far(*L); ++ __ bind(not_taken); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_fpu_branch ); ++%} ++ ++instruct branchConD_reg_reg_long(cmpOp cmp, regD src1, regD src2, label labl) %{ ++ match( If cmp (CmpD src1 src2) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConD_reg_reg_long" %} ++ ++ ins_encode %{ ++ FloatRegister reg_op1 = $src1$$FloatRegister; ++ FloatRegister reg_op2 = $src2$$FloatRegister; ++ Label* L = $labl$$label; ++ Label not_taken; ++ int flag = $cmp$$cmpcode; ++ ++ switch(flag) { ++ case 0x01: //equal ++ __ fcmp_ceq_d(FCC0, reg_op1, reg_op2); ++ __ bceqz(FCC0, not_taken); ++ break; ++ case 0x02: //not_equal ++ // c_ueq_d cannot distinguish NaN from equal. Double.isNaN(Double) is implemented by 'f != f', so the use of c_ueq_d causes bugs. ++ __ fcmp_ceq_d(FCC0, reg_op1, reg_op2); ++ __ bcnez(FCC0, not_taken); ++ break; ++ case 0x03: //greater ++ __ fcmp_cule_d(FCC0, reg_op1, reg_op2); ++ __ bcnez(FCC0, not_taken); ++ break; ++ case 0x04: //greater_equal ++ __ fcmp_cult_d(FCC0, reg_op1, reg_op2); ++ __ bcnez(FCC0, not_taken); ++ break; ++ case 0x05: //less ++ __ fcmp_cult_d(FCC0, reg_op1, reg_op2); ++ __ bceqz(FCC0, not_taken); ++ break; ++ case 0x06: //less_equal ++ __ fcmp_cule_d(FCC0, reg_op1, reg_op2); ++ __ bceqz(FCC0, not_taken); ++ break; ++ default: ++ Unimplemented(); ++ } ++ __ jmp_far(*L); ++ __ bind(not_taken); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_fpu_branch ); ++%} ++ ++ ++// ============================================================================ ++// Branch Instructions -- short offset versions ++ ++// Jump Direct ++instruct jmpDir_short(label labl) %{ ++ match(Goto); ++ effect(USE labl); ++ ++ ins_cost(300); ++ format %{ "JMP $labl #@jmpDir_short" %} ++ ++ ins_encode %{ ++ Label &L = *($labl$$label); ++ __ b(L); ++ %} ++ ++ ins_pipe( pipe_jump ); ++ ins_pc_relative(1); ++ ins_short_branch(1); ++%} ++ ++// Jump Direct Conditional - Label defines a relative address from Jcc+1 ++instruct CountedLoopEnd_reg_reg_short(cmpOp cop, mRegI src1, mRegI src2, label labl) %{ ++ match(CountedLoopEnd cop (CmpI src1 src2)); ++ effect(USE labl); ++ ++ ins_cost(300); ++ format %{ "J$cop $src1, $src2, $labl\t# Loop end @ CountedLoopEnd_reg_reg_short" %} ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Register op2 = $src2$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_branch_short(flag, op1, op2, L, true /* signed */); ++ %} ++ ins_pipe( pipe_jump ); ++ ins_pc_relative(1); ++ ins_short_branch(1); ++%} ++ ++instruct CountedLoopEnd_reg_zero_short(cmpOp cop, mRegI src1, immI_0 zero, label labl) %{ ++ match(CountedLoopEnd cop (CmpI src1 zero)); ++ effect(USE labl); ++ ++ ins_cost(300); ++ format %{ "J$cop $src1, $zero, $labl\t# Loop end @ CountedLoopEnd_reg_zero_short" %} ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_branch_short(flag, op1, R0, L, true /* signed */); ++ %} ++ ins_pipe( pipe_jump ); ++ ins_pc_relative(1); ++ ins_short_branch(1); ++%} ++ ++ ++instruct jmpCon_flags_short(cmpOpEqNe cop, FlagsReg cr, label labl) %{ ++ match(If cop cr); ++ effect(USE labl); ++ ++ ins_cost(300); ++ format %{ "J$cop $labl #LoongArch uses T0 as equivalent to eflag @jmpCon_flags_short" %} ++ ++ ins_encode %{ ++ Label &L = *($labl$$label); ++ switch($cop$$cmpcode) { ++ case 0x01: //equal ++ __ bnez($cr$$Register, L); ++ break; ++ case 0x02: //not equal ++ __ beqz($cr$$Register, L); ++ break; ++ default: ++ Unimplemented(); ++ } ++ %} ++ ++ ins_pipe( pipe_jump ); ++ ins_pc_relative(1); ++ ins_short_branch(1); ++%} ++ ++// Conditional jumps ++instruct branchConP_0_short(cmpOpEqNe cmp, mRegP op1, immP_0 zero, label labl) %{ ++ match(If cmp (CmpP op1 zero)); ++ effect(USE labl); ++ ++ ins_cost(180); ++ format %{ "b$cmp $op1, R0, $labl #@branchConP_0_short" %} ++ ++ ins_encode %{ ++ Register op1 = $op1$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branchEqNe_off21(flag, op1, L); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++instruct branchConN2P_0_short(cmpOpEqNe cmp, mRegN op1, immP_0 zero, label labl) %{ ++ match(If cmp (CmpP (DecodeN op1) zero)); ++ predicate(CompressedOops::base() == nullptr); ++ effect(USE labl); ++ ++ ins_cost(180); ++ format %{ "b$cmp $op1, R0, $labl #@branchConN2P_0_short" %} ++ ++ ins_encode %{ ++ Register op1 = $op1$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branchEqNe_off21(flag, op1, L); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++ ++instruct branchConP_short(cmpOp cmp, mRegP op1, mRegP op2, label labl) %{ ++ match(If cmp (CmpP op1 op2)); ++// predicate(can_branch_register(_kids[0]->_leaf, _kids[1]->_leaf)); ++ effect(USE labl); ++ ++ ins_cost(200); ++ format %{ "b$cmp $op1, $op2, $labl #@branchConP_short" %} ++ ++ ins_encode %{ ++ Register op1 = $op1$$Register; ++ Register op2 = $op2$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_short(flag, op1, op2, L, false /* unsigned */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++instruct cmpN_null_branch_short(cmpOp cmp, mRegN op1, immN_0 null, label labl) %{ ++ match(If cmp (CmpN op1 null)); ++ effect(USE labl); ++ ++ ins_cost(180); ++ format %{ "CMP $op1,0\t! compressed ptr\n\t" ++ "BP$cmp $labl @ cmpN_null_branch_short" %} ++ ins_encode %{ ++ Register op1 = $op1$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branchEqNe_off21(flag, op1, L); ++ %} ++//TODO: pipe_branchP or create pipe_branchN LEE ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++instruct cmpN_reg_branch_short(cmpOp cmp, mRegN op1, mRegN op2, label labl) %{ ++ match(If cmp (CmpN op1 op2)); ++ effect(USE labl); ++ ++ ins_cost(180); ++ format %{ "CMP $op1,$op2\t! compressed ptr\n\t" ++ "BP$cmp $labl @ cmpN_reg_branch_short" %} ++ ins_encode %{ ++ Register op1 = $op1$$Register; ++ Register op2 = $op2$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_short(flag, op1, op2, L, false /* unsigned */); ++ %} ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++instruct branchConIU_reg_reg_short(cmpOp cmp, mRegI src1, mRegI src2, label labl) %{ ++ match( If cmp (CmpU src1 src2) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConIU_reg_reg_short" %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Register op2 = $src2$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_short(flag, op1, op2, L, false /* unsigned */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++ ++instruct branchConIU_reg_zero_short(cmpOp cmp, mRegI src1, immI_0 zero, label labl) %{ ++ match( If cmp (CmpU src1 zero) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $zero, $labl #@branchConIU_reg_imm_short" %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_short(flag, op1, R0, L, false /* unsigned */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++instruct branchConI_reg_reg_short(cmpOp cmp, mRegI src1, mRegI src2, label labl) %{ ++ match( If cmp (CmpI src1 src2) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConI_reg_reg_short" %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Register op2 = $src2$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_short(flag, op1, op2, L, true /* signed */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++instruct branchConI_reg_zero_short(cmpOp cmp, mRegI src1, immI_0 zero, label labl) %{ ++ match( If cmp (CmpI src1 zero) ); ++ effect(USE labl); ++ ins_cost(200); ++ format %{ "BR$cmp $src1, $zero, $labl #@branchConI_reg_imm_short" %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Label &L = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_short(flag, op1, R0, L, true /* signed */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++instruct branchConL_regL_regL_short(cmpOp cmp, mRegLorI2L src1, mRegLorI2L src2, label labl) %{ ++ match( If cmp (CmpL src1 src2) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConL_regL_regL_short" %} ++ ins_cost(250); ++ ++ ins_encode %{ ++ Register op1 = as_Register($src1$$reg); ++ Register op2 = as_Register($src2$$reg); ++ ++ Label &target = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_short(flag, op1, op2, target, true /* signed */); ++ %} ++ ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++instruct branchConUL_regL_regL_short(cmpOp cmp, mRegLorI2L src1, mRegLorI2L src2, label labl) %{ ++ match(If cmp (CmpUL src1 src2)); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConUL_regL_regL_short" %} ++ ins_cost(250); ++ ++ ins_encode %{ ++ Register op1 = as_Register($src1$$reg); ++ Register op2 = as_Register($src2$$reg); ++ Label& target = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_short(flag, op1, op2, target, false /* signed */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++instruct branchConL_regL_zero_short(cmpOp cmp, mRegL src1, immL_0 zero, label labl) %{ ++ match( If cmp (CmpL src1 zero) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $zero, $labl #@branchConL_regL_immL_short" %} ++ ins_cost(180); ++ ++ ins_encode %{ ++ Register op1 = as_Register($src1$$reg); ++ Label &target = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_short(flag, op1, R0, target, true /* signed */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++instruct branchConUL_regL_zero_short(cmpOp cmp, mRegL src1, immL_0 zero, label labl) %{ ++ match(If cmp (CmpUL src1 zero)); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $zero, $labl #@branchConUL_regL_immL_short" %} ++ ins_cost(180); ++ ++ ins_encode %{ ++ Register op1 = as_Register($src1$$reg); ++ Label& target = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ __ cmp_branch_short(flag, op1, R0, target, false /* signed */); ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_alu_branch ); ++ ins_short_branch(1); ++%} ++ ++//FIXME ++instruct branchConF_reg_reg_short(cmpOp cmp, regF src1, regF src2, label labl) %{ ++ match( If cmp (CmpF src1 src2) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConF_reg_reg_short" %} ++ ++ ins_encode %{ ++ FloatRegister reg_op1 = $src1$$FloatRegister; ++ FloatRegister reg_op2 = $src2$$FloatRegister; ++ Label &L = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ switch(flag) { ++ case 0x01: //equal ++ __ fcmp_ceq_s(FCC0, reg_op1, reg_op2); ++ __ bcnez(FCC0, L); ++ break; ++ case 0x02: //not_equal ++ __ fcmp_ceq_s(FCC0, reg_op1, reg_op2); ++ __ bceqz(FCC0, L); ++ break; ++ case 0x03: //greater ++ __ fcmp_cule_s(FCC0, reg_op1, reg_op2); ++ __ bceqz(FCC0, L); ++ break; ++ case 0x04: //greater_equal ++ __ fcmp_cult_s(FCC0, reg_op1, reg_op2); ++ __ bceqz(FCC0, L); ++ break; ++ case 0x05: //less ++ __ fcmp_cult_s(FCC0, reg_op1, reg_op2); ++ __ bcnez(FCC0, L); ++ break; ++ case 0x06: //less_equal ++ __ fcmp_cule_s(FCC0, reg_op1, reg_op2); ++ __ bcnez(FCC0, L); ++ break; ++ default: ++ Unimplemented(); ++ } ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_fpu_branch ); ++ ins_short_branch(1); ++%} ++ ++instruct branchConD_reg_reg_short(cmpOp cmp, regD src1, regD src2, label labl) %{ ++ match( If cmp (CmpD src1 src2) ); ++ effect(USE labl); ++ format %{ "BR$cmp $src1, $src2, $labl #@branchConD_reg_reg_short" %} ++ ++ ins_encode %{ ++ FloatRegister reg_op1 = $src1$$FloatRegister; ++ FloatRegister reg_op2 = $src2$$FloatRegister; ++ Label &L = *($labl$$label); ++ int flag = $cmp$$cmpcode; ++ ++ switch(flag) { ++ case 0x01: //equal ++ __ fcmp_ceq_d(FCC0, reg_op1, reg_op2); ++ __ bcnez(FCC0, L); ++ break; ++ case 0x02: //not_equal ++ // c_ueq_d cannot distinguish NaN from equal. Double.isNaN(Double) is implemented by 'f != f', so the use of c_ueq_d causes bugs. ++ __ fcmp_ceq_d(FCC0, reg_op1, reg_op2); ++ __ bceqz(FCC0, L); ++ break; ++ case 0x03: //greater ++ __ fcmp_cule_d(FCC0, reg_op1, reg_op2); ++ __ bceqz(FCC0, L); ++ break; ++ case 0x04: //greater_equal ++ __ fcmp_cult_d(FCC0, reg_op1, reg_op2); ++ __ bceqz(FCC0, L); ++ break; ++ case 0x05: //less ++ __ fcmp_cult_d(FCC0, reg_op1, reg_op2); ++ __ bcnez(FCC0, L); ++ break; ++ case 0x06: //less_equal ++ __ fcmp_cule_d(FCC0, reg_op1, reg_op2); ++ __ bcnez(FCC0, L); ++ break; ++ default: ++ Unimplemented(); ++ } ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_fpu_branch ); ++ ins_short_branch(1); ++%} ++ ++// =================== End of branch instructions ========================== ++ ++// Call Runtime Instruction ++instruct CallRuntimeDirect(method meth) %{ ++ match(CallRuntime ); ++ effect(USE meth); ++ ++ ins_cost(300); ++ format %{ "CALL,runtime #@CallRuntimeDirect" %} ++ ins_encode( Java_To_Runtime( meth ) ); ++ ins_pipe( pipe_slow ); ++ ins_alignment(4); ++%} ++ ++ ++ ++//------------------------MemBar Instructions------------------------------- ++//Memory barrier flavors ++ ++instruct unnecessary_membar_acquire() %{ ++ predicate(unnecessary_acquire(n)); ++ match(MemBarAcquire); ++ ins_cost(0); ++ ++ format %{ "membar_acquire (elided)" %} ++ ++ ins_encode %{ ++ __ block_comment("membar_acquire (elided)"); ++ %} ++ ++ ins_pipe( empty ); ++%} ++ ++instruct membar_acquire() %{ ++ match(MemBarAcquire); ++ ins_cost(400); ++ ++ format %{ "MEMBAR-acquire @ membar_acquire" %} ++ ins_encode %{ ++ __ membar(Assembler::Membar_mask_bits(__ LoadLoad|__ LoadStore)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct load_fence() %{ ++ match(LoadFence); ++ ins_cost(400); ++ ++ format %{ "MEMBAR @ load_fence" %} ++ ins_encode %{ ++ __ membar(Assembler::Membar_mask_bits(__ LoadLoad|__ LoadStore)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct membar_acquire_lock() ++%{ ++ match(MemBarAcquireLock); ++ ins_cost(0); ++ ++ size(0); ++ format %{ "MEMBAR-acquire (acquire as part of CAS in prior FastLock so empty encoding) @ membar_acquire_lock" %} ++ ins_encode(); ++ ins_pipe( empty ); ++%} ++ ++instruct unnecessary_membar_release() %{ ++ predicate(unnecessary_release(n)); ++ match(MemBarRelease); ++ ins_cost(0); ++ ++ format %{ "membar_release (elided)" %} ++ ++ ins_encode %{ ++ __ block_comment("membar_release (elided)"); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct membar_release() %{ ++ match(MemBarRelease); ++ ins_cost(400); ++ ++ format %{ "MEMBAR-release @ membar_release" %} ++ ++ ins_encode %{ ++ // Attention: DO NOT DELETE THIS GUY! ++ __ membar(Assembler::Membar_mask_bits(__ LoadStore|__ StoreStore)); ++ %} ++ ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct store_fence() %{ ++ match(StoreFence); ++ ins_cost(400); ++ ++ format %{ "MEMBAR @ store_fence" %} ++ ++ ins_encode %{ ++ __ membar(Assembler::Membar_mask_bits(__ LoadStore|__ StoreStore)); ++ %} ++ ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct membar_release_lock() ++%{ ++ match(MemBarReleaseLock); ++ ins_cost(0); ++ ++ size(0); ++ format %{ "MEMBAR-release-lock (release in FastUnlock so empty) @ membar_release_lock" %} ++ ins_encode(); ++ ins_pipe( empty ); ++%} ++ ++instruct unnecessary_membar_volatile() %{ ++ predicate(unnecessary_volatile(n)); ++ match(MemBarVolatile); ++ ins_cost(0); ++ ++ format %{ "membar_volatile (elided)" %} ++ ++ ins_encode %{ ++ __ block_comment("membar_volatile (elided)"); ++ %} ++ ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct membar_volatile() %{ ++ match(MemBarVolatile); ++ ins_cost(400); ++ ++ format %{ "MEMBAR-volatile" %} ++ ins_encode %{ ++ if( !os::is_MP() ) return; // Not needed on single CPU ++ __ membar(__ StoreLoad); ++ ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct membar_storestore() %{ ++ match(MemBarStoreStore); ++ match(StoreStoreFence); ++ ++ ins_cost(400); ++ format %{ "MEMBAR-storestore @ membar_storestore" %} ++ ins_encode %{ ++ __ membar(__ StoreStore); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct same_addr_load_fence() %{ ++ match(SameAddrLoadFence); ++ ins_cost(400); ++ ++ format %{ "MEMBAR @ same_addr_load_fence" %} ++ ins_encode %{ ++ __ dbar(0x700); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++//----------Move Instructions-------------------------------------------------- ++instruct castX2P(mRegP dst, mRegL src) %{ ++ match(Set dst (CastX2P src)); ++ format %{ "castX2P $dst, $src @ castX2P" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ ++ if(src != dst) ++ __ move(dst, src); ++ %} ++ ins_cost(10); ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct castP2X(mRegL dst, mRegP src ) %{ ++ match(Set dst (CastP2X src)); ++ ++ format %{ "mov $dst, $src\t #@castP2X" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ ++ if(src != dst) ++ __ move(dst, src); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct MoveF2I_reg_reg(mRegI dst, regF src) %{ ++ match(Set dst (MoveF2I src)); ++ effect(DEF dst, USE src); ++ ins_cost(85); ++ format %{ "MoveF2I $dst, $src @ MoveF2I_reg_reg" %} ++ ins_encode %{ ++ Register dst = as_Register($dst$$reg); ++ FloatRegister src = as_FloatRegister($src$$reg); ++ ++ __ movfr2gr_s(dst, src); ++ %} ++ ins_pipe( fpu_movgrfr ); ++%} ++ ++instruct MoveI2F_reg_reg(regF dst, mRegI src) %{ ++ match(Set dst (MoveI2F src)); ++ effect(DEF dst, USE src); ++ ins_cost(85); ++ format %{ "MoveI2F $dst, $src @ MoveI2F_reg_reg" %} ++ ins_encode %{ ++ Register src = as_Register($src$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ movgr2fr_w(dst, src); ++ %} ++ ins_pipe( fpu_movgrfr ); ++%} ++ ++instruct MoveD2L_reg_reg(mRegL dst, regD src) %{ ++ match(Set dst (MoveD2L src)); ++ effect(DEF dst, USE src); ++ ins_cost(85); ++ format %{ "MoveD2L $dst, $src @ MoveD2L_reg_reg" %} ++ ins_encode %{ ++ Register dst = as_Register($dst$$reg); ++ FloatRegister src = as_FloatRegister($src$$reg); ++ ++ __ movfr2gr_d(dst, src); ++ %} ++ ins_pipe( fpu_movgrfr ); ++%} ++ ++instruct MoveL2D_reg_reg(regD dst, mRegL src) %{ ++ match(Set dst (MoveL2D src)); ++ effect(DEF dst, USE src); ++ ins_cost(85); ++ format %{ "MoveL2D $dst, $src @ MoveL2D_reg_reg" %} ++ ins_encode %{ ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ Register src = as_Register($src$$reg); ++ ++ __ movgr2fr_d(dst, src); ++ %} ++ ins_pipe( fpu_movgrfr ); ++%} ++ ++//----------Conditional Move--------------------------------------------------- ++// Conditional move ++instruct cmovI_cmpI_reg_reg(mRegI dst, mRegI src1, mRegI src2, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpI src1 src2)) (Binary src1 src2))); ++ ins_cost(50); ++ format %{ ++ "CMP$cop $src1, $src2\t @cmovI_cmpI_reg_reg\n" ++ "\tCMOV $dst,$src1, $src2 \t @cmovI_cmpI_reg_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Register op2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, op1, op2, (MacroAssembler::CMCompare) flag, true); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpI_reg_zero(mRegI dst, mRegI src1, immI_0 zero, mRegI tmp1, mRegI tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpI tmp1 tmp2)) (Binary src1 zero))); ++ ins_cost(20); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovI_cmpI_reg_zero\n" ++ "\tCMOV $dst,$src1, $zero \t @cmovI_cmpI_reg_zero" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, op2, dst, src1, (MacroAssembler::CMCompare) flag, true); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpI_reg_reg2(mRegI dst, mRegI src1, mRegI src2, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpI src1 src2)) (Binary src2 src1))); ++ ins_cost(50); ++ format %{ ++ "CMP$cop $src1, $src2\t @cmovI_cmpI_reg_reg2\n" ++ "\tCMOV $dst,$src2, $src1 \t @cmovI_cmpI_reg_reg2" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Register op2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, op2, op1, (MacroAssembler::CMCompare) flag, true); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpI_zero_reg(mRegI dst, mRegI src1, mRegI src2, mRegI tmp1, immI_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpI tmp1 zero)) (Binary src1 src2))); ++ ins_cost(20); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovI_cmpI_zero_reg\n" ++ "\tCMOV $dst,$src1, $src2 \t @cmovI_cmpI_zero_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, true); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpI_zero_zero(mRegI dst, mRegI src1, immI_0 zero, mRegI tmp1, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpI tmp1 zero)) (Binary src1 zero))); ++ ins_cost(20); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovI_cmpI_zero_zero\n" ++ "\tCMOV $dst,$src1, $zero \t @cmovI_cmpI_zero_zero" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, R0, dst, src1, (MacroAssembler::CMCompare) flag, true); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpI_dst_reg(mRegI dst, mRegI src, mRegI tmp1, mRegI tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpI tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovI_cmpI_dst_reg\n" ++ "\tCMOV $dst,$src \t @cmovI_cmpI_dst_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpP_zero_reg(mRegI dst, mRegI src1, mRegI src2, mRegP tmp1, immP_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpP tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zero\t @cmovI_cmpP_zero_reg\n\t" ++ "CMOV $dst,$src1, $src2\t @cmovI_cmpP_zero_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpP_zero_zero(mRegI dst, mRegI src1, immI_0 zeroI, mRegP tmp1, immP_0 zeroP, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpP tmp1 zeroP)) (Binary src1 zeroI))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zeroP\t @cmovI_cmpP_zero_zero\n\t" ++ "CMOV $dst,$zeroI\t @cmovI_cmpP_zero_zero" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register src1 = $src1$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ __ cmp_cmov_zero(op1, R0, dst, src1, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpP_reg_reg(mRegI dst, mRegI src, mRegP tmp1, mRegP tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpP tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovI_cmpP_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovI_cmpP_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpN_zero_reg(mRegI dst, mRegI src1, mRegI src2, mRegN tmp1, immN_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpN tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zero\t @cmovI_cmpN_zero_reg\n\t" ++ "CMOV $dst,$src1, $src2\t @cmovI_cmpN_zero_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpN_zero_zero(mRegI dst, mRegI src1, immI_0 zeroI, mRegN tmp1, immN_0 zeroN, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpN tmp1 zeroN)) (Binary src1 zeroI))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zeroN\t @cmovI_cmpN_zero_zero\n\t" ++ "CMOV $dst,$zeroI\t @cmovI_cmpN_zero_zero" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register src1 = $src1$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ __ cmp_cmov_zero(op1, R0, dst, src1, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpN_reg_reg(mRegI dst, mRegI src, mRegN tmp1, mRegN tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpN tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovI_cmpN_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovI_cmpN_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpU_zero_reg(mRegP dst, mRegP src1, mRegP src2, mRegI tmp1, immI_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpU tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zero\t @cmovP_cmpU_zero_reg\n\t" ++ "CMOV $dst,$src1, $src2\t @cmovP_cmpU_zero_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpU_zero_zero(mRegP dst, mRegP src1, immP_0 zeroP, mRegI tmp1, immI_0 zeroI, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpU tmp1 zeroI)) (Binary src1 zeroP))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zeroI\t @cmovP_cmpU_zero_zero\n\t" ++ "CMOV $dst,$zeroP\t @cmovP_cmpU_zero_zero" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register src1 = $src1$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ __ cmp_cmov_zero(op1, R0, dst, src1, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpU_reg_reg(mRegP dst, mRegP src, mRegI tmp1, mRegI tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpU tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovP_cmpU_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovP_cmpU_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpF_reg_reg(mRegP dst, mRegP src, regF tmp1, regF tmp2, cmpOp cop, regD tmp3, regD tmp4) %{ ++ match(Set dst (CMoveP (Binary cop (CmpF tmp1 tmp2)) (Binary dst src))); ++ effect(TEMP tmp3, TEMP tmp4); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovP_cmpF_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovP_cmpF_reg_reg" ++ %} ++ ++ ins_encode %{ ++ FloatRegister reg_op1 = $tmp1$$FloatRegister; ++ FloatRegister reg_op2 = $tmp2$$FloatRegister; ++ FloatRegister tmp1 = $tmp3$$FloatRegister; ++ FloatRegister tmp2 = $tmp4$$FloatRegister; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(reg_op1, reg_op2, dst, src, tmp1, tmp2, (MacroAssembler::CMCompare) flag, true /* is_float */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpN_reg_reg(mRegP dst, mRegP src, mRegN tmp1, mRegN tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpN tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovP_cmpN_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovP_cmpN_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovN_cmpP_reg_reg(mRegN dst, mRegN src, mRegP tmp1, mRegP tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveN (Binary cop (CmpP tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovN_cmpP_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovN_cmpP_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpD_reg_reg(mRegP dst, mRegP src, regD tmp1, regD tmp2, cmpOp cop, regD tmp3, regD tmp4) %{ ++ match(Set dst (CMoveP (Binary cop (CmpD tmp1 tmp2)) (Binary dst src))); ++ effect(TEMP tmp3, TEMP tmp4); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovP_cmpD_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovP_cmpD_reg_reg" ++ %} ++ ins_encode %{ ++ FloatRegister reg_op1 = as_FloatRegister($tmp1$$reg); ++ FloatRegister reg_op2 = as_FloatRegister($tmp2$$reg); ++ FloatRegister tmp1 = $tmp3$$FloatRegister; ++ FloatRegister tmp2 = $tmp4$$FloatRegister; ++ Register dst = as_Register($dst$$reg); ++ Register src = as_Register($src$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(reg_op1, reg_op2, dst, src, tmp1, tmp2, (MacroAssembler::CMCompare) flag, false /* is_float */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovN_cmpN_reg_reg(mRegN dst, mRegN src, mRegN tmp1, mRegN tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveN (Binary cop (CmpN tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovN_cmpN_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovN_cmpN_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpU_zero_reg(mRegI dst, mRegI src1, mRegI src2, mRegI tmp1, immI_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpU tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zero\t @cmovI_cmpU_zero_reg\n\t" ++ "CMOV $dst,$src1, $src2\t @cmovI_cmpU_zero_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpU_zero_zero(mRegI dst, mRegI src1, mRegI tmp1, immI_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpU tmp1 zero)) (Binary src1 zero))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zero\t @cmovI_cmpU_zero_zero\n\t" ++ "CMOV $dst,$zero\t @cmovI_cmpU_zero_zero" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register src1 = $src1$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, R0, dst, src1, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpU_reg_reg(mRegI dst, mRegI src, mRegI tmp1, mRegI tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpU tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovI_cmpU_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovI_cmpU_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpL_zero_reg(mRegI dst, mRegI src1, mRegI src2, mRegLorI2L tmp1, immL_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpL tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovI_cmpL_zero_reg\n" ++ "\tCMOV $dst,$src1, $src2 \t @cmovI_cmpL_zero_reg" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(opr1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpUL_zero_reg(mRegI dst, mRegI src1, mRegI src2, mRegLorI2L tmp1, immL_0 zero, cmpOp cop) %{ ++ match(Set dst (CMoveI (Binary cop (CmpUL tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovI_cmpUL_zero_reg\n" ++ "\tCMOV $dst,$src1, $src2 \t @cmovI_cmpUL_zero_reg" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(opr1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpL_reg_zero(mRegI dst, mRegI src1, immI_0 zeroI, mRegL tmp1, mRegL tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpL tmp1 tmp2)) (Binary src1 zeroI))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovI_cmpL_reg_zero\n" ++ "\tCMOV $dst, $src1, $zeroI \t @cmovI_cmpL_reg_zero" ++ %} ++ ins_encode %{ ++ Register op1 = as_Register($tmp1$$reg); ++ Register op2 = as_Register($tmp2$$reg); ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, op2, dst, src1, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpUL_reg_zero(mRegI dst, mRegI src1, immI_0 zeroI, mRegL tmp1, mRegL tmp2, cmpOp cop) %{ ++ match(Set dst (CMoveI (Binary cop (CmpUL tmp1 tmp2)) (Binary src1 zeroI))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovI_cmpUL_reg_zero\n" ++ "\tCMOV $dst, $src1, $zeroI \t @cmovI_cmpUL_reg_zero" ++ %} ++ ins_encode %{ ++ Register op1 = as_Register($tmp1$$reg); ++ Register op2 = as_Register($tmp2$$reg); ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, op2, dst, src1, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpL_zero_zero(mRegI dst, mRegI src1, immI_0 zeroI, mRegL tmp1, immL_0 zeroL, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpL tmp1 zeroL)) (Binary src1 zeroI))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $zeroL\t @cmovI_cmpL_zero_zero\n" ++ "\tCMOV $dst, $src1, $zeroI \t @cmovI_cmpL_zero_zero" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(opr1, R0, dst, src1, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpUL_zero_zero(mRegI dst, mRegI src1, immI_0 zeroI, mRegL tmp1, immL_0 zeroL, cmpOp cop) %{ ++ match(Set dst (CMoveI (Binary cop (CmpUL tmp1 zeroL)) (Binary src1 zeroI))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $zeroL\t @cmovI_cmpUL_zero_zero\n" ++ "\tCMOV $dst, $src1, $zeroI \t @cmovI_cmpUL_zero_zero" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(opr1, R0, dst, src1, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpL_reg_reg(mRegI dst, mRegIorL2I src, mRegLorI2L tmp1, mRegLorI2L tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveI (Binary cop (CmpL tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovI_cmpL_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovI_cmpL_reg_reg" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register opr2 = as_Register($tmp2$$reg); ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(opr1, opr2, dst, src, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpUL_reg_reg(mRegI dst, mRegIorL2I src, mRegLorI2L tmp1, mRegLorI2L tmp2, cmpOp cop) %{ ++ match(Set dst (CMoveI (Binary cop (CmpUL tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovI_cmpUL_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovI_cmpUL_reg_reg" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register opr2 = as_Register($tmp2$$reg); ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(opr1, opr2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpL_zero_reg(mRegP dst, mRegP src1, mRegP src2, mRegLorI2L tmp1, immL_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpL tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovP_cmpL_zero_reg\n" ++ "\tCMOV $dst,$src1, $src2 \t @cmovP_cmpL_zero_reg" ++ %} ++ ins_encode %{ ++ Register op1 = as_Register($tmp1$$reg); ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpUL_zero_reg(mRegP dst, mRegP src1, mRegP src2, mRegLorI2L tmp1, immL_0 zero, cmpOp cop) %{ ++ match(Set dst (CMoveP (Binary cop (CmpUL tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovP_cmpUL_zero_reg\n" ++ "\tCMOV $dst,$src1, $src2 \t @cmovP_cmpUL_zero_reg" ++ %} ++ ins_encode %{ ++ Register op1 = as_Register($tmp1$$reg); ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpL_reg_zero(mRegP dst, immP_0 zero, mRegL tmp1, mRegL tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpL tmp1 tmp2)) (Binary dst zero))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovP_cmpL_reg_zero\n" ++ "\tCMOV $dst,$zero \t @cmovP_cmpL_reg_zero" ++ %} ++ ins_encode %{ ++ Register op1 = as_Register($tmp1$$reg); ++ Register op2 = as_Register($tmp2$$reg); ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, op2, dst, dst, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpUL_reg_zero(mRegP dst, immP_0 zero, mRegL tmp1, mRegL tmp2, cmpOp cop) %{ ++ match(Set dst (CMoveP (Binary cop (CmpUL tmp1 tmp2)) (Binary dst zero))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovP_cmpUL_reg_zero\n" ++ "\tCMOV $dst,$zero \t @cmovP_cmpUL_reg_zero" ++ %} ++ ins_encode %{ ++ Register op1 = as_Register($tmp1$$reg); ++ Register op2 = as_Register($tmp2$$reg); ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, op2, dst, dst, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpL_zero_zero(mRegP dst, mRegP src1, immP_0 zeroP, mRegL tmp1, immL_0 zeroL, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpL tmp1 zeroL)) (Binary src1 zeroP))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $zeroL\t @cmovP_cmpL_zero_zero\n" ++ "\tCMOV $dst,$zeroP \t @cmovP_cmpL_zero_zero" ++ %} ++ ins_encode %{ ++ Register op1 = as_Register($tmp1$$reg); ++ Register src1 = $src1$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, R0, dst, src1, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpUL_zero_zero(mRegP dst, mRegP src1, immP_0 zeroP, mRegL tmp1, immL_0 zeroL, cmpOp cop) %{ ++ match(Set dst (CMoveP (Binary cop (CmpUL tmp1 zeroL)) (Binary src1 zeroP))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $zeroL\t @cmovP_cmpUL_zero_zero\n" ++ "\tCMOV $dst,$zeroP \t @cmovP_cmpUL_zero_zero" ++ %} ++ ins_encode %{ ++ Register op1 = as_Register($tmp1$$reg); ++ Register src1 = $src1$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, R0, dst, src1, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpL_reg_reg(mRegP dst, mRegP src, mRegLorI2L tmp1, mRegLorI2L tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpL tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovP_cmpL_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovP_cmpL_reg_reg" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register opr2 = as_Register($tmp2$$reg); ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(opr1, opr2, dst, src, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpUL_reg_reg(mRegP dst, mRegP src, mRegLorI2L tmp1, mRegLorI2L tmp2, cmpOp cop) %{ ++ match(Set dst (CMoveP (Binary cop (CmpUL tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovP_cmpUL_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovP_cmpUL_reg_reg" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register opr2 = as_Register($tmp2$$reg); ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(opr1, opr2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovI_cmpD_reg_reg(mRegI dst, mRegI src, regD tmp1, regD tmp2, cmpOp cop, regD tmp3, regD tmp4) %{ ++ match(Set dst (CMoveI (Binary cop (CmpD tmp1 tmp2)) (Binary dst src))); ++ effect(TEMP tmp3, TEMP tmp4); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovI_cmpD_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovI_cmpD_reg_reg" ++ %} ++ ins_encode %{ ++ FloatRegister reg_op1 = as_FloatRegister($tmp1$$reg); ++ FloatRegister reg_op2 = as_FloatRegister($tmp2$$reg); ++ FloatRegister tmp1 = $tmp3$$FloatRegister; ++ FloatRegister tmp2 = $tmp4$$FloatRegister; ++ Register dst = as_Register($dst$$reg); ++ Register src = as_Register($src$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(reg_op1, reg_op2, dst, src, tmp1, tmp2, (MacroAssembler::CMCompare) flag, false /* is_float */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpP_zero_reg(mRegP dst, mRegP src1, mRegP src2, mRegP tmp1, immP_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpP tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zero\t @cmovP_cmpP_zero_reg\n\t" ++ "CMOV $dst,$src1, $src2\t @cmovP_cmpP_zero_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpP_reg_zero(mRegP dst, immP_0 zero, mRegP tmp1, mRegP tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpP tmp1 tmp2)) (Binary dst zero))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovP_cmpP_reg_zero\n\t" ++ "CMOV $dst,$zero\t @cmovP_cmpP_reg_zero" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, op2, dst, dst, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpP_reg_reg(mRegP dst, mRegP src, mRegP tmp1, mRegP tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpP tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovP_cmpP_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovP_cmpP_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpI_zero_reg(mRegP dst, mRegP src1, mRegP src2, mRegI tmp1, immI_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpI tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1,$zero\t @cmovP_cmpI_zero_reg\n\t" ++ "CMOV $dst,$src1, $src2\t @cmovP_cmpI_zero_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpI_reg_zero(mRegP dst, immP_0 zero, mRegI tmp1, mRegI tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpI tmp1 tmp2)) (Binary dst zero))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1,$tmp2\t @cmovP_cmpI_reg_zero\n\t" ++ "CMOV $dst,$zero\t @cmovP_cmpI_reg_zero" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, op2, dst, dst, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpI_zero_zero(mRegP dst, mRegP src1, immP_0 zeroP, mRegI tmp1, immI_0 zeroI, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpI tmp1 zeroI)) (Binary src1 zeroP))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1,$zeroI\t @cmovP_cmpI_zero_zero\n\t" ++ "CMOV $dst,$zeroP\t @cmovP_cmpI_zero_zero" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register src1 = $src1$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, R0, dst, src1, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovP_cmpI_reg_reg(mRegP dst, mRegP src, mRegI tmp1, mRegI tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveP (Binary cop (CmpI tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1,$tmp2\t @cmovP_cmpI_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovP_cmpI_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpP_zero_reg(mRegL dst, mRegL src1, mRegL src2, mRegP tmp1, immP_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpP tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zero\t @cmovL_cmpP_zero_reg\n\t" ++ "CMOV $dst,$src1, $src2\t @cmovL_cmpP_zero_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ Label L; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpP_reg_reg(mRegL dst, mRegL src, mRegP tmp1, mRegP tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpP tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovL_cmpP_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovL_cmpP_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ Label L; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovN_cmpU_zero_reg(mRegN dst, mRegN src1, mRegN src2, mRegI tmp1, immI_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveN (Binary cop (CmpU tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zero\t @cmovN_cmpU_zero_reg\n\t" ++ "CMOV $dst,$src1, $src2\t @cmovN_cmpU_zero_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovN_cmpU_reg_reg(mRegN dst, mRegN src, mRegI tmp1, mRegI tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveN (Binary cop (CmpU tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovN_cmpU_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovN_cmpU_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovN_cmpL_zero_reg(mRegN dst, mRegN src1, mRegN src2, mRegL tmp1, immL_0 zero, cmpOp cop) %{ ++ match(Set dst (CMoveN (Binary cop (CmpL tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovN_cmpL_zero_reg\n" ++ "\tCMOV $dst, $src1, $src2 \t @cmovN_cmpL_zero_reg" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(opr1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovN_cmpUL_zero_reg(mRegN dst, mRegN src1, mRegN src2, mRegL tmp1, immL_0 zero, cmpOp cop) %{ ++ match(Set dst (CMoveN (Binary cop (CmpUL tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovN_cmpUL_zero_reg\n" ++ "\tCMOV $dst, $src1, $src2 \t @cmovN_cmpUL_zero_reg" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(opr1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovN_cmpL_reg_reg(mRegN dst, mRegN src, mRegL tmp1, mRegL tmp2, cmpOp cop) %{ ++ match(Set dst (CMoveN (Binary cop (CmpL tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovN_cmpL_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovN_cmpL_reg_reg" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register opr2 = as_Register($tmp2$$reg); ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(opr1, opr2, dst, src, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovN_cmpUL_reg_reg(mRegN dst, mRegN src, mRegL tmp1, mRegL tmp2, cmpOp cop) %{ ++ match(Set dst (CMoveN (Binary cop (CmpUL tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovN_cmpUL_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovN_cmpUL_reg_reg" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register opr2 = as_Register($tmp2$$reg); ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(opr1, opr2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovN_cmpI_reg_reg(mRegN dst, mRegN src, mRegI tmp1, mRegI tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveN (Binary cop (CmpI tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1,$tmp2\t @cmovN_cmpI_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovN_cmpI_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovN_cmpI_zero_reg(mRegN dst, mRegN src1, mRegN src2, mRegI tmp1, immI_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveN (Binary cop (CmpI tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1,$zero\t @cmovN_cmpI_zero_reg\n\t" ++ "CMOV $dst,$src1,$src2\t @cmovN_cmpI_zero_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpU_zero_reg(mRegL dst, mRegL src1, mRegL src2, mRegI tmp1, immI_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpU tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zero\t @cmovL_cmpU_zero_reg\n\t" ++ "CMOV $dst,$src1,$src2\t @cmovL_cmpU_zero_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpU_reg_zero(mRegL dst, immL_0 zero, mRegI tmp1, mRegI tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpU tmp1 tmp2)) (Binary dst zero))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovL_cmpU_reg_zero\n\t" ++ "CMOV $dst,$zero\t @cmovL_cmpU_reg_zero" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, op2, dst, dst, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpU_zero_zero(mRegL dst, mRegL src1, immL_0 zeroL, mRegI tmp1, immI_0 zeroI, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpU tmp1 zeroI)) (Binary src1 zeroL))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$zeroI\t @cmovL_cmpU_zero_zero\n\t" ++ "CMOV $dst,$zeroL\t @cmovL_cmpU_zero_zero" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register src1 = $src1$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, R0, dst, src1, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpU_reg_reg(mRegL dst, mRegL src, mRegI tmp1, mRegI tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpU tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovL_cmpU_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovL_cmpU_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpF_reg_reg(mRegL dst, mRegL src, regF tmp1, regF tmp2, cmpOp cop, regD tmp3, regD tmp4) %{ ++ match(Set dst (CMoveL (Binary cop (CmpF tmp1 tmp2)) (Binary dst src))); ++ effect(TEMP tmp3, TEMP tmp4); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovL_cmpF_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovL_cmpF_reg_reg" ++ %} ++ ++ ins_encode %{ ++ FloatRegister reg_op1 = $tmp1$$FloatRegister; ++ FloatRegister reg_op2 = $tmp2$$FloatRegister; ++ FloatRegister tmp1 = $tmp3$$FloatRegister; ++ FloatRegister tmp2 = $tmp4$$FloatRegister; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(reg_op1, reg_op2, dst, src, tmp1, tmp2, (MacroAssembler::CMCompare) flag, true /* is_float */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpI_zero_reg(mRegL dst, mRegL src1, mRegL src2, mRegI tmp1, immI_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpI tmp1 zero)) (Binary src1 src2))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovL_cmpI_zero_reg\n" ++ "\tCMOV $dst, $src1, $src2 \t @cmovL_cmpI_zero_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = as_Register($dst$$reg); ++ Register src1 = as_Register($src1$$reg); ++ Register src2 = as_Register($src2$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpI_reg_zero(mRegL dst, immL_0 zero, mRegI tmp1, mRegI tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpI tmp1 tmp2)) (Binary dst zero))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovL_cmpI_reg_zero\n" ++ "\tCMOV $dst,$zero \t @cmovL_cmpI_reg_zero" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = as_Register($dst$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, op2, dst, dst, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpI_zero_zero(mRegL dst, mRegL src1, immL_0 zeroL, mRegI tmp1, immI_0 zeroI, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpI tmp1 zeroI)) (Binary src1 zeroL))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $zeroI\t @cmovL_cmpI_zero_zero\n" ++ "\tCMOV $dst,$zeroL \t @cmovL_cmpI_zero_zero" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register src1 = $src1$$Register; ++ Register dst = as_Register($dst$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, R0, dst, src1, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpI_reg_reg(mRegL dst, mRegL src, mRegI tmp1, mRegI tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpI tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovL_cmpI_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovL_cmpI_reg_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = as_Register($dst$$reg); ++ Register src = as_Register($src$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpL_reg_reg(mRegL dst, mRegL src1, mRegL src2, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpL src1 src2)) (Binary src1 src2))); ++ ins_cost(50); ++ format %{ ++ "CMP$cop $src1, $src2\t @cmovL_cmpL_reg_reg\n" ++ "\tCMOV $dst,$src1, $src2 \t @cmovL_cmpL_reg_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Register op2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, op1, op2, (MacroAssembler::CMCompare) flag, true); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpUL_reg_reg(mRegL dst, mRegL src1, mRegL src2, cmpOp cop) %{ ++ match(Set dst (CMoveL (Binary cop (CmpUL src1 src2)) (Binary src1 src2))); ++ ins_cost(50); ++ format %{ ++ "CMP$cop $src1, $src2\t @cmovL_cmpUL_reg_reg\n" ++ "\tCMOV $dst,$src1, $src2 \t @cmovL_cmpUL_reg_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Register op2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, op1, op2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpL_reg_zero(mRegL dst, mRegL src1, immL_0 zero, mRegL tmp1, mRegL tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpL tmp1 tmp2)) (Binary src1 zero))); ++ ins_cost(20); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovL_cmpL_reg_zero\n" ++ "\tCMOV $dst,$src1, $zero \t @cmovL_cmpL_reg_zero" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, op2, dst, src1, (MacroAssembler::CMCompare) flag, true); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpUL_reg_zero(mRegL dst, mRegL src1, immL_0 zero, mRegL tmp1, mRegL tmp2, cmpOp cop) %{ ++ match(Set dst (CMoveL (Binary cop (CmpUL tmp1 tmp2)) (Binary src1 zero))); ++ ins_cost(20); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovL_cmpUL_reg_zero\n" ++ "\tCMOV $dst,$src1, $zero \t @cmovL_cmpUL_reg_zero" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, op2, dst, src1, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpL_reg_reg2(mRegL dst, mRegL src1, mRegL src2, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpL src1 src2)) (Binary src2 src1))); ++ ins_cost(50); ++ format %{ ++ "CMP$cop $src1, $src2\t @cmovL_cmpL_reg_reg2\n" ++ "\tCMOV $dst,$src2, $src1 \t @cmovL_cmpL_reg_reg2" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Register op2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, op2, op1, (MacroAssembler::CMCompare) flag, true); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpUL_reg_reg2(mRegL dst, mRegL src1, mRegL src2, cmpOp cop) %{ ++ match(Set dst (CMoveL (Binary cop (CmpUL src1 src2)) (Binary src2 src1))); ++ ins_cost(50); ++ format %{ ++ "CMP$cop $src1, $src2\t @cmovL_cmpUL_reg_reg2\n" ++ "\tCMOV $dst,$src2, $src1 \t @cmovL_cmpUL_reg_reg2" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $src1$$Register; ++ Register op2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, op2, op1, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpL_zero_reg(mRegL dst, mRegL src1, mRegL src2, mRegL tmp1, immL_0 zero, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpL tmp1 zero)) (Binary src1 src2))); ++ ins_cost(20); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovL_cmpL_zero_reg\n" ++ "\tCMOV $dst,$src1, $src2 \t @cmovL_cmpL_zero_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag,true); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpUL_zero_reg(mRegL dst, mRegL src1, mRegL src2, mRegL tmp1, immL_0 zero, cmpOp cop) %{ ++ match(Set dst (CMoveL (Binary cop (CmpUL tmp1 zero)) (Binary src1 src2))); ++ ins_cost(20); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovL_cmpUL_zero_reg\n" ++ "\tCMOV $dst,$src1, $src2 \t @cmovL_cmpUL_zero_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, R0, dst, src1, src2, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpL_zero_zero(mRegL dst, mRegL src1, immL_0 zero, mRegL tmp1, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpL tmp1 zero)) (Binary src1 zero))); ++ ins_cost(20); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovL_cmp_zero_zero\n" ++ "\tCMOV $dst,$src1, $zero \t @cmovL_cmpL_zero_zero" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, R0, dst, src1, (MacroAssembler::CMCompare) flag, true); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpUL_zero_zero(mRegL dst, mRegL src1, immL_0 zero, mRegL tmp1, cmpOp cop) %{ ++ match(Set dst (CMoveL (Binary cop (CmpUL tmp1 zero)) (Binary src1 zero))); ++ ins_cost(20); ++ format %{ ++ "CMP$cop $tmp1, $zero\t @cmovL_cmpUL_zero_zero\n" ++ "\tCMOV $dst,$src1, $zero \t @cmovL_cmpUL_zero_zero" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov_zero(op1, R0, dst, src1, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpL_dst_reg(mRegL dst, mRegL src, mRegL tmp1, mRegL tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpL tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovL_cmpL_dst_reg\n" ++ "\tCMOV $dst,$src \t @cmovL_cmpL_dst_reg" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register opr2 = as_Register($tmp2$$reg); ++ Register dst = as_Register($dst$$reg); ++ Register src = as_Register($src$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(opr1, opr2, dst, src, (MacroAssembler::CMCompare) flag, true /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpUL_dst_reg(mRegL dst, mRegL src, mRegL tmp1, mRegL tmp2, cmpOp cop) %{ ++ match(Set dst (CMoveL (Binary cop (CmpUL tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovL_cmpUL_dst_reg\n" ++ "\tCMOV $dst,$src \t @cmovL_cmpUL_dst_reg" ++ %} ++ ins_encode %{ ++ Register opr1 = as_Register($tmp1$$reg); ++ Register opr2 = as_Register($tmp2$$reg); ++ Register dst = as_Register($dst$$reg); ++ Register src = as_Register($src$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(opr1, opr2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovL_cmpN_reg_reg(mRegL dst, mRegL src, mRegN tmp1, mRegN tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveL (Binary cop (CmpN tmp1 tmp2)) (Binary dst src))); ++ ins_cost(80); ++ format %{ ++ "CMPU$cop $tmp1,$tmp2\t @cmovL_cmpN_reg_reg\n\t" ++ "CMOV $dst,$src\t @cmovL_cmpN_reg_reg" ++ %} ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++ ++instruct cmovL_cmpD_reg_reg(mRegL dst, mRegL src, regD tmp1, regD tmp2, cmpOp cop, regD tmp3, regD tmp4) %{ ++ match(Set dst (CMoveL (Binary cop (CmpD tmp1 tmp2)) (Binary dst src))); ++ effect(TEMP tmp3, TEMP tmp4); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovL_cmpD_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovL_cmpD_reg_reg" ++ %} ++ ins_encode %{ ++ FloatRegister reg_op1 = as_FloatRegister($tmp1$$reg); ++ FloatRegister reg_op2 = as_FloatRegister($tmp2$$reg); ++ FloatRegister tmp1 = $tmp3$$FloatRegister; ++ FloatRegister tmp2 = $tmp4$$FloatRegister; ++ Register dst = as_Register($dst$$reg); ++ Register src = as_Register($src$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(reg_op1, reg_op2, dst, src, tmp1, tmp2, (MacroAssembler::CMCompare) flag, false /* is_float */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovD_cmpD_reg_reg(regD dst, regD src, regD tmp1, regD tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveD (Binary cop (CmpD tmp1 tmp2)) (Binary dst src))); ++ ins_cost(200); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovD_cmpD_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovD_cmpD_reg_reg" ++ %} ++ ins_encode %{ ++ FloatRegister reg_op1 = as_FloatRegister($tmp1$$reg); ++ FloatRegister reg_op2 = as_FloatRegister($tmp2$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ FloatRegister src = as_FloatRegister($src$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(reg_op1, reg_op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_float */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovF_cmpI_reg_reg(regF dst, regF src, mRegI tmp1, mRegI tmp2, cmpOp cop) %{ ++ match(Set dst (CMoveF (Binary cop (CmpI tmp1 tmp2)) (Binary dst src))); ++ ins_cost(200); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovF_cmpI_reg_reg\n" ++ "\tCMOV $dst, $src \t @cmovF_cmpI_reg_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ FloatRegister src = as_FloatRegister($src$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovD_cmpN_reg_reg(regD dst, regD src, mRegN tmp1, mRegN tmp2, cmpOp cop) %{ ++ match(Set dst (CMoveD (Binary cop (CmpN tmp1 tmp2)) (Binary dst src))); ++ ins_cost(200); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovD_cmpN_reg_reg\n" ++ "\tCMOV $dst, $src \t @cmovD_cmpN_reg_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ FloatRegister src = as_FloatRegister($src$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag, false /* is_signed */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovD_cmpI_reg_reg(regD dst, regD src, mRegI tmp1, mRegI tmp2, cmpOp cop) %{ ++ match(Set dst (CMoveD (Binary cop (CmpI tmp1 tmp2)) (Binary dst src))); ++ ins_cost(200); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovD_cmpI_reg_reg\n" ++ "\tCMOV $dst, $src \t @cmovD_cmpI_reg_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ FloatRegister src = as_FloatRegister($src$$reg); ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovD_cmpP_reg_reg(regD dst, regD src, mRegP tmp1, mRegP tmp2, cmpOp cop) %{ ++ match(Set dst (CMoveD (Binary cop (CmpP tmp1 tmp2)) (Binary dst src))); ++ ins_cost(200); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovD_cmpP_reg_reg\n" ++ "\tCMOV $dst, $src \t @cmovD_cmpP_reg_reg" ++ %} ++ ++ ins_encode %{ ++ Register op1 = $tmp1$$Register; ++ Register op2 = $tmp2$$Register; ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ FloatRegister src = as_FloatRegister($src$$reg); ++ int flag = $cop$$cmpcode; ++ ++ // Use signed comparison here, because the most significant bit of the ++ // user-space virtual address must be 0. ++ __ cmp_cmov(op1, op2, dst, src, (MacroAssembler::CMCompare) flag); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++//FIXME ++instruct cmovI_cmpF_reg_reg(mRegI dst, mRegI src, regF tmp1, regF tmp2, cmpOp cop, regD tmp3, regD tmp4) %{ ++ match(Set dst (CMoveI (Binary cop (CmpF tmp1 tmp2)) (Binary dst src))); ++ effect(TEMP tmp3, TEMP tmp4); ++ ins_cost(80); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovI_cmpF_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovI_cmpF_reg_reg" ++ %} ++ ++ ins_encode %{ ++ FloatRegister reg_op1 = $tmp1$$FloatRegister; ++ FloatRegister reg_op2 = $tmp2$$FloatRegister; ++ FloatRegister tmp1 = $tmp3$$FloatRegister; ++ FloatRegister tmp2 = $tmp4$$FloatRegister; ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(reg_op1, reg_op2, dst, src, tmp1, tmp2, (MacroAssembler::CMCompare) flag, true /* is_float */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmovF_cmpF_reg_reg(regF dst, regF src, regF tmp1, regF tmp2, cmpOp cop ) %{ ++ match(Set dst (CMoveF (Binary cop (CmpF tmp1 tmp2)) (Binary dst src))); ++ ins_cost(200); ++ format %{ ++ "CMP$cop $tmp1, $tmp2\t @cmovF_cmpF_reg_reg\n" ++ "\tCMOV $dst,$src \t @cmovF_cmpF_reg_reg" ++ %} ++ ++ ins_encode %{ ++ FloatRegister reg_op1 = $tmp1$$FloatRegister; ++ FloatRegister reg_op2 = $tmp2$$FloatRegister; ++ FloatRegister dst = $dst$$FloatRegister; ++ FloatRegister src = $src$$FloatRegister; ++ int flag = $cop$$cmpcode; ++ ++ __ cmp_cmov(reg_op1, reg_op2, dst, src, (MacroAssembler::CMCompare) flag, true /* is_float */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// Manifest a CmpL result in an integer register. Very painful. ++// This is the test to avoid. ++instruct cmpL3_reg_zero(mRegI dst, mRegL src1, immL_0 zero) %{ ++ match(Set dst (CmpL3 src1 zero)); ++ match(Set dst (CmpL3 (CastLL src1) zero)); ++ ins_cost(1000); ++ format %{ "cmpL3 $dst, $src1, zero @ cmpL3_reg_zero" %} ++ ins_encode %{ ++ Register opr1 = as_Register($src1$$reg); ++ Register dst = as_Register($dst$$reg); ++ __ slt(AT, opr1, R0); ++ __ slt(dst, R0, opr1); ++ __ sub_d(dst, dst, AT); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// Manifest a CmpU result in an integer register. Very painful. ++// This is the test to avoid. ++instruct cmpU3_reg_reg(mRegI dst, mRegI src1, mRegI src2) %{ ++ match(Set dst (CmpU3 src1 src2)); ++ format %{ "cmpU3 $dst, $src1, $src2 @ cmpU3_reg_reg" %} ++ ins_encode %{ ++ Register opr1 = as_Register($src1$$reg); ++ Register opr2 = as_Register($src2$$reg); ++ Register dst = as_Register($dst$$reg); ++ ++ __ sltu(AT, opr1, opr2); ++ __ sltu(dst, opr2, opr1); ++ __ sub_d(dst, dst, AT); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmpL3_reg_reg(mRegI dst, mRegL src1, mRegL src2) %{ ++ match(Set dst (CmpL3 src1 src2)); ++ ins_cost(1000); ++ format %{ "cmpL3 $dst, $src1, $src2 @ cmpL3_reg_reg" %} ++ ins_encode %{ ++ Register opr1 = as_Register($src1$$reg); ++ Register opr2 = as_Register($src2$$reg); ++ Register dst = as_Register($dst$$reg); ++ ++ __ slt(AT, opr1, opr2); ++ __ slt(dst, opr2, opr1); ++ __ sub_d(dst, dst, AT); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// Manifest a CmpUL result in an integer register. Very painful. ++// This is the test to avoid. ++instruct cmpUL3_reg_reg(mRegI dst, mRegLorI2L src1, mRegLorI2L src2) %{ ++ match(Set dst (CmpUL3 src1 src2)); ++ format %{ "cmpUL3 $dst, $src1, $src2 @ cmpUL3_reg_reg" %} ++ ins_encode %{ ++ Register opr1 = as_Register($src1$$reg); ++ Register opr2 = as_Register($src2$$reg); ++ Register dst = as_Register($dst$$reg); ++ ++ __ sltu(AT, opr1, opr2); ++ __ sltu(dst, opr2, opr1); ++ __ sub_d(dst, dst, AT); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ++// less_rsult = -1 ++// greater_result = 1 ++// equal_result = 0 ++// nan_result = -1 ++// ++instruct cmpF3_reg_reg(mRegI dst, regF src1, regF src2) %{ ++ match(Set dst (CmpF3 src1 src2)); ++ ins_cost(1000); ++ format %{ "cmpF3 $dst, $src1, $src2 @ cmpF3_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src1 = as_FloatRegister($src1$$reg); ++ FloatRegister src2 = as_FloatRegister($src2$$reg); ++ Register dst = as_Register($dst$$reg); ++ ++ if (src1 == src2) { ++ __ fcmp_cun_s(FCC0, src1, src2); ++ if (UseCF2GR) { ++ __ movcf2gr(dst, FCC0); ++ } else { ++ __ movcf2fr(fscratch, FCC0); ++ __ movfr2gr_s(dst, fscratch); ++ } ++ __ sub_w(dst, R0, dst); ++ } else { ++ __ fcmp_clt_s(FCC0, src2, src1); ++ __ fcmp_cult_s(FCC1, src1, src2); ++ if (UseCF2GR) { ++ __ movcf2gr(dst, FCC0); ++ __ movcf2gr(AT, FCC1); ++ } else { ++ __ movcf2fr(fscratch, FCC0); ++ __ movfr2gr_s(dst, fscratch); ++ __ movcf2fr(fscratch, FCC1); ++ __ movfr2gr_s(AT, fscratch); ++ } ++ __ sub_d(dst, dst, AT); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmpD3_reg_reg(mRegI dst, regD src1, regD src2) %{ ++ match(Set dst (CmpD3 src1 src2)); ++ ins_cost(1000); ++ format %{ "cmpD3 $dst, $src1, $src2 @ cmpD3_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src1 = as_FloatRegister($src1$$reg); ++ FloatRegister src2 = as_FloatRegister($src2$$reg); ++ Register dst = as_Register($dst$$reg); ++ ++ if (src1 == src2) { ++ __ fcmp_cun_d(FCC0, src1, src2); ++ if (UseCF2GR) { ++ __ movcf2gr(dst, FCC0); ++ } else { ++ __ movcf2fr(fscratch, FCC0); ++ __ movfr2gr_s(dst, fscratch); ++ } ++ __ sub_d(dst, R0, dst); ++ } else { ++ __ fcmp_clt_d(FCC0, src2, src1); ++ __ fcmp_cult_d(FCC1, src1, src2); ++ if (UseCF2GR) { ++ __ movcf2gr(dst, FCC0); ++ __ movcf2gr(AT, FCC1); ++ } else { ++ __ movcf2fr(fscratch, FCC0); ++ __ movfr2gr_s(dst, fscratch); ++ __ movcf2fr(fscratch, FCC1); ++ __ movfr2gr_s(AT, fscratch); ++ } ++ __ sub_d(dst, dst, AT); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct clear_array(a2RegL cnt, a0_RegP base, Universe dummy, a1RegL value) %{ ++ match(Set dummy (ClearArray cnt base)); ++ effect(TEMP value, USE_KILL cnt, USE_KILL base); ++ ++ format %{ "CLEAR_ARRAY base = $base, cnt = $cnt @ clear_array" %} ++ ins_encode %{ ++ // Assume cnt is the number of bytes in an array to be cleared, ++ // and base points to the starting address of the array. ++ __ move($value$$Register, R0); ++ __ trampoline_call(RuntimeAddress(StubRoutines::la::arrayof_jlong_fill())); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct string_compareL(a4_RegP str1, mA5RegI cnt1, a6_RegP str2, mA7RegI cnt2, mRegI result, mRegL tmp1, mRegL tmp2, regF vtmp1, regF vtmp2) %{ ++ predicate(((StrCompNode*)n)->encoding() == StrIntrinsicNode::LL); ++ match(Set result (StrComp (Binary str1 cnt1) (Binary str2 cnt2))); ++ effect(TEMP_DEF result, TEMP tmp1, TEMP tmp2, TEMP vtmp1, TEMP vtmp2, USE_KILL str1, USE_KILL str2, USE_KILL cnt1, USE_KILL cnt2); ++ ++ format %{ "String Compare byte[] $str1[len: $cnt1], $str2[len: $cnt2] tmp1:$tmp1, tmp2:$tmp2, vtmp1:$vtmp1, vtmp2:$vtmp2 -> $result @ string_compareL" %} ++ ins_encode %{ ++ __ string_compare($str1$$Register, $str2$$Register, ++ $cnt1$$Register, $cnt2$$Register, $result$$Register, ++ StrIntrinsicNode::LL, $tmp1$$Register, $tmp2$$Register, ++ $vtmp1$$FloatRegister, $vtmp2$$FloatRegister); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct string_compareU(a4_RegP str1, mA5RegI cnt1, a6_RegP str2, mA7RegI cnt2, mRegI result, mRegL tmp1, mRegL tmp2, regF vtmp1, regF vtmp2) %{ ++ predicate(((StrCompNode*)n)->encoding() == StrIntrinsicNode::UU); ++ match(Set result (StrComp (Binary str1 cnt1) (Binary str2 cnt2))); ++ effect(TEMP_DEF result, TEMP tmp1, TEMP tmp2, TEMP vtmp1, TEMP vtmp2, USE_KILL str1, USE_KILL str2, USE_KILL cnt1, USE_KILL cnt2); ++ ++ format %{ "String Compare char[] $str1[len: $cnt1], $str2[len: $cnt2] tmp1:$tmp1, tmp2:$tmp2, vtmp1:$vtmp1, vtmp2:$vtmp2 -> $result @ string_compareU" %} ++ ins_encode %{ ++ __ string_compare($str1$$Register, $str2$$Register, ++ $cnt1$$Register, $cnt2$$Register, $result$$Register, ++ StrIntrinsicNode::UU, $tmp1$$Register, $tmp2$$Register, ++ $vtmp1$$FloatRegister, $vtmp2$$FloatRegister); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct string_compareLU(a4_RegP str1, mA5RegI cnt1, a6_RegP str2, mA7RegI cnt2, mRegI result, mRegL tmp1, mRegL tmp2, regF vtmp1, regF vtmp2) %{ ++ predicate(((StrCompNode*)n)->encoding() == StrIntrinsicNode::LU); ++ match(Set result (StrComp (Binary str1 cnt1) (Binary str2 cnt2))); ++ effect(TEMP_DEF result, TEMP tmp1, TEMP tmp2, TEMP vtmp1, TEMP vtmp2, USE_KILL str1, USE_KILL str2, USE_KILL cnt1, USE_KILL cnt2); ++ ++ format %{ "String Compare byte[] $str1[len: $cnt1], $str2[len: $cnt2] tmp1:$tmp1, tmp2:$tmp2, vtmp1:$vtmp1, vtmp2:$vtmp2 -> $result @ string_compareLU" %} ++ ins_encode %{ ++ __ string_compare($str1$$Register, $str2$$Register, ++ $cnt1$$Register, $cnt2$$Register, $result$$Register, ++ StrIntrinsicNode::LU, $tmp1$$Register, $tmp2$$Register, ++ $vtmp1$$FloatRegister, $vtmp2$$FloatRegister); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct string_compareUL(a4_RegP str1, mA5RegI cnt1, a6_RegP str2, mA7RegI cnt2, mRegI result, mRegL tmp1, mRegL tmp2, regF vtmp1, regF vtmp2) %{ ++ predicate(((StrCompNode*)n)->encoding() == StrIntrinsicNode::UL); ++ match(Set result (StrComp (Binary str1 cnt1) (Binary str2 cnt2))); ++ effect(TEMP_DEF result, TEMP tmp1, TEMP tmp2, TEMP vtmp1, TEMP vtmp2, USE_KILL str1, USE_KILL str2, USE_KILL cnt1, USE_KILL cnt2); ++ ++ format %{ "String Compare byte[] $str1[len: $cnt1], $str2[len: $cnt2] tmp1:$tmp1, tmp2:$tmp2, vtmp1:$vtmp1, vtmp2:$vtmp2 -> $result @ string_compareUL" %} ++ ins_encode %{ ++ __ string_compare($str1$$Register, $str2$$Register, ++ $cnt1$$Register, $cnt2$$Register, $result$$Register, ++ StrIntrinsicNode::UL, $tmp1$$Register, $tmp2$$Register, ++ $vtmp1$$FloatRegister, $vtmp2$$FloatRegister); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct string_indexofUU(a4_RegP str1, mA5RegI cnt1, a6_RegP str2, mA7RegI cnt2, ++ mT8RegI result) ++%{ ++ predicate(((StrIndexOfNode*)n)->encoding() == StrIntrinsicNode::UU); ++ match(Set result (StrIndexOf (Binary str1 cnt1) (Binary str2 cnt2))); ++ effect(USE_KILL str1, USE_KILL str2, USE_KILL cnt1, USE_KILL cnt2); ++ ++ format %{ "String IndexOf $str1,$cnt1,$str2,$cnt2 -> $result (UU)" %} ++ ins_encode %{ ++ __ string_indexof($str1$$Register, $str2$$Register, ++ $cnt1$$Register, $cnt2$$Register, ++ $result$$Register, StrIntrinsicNode::UU); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct string_indexofLL(a4_RegP str1, mA5RegI cnt1, a6_RegP str2, mA7RegI cnt2, ++ mT8RegI result) ++%{ ++ predicate(((StrIndexOfNode*)n)->encoding() == StrIntrinsicNode::LL); ++ match(Set result (StrIndexOf (Binary str1 cnt1) (Binary str2 cnt2))); ++ effect(USE_KILL str1, USE_KILL str2, USE_KILL cnt1, USE_KILL cnt2); ++ ++ format %{ "String IndexOf $str1,$cnt1,$str2,$cnt2 -> $result (LL)" %} ++ ins_encode %{ ++ __ string_indexof($str1$$Register, $str2$$Register, ++ $cnt1$$Register, $cnt2$$Register, ++ $result$$Register, StrIntrinsicNode::LL); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct string_indexofUL(a4_RegP str1, mA5RegI cnt1, a6_RegP str2, mA7RegI cnt2, ++ mT8RegI result) ++%{ ++ predicate(((StrIndexOfNode*)n)->encoding() == StrIntrinsicNode::UL); ++ match(Set result (StrIndexOf (Binary str1 cnt1) (Binary str2 cnt2))); ++ effect(USE_KILL str1, USE_KILL str2, USE_KILL cnt1, USE_KILL cnt2); ++ format %{ "String IndexOf $str1,$cnt1,$str2,$cnt2 -> $result (UL)" %} ++ ++ ins_encode %{ ++ __ string_indexof($str1$$Register, $str2$$Register, ++ $cnt1$$Register, $cnt2$$Register, ++ $result$$Register, StrIntrinsicNode::UL); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct string_indexof_conUU(a4_RegP str1, mA5RegI cnt1, a6_RegP str2, immI_1_4 int_cnt2, ++ mT8RegI result) ++%{ ++ predicate(((StrIndexOfNode*)n)->encoding() == StrIntrinsicNode::UU); ++ match(Set result (StrIndexOf (Binary str1 cnt1) (Binary str2 int_cnt2))); ++ effect(USE_KILL str1, USE_KILL str2, USE_KILL cnt1); ++ ++ format %{ "String IndexOf $str1,$cnt1,$str2,$int_cnt2 -> $result (UU)" %} ++ ++ ins_encode %{ ++ int icnt2 = (int)$int_cnt2$$constant; ++ __ string_indexof_linearscan($str1$$Register, $str2$$Register, ++ $cnt1$$Register, noreg, ++ icnt2, $result$$Register, StrIntrinsicNode::UU); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct string_indexof_conLL(a4_RegP str1, mA5RegI cnt1, a6_RegP str2, immI_1_4 int_cnt2, ++ mT8RegI result) ++%{ ++ predicate(((StrIndexOfNode*)n)->encoding() == StrIntrinsicNode::LL); ++ match(Set result (StrIndexOf (Binary str1 cnt1) (Binary str2 int_cnt2))); ++ effect(USE_KILL str1, USE_KILL str2, USE_KILL cnt1); ++ ++ format %{ "String IndexOf $str1,$cnt1,$str2,$int_cnt2 -> $result (LL)" %} ++ ins_encode %{ ++ int icnt2 = (int)$int_cnt2$$constant; ++ __ string_indexof_linearscan($str1$$Register, $str2$$Register, ++ $cnt1$$Register, noreg, ++ icnt2, $result$$Register, StrIntrinsicNode::LL); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct string_indexof_conUL(a4_RegP str1, mA5RegI cnt1, a6_RegP str2, immI_1 int_cnt2, ++ mT8RegI result) ++%{ ++ predicate(((StrIndexOfNode*)n)->encoding() == StrIntrinsicNode::UL); ++ match(Set result (StrIndexOf (Binary str1 cnt1) (Binary str2 int_cnt2))); ++ effect(USE_KILL str1, USE_KILL str2, USE_KILL cnt1); ++ ++ format %{ "String IndexOf $str1,$cnt1,$str2,$int_cnt2 -> $result (UL)" %} ++ ins_encode %{ ++ int icnt2 = (int)$int_cnt2$$constant; ++ __ string_indexof_linearscan($str1$$Register, $str2$$Register, ++ $cnt1$$Register, noreg, ++ icnt2, $result$$Register, StrIntrinsicNode::UL); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct string_indexof_char(a4_RegP str1, mA5RegI cnt1, mA6RegI ch, mRegI result, mRegL tmp1, mRegL tmp2, mRegL tmp3) %{ ++ predicate(((StrIndexOfCharNode*)n)->encoding() == StrIntrinsicNode::U); ++ match(Set result (StrIndexOfChar (Binary str1 cnt1) ch)); ++ effect(USE_KILL str1, USE_KILL cnt1, USE_KILL ch, TEMP_DEF result, TEMP tmp1, TEMP tmp2, TEMP tmp3); ++ ++ format %{ "StringUTF16 IndexOf char[] $str1, len:$cnt1, char:$ch, res:$result, tmp1:$tmp1, tmp2:$tmp2, tmp3:$tmp3 -> $result @ string_indexof_char" %} ++ ++ ins_encode %{ ++ __ string_indexof_char($str1$$Register, $cnt1$$Register, $ch$$Register, ++ $result$$Register, $tmp1$$Register, $tmp2$$Register, ++ $tmp3$$Register); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct stringL_indexof_char(a4_RegP str1, mA5RegI cnt1, mA6RegI ch, mRegI result, mRegL tmp1, mRegL tmp2, mRegL tmp3) %{ ++ predicate(((StrIndexOfCharNode*)n)->encoding() == StrIntrinsicNode::L); ++ match(Set result (StrIndexOfChar (Binary str1 cnt1) ch)); ++ effect(USE_KILL str1, USE_KILL cnt1, USE_KILL ch, TEMP_DEF result, TEMP tmp1, TEMP tmp2, TEMP tmp3); ++ ++ format %{ "StringLatin1 IndexOf char[] $str1, len:$cnt1, char:$ch, res:$result, tmp1:$tmp1, tmp2:$tmp2, tmp3:$tmp3 -> $result @ stringL_indexof_char" %} ++ ++ ins_encode %{ ++ __ stringL_indexof_char($str1$$Register, $cnt1$$Register, $ch$$Register, ++ $result$$Register, $tmp1$$Register, $tmp2$$Register, ++ $tmp3$$Register); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct count_positives(mRegP src, mRegI len, mRegI result, ++ mRegL tmp1, mRegL tmp2) %{ ++ match(Set result (CountPositives src len)); ++ effect(TEMP_DEF result, TEMP tmp1, TEMP tmp2); ++ ++ format %{ "count positives byte[] src:$src, len:$len -> $result TEMP($tmp1, $tmp2) @ count_positives" %} ++ ++ ins_encode %{ ++ __ count_positives($src$$Register, $len$$Register, $result$$Register, ++ $tmp1$$Register, $tmp2$$Register); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// fast char[] to byte[] compression ++instruct string_compress(a2_RegP src, mRegP dst, mRegI len, mRegI result, ++ mRegL tmp1, mRegL tmp2, mRegL tmp3, ++ regF vtemp1, regF vtemp2, regF vtemp3, regF vtemp4) ++%{ ++ predicate(UseLSX); ++ match(Set result (StrCompressedCopy src (Binary dst len))); ++ effect(TEMP_DEF result, TEMP tmp1, TEMP tmp2, TEMP tmp3, ++ TEMP vtemp1, TEMP vtemp2, TEMP vtemp3, TEMP vtemp4, USE_KILL src); ++ ++ format %{ "String Compress $src,$dst -> $result @ string_compress " %} ++ ++ ins_encode %{ ++ __ char_array_compress($src$$Register, $dst$$Register, $len$$Register, ++ $result$$Register, $tmp1$$Register, ++ $tmp2$$Register, $tmp3$$Register, ++ $vtemp1$$FloatRegister, $vtemp2$$FloatRegister, ++ $vtemp3$$FloatRegister, $vtemp4$$FloatRegister); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// byte[] to char[] inflation ++instruct string_inflate(Universe dummy, a4_RegP src, a5_RegP dst, mA6RegI len, ++ mRegL tmp1, mRegL tmp2, regF vtemp1, regF vtemp2) ++%{ ++ predicate(UseLSX); ++ match(Set dummy (StrInflatedCopy src (Binary dst len))); ++ effect(TEMP tmp1, TEMP tmp2, TEMP vtemp1, TEMP vtemp2, ++ USE_KILL src, USE_KILL dst, USE_KILL len); ++ ++ format %{ "String Inflate $src, $dst, len:$len " ++ "TEMP($tmp1, $tmp2, $vtemp1, $vtemp2) @ string_inflate " %} ++ ++ ins_encode %{ ++ __ byte_array_inflate($src$$Register, $dst$$Register, $len$$Register, ++ $tmp1$$Register, $tmp2$$Register, ++ $vtemp1$$FloatRegister, $vtemp2$$FloatRegister); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// intrinsic optimization ++instruct string_equals(a4_RegP str1, a5_RegP str2, mA6RegI cnt, mRegI result, mRegL tmp1, mRegL tmp2) %{ ++ match(Set result (StrEquals (Binary str1 str2) cnt)); ++ effect(USE_KILL str1, USE_KILL str2, USE_KILL cnt, TEMP_DEF result, TEMP tmp1, TEMP tmp2); ++ ++ format %{ "String Equal $str1, $str2, len:$cnt -> $result TEMP($tmp1, $tmp2) @ string_equals" %} ++ ins_encode %{ ++ __ arrays_equals($str1$$Register, $str2$$Register, ++ $cnt$$Register, $tmp1$$Register, $tmp2$$Register, $result$$Register, ++ false/* is_char */, false/* is_array */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct array_equalsB(a4_RegP ary1, a5_RegP ary2, mRegI result, mRegL tmp0, mRegL tmp1, mRegL tmp2) %{ ++ predicate(((AryEqNode*)n)->encoding() == StrIntrinsicNode::LL); ++ match(Set result (AryEq ary1 ary2)); ++ effect(USE_KILL ary1, USE_KILL ary2, TEMP_DEF result, TEMP tmp0, TEMP tmp1, TEMP tmp2); ++ ++ format %{ "Array Equals byte[] $ary1,$ary2 -> $result TEMP($tmp0, $tmp1, $tmp2) @ array_equalsB" %} ++ ins_encode %{ ++ __ arrays_equals($ary1$$Register, $ary2$$Register, ++ $tmp0$$Register, $tmp1$$Register, $tmp2$$Register, $result$$Register, ++ false/* is_char */, true/* is_array */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct array_equalsC(a4_RegP ary1, a5_RegP ary2, mRegI result, mRegL tmp0, mRegL tmp1, mRegL tmp2) %{ ++ predicate(((AryEqNode*)n)->encoding() == StrIntrinsicNode::UU); ++ match(Set result (AryEq ary1 ary2)); ++ effect(USE_KILL ary1, USE_KILL ary2, TEMP_DEF result, TEMP tmp0, TEMP tmp1, TEMP tmp2); ++ ++ format %{ "Array Equals char[] $ary1,$ary2 -> $result TEMP($tmp0, $tmp1, $tmp2) @ array_equalsC" %} ++ ins_encode %{ ++ __ arrays_equals($ary1$$Register, $ary2$$Register, ++ $tmp0$$Register, $tmp1$$Register, $tmp2$$Register, $result$$Register, ++ true/* is_char */, true/* is_array */); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// encode char[] to byte[] in ISO_8859_1 ++instruct encode_iso_array(a2_RegP src, mRegP dst, mRegI len, mRegI result, ++ mRegL tmp1, mRegL tmp2, mRegL tmp3, ++ regF vtemp1, regF vtemp2, regF vtemp3, regF vtemp4) ++%{ ++ predicate(UseLSX && !((EncodeISOArrayNode*)n)->is_ascii()); ++ match(Set result (EncodeISOArray src (Binary dst len))); ++ effect(TEMP_DEF result, TEMP tmp1, TEMP tmp2, TEMP tmp3, ++ TEMP vtemp1, TEMP vtemp2, TEMP vtemp3, TEMP vtemp4, USE_KILL src); ++ ++ format %{ "Encode ISO array $src,$dst,$len -> $result @ encode_iso_array" %} ++ ++ ins_encode %{ ++ __ encode_iso_array($src$$Register, $dst$$Register, $len$$Register, ++ $result$$Register, $tmp1$$Register, ++ $tmp2$$Register, $tmp3$$Register, false, ++ $vtemp1$$FloatRegister, $vtemp2$$FloatRegister, ++ $vtemp3$$FloatRegister, $vtemp4$$FloatRegister); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// encode char[] to byte[] in ASCII ++instruct encode_ascii_array(a2_RegP src, mRegP dst, mRegI len, mRegI result, ++ mRegL tmp1, mRegL tmp2, mRegL tmp3, ++ regF vtemp1, regF vtemp2, regF vtemp3, regF vtemp4) ++%{ ++ predicate(UseLSX && ((EncodeISOArrayNode*)n)->is_ascii()); ++ match(Set result (EncodeISOArray src (Binary dst len))); ++ effect(TEMP_DEF result, TEMP tmp1, TEMP tmp2, TEMP tmp3, ++ TEMP vtemp1, TEMP vtemp2, TEMP vtemp3, TEMP vtemp4, USE_KILL src); ++ ++ format %{ "Encode ASCII array $src,$dst,$len -> $result @ encode_ascii_array" %} ++ ++ ins_encode %{ ++ __ encode_iso_array($src$$Register, $dst$$Register, $len$$Register, ++ $result$$Register, $tmp1$$Register, ++ $tmp2$$Register, $tmp3$$Register, true, ++ $vtemp1$$FloatRegister, $vtemp2$$FloatRegister, ++ $vtemp3$$FloatRegister, $vtemp4$$FloatRegister); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++//----------Arithmetic Instructions------------------------------------------- ++//----------Addition Instructions--------------------------------------------- ++instruct addI_Reg_Reg(mRegI dst, mRegIorL2I src1, mRegIorL2I src2) %{ ++ match(Set dst (AddI src1 src2)); ++ ++ format %{ "add $dst, $src1, $src2 #@addI_Reg_Reg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ __ add_w(dst, src1, src2); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct addI_Reg_imm(mRegI dst, mRegIorL2I src1, immI12 src2) %{ ++ match(Set dst (AddI src1 src2)); ++ match(Set dst (AddI (CastII src1) src2)); ++ ++ format %{ "add $dst, $src1, $src2 #@addI_Reg_imm12" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ int imm = $src2$$constant; ++ ++ __ addi_w(dst, src1, imm); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct addI_salI_Reg_Reg_immI_1_4(mRegI dst, mRegI src1, mRegI src2, immI_1_4 shift) %{ ++ match(Set dst (AddI src1 (LShiftI src2 shift))); ++ ++ format %{ "alsl $dst, $src1, $src2, $shift #@addI_salI_Reg_Reg_immI_1_4" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int sh = $shift$$constant; ++ __ alsl_w(dst, src2, src1, sh - 1); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct addP_reg_reg(mRegP dst, mRegP src1, mRegLorI2L src2) %{ ++ match(Set dst (AddP src1 src2)); ++ ++ format %{ "ADD $dst, $src1, $src2 #@addP_reg_reg" %} ++ ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ __ add_d(dst, src1, src2); ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct addP_reg_imm12(mRegP dst, mRegP src1, immL12 src2) %{ ++ match(Set dst (AddP src1 src2)); ++ ++ format %{ "ADD $dst, $src1, $src2 #@addP_reg_imm12" %} ++ ins_encode %{ ++ Register src1 = $src1$$Register; ++ long src2 = $src2$$constant; ++ Register dst = $dst$$Register; ++ ++ __ addi_d(dst, src1, src2); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct addP_salL_Reg_RegI2L_immI_1_4(mRegP dst, mRegP src1, mRegI src2, immI_1_4 shift) %{ ++ match(Set dst (AddP src1 (LShiftL (ConvI2L src2) shift))); ++ ++ format %{ "alsl $dst, $src1, $src2, $shift #@addP_salL_Reg_RegI2L_immI_1_4" %} ++ ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ int sh = $shift$$constant; ++ __ alsl_d(dst, src2, src1, sh - 1); ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Add Long Register with Register ++instruct addL_Reg_Reg(mRegL dst, mRegLorI2L src1, mRegLorI2L src2) %{ ++ match(Set dst (AddL src1 src2)); ++ ins_cost(200); ++ format %{ "ADD $dst, $src1, $src2 #@addL_Reg_Reg\t" %} ++ ++ ins_encode %{ ++ Register dst_reg = as_Register($dst$$reg); ++ Register src1_reg = as_Register($src1$$reg); ++ Register src2_reg = as_Register($src2$$reg); ++ ++ __ add_d(dst_reg, src1_reg, src2_reg); ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct addL_Reg_imm(mRegL dst, mRegLorI2L src1, immL12 src2) ++%{ ++ match(Set dst (AddL src1 src2)); ++ ++ format %{ "ADD $dst, $src1, $src2 #@addL_Reg_imm " %} ++ ins_encode %{ ++ Register dst_reg = as_Register($dst$$reg); ++ Register src1_reg = as_Register($src1$$reg); ++ int src2_imm = $src2$$constant; ++ ++ __ addi_d(dst_reg, src1_reg, src2_imm); ++ %} ++ ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++//----------Abs Instructions------------------------------------------- ++ ++// Integer Absolute Instructions ++instruct absI_rReg(mRegI dst, mRegI src) ++%{ ++ match(Set dst (AbsI src)); ++ effect(TEMP dst); ++ format %{ "AbsI $dst, $src" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ ++ __ srai_w(AT, src, 31); ++ __ xorr(dst, src, AT); ++ __ sub_w(dst, dst, AT); ++ %} ++ ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++// Long Absolute Instructions ++instruct absL_rReg(mRegL dst, mRegLorI2L src) ++%{ ++ match(Set dst (AbsL src)); ++ effect(TEMP dst); ++ format %{ "AbsL $dst, $src" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ ++ __ srai_d(AT, src, 63); ++ __ xorr(dst, src, AT); ++ __ sub_d(dst, dst, AT); ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++//----------Subtraction Instructions------------------------------------------- ++// Integer Subtraction Instructions ++instruct subI_Reg_Reg(mRegI dst, mRegIorL2I src1, mRegIorL2I src2) %{ ++ match(Set dst (SubI src1 src2)); ++ ins_cost(100); ++ ++ format %{ "sub $dst, $src1, $src2 #@subI_Reg_Reg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ __ sub_w(dst, src1, src2); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct subI_Reg_immI_M2047_2048(mRegI dst, mRegIorL2I src1, immI_M2047_2048 src2) %{ ++ match(Set dst (SubI src1 src2)); ++ ins_cost(80); ++ ++ format %{ "sub $dst, $src1, $src2 #@subI_Reg_immI_M2047_2048" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ __ addi_w(dst, src1, -1 * $src2$$constant); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct negI_Reg(mRegI dst, immI_0 zero, mRegIorL2I src) %{ ++ match(Set dst (SubI zero src)); ++ ins_cost(80); ++ ++ format %{ "neg $dst, $src #@negI_Reg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ __ sub_w(dst, R0, src); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct negL_Reg(mRegL dst, immL_0 zero, mRegLorI2L src) %{ ++ match(Set dst (SubL zero src)); ++ ins_cost(80); ++ ++ format %{ "neg $dst, $src #@negL_Reg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ __ sub_d(dst, R0, src); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct subL_Reg_immL_M2047_2048(mRegL dst, mRegL src1, immL_M2047_2048 src2) %{ ++ match(Set dst (SubL src1 src2)); ++ ins_cost(80); ++ ++ format %{ "sub $dst, $src1, $src2 #@subL_Reg_immL_M2047_2048" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ __ addi_d(dst, src1, -1 * $src2$$constant); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++// Subtract Long Register with Register. ++instruct subL_Reg_Reg(mRegL dst, mRegLorI2L src1, mRegLorI2L src2) %{ ++ match(Set dst (SubL src1 src2)); ++ ins_cost(100); ++ format %{ "SubL $dst, $src1, $src2 @ subL_Reg_Reg" %} ++ ins_encode %{ ++ Register dst = as_Register($dst$$reg); ++ Register src1 = as_Register($src1$$reg); ++ Register src2 = as_Register($src2$$reg); ++ ++ __ sub_d(dst, src1, src2); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Integer MOD with Register ++instruct modI_Reg_Reg(mRegI dst, mRegIorL2I src1, mRegIorL2I src2) %{ ++ match(Set dst (ModI src1 src2)); ++ ins_cost(300); ++ format %{ "modi $dst, $src1, $src2 @ modI_Reg_Reg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ ++ __ mod_w(dst, src1, src2); ++ %} ++ ++ ins_pipe( ialu_div ); ++%} ++ ++instruct umodI_Reg_Reg(mRegI dst, mRegIorL2I src1, mRegIorL2I src2) %{ ++ match(Set dst (UModI src1 src2)); ++ format %{ "mod.wu $dst, $src1, $src2 @ umodI_Reg_Reg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ ++ __ mod_wu(dst, src1, src2); ++ %} ++ ++ //ins_pipe( ialu_mod ); ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct umodL_Reg_Reg(mRegL dst, mRegLorI2L src1, mRegLorI2L src2) %{ ++ match(Set dst (UModL src1 src2)); ++ format %{ "mod.du $dst, $src1, $src2 @ umodL_Reg_Reg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ ++ __ mod_du(dst, src1, src2); ++ %} ++ ++ //ins_pipe( ialu_mod ); ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct modL_reg_reg(mRegL dst, mRegLorI2L src1, mRegLorI2L src2) %{ ++ match(Set dst (ModL src1 src2)); ++ format %{ "modL $dst, $src1, $src2 @modL_reg_reg" %} ++ ++ ins_encode %{ ++ Register dst = as_Register($dst$$reg); ++ Register op1 = as_Register($src1$$reg); ++ Register op2 = as_Register($src2$$reg); ++ ++ __ mod_d(dst, op1, op2); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct mulI_Reg_Reg(mRegI dst, mRegI src1, mRegI src2) %{ ++ match(Set dst (MulI src1 src2)); ++ ++ ins_cost(300); ++ format %{ "mul $dst, $src1, $src2 @ mulI_Reg_Reg" %} ++ ins_encode %{ ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ ++ __ mul_w(dst, src1, src2); ++ %} ++ ins_pipe( ialu_mult ); ++%} ++ ++instruct divI_Reg_Reg(mRegI dst, mRegI src1, mRegI src2) %{ ++ match(Set dst (DivI src1 src2)); ++ ++ ins_cost(300); ++ format %{ "div $dst, $src1, $src2 @ divI_Reg_Reg" %} ++ ins_encode %{ ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ ++ __ div_w(dst, src1, src2); ++ ++ %} ++ ins_pipe( ialu_div ); ++%} ++ ++// =================== DivMod nodes ========================== ++// ++// Since we already have the `div` result here, ++// combining the `mul` and the `sub` to calculate ++// the remainder is more efficient than ++// applying the `mod` instruction directly. ++// ++instruct divmodI_Reg_Reg(mT1RegI div, mT2RegI mod, mRegI src1, mRegI src2) %{ ++ match(DivModI src1 src2); ++ effect(TEMP_DEF div, TEMP_DEF mod); ++ ++ format %{ "divmodI $div..$mod, $src1, $src2 @ divmodI_Reg_Reg" %} ++ ++ ins_encode %{ ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ Register div = $div$$Register; ++ Register mod = $mod$$Register; ++ ++ __ div_w(div, src1, src2); ++ __ mul_w(mod, div, src2); ++ __ sub_w(mod, src1, mod); ++ %} ++ ++ ins_pipe( ialu_div ); ++%} ++ ++instruct udivmodI_Reg_Reg(mT1RegI div, mT2RegI mod, mRegI src1, mRegI src2) %{ ++ match(UDivModI src1 src2); ++ effect(TEMP_DEF div, TEMP_DEF mod); ++ ++ format %{ "udivmodI $div..$mod, $src1, $src2 @ udivmodI_Reg_Reg" %} ++ ++ ins_encode %{ ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ Register div = $div$$Register; ++ Register mod = $mod$$Register; ++ ++ __ div_wu(div, src1, src2); ++ __ mul_w(mod, div, src2); ++ __ sub_w(mod, src1, mod); ++ %} ++ ++ ins_pipe( ialu_div ); ++%} ++ ++instruct divmodL_Reg_Reg(t1RegL div, t2RegL mod, mRegL src1, mRegL src2) %{ ++ match(DivModL src1 src2); ++ effect(TEMP_DEF div, TEMP_DEF mod); ++ ++ format %{ "divmodL $div..$mod, $src1, $src2 @ divmodL_Reg_Reg" %} ++ ++ ins_encode %{ ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ Register div = $div$$Register; ++ Register mod = $mod$$Register; ++ ++ __ div_d(div, src1, src2); ++ __ mul_d(mod, div, src2); ++ __ sub_d(mod, src1, mod); ++ %} ++ ++ ins_pipe( ialu_div ); ++%} ++ ++instruct udivmodL_Reg_Reg(t1RegL div, t2RegL mod, mRegL src1, mRegL src2) %{ ++ match(UDivModL src1 src2); ++ effect(TEMP_DEF div, TEMP_DEF mod); ++ ++ format %{ "udivmodL $div..$mod, $src1, $src2 @ udivmodL_Reg_Reg" %} ++ ++ ins_encode %{ ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ Register div = $div$$Register; ++ Register mod = $mod$$Register; ++ ++ __ div_du(div, src1, src2); ++ __ mul_d(mod, div, src2); ++ __ sub_d(mod, src1, mod); ++ %} ++ ++ ins_pipe( ialu_div ); ++%} ++ ++// =================== End of DivMod nodes ========================== ++ ++instruct udivI_Reg_Reg(mRegI dst, mRegI src1, mRegI src2) %{ ++ match(Set dst (UDivI src1 src2)); ++ ++ format %{ "udivI $dst, $src1, $src2 @ udivI_Reg_Reg" %} ++ ins_encode %{ ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ ++ __ div_wu(dst, src1, src2); ++ ++ %} ++ ins_pipe( ialu_div ); ++%} ++ ++instruct divF_Reg_Reg(regF dst, regF src1, regF src2) %{ ++ match(Set dst (DivF src1 src2)); ++ ++ ins_cost(300); ++ format %{ "divF $dst, $src1, $src2 @ divF_Reg_Reg" %} ++ ins_encode %{ ++ FloatRegister src1 = $src1$$FloatRegister; ++ FloatRegister src2 = $src2$$FloatRegister; ++ FloatRegister dst = $dst$$FloatRegister; ++ ++ __ fdiv_s(dst, src1, src2); ++ %} ++ ins_pipe( fpu_div ); ++%} ++ ++instruct divD_Reg_Reg(regD dst, regD src1, regD src2) %{ ++ match(Set dst (DivD src1 src2)); ++ ++ ins_cost(300); ++ format %{ "divD $dst, $src1, $src2 @ divD_Reg_Reg" %} ++ ins_encode %{ ++ FloatRegister src1 = $src1$$FloatRegister; ++ FloatRegister src2 = $src2$$FloatRegister; ++ FloatRegister dst = $dst$$FloatRegister; ++ ++ __ fdiv_d(dst, src1, src2); ++ %} ++ ins_pipe( fpu_div ); ++%} ++ ++instruct mulL_reg_reg(mRegL dst, mRegLorI2L src1, mRegLorI2L src2) %{ ++ match(Set dst (MulL src1 src2)); ++ format %{ "mulL $dst, $src1, $src2 @mulL_reg_reg" %} ++ ins_encode %{ ++ Register dst = as_Register($dst$$reg); ++ Register op1 = as_Register($src1$$reg); ++ Register op2 = as_Register($src2$$reg); ++ ++ __ mul_d(dst, op1, op2); ++ %} ++ ins_pipe( fpu_arith ); ++%} ++ ++instruct mulHiL_reg_reg(mRegL dst, mRegL src1, mRegL src2) %{ ++ match(Set dst (MulHiL src1 src2)); ++ format %{ "mulHiL $dst, $src1, $src2 @mulL_reg_reg" %} ++ ins_encode %{ ++ Register dst = as_Register($dst$$reg); ++ Register op1 = as_Register($src1$$reg); ++ Register op2 = as_Register($src2$$reg); ++ ++ __ mulh_d(dst, op1, op2); ++ %} ++ ins_pipe( fpu_arith); ++%} ++ ++instruct umulHiL_reg_reg(mRegL dst, mRegL src1, mRegL src2) %{ ++ match(Set dst (UMulHiL src1 src2)); ++ format %{ "mulh.du $dst, $src1, $src2 @umulHiL_reg_reg" %} ++ ++ ins_encode %{ ++ __ mulh_du($dst$$Register, $src1$$Register, $src2$$Register); ++ %} ++ ins_pipe( ialu_mult ); ++%} ++ ++instruct divL_reg_reg(mRegL dst, mRegL src1, mRegL src2) %{ ++ match(Set dst (DivL src1 src2)); ++ format %{ "divL $dst, $src1, $src2 @divL_reg_reg" %} ++ ++ ins_encode %{ ++ Register dst = as_Register($dst$$reg); ++ Register op1 = as_Register($src1$$reg); ++ Register op2 = as_Register($src2$$reg); ++ ++ __ div_d(dst, op1, op2); ++ %} ++ ins_pipe( ialu_div ); ++%} ++ ++instruct udivL_reg_reg(mRegL dst, mRegLorI2L src1, mRegLorI2L src2) %{ ++ match(Set dst (UDivL src1 src2)); ++ format %{ "udivL $dst, $src1, $src2 @udivL_reg_reg" %} ++ ++ ins_encode %{ ++ Register dst = as_Register($dst$$reg); ++ Register op1 = as_Register($src1$$reg); ++ Register op2 = as_Register($src2$$reg); ++ ++ __ div_du(dst, op1, op2); ++ %} ++ ins_pipe( ialu_div ); ++%} ++ ++instruct addF_reg_reg(regF dst, regF src1, regF src2) %{ ++ match(Set dst (AddF src1 src2)); ++ format %{ "AddF $dst, $src1, $src2 @addF_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src1 = as_FloatRegister($src1$$reg); ++ FloatRegister src2 = as_FloatRegister($src2$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fadd_s(dst, src1, src2); ++ %} ++ ins_pipe( fpu_arith); ++%} ++ ++instruct subF_reg_reg(regF dst, regF src1, regF src2) %{ ++ match(Set dst (SubF src1 src2)); ++ format %{ "SubF $dst, $src1, $src2 @subF_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src1 = as_FloatRegister($src1$$reg); ++ FloatRegister src2 = as_FloatRegister($src2$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fsub_s(dst, src1, src2); ++ %} ++ ins_pipe( fpu_arith ); ++%} ++instruct addD_reg_reg(regD dst, regD src1, regD src2) %{ ++ match(Set dst (AddD src1 src2)); ++ format %{ "AddD $dst, $src1, $src2 @addD_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src1 = as_FloatRegister($src1$$reg); ++ FloatRegister src2 = as_FloatRegister($src2$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fadd_d(dst, src1, src2); ++ %} ++ ins_pipe( fpu_arith ); ++%} ++ ++instruct subD_reg_reg(regD dst, regD src1, regD src2) %{ ++ match(Set dst (SubD src1 src2)); ++ format %{ "SubD $dst, $src1, $src2 @subD_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src1 = as_FloatRegister($src1$$reg); ++ FloatRegister src2 = as_FloatRegister($src2$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fsub_d(dst, src1, src2); ++ %} ++ ins_pipe( fpu_arith ); ++%} ++ ++instruct negF_reg(regF dst, regF src) %{ ++ match(Set dst (NegF src)); ++ format %{ "negF $dst, $src @negF_reg" %} ++ ins_encode %{ ++ FloatRegister src = as_FloatRegister($src$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fneg_s(dst, src); ++ %} ++ ins_pipe( fpu_absnegmov ); ++%} ++ ++instruct negD_reg(regD dst, regD src) %{ ++ match(Set dst (NegD src)); ++ format %{ "negD $dst, $src @negD_reg" %} ++ ins_encode %{ ++ FloatRegister src = as_FloatRegister($src$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fneg_d(dst, src); ++ %} ++ ins_pipe( fpu_absnegmov ); ++%} ++ ++ ++instruct mulF_reg_reg(regF dst, regF src1, regF src2) %{ ++ match(Set dst (MulF src1 src2)); ++ format %{ "MULF $dst, $src1, $src2 @mulF_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src1 = $src1$$FloatRegister; ++ FloatRegister src2 = $src2$$FloatRegister; ++ FloatRegister dst = $dst$$FloatRegister; ++ ++ __ fmul_s(dst, src1, src2); ++ %} ++ ins_pipe( fpu_arith ); ++%} ++ ++// Mul two double precision floating point number ++instruct mulD_reg_reg(regD dst, regD src1, regD src2) %{ ++ match(Set dst (MulD src1 src2)); ++ format %{ "MULD $dst, $src1, $src2 @mulD_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src1 = $src1$$FloatRegister; ++ FloatRegister src2 = $src2$$FloatRegister; ++ FloatRegister dst = $dst$$FloatRegister; ++ ++ __ fmul_d(dst, src1, src2); ++ %} ++ ins_pipe( fpu_arith ); ++%} ++ ++instruct absF_reg(regF dst, regF src) %{ ++ match(Set dst (AbsF src)); ++ ins_cost(100); ++ format %{ "absF $dst, $src @absF_reg" %} ++ ins_encode %{ ++ FloatRegister src = as_FloatRegister($src$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fabs_s(dst, src); ++ %} ++ ins_pipe( fpu_absnegmov ); ++%} ++ ++ ++// intrinsics for math_native. ++// AbsD SqrtD CosD SinD TanD LogD Log10D ++ ++instruct absD_reg(regD dst, regD src) %{ ++ match(Set dst (AbsD src)); ++ ins_cost(100); ++ format %{ "absD $dst, $src @absD_reg" %} ++ ins_encode %{ ++ FloatRegister src = as_FloatRegister($src$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fabs_d(dst, src); ++ %} ++ ins_pipe( fpu_absnegmov ); ++%} ++ ++instruct sqrtD_reg(regD dst, regD src) %{ ++ match(Set dst (SqrtD src)); ++ ins_cost(100); ++ format %{ "SqrtD $dst, $src @sqrtD_reg" %} ++ ins_encode %{ ++ FloatRegister src = as_FloatRegister($src$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fsqrt_d(dst, src); ++ %} ++ ins_pipe( fpu_div ); ++%} ++ ++instruct sqrtF_reg(regF dst, regF src) %{ ++ match(Set dst (ConvD2F (SqrtD (ConvF2D src)))); ++ ins_cost(100); ++ format %{ "SqrtF $dst, $src @sqrtF_reg" %} ++ ins_encode %{ ++ FloatRegister src = as_FloatRegister($src$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fsqrt_s(dst, src); ++ %} ++ ins_pipe( fpu_div ); ++%} ++ ++// src1 * src2 + src3 ++instruct maddF_reg_reg(regF dst, regF src1, regF src2, regF src3) %{ ++ predicate(UseFMA); ++ match(Set dst (FmaF src3 (Binary src1 src2))); ++ ++ format %{ "fmadd_s $dst, $src1, $src2, $src3" %} ++ ++ ins_encode %{ ++ __ fmadd_s(as_FloatRegister($dst$$reg), as_FloatRegister($src1$$reg), ++ as_FloatRegister($src2$$reg), as_FloatRegister($src3$$reg)); ++ %} ++ ++ ins_pipe( fpu_arith3 ); ++%} ++ ++// src1 * src2 + src3 ++instruct maddD_reg_reg(regD dst, regD src1, regD src2, regD src3) %{ ++ predicate(UseFMA); ++ match(Set dst (FmaD src3 (Binary src1 src2))); ++ ++ format %{ "fmadd_d $dst, $src1, $src2, $src3" %} ++ ++ ins_encode %{ ++ __ fmadd_d(as_FloatRegister($dst$$reg), as_FloatRegister($src1$$reg), ++ as_FloatRegister($src2$$reg), as_FloatRegister($src3$$reg)); ++ %} ++ ++ ins_pipe( fpu_arith3 ); ++%} ++ ++// src1 * src2 - src3 ++instruct msubF_reg_reg(regF dst, regF src1, regF src2, regF src3, immF_0 zero) %{ ++ predicate(UseFMA); ++ match(Set dst (FmaF (NegF src3) (Binary src1 src2))); ++ ++ format %{ "fmsub_s $dst, $src1, $src2, $src3" %} ++ ++ ins_encode %{ ++ __ fmsub_s(as_FloatRegister($dst$$reg), as_FloatRegister($src1$$reg), ++ as_FloatRegister($src2$$reg), as_FloatRegister($src3$$reg)); ++ %} ++ ++ ins_pipe( fpu_arith3 ); ++%} ++ ++// src1 * src2 - src3 ++instruct msubD_reg_reg(regD dst, regD src1, regD src2, regD src3, immD_0 zero) %{ ++ predicate(UseFMA); ++ match(Set dst (FmaD (NegD src3) (Binary src1 src2))); ++ ++ format %{ "fmsub_d $dst, $src1, $src2, $src3" %} ++ ++ ins_encode %{ ++ __ fmsub_d(as_FloatRegister($dst$$reg), as_FloatRegister($src1$$reg), ++ as_FloatRegister($src2$$reg), as_FloatRegister($src3$$reg)); ++ %} ++ ++ ins_pipe( fpu_arith3 ); ++%} ++ ++// -src1 * src2 - src3 ++instruct mnaddF_reg_reg(regF dst, regF src1, regF src2, regF src3, immF_0 zero) %{ ++ predicate(UseFMA); ++ match(Set dst (FmaF (NegF src3) (Binary (NegF src1) src2))); ++ match(Set dst (FmaF (NegF src3) (Binary src1 (NegF src2)))); ++ ++ format %{ "fnmadds $dst, $src1, $src2, $src3" %} ++ ++ ins_encode %{ ++ __ fnmadd_s(as_FloatRegister($dst$$reg), as_FloatRegister($src1$$reg), ++ as_FloatRegister($src2$$reg), as_FloatRegister($src3$$reg)); ++ %} ++ ++ ins_pipe( fpu_arith3 ); ++%} ++ ++// -src1 * src2 - src3 ++instruct mnaddD_reg_reg(regD dst, regD src1, regD src2, regD src3, immD_0 zero) %{ ++ predicate(UseFMA); ++ match(Set dst (FmaD (NegD src3) (Binary (NegD src1) src2))); ++ match(Set dst (FmaD (NegD src3) (Binary src1 (NegD src2)))); ++ ++ format %{ "fnmaddd $dst, $src1, $src2, $src3" %} ++ ++ ins_encode %{ ++ __ fnmadd_d(as_FloatRegister($dst$$reg), as_FloatRegister($src1$$reg), ++ as_FloatRegister($src2$$reg), as_FloatRegister($src3$$reg)); ++ %} ++ ++ ins_pipe( fpu_arith3 ); ++%} ++ ++// -src1 * src2 + src3 ++instruct mnsubF_reg_reg(regF dst, regF src1, regF src2, regF src3) %{ ++ predicate(UseFMA); ++ match(Set dst (FmaF src3 (Binary (NegF src1) src2))); ++ match(Set dst (FmaF src3 (Binary src1 (NegF src2)))); ++ ++ format %{ "fnmsubs $dst, $src1, $src2, $src3" %} ++ ++ ins_encode %{ ++ __ fnmsub_s(as_FloatRegister($dst$$reg), as_FloatRegister($src1$$reg), ++ as_FloatRegister($src2$$reg), as_FloatRegister($src3$$reg)); ++ %} ++ ++ ins_pipe( fpu_arith3 ); ++%} ++ ++// -src1 * src2 + src3 ++instruct mnsubD_reg_reg(regD dst, regD src1, regD src2, regD src3) %{ ++ predicate(UseFMA); ++ match(Set dst (FmaD src3 (Binary (NegD src1) src2))); ++ match(Set dst (FmaD src3 (Binary src1 (NegD src2)))); ++ ++ format %{ "fnmsubd $dst, $src1, $src2, $src3" %} ++ ++ ins_encode %{ ++ __ fnmsub_d(as_FloatRegister($dst$$reg), as_FloatRegister($src1$$reg), ++ as_FloatRegister($src2$$reg), as_FloatRegister($src3$$reg)); ++ %} ++ ++ ins_pipe( fpu_arith3 ); ++%} ++ ++instruct copySignF_reg(regF dst, regF src1, regF src2) %{ ++ match(Set dst (CopySignF src1 src2)); ++ effect(TEMP_DEF dst, USE src1, USE src2); ++ ++ format %{ "fcopysign_s $dst $src1 $src2 @ copySignF_reg" %} ++ ++ ins_encode %{ ++ __ fcopysign_s($dst$$FloatRegister, ++ $src1$$FloatRegister, ++ $src2$$FloatRegister); ++ %} ++ ++ ins_pipe( fpu_arith ); ++%} ++ ++instruct copySignD_reg(regD dst, regD src1, regD src2, immD_0 zero) %{ ++ match(Set dst (CopySignD src1 (Binary src2 zero))); ++ effect(TEMP_DEF dst, USE src1, USE src2); ++ ++ format %{ "fcopysign_d $dst $src1 $src2 @ copySignD_reg" %} ++ ++ ins_encode %{ ++ __ fcopysign_d($dst$$FloatRegister, ++ $src1$$FloatRegister, ++ $src2$$FloatRegister); ++ %} ++ ++ ins_pipe( fpu_arith ); ++%} ++ ++instruct signumF_reg(regF dst, regF src, regF zero, regF one, regF tmp) %{ ++ match(Set dst (SignumF src (Binary zero one))); ++ effect(TEMP_DEF dst, TEMP tmp); ++ format %{ "signumF $dst, $src, $zero, $one\t# TEMP($tmp) @signumF_reg" %} ++ ins_encode %{ ++ __ fcmp_clt_s(FCC0, $zero$$FloatRegister, $src$$FloatRegister); ++ __ fsel($dst$$FloatRegister, $src$$FloatRegister, $one$$FloatRegister, FCC0); ++ __ fcmp_clt_s(FCC0, $src$$FloatRegister, $zero$$FloatRegister); ++ __ fneg_s($tmp$$FloatRegister, $one$$FloatRegister); ++ __ fsel($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister, FCC0); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct signumD_reg(regD dst, regD src, regD zero, regD one, regD tmp) %{ ++ match(Set dst (SignumD src (Binary zero one))); ++ effect(TEMP_DEF dst, TEMP tmp); ++ format %{ "signumF $dst, $src, $zero, $one\t# TEMP($tmp) @signumD_reg" %} ++ ins_encode %{ ++ __ fcmp_clt_d(FCC0, $zero$$FloatRegister, $src$$FloatRegister); ++ __ fsel($dst$$FloatRegister, $src$$FloatRegister, $one$$FloatRegister, FCC0); ++ __ fcmp_clt_d(FCC0, $src$$FloatRegister, $zero$$FloatRegister); ++ __ fneg_d($tmp$$FloatRegister, $one$$FloatRegister); ++ __ fsel($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister, FCC0); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++//----------------------------------Logical Instructions---------------------- ++//__________________________________Integer Logical Instructions------------- ++ ++//And Instuctions ++// And Register with Immediate ++instruct andI_Reg_imm_0_4095(mRegI dst, mRegI src1, immI_0_4095 src2) %{ ++ match(Set dst (AndI src1 src2)); ++ ins_cost(60); ++ ++ format %{ "and $dst, $src1, $src2 #@andI_Reg_imm_0_4095" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src1$$Register; ++ int val = $src2$$constant; ++ ++ __ andi(dst, src, val); ++ ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct andI_Reg_immI_nonneg_mask(mRegI dst, mRegI src1, immI_nonneg_mask mask) %{ ++ match(Set dst (AndI src1 mask)); ++ ins_cost(60); ++ ++ format %{ "and $dst, $src1, $mask #@andI_Reg_immI_nonneg_mask" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src1$$Register; ++ int val = $mask$$constant; ++ int size = Assembler::count_trailing_ones(val); ++ ++ __ bstrpick_w(dst, src, size-1, 0); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct andL_Reg_immL_nonneg_mask(mRegL dst, mRegL src1, immL_nonneg_mask mask) %{ ++ match(Set dst (AndL src1 mask)); ++ ins_cost(60); ++ ++ format %{ "and $dst, $src1, $mask #@andL_Reg_immL_nonneg_mask" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src1$$Register; ++ long val = $mask$$constant; ++ int size = Assembler::count_trailing_ones(val); ++ ++ __ bstrpick_d(dst, src, size-1, 0); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct xorI_Reg_imm_0_4095(mRegI dst, mRegI src1, immI_0_4095 src2) %{ ++ match(Set dst (XorI src1 src2)); ++ ins_cost(60); ++ ++ format %{ "xori $dst, $src1, $src2 #@xorI_Reg_imm_0_4095" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src1$$Register; ++ int val = $src2$$constant; ++ ++ __ xori(dst, src, val); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct xorI_Reg_immI_M1(mRegI dst, mRegIorL2I src1, immI_M1 M1) %{ ++ match(Set dst (XorI src1 M1)); ++ ins_cost(60); ++ ++ format %{ "xor $dst, $src1, $M1 #@xorI_Reg_immI_M1" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src1$$Register; ++ ++ __ orn(dst, R0, src); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct xorL_Reg_imm_0_4095(mRegL dst, mRegL src1, immL_0_4095 src2) %{ ++ match(Set dst (XorL src1 src2)); ++ ins_cost(60); ++ ++ format %{ "xori $dst, $src1, $src2 #@xorL_Reg_imm_0_4095" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src1$$Register; ++ int val = $src2$$constant; ++ ++ __ xori(dst, src, val); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct andI_Reg_Reg(mRegI dst, mRegI src1, mRegI src2) %{ ++ match(Set dst (AndI src1 src2)); ++ ++ format %{ "and $dst, $src1, $src2 #@andI_Reg_Reg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ ++ __ andr(dst, src1, src2); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct andnI_Reg_nReg(mRegI dst, mRegI src1, mRegI src2, immI_M1 M1) %{ ++ match(Set dst (AndI src1 (XorI src2 M1))); ++ ++ format %{ "andn $dst, $src1, $src2 #@andnI_Reg_nReg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ ++ __ andn(dst, src1, src2); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct ornI_Reg_nReg(mRegI dst, mRegI src1, mRegI src2, immI_M1 M1) %{ ++ match(Set dst (OrI src1 (XorI src2 M1))); ++ ++ format %{ "orn $dst, $src1, $src2 #@ornI_Reg_nReg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ ++ __ orn(dst, src1, src2); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct ornI_nReg_Reg(mRegI dst, mRegI src1, mRegI src2, immI_M1 M1) %{ ++ match(Set dst (OrI (XorI src1 M1) src2)); ++ ++ format %{ "orn $dst, $src2, $src1 #@ornI_nReg_Reg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ ++ __ orn(dst, src2, src1); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// And Long Register with Register ++instruct andL_Reg_Reg(mRegL dst, mRegL src1, mRegLorI2L src2) %{ ++ match(Set dst (AndL src1 src2)); ++ format %{ "AND $dst, $src1, $src2 @ andL_Reg_Reg\n\t" %} ++ ins_encode %{ ++ Register dst_reg = as_Register($dst$$reg); ++ Register src1_reg = as_Register($src1$$reg); ++ Register src2_reg = as_Register($src2$$reg); ++ ++ __ andr(dst_reg, src1_reg, src2_reg); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct andL_Reg_imm_0_4095(mRegL dst, mRegL src1, immL_0_4095 src2) %{ ++ match(Set dst (AndL src1 src2)); ++ ins_cost(60); ++ ++ format %{ "and $dst, $src1, $src2 #@andL_Reg_imm_0_4095" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src1$$Register; ++ long val = $src2$$constant; ++ ++ __ andi(dst, src, val); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct andL2I_Reg_imm_0_4095(mRegI dst, mRegL src1, immL_0_4095 src2) %{ ++ match(Set dst (ConvL2I (AndL src1 src2))); ++ ins_cost(60); ++ ++ format %{ "and $dst, $src1, $src2 #@andL2I_Reg_imm_0_4095" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src1$$Register; ++ long val = $src2$$constant; ++ ++ __ andi(dst, src, val); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct andI_Reg_immI_zeroins_mask(mRegI dst, immI_zeroins_mask mask) %{ ++ match(Set dst (AndI dst mask)); ++ ins_cost(60); ++ ++ format %{ "and $dst, $dst, $mask #@andI_Reg_immI_zeroins_mask" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ int val = $mask$$constant; ++ int msb = 31 - Assembler::count_leading_ones(val); ++ int lsb = Assembler::count_trailing_ones(val); ++ ++ __ bstrins_w(dst, R0, msb, lsb); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct andL_Reg_immL_zeroins_mask(mRegL dst, immL_zeroins_mask mask) %{ ++ match(Set dst (AndL dst mask)); ++ ins_cost(60); ++ ++ format %{ "and $dst, $dst, $mask #@andL_Reg_immL_zeroins_mask" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ long val = $mask$$constant; ++ int msb = 63 - Assembler::count_leading_ones(val); ++ int lsb = Assembler::count_trailing_ones(val); ++ ++ __ bstrins_d(dst, R0, msb, lsb); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++// Or Long Register with Register ++instruct orL_Reg_Reg(mRegL dst, mRegLorI2L src1, mRegLorI2L src2) %{ ++ match(Set dst (OrL src1 src2)); ++ format %{ "OR $dst, $src1, $src2 @ orL_Reg_Reg\t" %} ++ ins_encode %{ ++ Register dst_reg = $dst$$Register; ++ Register src1_reg = $src1$$Register; ++ Register src2_reg = $src2$$Register; ++ ++ __ orr(dst_reg, src1_reg, src2_reg); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct orL_Reg_P2XReg(mRegL dst, mRegP src1, mRegLorI2L src2) %{ ++ match(Set dst (OrL (CastP2X src1) src2)); ++ format %{ "OR $dst, $src1, $src2 @ orL_Reg_P2XReg\t" %} ++ ins_encode %{ ++ Register dst_reg = $dst$$Register; ++ Register src1_reg = $src1$$Register; ++ Register src2_reg = $src2$$Register; ++ ++ __ orr(dst_reg, src1_reg, src2_reg); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Xor Long Register with Register ++ ++instruct xorL_Reg_Reg(mRegL dst, mRegLorI2L src1, mRegLorI2L src2) %{ ++ match(Set dst (XorL src1 src2)); ++ format %{ "XOR $dst, $src1, $src2 @ xorL_Reg_Reg\t" %} ++ ins_encode %{ ++ Register dst_reg = as_Register($dst$$reg); ++ Register src1_reg = as_Register($src1$$reg); ++ Register src2_reg = as_Register($src2$$reg); ++ ++ __ xorr(dst_reg, src1_reg, src2_reg); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct xorL_Reg_P2XReg(mRegL dst, mRegP src1, mRegLorI2L src2) %{ ++ match(Set dst (XorL (CastP2X src1) src2)); ++ format %{ "XOR $dst, $src1, $src2 @ xorL_Reg_P2XReg\t" %} ++ ins_encode %{ ++ __ xorr($dst$$Register, $src1$$Register, $src2$$Register); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Shift Left by 5-bit immediate ++instruct salI_Reg_imm(mRegI dst, mRegIorL2I src, immIU5 shift) %{ ++ match(Set dst (LShiftI src shift)); ++ ++ format %{ "SHL $dst, $src, $shift #@salI_Reg_imm" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ int shamt = $shift$$constant; ++ ++ __ slli_w(dst, src, shamt); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct salI_Reg_imm_and_M65536(mRegI dst, mRegI src, immI_16 shift, immI_M65536 mask) %{ ++ match(Set dst (AndI (LShiftI src shift) mask)); ++ ++ format %{ "SHL $dst, $src, $shift #@salI_Reg_imm_and_M65536" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ ++ __ slli_w(dst, src, 16); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++// Logical Shift Right by 16, followed by Arithmetic Shift Left by 16. ++// This idiom is used by the compiler the i2s bytecode. ++instruct i2s(mRegI dst, mRegI src, immI_16 sixteen) ++%{ ++ match(Set dst (RShiftI (LShiftI src sixteen) sixteen)); ++ ++ format %{ "i2s $dst, $src\t# @i2s" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ ++ __ ext_w_h(dst, src); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Logical Shift Right by 24, followed by Arithmetic Shift Left by 24. ++// This idiom is used by the compiler for the i2b bytecode. ++instruct i2b(mRegI dst, mRegI src, immI_24 twentyfour) ++%{ ++ match(Set dst (RShiftI (LShiftI src twentyfour) twentyfour)); ++ ++ format %{ "i2b $dst, $src\t# @i2b" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ ++ __ ext_w_b(dst, src); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++ ++instruct salI_RegL2I_imm(mRegI dst, mRegL src, immIU5 shift) %{ ++ match(Set dst (LShiftI (ConvL2I src) shift)); ++ ++ format %{ "SHL $dst, $src, $shift #@salI_RegL2I_imm" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ int shamt = $shift$$constant; ++ ++ __ slli_w(dst, src, shamt); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++// Shift Left by 8-bit immediate ++instruct salI_Reg_Reg(mRegI dst, mRegIorL2I src, mRegI shift) %{ ++ match(Set dst (LShiftI src shift)); ++ ++ format %{ "SHL $dst, $src, $shift #@salI_Reg_Reg" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ Register shamt = $shift$$Register; ++ __ sll_w(dst, src, shamt); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++ ++// Shift Left Long 6-bit immI ++instruct salL_Reg_imm(mRegL dst, mRegLorI2L src, immIU6 shift) %{ ++ match(Set dst (LShiftL src shift)); ++ ins_cost(100); ++ format %{ "salL $dst, $src, $shift @ salL_Reg_imm" %} ++ ins_encode %{ ++ Register src_reg = as_Register($src$$reg); ++ Register dst_reg = as_Register($dst$$reg); ++ int shamt = $shift$$constant; ++ ++ __ slli_d(dst_reg, src_reg, shamt); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++// Shift Left Long ++instruct salL_Reg_Reg(mRegL dst, mRegLorI2L src, mRegI shift) %{ ++ match(Set dst (LShiftL src shift)); ++ ins_cost(100); ++ format %{ "salL $dst, $src, $shift @ salL_Reg_Reg" %} ++ ins_encode %{ ++ Register src_reg = as_Register($src$$reg); ++ Register dst_reg = as_Register($dst$$reg); ++ ++ __ sll_d(dst_reg, src_reg, $shift$$Register); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Shift Right Long 6-bit ++instruct sarL_Reg_imm(mRegL dst, mRegLorI2L src, immIU6 shift) %{ ++ match(Set dst (RShiftL src shift)); ++ ins_cost(100); ++ format %{ "sarL $dst, $src, $shift @ sarL_Reg_imm" %} ++ ins_encode %{ ++ Register src_reg = as_Register($src$$reg); ++ Register dst_reg = as_Register($dst$$reg); ++ int shamt = $shift$$constant; ++ ++ __ srai_d(dst_reg, src_reg, shamt); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct sarL2I_Reg_immI_32_63(mRegI dst, mRegLorI2L src, immI_32_63 shift) %{ ++ match(Set dst (ConvL2I (RShiftL src shift))); ++ ins_cost(100); ++ format %{ "sarL $dst, $src, $shift @ sarL2I_Reg_immI_32_63" %} ++ ins_encode %{ ++ Register src_reg = as_Register($src$$reg); ++ Register dst_reg = as_Register($dst$$reg); ++ int shamt = $shift$$constant; ++ ++ __ srai_d(dst_reg, src_reg, shamt); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++// Shift Right Long arithmetically ++instruct sarL_Reg_Reg(mRegL dst, mRegLorI2L src, mRegI shift) %{ ++ match(Set dst (RShiftL src shift)); ++ ins_cost(100); ++ format %{ "sarL $dst, $src, $shift @ sarL_Reg_Reg" %} ++ ins_encode %{ ++ Register src_reg = as_Register($src$$reg); ++ Register dst_reg = as_Register($dst$$reg); ++ ++ __ sra_d(dst_reg, src_reg, $shift$$Register); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Shift Right Long logically ++instruct slrL_Reg_Reg(mRegL dst, mRegL src, mRegI shift) %{ ++ match(Set dst (URShiftL src shift)); ++ ins_cost(100); ++ format %{ "slrL $dst, $src, $shift @ slrL_Reg_Reg" %} ++ ins_encode %{ ++ Register src_reg = as_Register($src$$reg); ++ Register dst_reg = as_Register($dst$$reg); ++ ++ __ srl_d(dst_reg, src_reg, $shift$$Register); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct slrL_Reg_immI_0_31(mRegL dst, mRegLorI2L src, immI_0_31 shift) %{ ++ match(Set dst (URShiftL src shift)); ++ ins_cost(80); ++ format %{ "slrL $dst, $src, $shift @ slrL_Reg_immI_0_31" %} ++ ins_encode %{ ++ Register src_reg = as_Register($src$$reg); ++ Register dst_reg = as_Register($dst$$reg); ++ int shamt = $shift$$constant; ++ ++ __ srli_d(dst_reg, src_reg, shamt); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct slrL_Reg_immI_0_31_and_max_int(mRegI dst, mRegLorI2L src, immI_0_31 shift, immI_MaxI max_int) %{ ++ match(Set dst (AndI (ConvL2I (URShiftL src shift)) max_int)); ++ ins_cost(80); ++ format %{ "bstrpick_d $dst, $src, $shift+30, shift @ slrL_Reg_immI_0_31_and_max_int" %} ++ ins_encode %{ ++ Register src_reg = as_Register($src$$reg); ++ Register dst_reg = as_Register($dst$$reg); ++ int shamt = $shift$$constant; ++ ++ __ bstrpick_d(dst_reg, src_reg, shamt+30, shamt); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct slrL_P2XReg_immI_0_31(mRegL dst, mRegP src, immI_0_31 shift) %{ ++ match(Set dst (URShiftL (CastP2X src) shift)); ++ ins_cost(80); ++ format %{ "slrL $dst, $src, $shift @ slrL_P2XReg_immI_0_31" %} ++ ins_encode %{ ++ Register src_reg = as_Register($src$$reg); ++ Register dst_reg = as_Register($dst$$reg); ++ int shamt = $shift$$constant; ++ ++ __ srli_d(dst_reg, src_reg, shamt); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct slrL_Reg_immI_32_63(mRegL dst, mRegLorI2L src, immI_32_63 shift) %{ ++ match(Set dst (URShiftL src shift)); ++ ins_cost(80); ++ format %{ "slrL $dst, $src, $shift @ slrL_Reg_immI_32_63" %} ++ ins_encode %{ ++ Register src_reg = as_Register($src$$reg); ++ Register dst_reg = as_Register($dst$$reg); ++ int shamt = $shift$$constant; ++ ++ __ srli_d(dst_reg, src_reg, shamt); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct slrL_Reg_immI_convL2I(mRegI dst, mRegLorI2L src, immI_32_63 shift) %{ ++ match(Set dst (ConvL2I (URShiftL src shift))); ++ predicate(n->in(1)->in(2)->get_int() > 32); ++ ins_cost(80); ++ format %{ "slrL $dst, $src, $shift @ slrL_Reg_immI_convL2I" %} ++ ins_encode %{ ++ Register src_reg = as_Register($src$$reg); ++ Register dst_reg = as_Register($dst$$reg); ++ int shamt = $shift$$constant; ++ ++ __ srli_d(dst_reg, src_reg, shamt); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct slrL_P2XReg_immI_32_63(mRegL dst, mRegP src, immI_32_63 shift) %{ ++ match(Set dst (URShiftL (CastP2X src) shift)); ++ ins_cost(80); ++ format %{ "slrL $dst, $src, $shift @ slrL_P2XReg_immI_32_63" %} ++ ins_encode %{ ++ Register src_reg = as_Register($src$$reg); ++ Register dst_reg = as_Register($dst$$reg); ++ int shamt = $shift$$constant; ++ ++ __ srli_d(dst_reg, src_reg, shamt); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++// Xor Instructions ++// Xor Register with Register ++instruct xorI_Reg_Reg(mRegI dst, mRegI src1, mRegI src2) %{ ++ match(Set dst (XorI src1 src2)); ++ ++ format %{ "XOR $dst, $src1, $src2 #@xorI_Reg_Reg" %} ++ ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ __ xorr(dst, src1, src2); ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Or Instructions ++instruct orI_Reg_imm(mRegI dst, mRegI src1, immI_0_4095 src2) %{ ++ match(Set dst (OrI src1 src2)); ++ ++ format %{ "OR $dst, $src1, $src2 #@orI_Reg_imm" %} ++ ins_encode %{ ++ __ ori($dst$$Register, $src1$$Register, $src2$$constant); ++ %} ++ ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++// Or Register with Register ++instruct orI_Reg_Reg(mRegI dst, mRegI src1, mRegI src2) %{ ++ match(Set dst (OrI src1 src2)); ++ ++ format %{ "OR $dst, $src1, $src2 #@orI_Reg_Reg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ __ orr(dst, src1, src2); ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct rotI_shr_logical_Reg(mRegI dst, mRegI src, immI_0_31 rshift, immI_0_31 lshift, immI_1 one) %{ ++ match(Set dst (OrI (URShiftI src rshift) (LShiftI (AndI src one) lshift))); ++ predicate(32 == ((n->in(1)->in(2)->get_int() + n->in(2)->in(2)->get_int()))); ++ ++ format %{ "rotri_w $dst, $src, 1 ...\n\t" ++ "srli_w $dst, $dst, ($rshift-1) @ rotI_shr_logical_Reg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int rshift = $rshift$$constant; ++ ++ __ rotri_w(dst, src, 1); ++ if (rshift - 1) { ++ __ srli_w(dst, dst, rshift - 1); ++ } ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct orI_Reg_castP2X(mRegL dst, mRegL src1, mRegP src2) %{ ++ match(Set dst (OrI src1 (CastP2X src2))); ++ ++ format %{ "OR $dst, $src1, $src2 #@orI_Reg_castP2X" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ __ orr(dst, src1, src2); ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Logical Shift Right by 5-bit immediate ++instruct shr_logical_Reg_imm(mRegI dst, mRegI src, immIU5 shift) %{ ++ match(Set dst (URShiftI src shift)); ++ //effect(KILL cr); ++ ++ format %{ "SRLI_W $dst, $src, $shift #@shr_logical_Reg_imm" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ int shift = $shift$$constant; ++ ++ __ srli_w(dst, src, shift); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct shr_logical_Reg_imm_nonneg_mask(mRegI dst, mRegI src, immI_0_31 shift, immI_nonneg_mask mask) %{ ++ match(Set dst (AndI (URShiftI src shift) mask)); ++ ++ format %{ "bstrpick_w $dst, $src, $shift+one-bits($mask)-1, shift #@shr_logical_Reg_imm_nonneg_mask" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ int pos = $shift$$constant; ++ int val = $mask$$constant; ++ int size = Assembler::count_trailing_ones(val); ++ ++ __ bstrpick_w(dst, src, pos+size-1, pos); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct rolI_Reg_immI_0_31(mRegI dst, mRegI src, immI_0_31 lshift, immI_0_31 rshift) ++%{ ++ predicate(0 == ((n->in(1)->in(2)->get_int() + n->in(2)->in(2)->get_int()) & 0x1f)); ++ match(Set dst (OrI (LShiftI src lshift) (URShiftI src rshift))); ++ ++ ins_cost(100); ++ format %{ "rotri_w $dst, $src, $rshift #@rolI_Reg_immI_0_31" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int sa = $rshift$$constant; ++ ++ __ rotri_w(dst, src, sa); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct rolL_Reg_immI_0_31(mRegL dst, mRegLorI2L src, immI_32_63 lshift, immI_0_31 rshift) ++%{ ++ predicate(0 == ((n->in(1)->in(2)->get_int() + n->in(2)->in(2)->get_int()) & 0x3f)); ++ match(Set dst (OrL (LShiftL src lshift) (URShiftL src rshift))); ++ ++ ins_cost(100); ++ format %{ "rotri_d $dst, $src, $rshift #@rolL_Reg_immI_0_31" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int sa = $rshift$$constant; ++ ++ __ rotri_d(dst, src, sa); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct rolL_Reg_immI_32_63(mRegL dst, mRegLorI2L src, immI_0_31 lshift, immI_32_63 rshift) ++%{ ++ predicate(0 == ((n->in(1)->in(2)->get_int() + n->in(2)->in(2)->get_int()) & 0x3f)); ++ match(Set dst (OrL (LShiftL src lshift) (URShiftL src rshift))); ++ ++ ins_cost(100); ++ format %{ "rotri_d $dst, $src, $rshift #@rolL_Reg_immI_32_63" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int sa = $rshift$$constant; ++ ++ __ rotri_d(dst, src, sa); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct rorI_Reg_immI_0_31(mRegI dst, mRegI src, immI_0_31 rshift, immI_0_31 lshift) ++%{ ++ predicate(0 == ((n->in(1)->in(2)->get_int() + n->in(2)->in(2)->get_int()) & 0x1f)); ++ match(Set dst (OrI (URShiftI src rshift) (LShiftI src lshift))); ++ ++ ins_cost(100); ++ format %{ "rotri_w $dst, $src, $rshift #@rorI_Reg_immI_0_31" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int sa = $rshift$$constant; ++ ++ __ rotri_w(dst, src, sa); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct rorL_Reg_immI_0_31(mRegL dst, mRegLorI2L src, immI_0_31 rshift, immI_32_63 lshift) ++%{ ++ predicate(0 == ((n->in(1)->in(2)->get_int() + n->in(2)->in(2)->get_int()) & 0x3f)); ++ match(Set dst (OrL (URShiftL src rshift) (LShiftL src lshift))); ++ ++ ins_cost(100); ++ format %{ "rotri_d $dst, $src, $rshift #@rorL_Reg_immI_0_31" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int sa = $rshift$$constant; ++ ++ __ rotri_d(dst, src, sa); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct rorL_Reg_immI_32_63(mRegL dst, mRegLorI2L src, immI_32_63 rshift, immI_0_31 lshift) ++%{ ++ predicate(0 == ((n->in(1)->in(2)->get_int() + n->in(2)->in(2)->get_int()) & 0x3f)); ++ match(Set dst (OrL (URShiftL src rshift) (LShiftL src lshift))); ++ ++ ins_cost(100); ++ format %{ "rotri_d $dst, $src, $rshift #@rorL_Reg_immI_32_63" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ int sa = $rshift$$constant; ++ ++ __ rotri_d(dst, src, sa); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++// Rotate Shift Left ++instruct rolI_reg(mRegI dst, mRegI src, mRegI shift) ++%{ ++ match(Set dst (RotateLeft src shift)); ++ ++ format %{ "rotl_w $dst, $src, $shift @ rolI_reg" %} ++ ++ ins_encode %{ ++ __ sub_w(AT, R0, $shift$$Register); ++ __ rotr_w($dst$$Register, $src$$Register, AT); ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct rolL_reg(mRegL dst, mRegL src, mRegI shift) ++%{ ++ match(Set dst (RotateLeft src shift)); ++ ++ format %{ "rotl_d $dst, $src, $shift @ rolL_reg" %} ++ ++ ins_encode %{ ++ __ sub_d(AT, R0, $shift$$Register); ++ __ rotr_d($dst$$Register, $src$$Register, AT); ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Rotate Shift Right ++instruct rorI_imm(mRegI dst, mRegI src, immI shift) ++%{ ++ match(Set dst (RotateRight src shift)); ++ ++ format %{ "rotri_w $dst, $src, $shift @ rorI_imm" %} ++ ++ ins_encode %{ ++ __ rotri_w($dst$$Register, $src$$Register, $shift$$constant/* & 0x1f*/); ++ %} ++ ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct rorI_reg(mRegI dst, mRegI src, mRegI shift) ++%{ ++ match(Set dst (RotateRight src shift)); ++ ++ format %{ "rotr_w $dst, $src, $shift @ rorI_reg" %} ++ ++ ins_encode %{ ++ __ rotr_w($dst$$Register, $src$$Register, $shift$$Register); ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct rorL_imm(mRegL dst, mRegL src, immI shift) ++%{ ++ match(Set dst (RotateRight src shift)); ++ ++ format %{ "rotri_d $dst, $src, $shift @ rorL_imm" %} ++ ++ ins_encode %{ ++ __ rotri_d($dst$$Register, $src$$Register, $shift$$constant/* & 0x3f*/); ++ %} ++ ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct rorL_reg(mRegL dst, mRegL src, mRegI shift) ++%{ ++ match(Set dst (RotateRight src shift)); ++ ++ format %{ "rotr_d $dst, $src, $shift @ rorL_reg" %} ++ ++ ins_encode %{ ++ __ rotr_d($dst$$Register, $src$$Register, $shift$$Register); ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Logical Shift Right ++instruct shr_logical_Reg_Reg(mRegI dst, mRegI src, mRegI shift) %{ ++ match(Set dst (URShiftI src shift)); ++ ++ format %{ "SRL_W $dst, $src, $shift #@shr_logical_Reg_Reg" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ Register shift = $shift$$Register; ++ __ srl_w(dst, src, shift); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++ ++instruct shr_arith_Reg_imm(mRegI dst, mRegI src, immIU5 shift) %{ ++ match(Set dst (RShiftI src shift)); ++ // effect(KILL cr); ++ ++ format %{ "SRAI_W $dst, $src, $shift #@shr_arith_Reg_imm" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ int shift = $shift$$constant; ++ __ srai_w(dst, src, shift); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct shr_arith_Reg_Reg(mRegI dst, mRegI src, mRegI shift) %{ ++ match(Set dst (RShiftI src shift)); ++ // effect(KILL cr); ++ ++ format %{ "SRA_W $dst, $src, $shift #@shr_arith_Reg_Reg" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ Register shift = $shift$$Register; ++ __ sra_w(dst, src, shift); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++//----------Convert Int to Boolean--------------------------------------------- ++ ++instruct convI2B(mRegI dst, mRegI src) %{ ++ match(Set dst (Conv2B src)); ++ ++ ins_cost(100); ++ format %{ "convI2B $dst, $src @ convI2B" %} ++ ins_encode %{ ++ Register dst = as_Register($dst$$reg); ++ Register src = as_Register($src$$reg); ++ ++ __ sltu(dst, R0, src); ++ %} ++ ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct convI2L_reg( mRegL dst, mRegI src) %{ ++ match(Set dst (ConvI2L src)); ++ ++ ins_cost(100); ++ format %{ "SLLI_W $dst, $src @ convI2L_reg\t" %} ++ ++ ins_encode %{ ++ Register dst = as_Register($dst$$reg); ++ Register src = as_Register($src$$reg); ++ ++ if(dst != src) __ slli_w(dst, src, 0); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct convL2I_reg( mRegI dst, mRegLorI2L src ) %{ ++ match(Set dst (ConvL2I src)); ++ ++ format %{ "MOV $dst, $src @ convL2I_reg" %} ++ ins_encode %{ ++ Register dst = as_Register($dst$$reg); ++ Register src = as_Register($src$$reg); ++ ++ __ slli_w(dst, src, 0); ++ %} ++ ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct convL2D_reg( regD dst, mRegL src ) %{ ++ match(Set dst (ConvL2D src)); ++ format %{ "convL2D $dst, $src @ convL2D_reg" %} ++ ins_encode %{ ++ Register src = as_Register($src$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ movgr2fr_d(dst, src); ++ __ ffint_d_l(dst, dst); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++ ++// Convert double to int. ++// If the double is NaN, stuff a zero in instead. ++instruct convD2I_reg_reg(mRegI dst, regD src, regD tmp) %{ ++ match(Set dst (ConvD2I src)); ++ effect(USE src, TEMP tmp); ++ ++ format %{ "convd2i $dst, $src, using $tmp as TEMP @ convD2I_reg_reg" %} ++ ++ ins_encode %{ ++ __ ftintrz_w_d($tmp$$FloatRegister, $src$$FloatRegister); ++ __ movfr2gr_s($dst$$Register, $tmp$$FloatRegister); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct convD2L_reg_reg(mRegL dst, regD src, regD tmp) %{ ++ match(Set dst (ConvD2L src)); ++ effect(USE src, TEMP tmp); ++ ++ format %{ "convd2l $dst, $src, using $tmp as TEMP @ convD2L_reg_reg" %} ++ ++ ins_encode %{ ++ __ ftintrz_l_d($tmp$$FloatRegister, $src$$FloatRegister); ++ __ movfr2gr_d($dst$$Register, $tmp$$FloatRegister); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// Convert float to int. ++// If the float is NaN, stuff a zero in instead. ++instruct convF2I_reg_reg(mRegI dst, regF src, regF tmp) %{ ++ match(Set dst (ConvF2I src)); ++ effect(USE src, TEMP tmp); ++ ++ format %{ "convf2i $dst, $src, using $tmp as TEMP @ convF2I_reg_reg" %} ++ ++ ins_encode %{ ++ __ ftintrz_w_s($tmp$$FloatRegister, $src$$FloatRegister); ++ __ movfr2gr_s($dst$$Register, $tmp$$FloatRegister); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct convF2L_reg_reg(mRegL dst, regF src, regF tmp) %{ ++ match(Set dst (ConvF2L src)); ++ effect(USE src, TEMP tmp); ++ ++ format %{ "convf2l $dst, $src, using $tmp as TEMP @ convF2L_reg_reg" %} ++ ++ ins_encode %{ ++ __ ftintrz_l_s($tmp$$FloatRegister, $src$$FloatRegister); ++ __ movfr2gr_d($dst$$Register, $tmp$$FloatRegister); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++ ++instruct convL2F_reg( regF dst, mRegL src ) %{ ++ match(Set dst (ConvL2F src)); ++ format %{ "convl2f $dst, $src @ convL2F_reg" %} ++ ins_encode %{ ++ FloatRegister dst = $dst$$FloatRegister; ++ Register src = as_Register($src$$reg); ++ Label L; ++ ++ __ movgr2fr_d(dst, src); ++ __ ffint_s_l(dst, dst); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct convI2F_reg( regF dst, mRegI src ) %{ ++ match(Set dst (ConvI2F src)); ++ format %{ "convi2f $dst, $src @ convI2F_reg" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ FloatRegister dst = $dst$$FloatRegister; ++ ++ __ movgr2fr_w(dst, src); ++ __ ffint_s_w(dst, dst); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct convF2HF_reg_reg(mRegI dst, regF src, regF tmp) %{ ++ predicate(UseLSX); ++ match(Set dst (ConvF2HF src)); ++ format %{ "fcvt_f2hf $dst, $src\t# TMEP($tmp) @convF2HF_reg_reg" %} ++ effect(TEMP tmp); ++ ins_encode %{ ++ __ flt_to_flt16($dst$$Register, $src$$FloatRegister, $tmp$$FloatRegister); ++ %} ++ ins_pipe(pipe_slow); ++%} ++ ++instruct convHF2F_reg_reg(regF dst, mRegI src, regF tmp) %{ ++ predicate(UseLSX); ++ match(Set dst (ConvHF2F src)); ++ format %{ "fcvt_hf2f $dst, $src\t# TMEP($tmp) @convHF2F_reg_reg" %} ++ effect(TEMP tmp); ++ ins_encode %{ ++ __ flt16_to_flt($dst$$FloatRegister, $src$$Register, $tmp$$FloatRegister); ++ %} ++ ins_pipe(pipe_slow); ++%} ++ ++instruct round_float_reg(mRegI dst, regF src, regF vtemp1) ++%{ ++ match(Set dst (RoundF src)); ++ effect(TEMP_DEF dst, TEMP vtemp1); ++ format %{ "round_float $dst, $src\t# " ++ "TEMP($vtemp1) @round_float_reg" %} ++ ins_encode %{ ++ __ java_round_float($dst$$Register, ++ $src$$FloatRegister, ++ $vtemp1$$FloatRegister); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct round_double_reg(mRegL dst, regD src, regD vtemp1) ++%{ ++ match(Set dst (RoundD src)); ++ effect(TEMP_DEF dst, TEMP vtemp1); ++ format %{ "round_double $dst, $src\t# " ++ "TEMP($vtemp1) @round_double_reg" %} ++ ins_encode %{ ++ __ java_round_double($dst$$Register, ++ $src$$FloatRegister, ++ $vtemp1$$FloatRegister); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct roundD(regD dst, regD src, immI rmode) %{ ++ predicate(UseLSX); ++ match(Set dst (RoundDoubleMode src rmode)); ++ format %{ "frint $dst, $src, $rmode\t# @roundD" %} ++ ins_encode %{ ++ switch ($rmode$$constant) { ++ case RoundDoubleModeNode::rmode_rint: __ vfrintrne_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ case RoundDoubleModeNode::rmode_floor: __ vfrintrm_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ case RoundDoubleModeNode::rmode_ceil: __ vfrintrp_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmpLTMask_immI_0( mRegI dst, mRegI p, immI_0 zero ) %{ ++ match(Set dst (CmpLTMask p zero)); ++ ins_cost(100); ++ ++ format %{ "srai_w $dst, $p, 31 @ cmpLTMask_immI_0" %} ++ ins_encode %{ ++ Register src = $p$$Register; ++ Register dst = $dst$$Register; ++ ++ __ srai_w(dst, src, 31); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++ ++instruct cmpLTMask( mRegI dst, mRegI p, mRegI q ) %{ ++ match(Set dst (CmpLTMask p q)); ++ ins_cost(400); ++ ++ format %{ "cmpLTMask $dst, $p, $q @ cmpLTMask" %} ++ ins_encode %{ ++ Register p = $p$$Register; ++ Register q = $q$$Register; ++ Register dst = $dst$$Register; ++ ++ __ slt(dst, p, q); ++ __ sub_d(dst, R0, dst); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct convP2B(mRegI dst, mRegP src) %{ ++ match(Set dst (Conv2B src)); ++ ++ ins_cost(100); ++ format %{ "convP2B $dst, $src @ convP2B" %} ++ ins_encode %{ ++ Register dst = as_Register($dst$$reg); ++ Register src = as_Register($src$$reg); ++ ++ __ sltu(dst, R0, src); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++ ++instruct convI2D_reg_reg(regD dst, mRegI src) %{ ++ match(Set dst (ConvI2D src)); ++ format %{ "conI2D $dst, $src @convI2D_reg" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ FloatRegister dst = $dst$$FloatRegister; ++ __ movgr2fr_w(dst ,src); ++ __ ffint_d_w(dst, dst); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct convF2D_reg_reg(regD dst, regF src) %{ ++ match(Set dst (ConvF2D src)); ++ format %{ "convF2D $dst, $src\t# @convF2D_reg_reg" %} ++ ins_encode %{ ++ FloatRegister dst = $dst$$FloatRegister; ++ FloatRegister src = $src$$FloatRegister; ++ ++ __ fcvt_d_s(dst, src); ++ %} ++ ins_pipe( fpu_cvt ); ++%} ++ ++instruct convD2F_reg_reg(regF dst, regD src) %{ ++ match(Set dst (ConvD2F src)); ++ format %{ "convD2F $dst, $src\t# @convD2F_reg_reg" %} ++ ins_encode %{ ++ FloatRegister dst = $dst$$FloatRegister; ++ FloatRegister src = $src$$FloatRegister; ++ ++ __ fcvt_s_d(dst, src); ++ %} ++ ins_pipe( fpu_cvt ); ++%} ++ ++ ++// Convert oop pointer into compressed form ++instruct encodeHeapOop(mRegN dst, mRegP src) %{ ++ predicate(n->bottom_type()->make_ptr()->ptr() != TypePtr::NotNull); ++ match(Set dst (EncodeP src)); ++ format %{ "encode_heap_oop $dst,$src" %} ++ ins_encode %{ ++ Register src = $src$$Register; ++ Register dst = $dst$$Register; ++ ++ __ encode_heap_oop(dst, src); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct encodeHeapOop_not_null(mRegN dst, mRegP src) %{ ++ predicate(n->bottom_type()->make_ptr()->ptr() == TypePtr::NotNull); ++ match(Set dst (EncodeP src)); ++ format %{ "encode_heap_oop_not_null $dst,$src @ encodeHeapOop_not_null" %} ++ ins_encode %{ ++ __ encode_heap_oop_not_null($dst$$Register, $src$$Register); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct decodeHeapOop(mRegP dst, mRegN src) %{ ++ predicate(n->bottom_type()->is_ptr()->ptr() != TypePtr::NotNull && ++ n->bottom_type()->is_ptr()->ptr() != TypePtr::Constant); ++ match(Set dst (DecodeN src)); ++ format %{ "decode_heap_oop $dst,$src @ decodeHeapOop" %} ++ ins_encode %{ ++ Register s = $src$$Register; ++ Register d = $dst$$Register; ++ ++ __ decode_heap_oop(d, s); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct decodeHeapOop_not_null(mRegP dst, mRegN src) %{ ++ predicate(n->bottom_type()->is_ptr()->ptr() == TypePtr::NotNull || ++ n->bottom_type()->is_ptr()->ptr() == TypePtr::Constant); ++ match(Set dst (DecodeN src)); ++ format %{ "decode_heap_oop_not_null $dst,$src @ decodeHeapOop_not_null" %} ++ ins_encode %{ ++ Register s = $src$$Register; ++ Register d = $dst$$Register; ++ if (s != d) { ++ __ decode_heap_oop_not_null(d, s); ++ } else { ++ __ decode_heap_oop_not_null(d); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct encodeKlass_not_null(mRegN dst, mRegP src) %{ ++ match(Set dst (EncodePKlass src)); ++ format %{ "encode_heap_oop_not_null $dst,$src @ encodeKlass_not_null" %} ++ ins_encode %{ ++ __ encode_klass_not_null($dst$$Register, $src$$Register); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct decodeKlass_not_null(mRegP dst, mRegN src) %{ ++ match(Set dst (DecodeNKlass src)); ++ format %{ "decode_heap_klass_not_null $dst,$src" %} ++ ins_encode %{ ++ Register s = $src$$Register; ++ Register d = $dst$$Register; ++ if (s != d) { ++ __ decode_klass_not_null(d, s); ++ } else { ++ __ decode_klass_not_null(d); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ============================================================================ ++// This name is KNOWN by the ADLC and cannot be changed. ++// The ADLC forces a 'TypeRawPtr::BOTTOM' output type ++// for this guy. ++instruct tlsLoadP(javaThread_RegP dst) ++%{ ++ match(Set dst (ThreadLocal)); ++ ++ ins_cost(0); ++ ++ format %{ " -- \t// $dst=Thread::current(), empty encoding, #@tlsLoadP" %} ++ ++ size(0); ++ ++ ins_encode( /*empty*/ ); ++ ++ ins_pipe( empty ); ++%} ++ ++ ++instruct checkCastPP( mRegP dst ) %{ ++ match(Set dst (CheckCastPP dst)); ++ ++ format %{ "#checkcastPP of $dst (empty encoding) #@chekCastPP" %} ++ ins_encode( /*empty encoding*/ ); ++ ins_pipe( empty ); ++%} ++ ++instruct castPP(mRegP dst) ++%{ ++ match(Set dst (CastPP dst)); ++ ++ size(0); ++ format %{ "# castPP of $dst" %} ++ ins_encode(/* empty encoding */); ++ ins_pipe( empty ); ++%} ++ ++instruct castII( mRegI dst ) %{ ++ match(Set dst (CastII dst)); ++ format %{ "#castII of $dst empty encoding" %} ++ ins_encode( /*empty encoding*/ ); ++ ins_cost(0); ++ ins_pipe( empty ); ++%} ++ ++instruct castLL(mRegL dst) ++%{ ++ match(Set dst (CastLL dst)); ++ ++ size(0); ++ format %{ "# castLL of $dst" %} ++ ins_encode(/* empty encoding */); ++ ins_cost(0); ++ ins_pipe( empty ); ++%} ++ ++instruct castFF(regF dst) %{ ++ match(Set dst (CastFF dst)); ++ size(0); ++ format %{ "# castFF of $dst" %} ++ ins_encode(/*empty*/); ++ ins_pipe( empty ); ++%} ++ ++instruct castDD(regD dst) %{ ++ match(Set dst (CastDD dst)); ++ size(0); ++ format %{ "# castDD of $dst" %} ++ ins_encode(/*empty*/); ++ ins_pipe( empty ); ++%} ++ ++instruct castVV(vReg dst) %{ ++ match(Set dst (CastVV dst)); ++ size(0); ++ format %{ "# castVV of $dst" %} ++ ins_encode(/*empty*/); ++ ins_pipe( empty ); ++%} ++ ++// Return Instruction ++// Remove the return address & jump to it. ++instruct Ret() %{ ++ match(Return); ++ format %{ "RET #@Ret" %} ++ ++ ins_encode %{ ++ __ jr(RA); ++ %} ++ ++ ins_pipe( pipe_jump ); ++%} ++ ++ ++ ++// Tail Jump; remove the return address; jump to target. ++// TailCall above leaves the return address around. ++// TailJump is used in only one place, the rethrow_Java stub (fancy_jump=2). ++// ex_oop (Exception Oop) is needed in %o0 at the jump. As there would be a ++// "restore" before this instruction (in Epilogue), we need to materialize it ++// in %i0. ++//FIXME ++instruct tailjmpInd(mRegP jump_target, a0_RegP ex_oop, mA1RegI exception_pc) %{ ++ match( TailJump jump_target ex_oop ); ++ ins_cost(200); ++ format %{ "Jmp $jump_target ; ex_oop = $ex_oop #@tailjmpInd" %} ++ ins_encode %{ ++ Register target = $jump_target$$Register; ++ ++ // A0, A1 are indicated in: ++ // [stubGenerator_loongarch.cpp] generate_forward_exception() ++ // [runtime_loongarch.cpp] OptoRuntime::generate_exception_blob() ++ __ move($exception_pc$$Register, RA); ++ __ jr(target); ++ %} ++ ins_pipe( pipe_jump ); ++%} ++ ++// ============================================================================ ++// Procedure Call/Return Instructions ++// Call Java Static Instruction ++// Note: If this code changes, the corresponding ret_addr_offset() and ++// compute_padding() functions will have to be adjusted. ++instruct CallStaticJavaDirect(method meth) %{ ++ match(CallStaticJava); ++ effect(USE meth); ++ ++ ins_cost(300); ++ format %{ "CALL,static #@CallStaticJavaDirect " %} ++ ins_encode( Java_Static_Call( meth ) ); ++ ins_pipe( pipe_slow ); ++ ins_pc_relative(1); ++ ins_alignment(4); ++%} ++ ++// Call Java Dynamic Instruction ++// Note: If this code changes, the corresponding ret_addr_offset() and ++// compute_padding() functions will have to be adjusted. ++instruct CallDynamicJavaDirect(method meth) %{ ++ match(CallDynamicJava); ++ effect(USE meth); ++ ++ ins_cost(300); ++ format %{"MOV IC_Klass, #Universe::non_oop_word()\n\t" ++ "CallDynamic @ CallDynamicJavaDirect" %} ++ ins_encode( Java_Dynamic_Call( meth ) ); ++ ins_pipe( pipe_slow ); ++ ins_pc_relative(1); ++ ins_alignment(4); ++%} ++ ++instruct CallLeafNoFPDirect(method meth) %{ ++ match(CallLeafNoFP); ++ effect(USE meth); ++ ++ ins_cost(300); ++ format %{ "CALL_LEAF_NOFP,runtime " %} ++ ins_encode(Java_To_Runtime(meth)); ++ ins_pipe( pipe_slow ); ++ ins_pc_relative(1); ++ ins_alignment(4); ++%} ++ ++// Prefetch instructions for allocation. ++ ++instruct prefetchAlloc(memory mem) %{ ++ match(PrefetchAllocation mem); ++ ins_cost(125); ++ format %{ "preld $mem\t# Prefetch allocation @ prefetchAlloc" %} ++ ins_encode %{ ++ int base = $mem$$base; ++ int index = $mem$$index; ++ int disp = $mem$$disp; ++ ++ if (index != -1) { ++ __ add_d(AT, as_Register(base), as_Register(index)); ++ __ preld(8, AT, disp); ++ } else { ++ __ preld(8, as_Register(base), disp); ++ } ++ %} ++ ins_pipe( pipe_prefetch ); ++%} ++ ++// Call runtime without safepoint ++instruct CallLeafDirect(method meth) %{ ++ match(CallLeaf); ++ effect(USE meth); ++ ++ ins_cost(300); ++ format %{ "CALL_LEAF,runtime #@CallLeafDirect " %} ++ ins_encode(Java_To_Runtime(meth)); ++ ins_pipe( pipe_slow ); ++ ins_pc_relative(1); ++ ins_alignment(4); ++%} ++ ++// Load Char (16bit unsigned) ++instruct loadUS(mRegI dst, memory mem) %{ ++ match(Set dst (LoadUS mem)); ++ ++ ins_cost(125); ++ format %{ "loadUS $dst,$mem @ loadC" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_U_SHORT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++instruct loadUS_convI2L(mRegL dst, memory mem) %{ ++ match(Set dst (ConvI2L (LoadUS mem))); ++ ++ ins_cost(125); ++ format %{ "loadUS $dst,$mem @ loadUS_convI2L" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_U_SHORT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// Store Char (16bit unsigned) ++instruct storeC(memory mem, mRegIorL2I src) %{ ++ match(Set mem (StoreC mem src)); ++ ++ ins_cost(125); ++ format %{ "storeC $src, $mem @ storeC" %} ++ ins_encode %{ ++ __ loadstore_enc($src$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_CHAR); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct storeC_0(memory mem, immI_0 zero) %{ ++ match(Set mem (StoreC mem zero)); ++ ++ ins_cost(125); ++ format %{ "storeC $zero, $mem @ storeC_0" %} ++ ins_encode %{ ++ __ loadstore_enc(R0, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_SHORT); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++ ++instruct loadConF_immF_0(regF dst, immF_0 zero) %{ ++ match(Set dst zero); ++ ins_cost(100); ++ ++ format %{ "mov $dst, zero @ loadConF_immF_0\n"%} ++ ins_encode %{ ++ FloatRegister dst = $dst$$FloatRegister; ++ ++ __ movgr2fr_w(dst, R0); ++ %} ++ ins_pipe( fpu_movgrfr ); ++%} ++ ++ ++instruct loadConF(regF dst, immF src) %{ ++ match(Set dst src); ++ ins_cost(125); ++ ++ format %{ "fld_s $dst, $constantoffset[$constanttablebase] # load FLOAT $src from table @ loadConF" %} ++ ins_encode %{ ++ int con_offset = $constantoffset($src); ++ ++ if (Assembler::is_simm(con_offset, 12)) { ++ __ fld_s($dst$$FloatRegister, $constanttablebase, con_offset); ++ } else { ++ __ li(AT, con_offset); ++ __ fldx_s($dst$$FloatRegister, $constanttablebase, AT); ++ } ++ %} ++ ins_pipe( fpu_load ); ++%} ++ ++instruct loadConFVec(regF dst, immFVec src) %{ ++ match(Set dst src); ++ ins_cost(50); ++ ++ format %{ "vldi $dst, $src # load FLOAT $src @ loadConFVec" %} ++ ins_encode %{ ++ int val = Assembler::get_vec_imm($src$$constant); ++ ++ __ vldi($dst$$FloatRegister, val); ++ %} ++ ins_pipe( fpu_load ); ++%} ++ ++ ++instruct loadConD_immD_0(regD dst, immD_0 zero) %{ ++ match(Set dst zero); ++ ins_cost(100); ++ ++ format %{ "mov $dst, zero @ loadConD_immD_0"%} ++ ins_encode %{ ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ movgr2fr_d(dst, R0); ++ %} ++ ins_pipe( fpu_movgrfr ); ++%} ++ ++instruct loadConD(regD dst, immD src) %{ ++ match(Set dst src); ++ ins_cost(125); ++ ++ format %{ "fld_d $dst, $constantoffset[$constanttablebase] # load DOUBLE $src from table @ loadConD" %} ++ ins_encode %{ ++ int con_offset = $constantoffset($src); ++ ++ if (Assembler::is_simm(con_offset, 12)) { ++ __ fld_d($dst$$FloatRegister, $constanttablebase, con_offset); ++ } else { ++ __ li(AT, con_offset); ++ __ fldx_d($dst$$FloatRegister, $constanttablebase, AT); ++ } ++ %} ++ ins_pipe( fpu_load ); ++%} ++ ++instruct loadConDVec(regD dst, immDVec src) %{ ++ match(Set dst src); ++ ins_cost(50); ++ ++ format %{ "vldi $dst, $src # load DOUBLE $src @ loadConDVec" %} ++ ins_encode %{ ++ int val = Assembler::get_vec_imm($src$$constant); ++ ++ __ vldi($dst$$FloatRegister, val); ++ %} ++ ins_pipe( fpu_load ); ++%} ++ ++// Store register Float value (it is faster than store from FPU register) ++instruct storeF_reg( memory mem, regF src) %{ ++ match(Set mem (StoreF mem src)); ++ ++ ins_cost(50); ++ format %{ "store $mem, $src\t# store float @ storeF_reg" %} ++ ins_encode %{ ++ __ loadstore_enc($src$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_FLOAT); ++ %} ++ ins_pipe( fpu_store ); ++%} ++ ++instruct storeF_immF_0( memory mem, immF_0 zero) %{ ++ match(Set mem (StoreF mem zero)); ++ ++ ins_cost(40); ++ format %{ "store $mem, zero\t# store float @ storeF_immF_0" %} ++ ins_encode %{ ++ __ loadstore_enc(R0, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_INT); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++// Load Double ++instruct loadD(regD dst, memory mem) %{ ++ match(Set dst (LoadD mem)); ++ ++ ins_cost(150); ++ format %{ "loadD $dst, $mem #@loadD" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_DOUBLE); ++ %} ++ ins_pipe( fpu_load ); ++%} ++ ++// Load Double - UNaligned ++instruct loadD_unaligned(regD dst, memory mem ) %{ ++ match(Set dst (LoadD_unaligned mem)); ++ ins_cost(250); ++ // FIXME: Need more effective ldl/ldr ++ format %{ "loadD_unaligned $dst, $mem #@loadD_unaligned" %} ++ ins_encode %{ ++ __ loadstore_enc($dst$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_DOUBLE); ++ %} ++ ins_pipe( fpu_load ); ++%} ++ ++instruct storeD_reg( memory mem, regD src) %{ ++ match(Set mem (StoreD mem src)); ++ ++ ins_cost(50); ++ format %{ "store $mem, $src\t# store float @ storeD_reg" %} ++ ins_encode %{ ++ __ loadstore_enc($src$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_DOUBLE); ++ %} ++ ins_pipe( fpu_store ); ++%} ++ ++instruct storeD_immD_0( memory mem, immD_0 zero) %{ ++ match(Set mem (StoreD mem zero)); ++ ++ ins_cost(40); ++ format %{ "store $mem, zero\t# store float @ storeD_immD_0" %} ++ ins_encode %{ ++ __ loadstore_enc(R0, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_LONG); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct loadSSI(mRegI dst, stackSlotI src) ++%{ ++ match(Set dst src); ++ ++ ins_cost(125); ++ format %{ "ld_w $dst, $src\t# int stk @ loadSSI" %} ++ ins_encode %{ ++ guarantee( Assembler::is_simm($src$$disp, 12), "disp too long (loadSSI) !"); ++ __ ld_w($dst$$Register, SP, $src$$disp); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++instruct storeSSI(stackSlotI dst, mRegI src) ++%{ ++ match(Set dst src); ++ ++ ins_cost(100); ++ format %{ "st_w $dst, $src\t# int stk @ storeSSI" %} ++ ins_encode %{ ++ guarantee( Assembler::is_simm($dst$$disp, 12), "disp too long (storeSSI) !"); ++ __ st_w($src$$Register, SP, $dst$$disp); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct loadSSL(mRegL dst, stackSlotL src) ++%{ ++ match(Set dst src); ++ ++ ins_cost(125); ++ format %{ "ld_d $dst, $src\t# long stk @ loadSSL" %} ++ ins_encode %{ ++ guarantee( Assembler::is_simm($src$$disp, 12), "disp too long (loadSSL) !"); ++ __ ld_d($dst$$Register, SP, $src$$disp); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++instruct storeSSL(stackSlotL dst, mRegL src) ++%{ ++ match(Set dst src); ++ ++ ins_cost(100); ++ format %{ "st_d $dst, $src\t# long stk @ storeSSL" %} ++ ins_encode %{ ++ guarantee( Assembler::is_simm($dst$$disp, 12), "disp too long (storeSSL) !"); ++ __ st_d($src$$Register, SP, $dst$$disp); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct loadSSP(mRegP dst, stackSlotP src) ++%{ ++ match(Set dst src); ++ ++ ins_cost(125); ++ format %{ "ld_d $dst, $src\t# ptr stk @ loadSSP" %} ++ ins_encode %{ ++ guarantee( Assembler::is_simm($src$$disp, 12), "disp too long (loadSSP) !"); ++ __ ld_d($dst$$Register, SP, $src$$disp); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++instruct storeSSP(stackSlotP dst, mRegP src) ++%{ ++ match(Set dst src); ++ ++ ins_cost(100); ++ format %{ "st_d $dst, $src\t# ptr stk @ storeSSP" %} ++ ins_encode %{ ++ guarantee( Assembler::is_simm($dst$$disp, 12), "disp too long (storeSSP) !"); ++ __ st_d($src$$Register, SP, $dst$$disp); ++ %} ++ ins_pipe( ialu_store ); ++%} ++ ++instruct loadSSF(regF dst, stackSlotF src) ++%{ ++ match(Set dst src); ++ ++ ins_cost(125); ++ format %{ "fld_s $dst, $src\t# float stk @ loadSSF" %} ++ ins_encode %{ ++ guarantee( Assembler::is_simm($src$$disp, 12), "disp too long (loadSSF) !"); ++ __ fld_s($dst$$FloatRegister, SP, $src$$disp); ++ %} ++ ins_pipe( fpu_load ); ++%} ++ ++instruct storeSSF(stackSlotF dst, regF src) ++%{ ++ match(Set dst src); ++ ++ ins_cost(100); ++ format %{ "fst_s $dst, $src\t# float stk @ storeSSF" %} ++ ins_encode %{ ++ guarantee( Assembler::is_simm($dst$$disp, 12), "disp too long (storeSSF) !"); ++ __ fst_s($src$$FloatRegister, SP, $dst$$disp); ++ %} ++ ins_pipe( fpu_store ); ++%} ++ ++// Use the same format since predicate() can not be used here. ++instruct loadSSD(regD dst, stackSlotD src) ++%{ ++ match(Set dst src); ++ ++ ins_cost(125); ++ format %{ "fld_d $dst, $src\t# double stk @ loadSSD" %} ++ ins_encode %{ ++ guarantee( Assembler::is_simm($src$$disp, 12), "disp too long (loadSSD) !"); ++ __ fld_d($dst$$FloatRegister, SP, $src$$disp); ++ %} ++ ins_pipe( fpu_load ); ++%} ++ ++instruct storeSSD(stackSlotD dst, regD src) ++%{ ++ match(Set dst src); ++ ++ ins_cost(100); ++ format %{ "fst_d $dst, $src\t# double stk @ storeSSD" %} ++ ins_encode %{ ++ guarantee( Assembler::is_simm($dst$$disp, 12), "disp too long (storeSSD) !"); ++ __ fst_d($src$$FloatRegister, SP, $dst$$disp); ++ %} ++ ins_pipe( fpu_store ); ++%} ++ ++instruct cmpFastLock(FlagsReg cr, no_CR_mRegP object, no_CR_mRegP box, no_CR_mRegP tmp1, no_CR_mRegP tmp2) %{ ++ match(Set cr (FastLock object box)); ++ effect(TEMP tmp1, TEMP tmp2); ++ ++ format %{ "FASTLOCK $cr <-- $object, $box, $tmp1, $tmp2 #@ cmpFastLock" %} ++ ++ ins_encode %{ ++ __ fast_lock_c2($object$$Register, $box$$Register, $cr$$Register, $tmp1$$Register, $tmp2$$Register); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct cmpFastUnlock(FlagsReg cr, no_CR_mRegP object, no_CR_mRegP box, no_CR_mRegP tmp1, no_CR_mRegP tmp2) %{ ++ match(Set cr (FastUnlock object box)); ++ effect(TEMP tmp1, TEMP tmp2); ++ ++ format %{ "FASTUNLOCK $cr <-- $object, $box, $tmp1, $tmp2 #@ cmpFastUnlock" %} ++ ++ ins_encode %{ ++ __ fast_unlock_c2($object$$Register, $box$$Register, $cr$$Register, $tmp1$$Register, $tmp2$$Register); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// Store card-mark Immediate 0 ++instruct storeImmCM(memory mem, immI_0 zero) %{ ++ match(Set mem (StoreCM mem zero)); ++ ++ ins_cost(150); ++ format %{ "st_b $mem, zero\t! card-mark imm0" %} ++ ins_encode %{ ++ __ loadstore_enc(R0, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_BYTE); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++// Die now ++instruct ShouldNotReachHere( ) ++%{ ++ match(Halt); ++ ins_cost(300); ++ ++ // Use the following format syntax ++ format %{ "stop; #@ShouldNotReachHere" %} ++ ins_encode %{ ++ if (is_reachable()) { ++ __ stop(_halt_reason); ++ } ++ %} ++ ++ ins_pipe( pipe_jump ); ++%} ++ ++instruct leaP12Narrow(mRegP dst, indOffset12Narrow mem) ++%{ ++ predicate(CompressedOops::shift() == 0); ++ match(Set dst mem); ++ ++ ins_cost(110); ++ format %{ "leaq $dst, $mem\t# ptr off12narrow @ leaP12Narrow" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register base = as_Register($mem$$base); ++ int disp = $mem$$disp; ++ ++ __ addi_d(dst, base, disp); ++ %} ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++instruct leaPIdxScale(mRegP dst, mRegP reg, mRegLorI2L lreg, immI_0_3 scale) ++%{ ++ match(Set dst (AddP reg (LShiftL lreg scale))); ++ ++ ins_cost(110); ++ format %{ "leaq $dst, [$reg + $lreg << $scale]\t# @ leaPIdxScale" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register base = $reg$$Register; ++ Register index = $lreg$$Register; ++ int scale = $scale$$constant; ++ ++ if (scale == 0) { ++ __ add_d($dst$$Register, $reg$$Register, index); ++ } else { ++ __ alsl_d(dst, index, base, scale - 1); ++ } ++ %} ++ ++ ins_pipe( ialu_reg_imm ); ++%} ++ ++ ++// ============================================================================ ++// The 2nd slow-half of a subtype check. Scan the subklass's 2ndary superklass ++// array for an instance of the superklass. Set a hidden internal cache on a ++// hit (cache is checked with exposed code in gen_subtype_check()). Return ++// NZ for a miss or zero for a hit. The encoding ALSO sets flags. ++instruct partialSubtypeCheck( mRegP result, mRegP sub, mRegP super, mRegI tmp2, mRegI tmp ) %{ ++ match(Set result (PartialSubtypeCheck sub super)); ++ effect(TEMP tmp, TEMP tmp2); ++ ins_cost(1100); // slightly larger than the next version ++ format %{ "partialSubtypeCheck result=$result, sub=$sub, super=$super, tmp=$tmp, tmp2=$tmp2" %} ++ ++ ins_encode( enc_PartialSubtypeCheck(result, sub, super, tmp, tmp2) ); ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct partialSubtypeCheckVsZero_long( mRegP sub, mRegP super, mRegP tmp1, mRegP tmp2, immP_0 zero, cmpOpEqNe cmp, label lbl) %{ ++ match(If cmp (CmpP (PartialSubtypeCheck sub super) zero)); ++ effect(USE lbl, TEMP tmp1, TEMP tmp2); ++ format %{ "partialSubtypeCheckVsZero_long b$cmp (sub=$sub, super=$super) R0, $lbl using $tmp1,$tmp2 as TEMP" %} ++ ++ ins_encode %{ ++ Label miss; ++ Label* success = $lbl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ if (flag == 0x01) { //equal ++ __ check_klass_subtype_slow_path($sub$$Register, $super$$Register, $tmp1$$Register, $tmp2$$Register, success, &miss); ++ } else { //no_equal ++ __ check_klass_subtype_slow_path($sub$$Register, $super$$Register, $tmp1$$Register, $tmp2$$Register, &miss, success); ++ } ++ __ bind(miss); ++ %} ++ ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct partialSubtypeCheckVsZero_short( mRegP sub, mRegP super, mRegP tmp1, mRegP tmp2, immP_0 zero, cmpOpEqNe cmp, label lbl) %{ ++ match(If cmp (CmpP (PartialSubtypeCheck sub super) zero)); ++ effect(USE lbl, TEMP tmp1, TEMP tmp2); ++ format %{ "partialSubtypeCheckVsZero_short b$cmp (sub=$sub, super=$super) R0, $lbl using $tmp1,$tmp2 as TEMP" %} ++ ++ ins_encode %{ ++ Label miss; ++ Label* success = $lbl$$label; ++ int flag = $cmp$$cmpcode; ++ ++ if (flag == 0x01) { //equal ++ __ check_klass_subtype_slow_path($sub$$Register, $super$$Register, $tmp1$$Register, $tmp2$$Register, success, &miss, true); ++ } else { //no_equal ++ __ check_klass_subtype_slow_path($sub$$Register, $super$$Register, $tmp1$$Register, $tmp2$$Register, &miss, success, true); ++ } ++ __ bind(miss); ++ %} ++ ++ ins_pipe( pipe_serial ); ++ ins_short_branch(1); ++%} ++ ++instruct compareAndSwapB(mRegI res, memory_exclusive mem_ptr, mRegI oldval, mRegI newval) %{ ++ predicate(UseAMCAS); ++ match(Set res (CompareAndSwapB mem_ptr (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ ins_cost(3 * MEMORY_REF_COST); ++ format %{ "CMPXCHG $newval, [$mem_ptr], $oldval @ compareAndSwapB" %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem_ptr$$base), $mem_ptr$$disp); ++ ++ __ cmpxchg8(addr, oldval, newval, res, true /* sign */, false /* retold */, true /* acquire */); ++ %} ++ ins_pipe( long_memory_op ); ++%} ++ ++instruct compareAndSwapS(mRegI res, memory_exclusive mem_ptr, mRegI oldval, mRegI newval) %{ ++ predicate(UseAMCAS); ++ match(Set res (CompareAndSwapS mem_ptr (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ ins_cost(3 * MEMORY_REF_COST); ++ format %{ "CMPXCHG $newval, [$mem_ptr], $oldval @ compareAndSwapS" %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem_ptr$$base), $mem_ptr$$disp); ++ ++ __ cmpxchg16(addr, oldval, newval, res, true /* sign */, false /* retold */, true /* acquire */); ++ %} ++ ins_pipe( long_memory_op ); ++%} ++ ++instruct compareAndSwapI(mRegI res, memory_exclusive mem_ptr, mRegI oldval, mRegI newval) %{ ++ match(Set res (CompareAndSwapI mem_ptr (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ ins_cost(3 * MEMORY_REF_COST); ++ format %{ "CMPXCHG $newval, [$mem_ptr], $oldval @ compareAndSwapI" %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem_ptr$$base), $mem_ptr$$disp); ++ ++ __ cmpxchg32(addr, oldval, newval, res, true /* sign */, false /* retold*/, true /* acquire */); ++ %} ++ ins_pipe( long_memory_op ); ++%} ++ ++instruct compareAndSwapL(mRegI res, memory_exclusive mem_ptr, mRegL oldval, mRegL newval) %{ ++ predicate(VM_Version::supports_cx8()); ++ match(Set res (CompareAndSwapL mem_ptr (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ ins_cost(3 * MEMORY_REF_COST); ++ format %{ "CMPXCHG $newval, [$mem_ptr], $oldval @ compareAndSwapL" %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem_ptr$$base), $mem_ptr$$disp); ++ ++ __ cmpxchg(addr, oldval, newval, res, false, true /* acquire */); ++ %} ++ ins_pipe( long_memory_op ); ++%} ++ ++instruct compareAndSwapP(mRegI res, memory_exclusive mem_ptr, mRegP oldval, mRegP newval) %{ ++ match(Set res (CompareAndSwapP mem_ptr (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ predicate(n->as_LoadStore()->barrier_data() == 0); ++ ins_cost(3 * MEMORY_REF_COST); ++ format %{ "CMPXCHG $newval, [$mem_ptr], $oldval @ compareAndSwapP" %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem_ptr$$base), $mem_ptr$$disp); ++ ++ __ cmpxchg(addr, oldval, newval, res, false, true /* acquire */); ++ %} ++ ins_pipe( long_memory_op ); ++%} ++ ++instruct compareAndSwapN(mRegI res, indirect mem_ptr, mRegN oldval, mRegN newval) %{ ++ match(Set res (CompareAndSwapN mem_ptr (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ ins_cost(3 * MEMORY_REF_COST); ++ format %{ "CMPXCHG $newval, [$mem_ptr], $oldval @ compareAndSwapN" %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem_ptr$$base), $mem_ptr$$disp); ++ ++ __ cmpxchg32(addr, oldval, newval, res, false, false, true /* acquire */); ++ %} ++ ins_pipe( long_memory_op ); ++%} ++ ++instruct get_and_setB(indirect mem, mRegI newv, mRegI prev) %{ ++ predicate(UseAMBH); ++ match(Set prev (GetAndSetB mem newv)); ++ effect(TEMP_DEF prev); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ "amswap_db_b $prev, $newv, [$mem] @get_and_setB" %} ++ ins_encode %{ ++ Register prev = $prev$$Register; ++ Register newv = $newv$$Register; ++ Register addr = as_Register($mem$$base); ++ ++ __ amswap_db_b(prev, newv, addr); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_setS(indirect mem, mRegI newv, mRegI prev) %{ ++ predicate(UseAMBH); ++ match(Set prev (GetAndSetS mem newv)); ++ effect(TEMP_DEF prev); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ "amswap_db_s $prev, $newv, [$mem] @get_and_setS" %} ++ ins_encode %{ ++ Register prev = $prev$$Register; ++ Register newv = $newv$$Register; ++ Register addr = as_Register($mem$$base); ++ ++ __ amswap_db_h(prev, newv, addr); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_setI(indirect mem, mRegI newv, mRegI prev) %{ ++ match(Set prev (GetAndSetI mem newv)); ++ effect(TEMP_DEF prev); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ "amswap_db_w $prev, $newv, [$mem] @get_and_setI" %} ++ ins_encode %{ ++ Register prev = $prev$$Register; ++ Register newv = $newv$$Register; ++ Register addr = as_Register($mem$$base); ++ ++ __ amswap_db_w(prev, newv, addr); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_setL(indirect mem, mRegL newv, mRegL prev) %{ ++ match(Set prev (GetAndSetL mem newv)); ++ effect(TEMP_DEF prev); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ "amswap_db_d $prev, $newv, [$mem] @get_and_setL" %} ++ ins_encode %{ ++ Register prev = $prev$$Register; ++ Register newv = $newv$$Register; ++ Register addr = as_Register($mem$$base); ++ ++ __ amswap_db_d(prev, newv, addr); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_setN(indirect mem, mRegN newv, mRegN prev) %{ ++ match(Set prev (GetAndSetN mem newv)); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ "amswap_db_w $prev, $newv, [$mem] @get_and_setN" %} ++ ins_encode %{ ++ Register prev = $prev$$Register; ++ Register newv = $newv$$Register; ++ Register addr = as_Register($mem$$base); ++ ++ __ amswap_db_w(AT, newv, addr); ++ __ bstrpick_d(prev, AT, 31, 0); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_setP(indirect mem, mRegP newv, mRegP prev) %{ ++ match(Set prev (GetAndSetP mem newv)); ++ effect(TEMP_DEF prev); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ "amswap_db_d $prev, $newv, [$mem] @get_and_setP" %} ++ ins_encode %{ ++ Register prev = $prev$$Register; ++ Register newv = $newv$$Register; ++ Register addr = as_Register($mem$$base); ++ ++ __ amswap_db_d(prev, newv, addr); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_addL(indirect mem, mRegL newval, mRegL incr) %{ ++ match(Set newval (GetAndAddL mem incr)); ++ effect(TEMP_DEF newval); ++ ins_cost(2 * MEMORY_REF_COST + 1); ++ format %{ "amadd_db_d $newval, [$mem], $incr @get_and_addL" %} ++ ins_encode %{ ++ Register newv = $newval$$Register; ++ Register incr = $incr$$Register; ++ Register addr = as_Register($mem$$base); ++ ++ __ amadd_db_d(newv, incr, addr); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_addL_no_res(indirect mem, Universe dummy, mRegL incr) %{ ++ predicate(n->as_LoadStore()->result_not_used()); ++ match(Set dummy (GetAndAddL mem incr)); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ "amadd_db_d [$mem], $incr @get_and_addL_no_res" %} ++ ins_encode %{ ++ __ amadd_db_d(R0, $incr$$Register, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_addI(indirect mem, mRegI newval, mRegIorL2I incr) %{ ++ match(Set newval (GetAndAddI mem incr)); ++ effect(TEMP_DEF newval); ++ ins_cost(2 * MEMORY_REF_COST + 1); ++ format %{ "amadd_db_w $newval, [$mem], $incr @get_and_addI" %} ++ ins_encode %{ ++ Register newv = $newval$$Register; ++ Register incr = $incr$$Register; ++ Register addr = as_Register($mem$$base); ++ ++ __ amadd_db_w(newv, incr, addr); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_addI_no_res(indirect mem, Universe dummy, mRegIorL2I incr) %{ ++ predicate(n->as_LoadStore()->result_not_used()); ++ match(Set dummy (GetAndAddI mem incr)); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ "amadd_db_w [$mem], $incr @get_and_addI_no_res" %} ++ ins_encode %{ ++ __ amadd_db_w(R0, $incr$$Register, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_addB(indirect mem, mRegI newval, mRegIorL2I incr) %{ ++ predicate(UseAMBH); ++ match(Set newval (GetAndAddB mem incr)); ++ effect(TEMP_DEF newval); ++ ins_cost(2 * MEMORY_REF_COST + 1); ++ format %{ "amadd_db_b $newval, [$mem], $incr @get_and_addB" %} ++ ins_encode %{ ++ Register newv = $newval$$Register; ++ Register incr = $incr$$Register; ++ Register addr = as_Register($mem$$base); ++ ++ __ amadd_db_b(newv, incr, addr); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_addB_no_res(indirect mem, Universe dummy, mRegIorL2I incr) %{ ++ predicate(UseAMBH); ++ predicate(n->as_LoadStore()->result_not_used()); ++ match(Set dummy (GetAndAddB mem incr)); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ "amadd_db_b [$mem], $incr @get_and_addB_no_res" %} ++ ins_encode %{ ++ __ amadd_db_b(R0, $incr$$Register, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_addS(indirect mem, mRegI newval, mRegIorL2I incr) %{ ++ predicate(UseAMBH); ++ match(Set newval (GetAndAddS mem incr)); ++ effect(TEMP_DEF newval); ++ ins_cost(2 * MEMORY_REF_COST + 1); ++ format %{ "amadd_db_s $newval, [$mem], $incr @get_and_addS" %} ++ ins_encode %{ ++ Register newv = $newval$$Register; ++ Register incr = $incr$$Register; ++ Register addr = as_Register($mem$$base); ++ ++ __ amadd_db_h(newv, incr, addr); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct get_and_addS_no_res(indirect mem, Universe dummy, mRegIorL2I incr) %{ ++ predicate(UseAMBH); ++ predicate(n->as_LoadStore()->result_not_used()); ++ match(Set dummy (GetAndAddS mem incr)); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ "amadd_db_s [$mem], $incr @get_and_addS_no_res" %} ++ ins_encode %{ ++ __ amadd_db_h(R0, $incr$$Register, as_Register($mem$$base)); ++ %} ++ ins_pipe( pipe_serial ); ++%} ++ ++instruct compareAndExchangeB(mRegI res, memory_exclusive mem, mRegI oldval, mRegI newval) %{ ++ predicate(UseAMCAS); ++ match(Set res (CompareAndExchangeB mem (Binary oldval newval))); ++ ins_cost(2 * MEMORY_REF_COST); ++ effect(TEMP_DEF res); ++ format %{ ++ "cmpxchg8 $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @compareAndExchangeB" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg8(addr, oldval, newval, res, true /* sign */, false /* retold */, true /* acquire */, false /* weak */, true /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct compareAndExchangeS(mRegI res, memory_exclusive mem, mRegI oldval, mRegI newval) %{ ++ predicate(UseAMCAS); ++ match(Set res (CompareAndExchangeS mem (Binary oldval newval))); ++ ins_cost(2 * MEMORY_REF_COST); ++ effect(TEMP_DEF res); ++ format %{ ++ "cmpxchg16 $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @compareAndExchangeS" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg16(addr, oldval, newval, res, true /* sign */, false /* retold */, true /* acquire */, false /* weak */, true /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct compareAndExchangeI(mRegI res, memory_exclusive mem, mRegI oldval, mRegI newval) %{ ++ ++ match(Set res (CompareAndExchangeI mem (Binary oldval newval))); ++ ins_cost(2 * MEMORY_REF_COST); ++ effect(TEMP_DEF res); ++ format %{ ++ "CMPXCHG $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @compareAndExchangeI" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg32(addr, oldval, newval, res, true /* sign */, false /* retold */, true /* acquire */, false /* weak */, true /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct compareAndExchangeL(mRegL res, memory_exclusive mem, mRegL oldval, mRegL newval) %{ ++ ++ match(Set res (CompareAndExchangeL mem (Binary oldval newval))); ++ ins_cost(2 * MEMORY_REF_COST); ++ effect(TEMP_DEF res); ++ format %{ ++ "CMPXCHG $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @compareAndExchangeL" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg(addr, oldval, newval, res, false /* retold */, true /* acquire */, false /* weak */, true /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct compareAndExchangeP(mRegP res, memory_exclusive mem, mRegP oldval, mRegP newval) %{ ++ predicate(n->as_LoadStore()->barrier_data() == 0); ++ match(Set res (CompareAndExchangeP mem (Binary oldval newval))); ++ ins_cost(2 * MEMORY_REF_COST); ++ effect(TEMP_DEF res); ++ format %{ ++ "CMPXCHG $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @compareAndExchangeP" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg(addr, oldval, newval, res, false /* retold */, true /* acquire */, false /* weak */, true /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct compareAndExchangeN(mRegN res, memory_exclusive mem, mRegN oldval, mRegN newval) %{ ++ ++ match(Set res (CompareAndExchangeN mem (Binary oldval newval))); ++ ins_cost(2 * MEMORY_REF_COST); ++ effect(TEMP_DEF res); ++ format %{ ++ "CMPXCHG $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @compareAndExchangeN" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg32(addr, oldval, newval, res, false /* sign */, false /* retold */, true /* acquire */, false /* weak */, true /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct weakCompareAndSwapB(mRegI res, memory_exclusive mem, mRegI oldval, mRegI newval) %{ ++ predicate(UseAMCAS); ++ match(Set res (WeakCompareAndSwapB mem (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ ++ "CMPXCHG $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @weakCompareAndSwapB" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg8(addr, oldval, newval, res, true /* sign */, false /* retold */, true /* acquire */, true /* weak */, false /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct weakCompareAndSwapS(mRegI res, memory_exclusive mem, mRegI oldval, mRegI newval) %{ ++ predicate(UseAMCAS); ++ match(Set res (WeakCompareAndSwapS mem (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ ++ "CMPXCHG $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @weakCompareAndSwapS" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg16(addr, oldval, newval, res, true /* sign */, false /* retold */, true /* acquire */, true /* weak */, false /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct weakCompareAndSwapI(mRegI res, memory_exclusive mem, mRegI oldval, mRegI newval) %{ ++ ++ match(Set res (WeakCompareAndSwapI mem (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ ++ "CMPXCHG $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @weakCompareAndSwapI" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg32(addr, oldval, newval, res, true /* sign */, false /* retold */, true /* acquire */, true /* weak */, false /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct weakCompareAndSwapL(mRegI res, memory_exclusive mem, mRegL oldval, mRegL newval) %{ ++ ++ match(Set res (WeakCompareAndSwapL mem (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ ++ "CMPXCHG $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @WeakCompareAndSwapL" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg(addr, oldval, newval, res, false /* retold */, true /* acquire */, true /* weak */, false /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct weakCompareAndSwapP(mRegI res, memory_exclusive mem, mRegP oldval, mRegP newval) %{ ++ predicate((((CompareAndSwapNode*)n)->order() != MemNode::acquire && ((CompareAndSwapNode*)n)->order() != MemNode::seqcst) && n->as_LoadStore()->barrier_data() == 0); ++ match(Set res (WeakCompareAndSwapP mem (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ ins_cost(MEMORY_REF_COST); ++ format %{ ++ "CMPXCHG $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @weakCompareAndSwapP" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg(addr, oldval, newval, res, false /* retold */, false /* acquire */, true /* weak */, false /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct weakCompareAndSwapP_acq(mRegI res, memory_exclusive mem, mRegP oldval, mRegP newval) %{ ++ predicate(n->as_LoadStore()->barrier_data() == 0); ++ match(Set res (WeakCompareAndSwapP mem (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ ++ "CMPXCHG $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @weakCompareAndSwapP" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg(addr, oldval, newval, res, false /* retold */, true /* acquire */, true /* weak */, false /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct weakCompareAndSwapN(mRegI res, memory_exclusive mem, mRegN oldval, mRegN newval) %{ ++ ++ match(Set res (WeakCompareAndSwapN mem (Binary oldval newval))); ++ effect(TEMP_DEF res); ++ ins_cost(2 * MEMORY_REF_COST); ++ format %{ ++ "CMPXCHG $res = $mem, $oldval, $newval\t# if $mem == $oldval then $mem <-- $newval @weakCompareAndSwapN" ++ %} ++ ins_encode %{ ++ Register newval = $newval$$Register; ++ Register oldval = $oldval$$Register; ++ Register res = $res$$Register; ++ Address addr(as_Register($mem$$base), $mem$$disp); ++ ++ __ cmpxchg32(addr, oldval, newval, res, false /* sign */, false /* retold */, true /* acquire */, true /* weak */, false /* exchange */); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++//----------Max and Min-------------------------------------------------------- ++ ++// Min Register with Register (generic version) ++instruct minI_Reg_Reg(mRegI dst, mRegI src) %{ ++ match(Set dst (MinI dst src)); ++ //effect(KILL flags); ++ ins_cost(80); ++ ++ format %{ "MIN $dst, $src @minI_Reg_Reg" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ ++ __ slt(AT, src, dst); ++ __ masknez(dst, dst, AT); ++ __ maskeqz(AT, src, AT); ++ __ OR(dst, dst, AT); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// Max Register with Register (generic version) ++instruct maxI_Reg_Reg(mRegI dst, mRegI src) %{ ++ match(Set dst (MaxI dst src)); ++ ins_cost(80); ++ ++ format %{ "MAX $dst, $src @maxI_Reg_Reg" %} ++ ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ ++ __ slt(AT, dst, src); ++ __ masknez(dst, dst, AT); ++ __ maskeqz(AT, src, AT); ++ __ OR(dst, dst, AT); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct maxI_Reg_zero(mRegI dst, immI_0 zero) %{ ++ match(Set dst (MaxI dst zero)); ++ ins_cost(50); ++ ++ format %{ "MAX $dst, 0 @maxI_Reg_zero" %} ++ ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ ++ __ slt(AT, dst, R0); ++ __ masknez(dst, dst, AT); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// Math.max(FF)F ++instruct maxF_reg_reg(regF dst, regF src1, regF src2) %{ ++ match(Set dst (MaxF src1 src2)); ++ effect(TEMP_DEF dst); ++ ++ format %{ "fmaxs $dst, $src1, $src2 @maxF_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src1 = as_FloatRegister($src1$$reg); ++ FloatRegister src2 = as_FloatRegister($src2$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fmax_s(dst, src1, src2); ++ __ fcmp_cun_s(fcc0, src1, src1); ++ __ fsel(dst, dst, src1, fcc0); ++ __ fcmp_cun_s(fcc0, src2, src2); ++ __ fsel(dst, dst, src2, fcc0); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// Math.min(FF)F ++instruct minF_reg_reg(regF dst, regF src1, regF src2) %{ ++ match(Set dst (MinF src1 src2)); ++ effect(TEMP_DEF dst); ++ ++ format %{ "fmins $dst, $src1, $src2 @minF_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src1 = as_FloatRegister($src1$$reg); ++ FloatRegister src2 = as_FloatRegister($src2$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fmin_s(dst, src1, src2); ++ __ fcmp_cun_s(fcc0, src1, src1); ++ __ fsel(dst, dst, src1, fcc0); ++ __ fcmp_cun_s(fcc0, src2, src2); ++ __ fsel(dst, dst, src2, fcc0); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// Math.max(DD)D ++instruct maxD_reg_reg(regD dst, regD src1, regD src2) %{ ++ match(Set dst (MaxD src1 src2)); ++ effect(TEMP_DEF dst); ++ ++ format %{ "fmaxd $dst, $src1, $src2 @maxD_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src1 = as_FloatRegister($src1$$reg); ++ FloatRegister src2 = as_FloatRegister($src2$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fmax_d(dst, src1, src2); ++ __ fcmp_cun_d(fcc0, src1, src1); ++ __ fsel(dst, dst, src1, fcc0); ++ __ fcmp_cun_d(fcc0, src2, src2); ++ __ fsel(dst, dst, src2, fcc0); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// Math.min(DD)D ++instruct minD_reg_reg(regD dst, regD src1, regD src2) %{ ++ match(Set dst (MinD src1 src2)); ++ effect(TEMP_DEF dst); ++ ++ format %{ "fmind $dst, $src1, $src2 @minD_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src1 = as_FloatRegister($src1$$reg); ++ FloatRegister src2 = as_FloatRegister($src2$$reg); ++ FloatRegister dst = as_FloatRegister($dst$$reg); ++ ++ __ fmin_d(dst, src1, src2); ++ __ fcmp_cun_d(fcc0, src1, src1); ++ __ fsel(dst, dst, src1, fcc0); ++ __ fcmp_cun_d(fcc0, src2, src2); ++ __ fsel(dst, dst, src2, fcc0); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// Float.isInfinite ++instruct isInfiniteF_reg_reg(mRegI dst, regF src) ++%{ ++ match(Set dst (IsInfiniteF src)); ++ format %{ "isInfinite $dst, $src @isInfiniteF_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src = $src$$FloatRegister; ++ Register dst = $dst$$Register; ++ ++ __ fclass_s(fscratch, src); ++ __ movfr2gr_s(dst, fscratch); ++ __ andi(dst, dst, 0b0001000100); ++ __ slt(dst, R0, dst); ++ %} ++ size(16); ++ ins_pipe( pipe_slow ); ++%} ++ ++// Double.isInfinite ++instruct isInfiniteD_reg_reg(mRegI dst, regD src) ++%{ ++ match(Set dst (IsInfiniteD src)); ++ format %{ "isInfinite $dst, $src @isInfiniteD_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src = $src$$FloatRegister; ++ Register dst = $dst$$Register; ++ ++ __ fclass_d(fscratch, src); ++ __ movfr2gr_d(dst, fscratch); ++ __ andi(dst, dst, 0b0001000100); ++ __ slt(dst, R0, dst); ++ %} ++ size(16); ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct isInfiniteF_cmovI(mRegI dst, mRegI src1, mRegI src2, regF op, immI_0 zero, cmpOp cop) ++%{ ++ match(Set dst (CMoveI (Binary cop (CmpI (IsInfiniteF op) zero)) (Binary src1 src2))); ++ format %{ "isInfinite_cmovI $dst, $src1, $src2, $op, $cop @isInfiniteF_cmovI" %} ++ ins_encode %{ ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ FloatRegister op = $op$$FloatRegister; ++ int flag = $cop$$cmpcode; ++ ++ __ fclass_s(fscratch, op); ++ __ movfr2gr_s(AT, fscratch); ++ __ andi(AT, AT, 0b0001000100); ++ switch(flag) { ++ case 0x01: // EQ ++ __ maskeqz(dst, src1, AT); ++ __ masknez(AT, src2, AT); ++ break; ++ case 0x02: // NE ++ __ masknez(dst, src1, AT); ++ __ maskeqz(AT, src2, AT); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ __ orr(dst, dst, AT); ++ %} ++ size(24); ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct isInfiniteD_cmovI(mRegI dst, mRegI src1, mRegI src2, regD op, immI_0 zero, cmpOp cop) ++%{ ++ match(Set dst (CMoveI (Binary cop (CmpI (IsInfiniteD op) zero)) (Binary src1 src2))); ++ format %{ "isInfinite_cmovI $dst, $src1, $src2, $op, $cop @isInfiniteD_cmovI" %} ++ ins_encode %{ ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ FloatRegister op = $op$$FloatRegister; ++ int flag = $cop$$cmpcode; ++ ++ __ fclass_d(fscratch, op); ++ __ movfr2gr_d(AT, fscratch); ++ __ andi(AT, AT, 0b0001000100); ++ switch(flag) { ++ case 0x01: // EQ ++ __ maskeqz(dst, src1, AT); ++ __ masknez(AT, src2, AT); ++ break; ++ case 0x02: // NE ++ __ masknez(dst, src1, AT); ++ __ maskeqz(AT, src2, AT); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ __ orr(dst, dst, AT); ++ %} ++ size(24); ++ ins_pipe( pipe_slow ); ++%} ++ ++// Float.isFinite ++instruct isFiniteF_reg_reg(mRegI dst, regF src) ++%{ ++ match(Set dst (IsFiniteF src)); ++ format %{ "isFinite $dst, $src @isFiniteF_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src = $src$$FloatRegister; ++ Register dst = $dst$$Register; ++ ++ __ fclass_s(fscratch, src); ++ __ movfr2gr_s(dst, fscratch); ++ __ andi(dst, dst, 0b1110111000); ++ __ slt(dst, R0, dst); ++ %} ++ size(16); ++ ins_pipe( pipe_slow ); ++%} ++ ++// Double.isFinite ++instruct isFiniteD_reg_reg(mRegI dst, regD src) ++%{ ++ match(Set dst (IsFiniteD src)); ++ format %{ "isFinite $dst, $src @isFiniteD_reg_reg" %} ++ ins_encode %{ ++ FloatRegister src = $src$$FloatRegister; ++ Register dst = $dst$$Register; ++ ++ __ fclass_d(fscratch, src); ++ __ movfr2gr_d(dst, fscratch); ++ __ andi(dst, dst, 0b1110111000); ++ __ slt(dst, R0, dst); ++ %} ++ size(16); ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct isFiniteF_cmovI(mRegI dst, mRegI src1, mRegI src2, regF op, immI_0 zero, cmpOp cop) ++%{ ++ match(Set dst (CMoveI (Binary cop (CmpI (IsFiniteF op) zero)) (Binary src1 src2))); ++ format %{ "isFinite_cmovI $dst, $src1, $src2, $op, $cop @isFiniteF_cmovI" %} ++ ins_encode %{ ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ FloatRegister op = $op$$FloatRegister; ++ int flag = $cop$$cmpcode; ++ ++ __ fclass_s(fscratch, op); ++ __ movfr2gr_s(AT, fscratch); ++ __ andi(dst, dst, 0b1110111000); ++ switch(flag) { ++ case 0x01: // EQ ++ __ maskeqz(dst, src1, AT); ++ __ masknez(AT, src2, AT); ++ break; ++ case 0x02: // NE ++ __ masknez(dst, src1, AT); ++ __ maskeqz(AT, src2, AT); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ __ orr(dst, dst, AT); ++ %} ++ size(24); ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct isFiniteD_cmovI(mRegI dst, mRegI src1, mRegI src2, regD op, immI_0 zero, cmpOp cop) ++%{ ++ match(Set dst (CMoveI (Binary cop (CmpI (IsFiniteD op) zero)) (Binary src1 src2))); ++ format %{ "isFinite_cmovI $dst, $src1, $src2, $op, $cop @isFiniteD_cmovI" %} ++ ins_encode %{ ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ Register dst = $dst$$Register; ++ FloatRegister op = $op$$FloatRegister; ++ int flag = $cop$$cmpcode; ++ ++ __ fclass_d(fscratch, op); ++ __ movfr2gr_d(AT, fscratch); ++ __ andi(dst, dst, 0b1110111000); ++ switch(flag) { ++ case 0x01: // EQ ++ __ maskeqz(dst, src1, AT); ++ __ masknez(AT, src2, AT); ++ break; ++ case 0x02: // NE ++ __ masknez(dst, src1, AT); ++ __ maskeqz(AT, src2, AT); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ __ orr(dst, dst, AT); ++ %} ++ size(24); ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct combine_i2l(mRegL dst, mRegI src1, immL_MaxUI mask, mRegI src2, immI_32 shift32) ++%{ ++ match(Set dst (OrL (AndL (ConvI2L src1) mask) (LShiftL (ConvI2L src2) shift32))); ++ ++ format %{ "combine_i2l $dst, $src2(H), $src1(L) @ combine_i2l" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src1 = $src1$$Register; ++ Register src2 = $src2$$Register; ++ ++ if (src1 == dst) { ++ __ bstrins_d(dst, src2, 63, 32); ++ } else if (src2 == dst) { ++ __ slli_d(dst, dst, 32); ++ __ bstrins_d(dst, src1, 31, 0); ++ } else { ++ __ bstrpick_d(dst, src1, 31, 0); ++ __ bstrins_d(dst, src2, 63, 32); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// Zero-extend convert int to long ++instruct convI2L_reg_reg_zex(mRegL dst, mRegI src, immL_MaxUI mask) ++%{ ++ match(Set dst (AndL (ConvI2L src) mask)); ++ ++ format %{ "movl $dst, $src\t# i2l zero-extend @ convI2L_reg_reg_zex" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ ++ __ bstrpick_d(dst, src, 31, 0); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct convL2I2L_reg_reg_zex(mRegL dst, mRegL src, immL_MaxUI mask) ++%{ ++ match(Set dst (AndL (ConvI2L (ConvL2I src)) mask)); ++ ++ format %{ "movl $dst, $src\t# i2l zero-extend @ convL2I2L_reg_reg_zex" %} ++ ins_encode %{ ++ Register dst = $dst$$Register; ++ Register src = $src$$Register; ++ ++ __ bstrpick_d(dst, src, 31, 0); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// Match loading integer and casting it to unsigned int in long register. ++// LoadI + ConvI2L + AndL 0xffffffff. ++instruct loadUI2L_mask(mRegL dst, memory mem, immL_MaxUI mask) %{ ++ match(Set dst (AndL (ConvI2L (LoadI mem)) mask)); ++ ++ format %{ "ld_wu $dst, $mem \t// zero-extend to long @ loadUI2L_mask" %} ++ ins_encode %{ ++ relocInfo::relocType disp_reloc = $mem->disp_reloc(); ++ assert(disp_reloc == relocInfo::none, "cannot have disp"); ++ __ loadstore_enc($dst$$Register, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_U_INT); ++ %} ++ ins_pipe( ialu_load ); ++%} ++ ++// ============================================================================ ++// Safepoint Instruction ++instruct safePoint_poll_tls(mRegP poll) %{ ++ match(SafePoint poll); ++ effect(USE poll); ++ ++ ins_cost(125); ++ format %{ "ld_w AT, [$poll]\t" ++ "Safepoint @ [$poll] : poll for GC" %} ++ size(4); ++ ins_encode %{ ++ Register poll_reg = $poll$$Register; ++ ++ __ block_comment("Safepoint:"); ++ __ relocate(relocInfo::poll_type); ++ address pre_pc = __ pc(); ++ __ ld_w(AT, poll_reg, 0); ++ assert(nativeInstruction_at(pre_pc)->is_safepoint_poll(), "must emit ld_w AT, [$poll]"); ++ %} ++ ++ ins_pipe( pipe_serial ); ++%} ++ ++//----------BSWAP Instructions------------------------------------------------- ++instruct bytes_reverse_int(mRegI dst, mRegIorL2I src) %{ ++ match(Set dst (ReverseBytesI src)); ++ ++ format %{ "RevB_I $dst, $src" %} ++ ins_encode %{ ++ __ bswap_w($dst$$Register, $src$$Register); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct bytes_reverse_long(mRegL dst, mRegL src) %{ ++ match(Set dst (ReverseBytesL src)); ++ ++ format %{ "RevB_L $dst, $src" %} ++ ins_encode %{ ++ __ revb_d($dst$$Register, $src$$Register); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct bytes_reverse_unsigned_short(mRegI dst, mRegIorL2I src) %{ ++ match(Set dst (ReverseBytesUS src)); ++ ++ format %{ "RevB_US $dst, $src" %} ++ ins_encode %{ ++ __ bswap_hu($dst$$Register, $src$$Register); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct bytes_reverse_short(mRegI dst, mRegIorL2I src) %{ ++ match(Set dst (ReverseBytesS src)); ++ ++ format %{ "RevB_S $dst, $src" %} ++ ins_encode %{ ++ __ bswap_h($dst$$Register, $src$$Register); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++//---------- Zeros Count Instructions ------------------------------------------ ++// CountLeadingZerosINode CountTrailingZerosINode ++instruct countLeadingZerosI(mRegI dst, mRegIorL2I src) %{ ++ match(Set dst (CountLeadingZerosI src)); ++ ++ format %{ "clz_w $dst, $src\t# count leading zeros (int)" %} ++ ins_encode %{ ++ __ clz_w($dst$$Register, $src$$Register); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct countLeadingZerosL(mRegI dst, mRegL src) %{ ++ match(Set dst (CountLeadingZerosL src)); ++ ++ format %{ "clz_d $dst, $src\t# count leading zeros (long)" %} ++ ins_encode %{ ++ __ clz_d($dst$$Register, $src$$Register); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct countTrailingZerosI(mRegI dst, mRegIorL2I src) %{ ++ match(Set dst (CountTrailingZerosI src)); ++ ++ format %{ "ctz_w $dst, $src\t# count trailing zeros (int)" %} ++ ins_encode %{ ++ __ ctz_w($dst$$Register, $src$$Register); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++instruct countTrailingZerosL(mRegI dst, mRegL src) %{ ++ match(Set dst (CountTrailingZerosL src)); ++ ++ format %{ "ctz_d $dst, $src\t# count trailing zeros (long)" %} ++ ins_encode %{ ++ __ ctz_d($dst$$Register, $src$$Register); ++ %} ++ ins_pipe( ialu_reg_reg ); ++%} ++ ++// --------------- Population Count Instructions ------------------------------ ++// ++instruct popCountI(mRegI dst, mRegIorL2I src) %{ ++ predicate(UsePopCountInstruction); ++ match(Set dst (PopCountI src)); ++ ++ format %{ "vinsgr2vr_w fscratch, $src, 0\n\t" ++ "vpcnt_w fscratch, fscratch\n\t" ++ "vpickve2gr_wu $dst, fscratch, 0\n\t# @popCountI" %} ++ ++ ins_encode %{ ++ __ vinsgr2vr_w(fscratch, $src$$Register, 0); ++ __ vpcnt_w(fscratch, fscratch); ++ __ vpickve2gr_wu($dst$$Register, fscratch, 0); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct popCountI_mem(mRegI dst, memory mem) %{ ++ predicate(UsePopCountInstruction); ++ match(Set dst (PopCountI (LoadI mem))); ++ ++ format %{ "fld_s fscratch, $mem, 0\n\t" ++ "vpcnt_w fscratch, fscratch\n\t" ++ "vpickve2gr_wu $dst, fscratch, 0\n\t# @popCountI_mem" %} ++ ++ ins_encode %{ ++ __ loadstore_enc(fscratch, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_FLOAT); ++ __ vpcnt_w(fscratch, fscratch); ++ __ vpickve2gr_wu($dst$$Register, fscratch, 0); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// Note: Long.bitCount(long) returns an int. ++instruct popCountL(mRegI dst, mRegL src) %{ ++ predicate(UsePopCountInstruction); ++ match(Set dst (PopCountL src)); ++ ++ format %{ "vinsgr2vr_d fscratch, $src, 0\n\t" ++ "vpcnt_d fscratch, fscratch\n\t" ++ "vpickve2gr_wu $dst, fscratch, 0\n\t# @popCountL" %} ++ ++ ins_encode %{ ++ __ vinsgr2vr_d(fscratch, $src$$Register, 0); ++ __ vpcnt_d(fscratch, fscratch); ++ __ vpickve2gr_wu($dst$$Register, fscratch, 0); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct popCountL_mem(mRegI dst, memory mem) %{ ++ predicate(UsePopCountInstruction); ++ match(Set dst (PopCountL (LoadL mem))); ++ ++ format %{ "fld_d fscratch, $mem, 0\n\t" ++ "vpcnt_d fscratch, fscratch\n\t" ++ "vpickve2gr_wu $dst, fscratch, 0\n\t# @popCountL_mem" %} ++ ++ ins_encode %{ ++ __ loadstore_enc(fscratch, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_DOUBLE); ++ __ vpcnt_d(fscratch, fscratch); ++ __ vpickve2gr_wu($dst$$Register, fscratch, 0); ++ %} ++ ++ ins_pipe( pipe_slow ); ++%} ++ ++// ====================VECTOR INSTRUCTIONS===================================== ++ ++// --------------------------------- Load ------------------------------------- ++ ++instruct loadV(vReg dst, memory mem) %{ ++ match(Set dst (LoadVector mem)); ++ format %{ "(x)vload $dst, $mem\t# @loadV" %} ++ ins_encode %{ ++ switch (Matcher::vector_length_in_bytes(this)) { ++ case 4: __ loadstore_enc($dst$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_FLOAT); break; ++ case 8: __ loadstore_enc($dst$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_DOUBLE); break; ++ case 16: __ loadstore_enc($dst$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_VECTORX); break; ++ case 32: __ loadstore_enc($dst$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::LOAD_VECTORY); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- Store ------------------------------------ ++ ++instruct storeV(memory mem, vReg src) %{ ++ match(Set mem (StoreVector mem src)); ++ format %{ "(x)vstore $src, $mem\t# @storeV" %} ++ ins_encode %{ ++ switch (Matcher::vector_length_in_bytes(this, $src)) { ++ case 4: __ loadstore_enc($src$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_FLOAT); break; ++ case 8: __ loadstore_enc($src$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_DOUBLE); break; ++ case 16: __ loadstore_enc($src$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_VECTORX); break; ++ case 32: __ loadstore_enc($src$$FloatRegister, $mem$$base, $mem$$index, $mem$$scale, $mem$$disp, C2_MacroAssembler::STORE_VECTORY); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ------------------------------- Replicate ---------------------------------- ++ ++instruct replV(vReg dst, mRegI src) %{ ++ match(Set dst (ReplicateB src)); ++ match(Set dst (ReplicateS src)); ++ match(Set dst (ReplicateI src)); ++ format %{ "(x)vreplgr2vr $dst, $src\t# @replV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvreplgr2vr_b($dst$$FloatRegister, $src$$Register); break; ++ case T_SHORT: __ xvreplgr2vr_h($dst$$FloatRegister, $src$$Register); break; ++ case T_INT : __ xvreplgr2vr_w($dst$$FloatRegister, $src$$Register); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vreplgr2vr_b($dst$$FloatRegister, $src$$Register); break; ++ case T_SHORT: __ vreplgr2vr_h($dst$$FloatRegister, $src$$Register); break; ++ case T_INT : __ vreplgr2vr_w($dst$$FloatRegister, $src$$Register); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct replVL(vReg dst, mRegL src) %{ ++ match(Set dst (ReplicateL src)); ++ format %{ "(x)vreplgr2vr.d $dst, $src\t# @replVL" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 2: __ vreplgr2vr_d($dst$$FloatRegister, $src$$Register); break; ++ case 4: __ xvreplgr2vr_d($dst$$FloatRegister, $src$$Register); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct replVF(vReg dst, regF src) %{ ++ match(Set dst (ReplicateF src)); ++ format %{ "(x)vreplve0.w $dst, $src\t# @replVF" %} ++ ins_encode %{ ++ __ vreplvei_w($dst$$FloatRegister, $src$$FloatRegister, 0); ++ switch (Matcher::vector_length(this)) { ++ case 2: ++ case 4: __ vreplvei_w($dst$$FloatRegister, $src$$FloatRegister, 0); break; ++ case 8: __ xvreplve0_w($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct replVD(vReg dst, regD src) %{ ++ match(Set dst (ReplicateD src)); ++ format %{ "(x)vreplve0.d $dst, $src\t# @replVD" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 2: __ vreplvei_d($dst$$FloatRegister, $src$$FloatRegister, 0); break; ++ case 4: __ xvreplve0_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct replVB_imm(vReg dst, immI_M128_255 imm) %{ ++ match(Set dst (ReplicateB imm)); ++ format %{ "(x)vldi $dst, $imm\t# @replVB_imm" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 4: ++ case 8: ++ case 16: __ vldi($dst$$FloatRegister, ($imm$$constant & 0xff)); break; ++ case 32: __ xvldi($dst$$FloatRegister, ($imm$$constant & 0xff)); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct replV_imm(vReg dst, immI10 imm) %{ ++ match(Set dst (ReplicateS imm)); ++ match(Set dst (ReplicateI imm)); ++ format %{ "(x)vldi $dst, $imm\t# @replV_imm" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_SHORT: __ xvldi($dst$$FloatRegister, (0b001 << 10 ) | ($imm$$constant & 0x3ff)); break; ++ case T_INT : __ xvldi($dst$$FloatRegister, (0b010 << 10 ) | ($imm$$constant & 0x3ff)); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_SHORT: __ vldi($dst$$FloatRegister, (0b001 << 10 ) | ($imm$$constant & 0x3ff)); break; ++ case T_INT : __ vldi($dst$$FloatRegister, (0b010 << 10 ) | ($imm$$constant & 0x3ff)); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct replVL_imm(vReg dst, immL10 imm) %{ ++ match(Set dst (ReplicateL imm)); ++ format %{ "(x)vldi $dst, $imm\t# @replVL_imm" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 2: __ vldi($dst$$FloatRegister, (0b011 << 10 ) | ($imm$$constant & 0x3ff)); break; ++ case 4: __ xvldi($dst$$FloatRegister, (0b011 << 10 ) | ($imm$$constant & 0x3ff)); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- ADD -------------------------------------- ++ ++instruct addV(vReg dst, vReg src1, vReg src2) %{ ++ match(Set dst (AddVB src1 src2)); ++ match(Set dst (AddVS src1 src2)); ++ match(Set dst (AddVI src1 src2)); ++ match(Set dst (AddVL src1 src2)); ++ match(Set dst (AddVF src1 src2)); ++ match(Set dst (AddVD src1 src2)); ++ format %{ "(x)vadd $dst, $src1, $src2\t# @addV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvadd_b ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT : __ xvadd_h ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ xvadd_w ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ xvadd_d ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_FLOAT : __ xvfadd_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_DOUBLE: __ xvfadd_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vadd_b ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT : __ vadd_h ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ vadd_w ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ vadd_d ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_FLOAT : __ vfadd_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_DOUBLE: __ vfadd_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct addV_imm(vReg dst, vReg src, immIU5 imm) %{ ++ match(Set dst (AddVB src (ReplicateB imm))); ++ match(Set dst (AddVS src (ReplicateS imm))); ++ match(Set dst (AddVI src (ReplicateI imm))); ++ format %{ "(x)vaddi $dst, $src, $imm\t# @addV_imm" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvaddi_bu($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case T_SHORT: __ xvaddi_hu($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case T_INT : __ xvaddi_wu($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vaddi_bu($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case T_SHORT: __ vaddi_hu($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case T_INT : __ vaddi_wu($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct addVL_imm(vReg dst, vReg src, immLU5 imm) %{ ++ match(Set dst (AddVL src (ReplicateL imm))); ++ format %{ "(x)vaddi.du $dst, $src, $imm\t# @addVL_imm" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 2: __ vaddi_du($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case 4: __ xvaddi_du($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- SUB -------------------------------------- ++ ++instruct subV(vReg dst, vReg src1, vReg src2) %{ ++ match(Set dst (SubVB src1 src2)); ++ match(Set dst (SubVS src1 src2)); ++ match(Set dst (SubVI src1 src2)); ++ match(Set dst (SubVL src1 src2)); ++ match(Set dst (SubVF src1 src2)); ++ match(Set dst (SubVD src1 src2)); ++ format %{ "(x)vsub $dst, $src1, $src2\t# @subV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvsub_b ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT : __ xvsub_h ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ xvsub_w ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ xvsub_d ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_FLOAT : __ xvfsub_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_DOUBLE: __ xvfsub_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vsub_b ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT : __ vsub_h ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ vsub_w ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ vsub_d ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_FLOAT : __ vfsub_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_DOUBLE: __ vfsub_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct subV_imm(vReg dst, vReg src, immIU5 imm) %{ ++ match(Set dst (SubVB src (ReplicateB imm))); ++ match(Set dst (SubVS src (ReplicateB imm))); ++ match(Set dst (SubVI src (ReplicateB imm))); ++ format %{ "(x)vsubi $dst, $src, $imm\t# @subV_imm" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvsubi_bu($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case T_SHORT: __ xvsubi_hu($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case T_INT : __ xvsubi_wu($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vsubi_bu($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case T_SHORT: __ vsubi_hu($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case T_INT : __ vsubi_wu($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct subVL_imm(vReg dst, vReg src, immLU5 imm) %{ ++ match(Set dst (SubVL src (ReplicateL imm))); ++ format %{ "(x)vsubi.du $dst, $src, $imm\t# @subVL_imm" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 2: __ vsubi_du($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case 4: __ xvsubi_du($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- MUL -------------------------------------- ++ ++instruct mulV(vReg dst, vReg src1, vReg src2) %{ ++ match(Set dst (MulVB src1 src2)); ++ match(Set dst (MulVS src1 src2)); ++ match(Set dst (MulVI src1 src2)); ++ match(Set dst (MulVL src1 src2)); ++ match(Set dst (MulVF src1 src2)); ++ match(Set dst (MulVD src1 src2)); ++ format %{ "(x)vmul $dst, $src1, $src2\t# @mulV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvmul_b ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT : __ xvmul_h ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ xvmul_w ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ xvmul_d ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_FLOAT : __ xvfmul_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_DOUBLE: __ xvfmul_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vmul_b ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT : __ vmul_h ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ vmul_w ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ vmul_d ($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_FLOAT : __ vfmul_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_DOUBLE: __ vfmul_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- DIV -------------------------------------- ++ ++instruct divV(vReg dst, vReg src1, vReg src2) %{ ++ match(Set dst (DivVF src1 src2)); ++ match(Set dst (DivVD src1 src2)); ++ format %{ "(x)vfdiv $dst, $src1, $src2\t# @divV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : __ xvfdiv_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_DOUBLE: __ xvfdiv_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : __ vfdiv_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_DOUBLE: __ vfdiv_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- ABS -------------------------------------- ++ ++instruct absV(vReg dst, vReg src) %{ ++ match(Set dst (AbsVB src)); ++ match(Set dst (AbsVS src)); ++ match(Set dst (AbsVI src)); ++ match(Set dst (AbsVL src)); ++ match(Set dst (AbsVF src)); ++ match(Set dst (AbsVD src)); ++ format %{ "(x)vabs $dst, $src\t# @absV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ if (!is_floating_point_type(Matcher::vector_element_basic_type(this))) ++ __ xvxor_v(fscratch, fscratch, fscratch); ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvabsd_b($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_SHORT : __ xvabsd_h($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_INT : __ xvabsd_w($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_LONG : __ xvabsd_d($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_FLOAT : __ xvbitclri_w($dst$$FloatRegister, $src$$FloatRegister, 0x1f); break; ++ case T_DOUBLE: __ xvbitclri_d($dst$$FloatRegister, $src$$FloatRegister, 0x3f); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ if (!is_floating_point_type(Matcher::vector_element_basic_type(this))) ++ __ vxor_v(fscratch, fscratch, fscratch); ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vabsd_b($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_SHORT : __ vabsd_h($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_INT : __ vabsd_w($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_LONG : __ vabsd_d($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_FLOAT : __ vbitclri_w($dst$$FloatRegister, $src$$FloatRegister, 0x1f); break; ++ case T_DOUBLE: __ vbitclri_d($dst$$FloatRegister, $src$$FloatRegister, 0x3f); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- ABS DIFF --------------------------------- ++ ++instruct absdV(vReg dst, vReg src1, vReg src2) %{ ++ match(Set dst (AbsVB (SubVI src1 src2))); ++ match(Set dst (AbsVS (SubVI src1 src2))); ++ match(Set dst (AbsVI (SubVI src1 src2))); ++ match(Set dst (AbsVL (SubVI src1 src2))); ++ format %{ "(x)vabsd $dst, $src1, $src2\t# @absdV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvabsd_b($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT: __ xvabsd_h($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ xvabsd_w($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ xvabsd_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vabsd_b($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT: __ vabsd_h($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ vabsd_w($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ vabsd_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- MAX -------------------------------------- ++ ++instruct maxV(vReg dst, vReg src1, vReg src2) %{ ++ predicate(!(is_floating_point_type(Matcher::vector_element_basic_type(n)))); ++ match(Set dst (MaxV src1 src2)); ++ format %{ "(x)vmax $dst, $src1, $src2\t# @maxV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvmax_b($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT: __ xvmax_h($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ xvmax_w($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ xvmax_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vmax_b($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT: __ vmax_h($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ vmax_w($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ vmax_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct maxVF(vReg dst, vReg src1, vReg src2, vReg tmp) %{ ++ predicate(Matcher::vector_element_basic_type(n) == T_FLOAT); ++ match(Set dst (MaxV src1 src2)); ++ effect(TEMP_DEF dst, TEMP tmp); ++ format %{ "(x)vfmax $dst, $src1, $src2\t# TEMP($tmp) @maxVF" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ __ xvfmax_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ xvxor_v($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ xvfdiv_s($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ xvfcmp_cun_s(fscratch, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ xvbitsel_v($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister, fscratch); ++ } else { ++ __ vfmax_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ vxor_v($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ vfdiv_s($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ vfcmp_cun_s(fscratch, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ vbitsel_v($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister, fscratch); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct maxVD(vReg dst, vReg src1, vReg src2, vReg tmp) %{ ++ predicate(Matcher::vector_element_basic_type(n) == T_DOUBLE); ++ match(Set dst (MaxV src1 src2)); ++ effect(TEMP_DEF dst, TEMP tmp); ++ format %{ "(x)vfmax $dst, $src1, $src2\t# TEMP($tmp) @maxVD" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ __ xvfmax_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ xvxor_v($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ xvfdiv_d($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ xvfcmp_cun_d(fscratch, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ xvbitsel_v($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister, fscratch); ++ } else { ++ __ vfmax_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ vxor_v($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ vfdiv_d($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ vfcmp_cun_d(fscratch, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ vbitsel_v($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister, fscratch); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- MIN -------------------------------------- ++ ++instruct minV(vReg dst, vReg src1, vReg src2) %{ ++ predicate(!(is_floating_point_type(Matcher::vector_element_basic_type(n)))); ++ match(Set dst (MinV src1 src2)); ++ format %{ "(x)vmin $dst, $src1, $src2\t# @minV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvmin_b($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT: __ xvmin_h($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ xvmin_w($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ xvmin_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vmin_b($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT: __ vmin_h($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ vmin_w($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ vmin_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct minVF(vReg dst, vReg src1, vReg src2, vReg tmp) %{ ++ predicate(Matcher::vector_element_basic_type(n) == T_FLOAT); ++ match(Set dst (MinV src1 src2)); ++ effect(TEMP_DEF dst, TEMP tmp); ++ format %{ "(x)vfmin $dst, $src1, $src2\t# TEMP($tmp) @minVF" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ __ xvfmin_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ xvxor_v($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ xvfdiv_s($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ xvfcmp_cun_s(fscratch, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ xvbitsel_v($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister, fscratch); ++ } else { ++ __ vfmin_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ vxor_v($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ vfdiv_s($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ vfcmp_cun_s(fscratch, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ vbitsel_v($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister, fscratch); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct minVD(vReg dst, vReg src1, vReg src2, vReg tmp) %{ ++ predicate(Matcher::vector_element_basic_type(n) == T_DOUBLE); ++ match(Set dst (MinV src1 src2)); ++ effect(TEMP_DEF dst, TEMP tmp); ++ format %{ "(x)vfmin $dst, $src1, $src2\t# TEMP($tmp) @minVD" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ __ xvfmin_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ xvxor_v($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ xvfdiv_d($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ xvfcmp_cun_d(fscratch, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ xvbitsel_v($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister, fscratch); ++ } else { ++ __ vfmin_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ vxor_v($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ vfdiv_d($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ vfcmp_cun_d(fscratch, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ vbitsel_v($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister, fscratch); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- NEG -------------------------------------- ++ ++instruct negV(vReg dst, vReg src) %{ ++ match(Set dst (NegVI src)); ++ match(Set dst (NegVL src)); ++ match(Set dst (NegVF src)); ++ match(Set dst (NegVD src)); ++ format %{ "(x)vneg $dst, $src\t# @negV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvneg_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_SHORT : __ xvneg_h($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_INT : __ xvneg_w($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_LONG : __ xvneg_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_FLOAT : __ xvbitrevi_w($dst$$FloatRegister, $src$$FloatRegister, 0x1f); break; ++ case T_DOUBLE: __ xvbitrevi_d($dst$$FloatRegister, $src$$FloatRegister, 0x3f); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vneg_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_SHORT : __ vneg_h($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_INT : __ vneg_w($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_LONG : __ vneg_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_FLOAT : __ vbitrevi_w($dst$$FloatRegister, $src$$FloatRegister, 0x1f); break; ++ case T_DOUBLE: __ vbitrevi_d($dst$$FloatRegister, $src$$FloatRegister, 0x3f); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- SQRT ------------------------------------- ++ ++instruct sqrtVF(vReg dst, vReg src) %{ ++ match(Set dst (SqrtVF src)); ++ match(Set dst (SqrtVD src)); ++ format %{ "(x)vfsqrt.s $dst, $src\t# @sqrtVF" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : __ xvfsqrt_s($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_DOUBLE: __ xvfsqrt_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : __ vfsqrt_s($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_DOUBLE: __ vfsqrt_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- MADD ------------------------------------- ++ ++instruct maddV(vReg dst, vReg src1, vReg src2) %{ ++ match(Set dst (AddVB dst (MulVB src1 src2))); ++ match(Set dst (AddVS dst (MulVS src1 src2))); ++ match(Set dst (AddVI dst (MulVI src1 src2))); ++ match(Set dst (AddVL dst (MulVL src1 src2))); ++ format %{ "(x)vmadd $dst, $src1, $src2\t# @maddV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvmadd_b($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT: __ xvmadd_h($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ xvmadd_w($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ xvmadd_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vmadd_b($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT: __ vmadd_h($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ vmadd_w($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ vmadd_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// src1 * src2 + src3 ++instruct fmaddV(vReg dst, vReg src1, vReg src2, vReg src3) %{ ++ match(Set dst (FmaVF src3 (Binary src1 src2))); ++ match(Set dst (FmaVD src3 (Binary src1 src2))); ++ format %{ "(x)vfmadd $dst, $src1, $src2, $src3\t# @fmaddV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : __ xvfmadd_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ case T_DOUBLE: __ xvfmadd_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : __ vfmadd_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ case T_DOUBLE: __ vfmadd_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- MSUB ------------------------------------- ++ ++instruct msubV(vReg dst, vReg src1, vReg src2) %{ ++ match(Set dst (SubVB dst (MulVB src1 src2))); ++ match(Set dst (SubVS dst (MulVS src1 src2))); ++ match(Set dst (SubVI dst (MulVI src1 src2))); ++ match(Set dst (SubVL dst (MulVL src1 src2))); ++ format %{ "(x)vmsub $dst, $src1, $src2\t# @msubV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvmsub_b($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT: __ xvmsub_h($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ xvmsub_w($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ xvmsub_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vmsub_b($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_SHORT: __ vmsub_h($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_INT : __ vmsub_w($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case T_LONG : __ vmsub_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// src1 * src2 - src3 ++instruct fmsubV(vReg dst, vReg src1, vReg src2, vReg src3) %{ ++ match(Set dst (FmaVF (NegVF src3) (Binary src1 src2))); ++ format %{ "(x)vfmsub $dst, $src1, $src2, $src3\t# @fmsubV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : __ xvfmsub_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ case T_DOUBLE: __ xvfmsub_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : __ vfmsub_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ case T_DOUBLE: __ vfmsub_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- FNMADD ----------------------------------- ++ ++// -src1 * src2 - src3 ++instruct fnmaddV(vReg dst, vReg src1, vReg src2, vReg src3) %{ ++ match(Set dst (FmaVF (NegVF src3) (Binary (NegVF src1) src2))); ++ match(Set dst (FmaVF (NegVF src3) (Binary src1 (NegVF src2)))); ++ match(Set dst (FmaVD (NegVD src3) (Binary (NegVD src1) src2))); ++ match(Set dst (FmaVD (NegVD src3) (Binary src1 (NegVD src2)))); ++ format %{ "(x)vfnmadd $dst, $src1, $src2, $src3\t# @fnmaddV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : __ xvfnmadd_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ case T_DOUBLE: __ xvfnmadd_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : __ vfnmadd_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ case T_DOUBLE: __ vfnmadd_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- FNMSUB ----------------------------------- ++ ++// -src1 * src2 + src3 ++instruct fnmsubV(vReg dst, vReg src1, vReg src2, vReg src3) %{ ++ match(Set dst (FmaVF src3 (Binary (NegVF src1) src2))); ++ match(Set dst (FmaVF src3 (Binary src1 (NegVF src2)))); ++ match(Set dst (FmaVD src3 (Binary (NegVD src1) src2))); ++ match(Set dst (FmaVD src3 (Binary src1 (NegVD src2)))); ++ format %{ "(x)vfnmsub $dst, $src1, $src2, $src3\t# @fnmsubV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : __ xvfnmsub_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ case T_DOUBLE: __ xvfnmsub_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch(Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : __ vfnmsub_s($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ case T_DOUBLE: __ vfnmsub_d($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $src3$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------- Vector Multiply-Add Shorts into Integer -------------------- ++ ++instruct muladd8Sto4I(vReg dst, vReg src1, vReg src2) %{ ++ predicate(Matcher::vector_length(n->in(1)) == 8 && Matcher::vector_element_basic_type(n->in(1)) == T_SHORT); ++ match(Set dst (MulAddVS2VI src1 src2)); ++ format %{ "muladdvs2vi $dst, $src1, $src2\t# @muladd8Sto4I" %} ++ ins_encode %{ ++ DEBUG_ONLY(Unimplemented()); // unverified ++ __ vmulwev_w_h(fscratch, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ vmulwod_w_h($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ vadd_w($dst$$FloatRegister, fscratch, $dst$$FloatRegister); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct muladd16Sto8I(vReg dst, vReg src1, vReg src2) %{ ++ predicate(Matcher::vector_length(n->in(1)) == 16 && Matcher::vector_element_basic_type(n->in(1)) == T_SHORT); ++ match(Set dst (MulAddVS2VI src1 src2)); ++ format %{ "muladdvs2vi $dst, $src1, $src2\t# @muladd16Sto8I" %} ++ ins_encode %{ ++ DEBUG_ONLY(Unimplemented()); // unverified ++ __ xvmulwev_w_h(fscratch, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ xvmulwod_w_h($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); ++ __ xvadd_w($dst$$FloatRegister, fscratch, $dst$$FloatRegister); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ------------------------------ Shift --------------------------------------- ++ ++instruct shiftcntV(vReg dst, mRegI cnt) %{ ++ match(Set dst (LShiftCntV cnt)); ++ match(Set dst (RShiftCntV cnt)); ++ format %{ "(x)vreplgr2vr.b $dst, $cnt\t# @shiftcntV" %} ++ ins_encode %{ ++ switch (Matcher::vector_length_in_bytes(this)) { ++ case 4: ++ case 8: ++ case 16: __ vreplgr2vr_b($dst$$FloatRegister, $cnt$$Register); break; ++ case 32: __ xvreplgr2vr_b($dst$$FloatRegister, $cnt$$Register); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ------------------------------ LeftShift ----------------------------------- ++ ++instruct sllV(vReg dst, vReg src, vReg shift) %{ ++ match(Set dst (LShiftVB src shift)); ++ match(Set dst (LShiftVS src shift)); ++ match(Set dst (LShiftVI src shift)); ++ match(Set dst (LShiftVL src shift)); ++ format %{ "(x)vsll $dst, $src, $shift\t# @sllV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : __ xvsll_b(fscratch, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_CHAR : ++ case T_SHORT : __ xvsll_h(fscratch, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_INT : __ xvsll_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_LONG : __ xvsll_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : __ xvslti_bu($dst$$FloatRegister, $shift$$FloatRegister, 8); ++ __ xvand_v($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ case T_CHAR : ++ case T_SHORT : __ xvslti_bu($dst$$FloatRegister, $shift$$FloatRegister, 16); ++ __ xvand_v($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ default: ++ break; // do nothing ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : __ vsll_b(fscratch, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_CHAR : ++ case T_SHORT : __ vsll_h(fscratch, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_INT : __ vsll_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_LONG : __ vsll_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : __ vslti_bu($dst$$FloatRegister, $shift$$FloatRegister, 8); ++ __ vand_v($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ case T_CHAR : ++ case T_SHORT : __ vslti_bu($dst$$FloatRegister, $shift$$FloatRegister, 16); ++ __ vand_v($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ default: ++ break; // do nothing ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct sllV_imm(vReg dst, vReg src, immI shift) %{ ++ match(Set dst (LShiftVB src (LShiftCntV shift))); ++ match(Set dst (LShiftVS src (LShiftCntV shift))); ++ match(Set dst (LShiftVI src (LShiftCntV shift))); ++ match(Set dst (LShiftVL src (LShiftCntV shift))); ++ format %{ "(x)vslli $dst, $src, $shift\t# @sllV_imm" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : $shift$$constant >= 8 ? ++ __ xvxor_v($dst$$FloatRegister, $dst$$FloatRegister, $dst$$FloatRegister) : ++ __ xvslli_b($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_CHAR : ++ case T_SHORT : $shift$$constant >= 16 ? ++ __ xvxor_v($dst$$FloatRegister, $dst$$FloatRegister, $dst$$FloatRegister) : ++ __ xvslli_h($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_INT : __ xvslli_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_LONG : __ xvslli_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : $shift$$constant >= 8 ? ++ __ vxor_v($dst$$FloatRegister, $dst$$FloatRegister, $dst$$FloatRegister) : ++ __ vslli_b($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_CHAR : ++ case T_SHORT : $shift$$constant >= 16 ? ++ __ vxor_v($dst$$FloatRegister, $dst$$FloatRegister, $dst$$FloatRegister) : ++ __ vslli_h($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_INT : __ vslli_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_LONG : __ vslli_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ----------------------- LogicalRightShift ---------------------------------- ++ ++instruct srlV(vReg dst, vReg src, vReg shift) %{ ++ match(Set dst (URShiftVB src shift)); ++ match(Set dst (URShiftVS src shift)); ++ match(Set dst (URShiftVI src shift)); ++ match(Set dst (URShiftVL src shift)); ++ format %{ "(x)vsrl $dst, $src, $shift\t# @srlV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : __ xvsrl_b(fscratch, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_CHAR : ++ case T_SHORT : __ xvsrl_h(fscratch, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_INT : __ xvsrl_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_LONG : __ xvsrl_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : __ xvslti_bu($dst$$FloatRegister, $shift$$FloatRegister, 8); ++ __ xvand_v($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ case T_CHAR : ++ case T_SHORT : __ xvslti_bu($dst$$FloatRegister, $shift$$FloatRegister, 16); ++ __ xvand_v($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ default: ++ break; // do nothing ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : __ vsrl_b(fscratch, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_CHAR : ++ case T_SHORT : __ vsrl_h(fscratch, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_INT : __ vsrl_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_LONG : __ vsrl_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : __ vslti_bu($dst$$FloatRegister, $shift$$FloatRegister, 8); ++ __ vand_v($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ case T_CHAR : ++ case T_SHORT : __ vslti_bu($dst$$FloatRegister, $shift$$FloatRegister, 16); ++ __ vand_v($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ default: ++ break; // do nothing ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct srlV_imm(vReg dst, vReg src, immI shift) %{ ++ match(Set dst (URShiftVB src (RShiftCntV shift))); ++ match(Set dst (URShiftVS src (RShiftCntV shift))); ++ match(Set dst (URShiftVI src (RShiftCntV shift))); ++ match(Set dst (URShiftVL src (RShiftCntV shift))); ++ format %{ "(x)vsrli $dst, $src, $shift\t# @srlV_imm" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : $shift$$constant >= 8 ? ++ __ xvxor_v($dst$$FloatRegister, $dst$$FloatRegister, $dst$$FloatRegister) : ++ __ xvsrli_b($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_CHAR : ++ case T_SHORT : $shift$$constant >= 16 ? ++ __ xvxor_v($dst$$FloatRegister, $dst$$FloatRegister, $dst$$FloatRegister) : ++ __ xvsrli_h($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_INT : __ xvsrli_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_LONG : __ xvsrli_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : $shift$$constant >= 8 ? ++ __ vxor_v($dst$$FloatRegister, $dst$$FloatRegister, $dst$$FloatRegister) : ++ __ vsrli_b($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_CHAR : ++ case T_SHORT : $shift$$constant >= 16 ? ++ __ vxor_v($dst$$FloatRegister, $dst$$FloatRegister, $dst$$FloatRegister) : ++ __ vsrli_h($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_INT : __ vsrli_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_LONG : __ vsrli_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ------------------------- ArithmeticRightShift ----------------------------- ++ ++instruct sraV(vReg dst, vReg src, vReg shift) %{ ++ match(Set dst (RShiftVB src shift)); ++ match(Set dst (RShiftVS src shift)); ++ match(Set dst (RShiftVI src shift)); ++ match(Set dst (RShiftVL src shift)); ++ format %{ "(x)vsra $dst, $src, $shift\t# @sraV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : __ xvslti_bu(fscratch, $shift$$FloatRegister, 8); ++ __ xvorn_v(fscratch, $shift$$FloatRegister, fscratch); break; ++ case T_CHAR : ++ case T_SHORT : __ xvslti_bu(fscratch, $shift$$FloatRegister, 16); ++ __ xvorn_v(fscratch, $shift$$FloatRegister, fscratch); break; ++ default: ++ break; // do nothing ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : __ xvsra_b($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_CHAR : ++ case T_SHORT : __ xvsra_h($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_INT : __ xvsra_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_LONG : __ xvsra_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : __ vslti_bu(fscratch, $shift$$FloatRegister, 8); ++ __ vorn_v(fscratch, $shift$$FloatRegister, fscratch); break; ++ case T_CHAR : ++ case T_SHORT : __ vslti_bu(fscratch, $shift$$FloatRegister, 16); ++ __ vorn_v(fscratch, $shift$$FloatRegister, fscratch); break; ++ default: ++ break; // do nothing ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : __ vsra_b($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_CHAR : ++ case T_SHORT : __ vsra_h($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_INT : __ vsra_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_LONG : __ vsra_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct sraV_imm(vReg dst, vReg src, immI shift) %{ ++ match(Set dst (RShiftVB src (RShiftCntV shift))); ++ match(Set dst (RShiftVS src (RShiftCntV shift))); ++ match(Set dst (RShiftVI src (RShiftCntV shift))); ++ match(Set dst (RShiftVL src (RShiftCntV shift))); ++ format %{ "(x)vsrai $dst, $src, $shift\t# @sraV_imm" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : $shift$$constant >= 8 ? ++ __ xvsrai_b($dst$$FloatRegister, $src$$FloatRegister, 7) : ++ __ xvsrai_b($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_CHAR : ++ case T_SHORT : $shift$$constant >= 16 ? ++ __ xvsrai_h($dst$$FloatRegister, $src$$FloatRegister, 15) : ++ __ xvsrai_h($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_INT : __ xvsrai_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_LONG : __ xvsrai_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BOOLEAN: ++ case T_BYTE : $shift$$constant >= 8 ? ++ __ vsrai_b($dst$$FloatRegister, $src$$FloatRegister, 7) : ++ __ vsrai_b($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_CHAR : ++ case T_SHORT : $shift$$constant >= 16 ? ++ __ vsrai_h($dst$$FloatRegister, $src$$FloatRegister, 15) : ++ __ vsrai_h($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_INT : __ vsrai_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_LONG : __ vsrai_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ----------------------------- RotateRightV --------------------------------- ++ ++instruct rotrV(vReg dst, vReg src, vReg shift) %{ ++ match(Set dst (RotateRightV src shift)); ++ format %{ "(x)vrotr $dst, $src, $shift\t# @rotrV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_INT : __ xvrotr_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_LONG: __ xvrotr_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_INT : __ vrotr_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ case T_LONG: __ vrotr_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct rotrV_imm(vReg dst, vReg src, immI shift) %{ ++ match(Set dst (RotateRightV src shift)); ++ format %{ "(x)vrotri $dst, $src, $shift\t# @rotrV_imm" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_INT : __ xvrotri_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_LONG: __ xvrotri_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_INT : __ vrotri_w($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ case T_LONG: __ vrotri_d($dst$$FloatRegister, $src$$FloatRegister, $shift$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ------------------------------ RotateLeftV --------------------------------- ++ ++instruct rotlV(vReg dst, vReg src, vReg shift) %{ ++ match(Set dst (RotateLeftV src shift)); ++ format %{ "(x)vrotl $dst, $src, $shift\t# @rotlV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_INT : __ xvneg_w(fscratch, $shift$$FloatRegister); break; ++ case T_LONG: __ xvneg_d(fscratch, $shift$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_INT : __ xvrotr_w($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_LONG: __ xvrotr_d($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_INT : __ vneg_w(fscratch, $shift$$FloatRegister); break; ++ case T_LONG: __ vneg_d(fscratch, $shift$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_INT : __ vrotr_w($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ case T_LONG: __ vrotr_d($dst$$FloatRegister, $src$$FloatRegister, fscratch); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct rotlV_imm(vReg dst, vReg src, immI shift) %{ ++ match(Set dst (RotateLeftV src shift)); ++ format %{ "(x)vrotli $dst, $src, $shift\t# @rotlV_imm" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_INT : __ xvrotri_w($dst$$FloatRegister, $src$$FloatRegister, (-$shift$$constant) & 0x1f); break; ++ case T_LONG: __ xvrotri_d($dst$$FloatRegister, $src$$FloatRegister, (-$shift$$constant) & 0x3f); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_INT : __ vrotri_w($dst$$FloatRegister, $src$$FloatRegister, (-$shift$$constant) & 0x1f); break; ++ case T_LONG: __ vrotri_d($dst$$FloatRegister, $src$$FloatRegister, (-$shift$$constant) & 0x3f); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- AND -------------------------------------- ++ ++instruct andV(vReg dst, vReg src1, vReg src2) %{ ++ match(Set dst (AndV src1 src2)); ++ format %{ "(x)vand.v $dst, $src1, $src2\t# @andV" %} ++ ins_encode %{ ++ switch (Matcher::vector_length_in_bytes(this)) { ++ case 4: ++ case 8: ++ case 16: __ vand_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case 32: __ xvand_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct andVB_imm(vReg dst, vReg src, immIU8 imm) %{ ++ match(Set dst (AndV src (ReplicateB imm))); ++ format %{ "(x)vandi.b $dst, $src, $imm\t# @andVB_imm" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 4: ++ case 8: ++ case 16: __ vandi_b($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case 32: __ xvandi_b($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- OR --------------------------------------- ++ ++instruct orV(vReg dst, vReg src1, vReg src2) %{ ++ match(Set dst (OrV src1 src2)); ++ format %{ "(x)vor.v $dst, $src1, $src2\t# @orV" %} ++ ins_encode %{ ++ switch (Matcher::vector_length_in_bytes(this)) { ++ case 4: ++ case 8: ++ case 16: __ vor_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case 32: __ xvor_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct orVB_imm(vReg dst, vReg src, immIU8 imm) %{ ++ match(Set dst (OrV src (ReplicateB imm))); ++ format %{ "(x)vori.b $dst, $src, $imm\t# @orVB_imm" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 4: ++ case 8: ++ case 16: __ vori_b($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case 32: __ xvori_b($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- XOR -------------------------------------- ++ ++instruct xorV(vReg dst, vReg src1, vReg src2) %{ ++ match(Set dst (XorV src1 src2)); ++ format %{ "(x)vxor.v $dst, $src1, $src2\t# @xorV" %} ++ ins_encode %{ ++ switch (Matcher::vector_length_in_bytes(this)) { ++ case 4: ++ case 8: ++ case 16: __ vxor_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case 32: __ xvxor_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct xor16B_imm(vReg dst, vReg src, immIU8 imm) %{ ++ match(Set dst (XorV src (ReplicateB imm))); ++ format %{ "(x)vxori.b $dst, $src, $imm\t# @xor16B_imm" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 4: ++ case 8: ++ case 16: __ vxori_b($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case 32: __ xvxori_b($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- NOR -------------------------------------- ++ ++instruct norV(vReg dst, vReg src1, vReg src2, immI_M1 m1) %{ ++ match(Set dst (XorV (OrV src1 src2) (ReplicateB m1))); ++ match(Set dst (XorV (OrV src1 src2) (ReplicateS m1))); ++ match(Set dst (XorV (OrV src1 src2) (ReplicateI m1))); ++ format %{ "(x)vnor.v $dst, $src1, $src2\t# @norV" %} ++ ins_encode %{ ++ switch (Matcher::vector_length_in_bytes(this)) { ++ case 4: ++ case 8: ++ case 16: __ vnor_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case 32: __ xvnor_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct norVB_imm(vReg dst, vReg src, immIU8 imm, immI_M1 m1) %{ ++ match(Set dst (XorV (OrV src (ReplicateB imm)) (ReplicateB m1))); ++ format %{ "(x)vnori.b $dst, $src, $imm\t# @norVB_imm" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 4: ++ case 8: ++ case 16: __ vnori_b($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ case 32: __ xvnori_b($dst$$FloatRegister, $src$$FloatRegister, $imm$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- ANDN ------------------------------------- ++ ++instruct andnV(vReg dst, vReg src1, vReg src2, immI_M1 m1) %{ ++ match(Set dst (AndV src2 (XorV src1 (ReplicateB m1)))); ++ match(Set dst (AndV src2 (XorV src1 (ReplicateS m1)))); ++ match(Set dst (AndV src2 (XorV src1 (ReplicateI m1)))); ++ format %{ "(x)vandn.v $dst, $src1, $src2\t# @andnV" %} ++ ins_encode %{ ++ switch (Matcher::vector_length_in_bytes(this)) { ++ case 4: ++ case 8: ++ case 16: __ vandn_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case 32: __ xvandn_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------------- ORN -------------------------------------- ++ ++instruct ornV(vReg dst, vReg src1, vReg src2, immI_M1 m1) %{ ++ match(Set dst (OrV src1 (XorV src2 (ReplicateB m1)))); ++ match(Set dst (OrV src1 (XorV src2 (ReplicateS m1)))); ++ match(Set dst (OrV src1 (XorV src2 (ReplicateI m1)))); ++ format %{ "(x)vorn.v $dst, $src1, $src2\t# @ornV" %} ++ ins_encode %{ ++ switch (Matcher::vector_length_in_bytes(this)) { ++ case 4: ++ case 8: ++ case 16: __ vorn_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ case 32: __ xvorn_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ----------------------------- Reduction Add -------------------------------- ++ ++instruct reduceV(mRegI dst, mRegI src, vReg vsrc, vReg tmp1, vReg tmp2) %{ ++ match(Set dst (AddReductionVI src vsrc)); ++ match(Set dst (MulReductionVI src vsrc)); ++ match(Set dst (MaxReductionV src vsrc)); ++ match(Set dst (MinReductionV src vsrc)); ++ match(Set dst (AndReductionV src vsrc)); ++ match(Set dst (OrReductionV src vsrc)); ++ match(Set dst (XorReductionV src vsrc)); ++ effect(TEMP_DEF dst, TEMP tmp1, TEMP tmp2); ++ format %{ "(x)vreduce $dst, $src, $vsrc\t# TEMP($tmp1, $tmp2) @reduceV" %} ++ ins_encode %{ ++ __ reduce($dst$$Register, $src$$Register, $vsrc$$FloatRegister, $tmp1$$FloatRegister, $tmp2$$FloatRegister, ++ Matcher::vector_element_basic_type(this, $vsrc), this->ideal_Opcode(), Matcher::vector_length_in_bytes(this, $vsrc)); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct reduceVL(mRegL dst, mRegL src, vReg vsrc, vReg tmp1, vReg tmp2) %{ ++ match(Set dst (AddReductionVL src vsrc)); ++ match(Set dst (MulReductionVL src vsrc)); ++ match(Set dst (MaxReductionV src vsrc)); ++ match(Set dst (MinReductionV src vsrc)); ++ match(Set dst (AndReductionV src vsrc)); ++ match(Set dst (OrReductionV src vsrc)); ++ match(Set dst (XorReductionV src vsrc)); ++ effect(TEMP_DEF dst, TEMP tmp1, TEMP tmp2); ++ format %{ "(x)vreduce $dst, $src, $vsrc\t# TEMP($tmp1, $tmp2) @reduceVL" %} ++ ins_encode %{ ++ __ reduce($dst$$Register, $src$$Register, $vsrc$$FloatRegister, $tmp1$$FloatRegister, $tmp2$$FloatRegister, ++ T_LONG, this->ideal_Opcode(), Matcher::vector_length_in_bytes(this, $vsrc)); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct reduceVF(regF dst, regF src, vReg vsrc, vReg tmp) %{ ++ match(Set dst (AddReductionVF src vsrc)); ++ match(Set dst (MulReductionVF src vsrc)); ++ effect(TEMP_DEF dst, TEMP tmp); ++ format %{ "(x)vreduce $dst, $src, $vsrc\t# TEMP($tmp) @reduceVF" %} ++ ins_encode %{ ++ __ reduce($dst$$FloatRegister, $src$$FloatRegister, $vsrc$$FloatRegister, $tmp$$FloatRegister, ++ T_FLOAT, this->ideal_Opcode(), Matcher::vector_length_in_bytes(this, $vsrc)); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct reduceVD(regD dst, regD src, vReg vsrc, vReg tmp) %{ ++ match(Set dst (AddReductionVD src vsrc)); ++ match(Set dst (MulReductionVD src vsrc)); ++ effect(TEMP_DEF dst, TEMP tmp); ++ format %{ "(x)vreduce $dst, $src, $vsrc\t# TEMP($tmp) @reduceVD" %} ++ ins_encode %{ ++ __ reduce($dst$$FloatRegister, $src$$FloatRegister, $vsrc$$FloatRegister, $tmp$$FloatRegister, ++ T_DOUBLE, this->ideal_Opcode(), Matcher::vector_length_in_bytes(this, $vsrc)); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ------------------------------ Vector Round --------------------------------- ++ ++instruct round_float_lsx(vReg dst, vReg src, vReg vtemp1, vReg vtemp2) %{ ++ predicate(Matcher::vector_length_in_bytes(n) <= 16); ++ match(Set dst (RoundVF src)); ++ effect(TEMP_DEF dst, TEMP vtemp1, TEMP vtemp2); ++ format %{ "round_float_lsx $dst, $src\t# " ++ "TEMP($vtemp1, $vtemp2) @round_float_lsx" %} ++ ins_encode %{ ++ __ java_round_float_lsx($dst$$FloatRegister, ++ $src$$FloatRegister, ++ $vtemp1$$FloatRegister, ++ $vtemp2$$FloatRegister); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct round_float_lasx(vReg dst, vReg src, vReg vtemp1, vReg vtemp2) %{ ++ predicate(Matcher::vector_length_in_bytes(n) > 16); ++ match(Set dst (RoundVF src)); ++ effect(TEMP_DEF dst, TEMP vtemp1, TEMP vtemp2); ++ format %{ "round_float_lasx $dst, $src\t# " ++ "TEMP($vtemp1, $vtemp2) @round_float_lasx" %} ++ ins_encode %{ ++ __ java_round_float_lasx($dst$$FloatRegister, ++ $src$$FloatRegister, ++ $vtemp1$$FloatRegister, ++ $vtemp2$$FloatRegister); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct round_double_lsx(vReg dst, vReg src, vReg vtemp1, vReg vtemp2) %{ ++ predicate(Matcher::vector_length_in_bytes(n) <= 16); ++ match(Set dst (RoundVD src)); ++ effect(TEMP_DEF dst, TEMP vtemp1, TEMP vtemp2); ++ format %{ "round_double_lsx $dst, $src\t# " ++ "TEMP($vtemp1, $vtemp2) @round_double_lsx" %} ++ ins_encode %{ ++ __ java_round_double_lsx($dst$$FloatRegister, ++ $src$$FloatRegister, ++ $vtemp1$$FloatRegister, ++ $vtemp2$$FloatRegister); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct round_double_lasx(vReg dst, vReg src, vReg vtemp1, vReg vtemp2) %{ ++ predicate(Matcher::vector_length_in_bytes(n) > 16); ++ match(Set dst (RoundVD src)); ++ effect(TEMP_DEF dst, TEMP vtemp1, TEMP vtemp2); ++ format %{ "round_double_lasx $dst, $src\t# " ++ "TEMP($vtemp1, $vtemp2) @round_double_lasx" %} ++ ins_encode %{ ++ __ java_round_double_lasx($dst$$FloatRegister, ++ $src$$FloatRegister, ++ $vtemp1$$FloatRegister, ++ $vtemp2$$FloatRegister); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ------------------------------ RoundDoubleModeV ---------------------------- ++ ++instruct roundVD(vReg dst, vReg src, immI rmode) %{ ++ match(Set dst (RoundDoubleModeV src rmode)); ++ format %{ "(x)vfrint $dst, $src, $rmode\t# @roundVD" %} ++ ins_encode %{ ++ if (Matcher::vector_length(this) == 4) { ++ switch ($rmode$$constant) { ++ case RoundDoubleModeNode::rmode_rint: __ xvfrintrne_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ case RoundDoubleModeNode::rmode_floor: __ xvfrintrm_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ case RoundDoubleModeNode::rmode_ceil: __ xvfrintrp_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ } ++ } else if (Matcher::vector_length(this) == 2) { ++ switch ($rmode$$constant) { ++ case RoundDoubleModeNode::rmode_rint: __ vfrintrne_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ case RoundDoubleModeNode::rmode_floor: __ vfrintrm_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ case RoundDoubleModeNode::rmode_ceil: __ vfrintrp_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ---------------------------- Vector Cast B2X ------------------------------- ++ ++instruct cvtVB(vReg dst, vReg src) %{ ++ match(Set dst (VectorCastB2X src)); ++ format %{ "(x)vconvert $dst, $src\t# @cvtVB" %} ++ ins_encode %{ ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_SHORT : __ vext2xv_h_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_INT : __ vext2xv_w_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_LONG : __ vext2xv_d_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_FLOAT : __ vext2xv_w_b($dst$$FloatRegister, $src$$FloatRegister); ++ Matcher::vector_length_in_bytes(this) > 16 ? ++ __ xvffint_s_w($dst$$FloatRegister, $dst$$FloatRegister) : ++ __ vffint_s_w($dst$$FloatRegister, $dst$$FloatRegister); break; ++ case T_DOUBLE: __ vext2xv_d_b($dst$$FloatRegister, $src$$FloatRegister); ++ assert(Matcher::vector_length_in_bytes(this) > 16, "only support 4Bto4D"); ++ __ xvffint_d_l($dst$$FloatRegister, $dst$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------- Vector Cast S2X -------------------------------- ++ ++instruct cvtVS(vReg dst, vReg src) %{ ++ match(Set dst (VectorCastS2X src)); ++ format %{ "(x)vconvert $dst, $src\t# @cvtVS" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this, $src) > 16 && Matcher::vector_element_basic_type(this) == T_BYTE) { ++ __ xvpermi_q(fscratch, $src$$FloatRegister, 0x00); ++ __ xvpermi_q($dst$$FloatRegister, $src$$FloatRegister, 0x11); ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : Matcher::vector_length_in_bytes(this, $src) > 16 ? ++ __ vsrlni_b_h($dst$$FloatRegister, fscratch, 0) : ++ __ vsrlni_b_h($dst$$FloatRegister, $src$$FloatRegister, 0); break; ++ case T_INT : __ vext2xv_w_h($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_LONG : __ vext2xv_d_h($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_FLOAT : __ vext2xv_w_h($dst$$FloatRegister, $src$$FloatRegister); ++ Matcher::vector_length_in_bytes(this) > 16 ? ++ __ xvffint_s_w($dst$$FloatRegister, $dst$$FloatRegister) : ++ __ vffint_s_w($dst$$FloatRegister, $dst$$FloatRegister); break; ++ case T_DOUBLE: __ vext2xv_d_h($dst$$FloatRegister, $src$$FloatRegister); ++ Matcher::vector_length_in_bytes(this) > 16 ? ++ __ xvffint_d_l($dst$$FloatRegister, $dst$$FloatRegister) : ++ __ vffint_d_l($dst$$FloatRegister, $dst$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// --------------------------- Vector Cast I2X -------------------------------- ++ ++instruct cvtVI(vReg dst, vReg src) %{ ++ match(Set dst (VectorCastI2X src)); ++ format %{ "(x)vconvert $dst, $src\t# @cvtVI" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this, $src) > 16 && type2aelembytes(Matcher::vector_element_basic_type(this)) < 4) { ++ __ xvpermi_q(fscratch, $src$$FloatRegister, 0x00); ++ __ xvpermi_q($dst$$FloatRegister, $src$$FloatRegister, 0x11); ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : if (Matcher::vector_length_in_bytes(this, $src) > 16) { ++ __ vsrlni_h_w($dst$$FloatRegister, fscratch, 0); ++ __ vsrlni_b_h($dst$$FloatRegister, $dst$$FloatRegister, 0); ++ } else { ++ __ vsrlni_h_w($dst$$FloatRegister, $src$$FloatRegister, 0); ++ __ vsrlni_b_h($dst$$FloatRegister, $dst$$FloatRegister, 0); ++ } ++ break; ++ case T_SHORT : Matcher::vector_length_in_bytes(this, $src) > 16 ? ++ __ vsrlni_h_w($dst$$FloatRegister, fscratch, 0) : ++ __ vsrlni_h_w($dst$$FloatRegister, $src$$FloatRegister, 0); break; ++ case T_LONG : __ vext2xv_d_w($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_FLOAT : Matcher::vector_length_in_bytes(this) > 16 ? ++ __ xvffint_s_w($dst$$FloatRegister, $src$$FloatRegister) : ++ __ vffint_s_w($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_DOUBLE: __ vext2xv_d_w($dst$$FloatRegister, $src$$FloatRegister); ++ Matcher::vector_length_in_bytes(this) > 16 ? ++ __ xvffint_d_l($dst$$FloatRegister, $dst$$FloatRegister) : ++ __ vffint_d_l($dst$$FloatRegister, $dst$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ----------------------------- Vector Cast L2X ------------------------------ ++ ++instruct cvtVL(vReg dst, vReg src) %{ ++ match(Set dst (VectorCastL2X src)); ++ format %{ "(x)vconvert $dst, $src\t# @cvtVL" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this, $src) > 16 && type2aelembytes(Matcher::vector_element_basic_type(this)) < 8) { ++ __ xvpermi_q(fscratch, $src$$FloatRegister, 0x00); ++ __ xvpermi_q($dst$$FloatRegister, $src$$FloatRegister, 0x11); ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : assert(Matcher::vector_length_in_bytes(this, $src) > 16, "only support 4Lto4B"); ++ __ vsrlni_w_d($dst$$FloatRegister, fscratch, 0); ++ __ vsrlni_h_w($dst$$FloatRegister, $dst$$FloatRegister, 0); ++ __ vsrlni_b_h($dst$$FloatRegister, $dst$$FloatRegister, 0); break; ++ case T_SHORT : if (Matcher::vector_length_in_bytes(this, $src) > 16) { ++ __ vsrlni_w_d($dst$$FloatRegister, fscratch, 0); ++ __ vsrlni_h_w($dst$$FloatRegister, $dst$$FloatRegister, 0); ++ } else { ++ __ vsrlni_w_d($dst$$FloatRegister, $src$$FloatRegister, 0); ++ __ vsrlni_h_w($dst$$FloatRegister, $dst$$FloatRegister, 0); ++ } ++ break; ++ case T_INT : Matcher::vector_length_in_bytes(this, $src) > 16 ? ++ __ vsrlni_w_d($dst$$FloatRegister, fscratch, 0) : ++ __ vsrlni_w_d($dst$$FloatRegister, $src$$FloatRegister, 0); break; ++ case T_FLOAT : Matcher::vector_length_in_bytes(this, $src) > 16 ? ++ __ vffint_s_l($dst$$FloatRegister, $dst$$FloatRegister, fscratch) : ++ __ vffint_s_l($dst$$FloatRegister, $src$$FloatRegister, $src$$FloatRegister); break; ++ case T_DOUBLE: Matcher::vector_length_in_bytes(this) > 16 ? ++ __ xvffint_d_l($dst$$FloatRegister, $src$$FloatRegister) : ++ __ vffint_d_l($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ----------------------------- Vector Cast F2X ------------------------------ ++ ++instruct cvtVF(vReg dst, vReg src) %{ ++ match(Set dst (VectorCastF2X src)); ++ format %{ "(x)vconvert $dst, $src\t# @cvtVF" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this, $src) > 16 && type2aelembytes(Matcher::vector_element_basic_type(this)) < 4) { ++ __ xvftintrz_w_s(fscratch, $src$$FloatRegister); ++ __ xvpermi_q($dst$$FloatRegister, fscratch, 0x11); ++ } else if (Matcher::vector_length_in_bytes(this) > 16 && type2aelembytes(Matcher::vector_element_basic_type(this)) > 4) { ++ __ xvpermi_d($dst$$FloatRegister, $src$$FloatRegister, 0b01010000); ++ } else if (Matcher::vector_length_in_bytes(this, $src) <= 16 && type2aelembytes(Matcher::vector_element_basic_type(this)) < 4) { ++ __ vftintrz_w_s($dst$$FloatRegister, $src$$FloatRegister); ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : if (Matcher::vector_length_in_bytes(this, $src) > 16) { ++ __ vsrlni_h_w($dst$$FloatRegister, fscratch, 0); ++ __ vsrlni_b_h($dst$$FloatRegister, $dst$$FloatRegister, 0); ++ } else { ++ __ vsrlni_h_w($dst$$FloatRegister, $dst$$FloatRegister, 0); ++ __ vsrlni_b_h($dst$$FloatRegister, $dst$$FloatRegister, 0); ++ } ++ break; ++ case T_SHORT : Matcher::vector_length_in_bytes(this, $src) > 16 ? ++ __ vsrlni_h_w($dst$$FloatRegister, fscratch, 0) : ++ __ vsrlni_h_w($dst$$FloatRegister, $dst$$FloatRegister, 0); break; ++ case T_INT : Matcher::vector_length_in_bytes(this, $src) > 16 ? ++ __ xvftintrz_w_s($dst$$FloatRegister, $src$$FloatRegister) : ++ __ vftintrz_w_s($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_LONG : Matcher::vector_length_in_bytes(this) > 16 ? ++ __ xvftintrzl_l_s($dst$$FloatRegister, $dst$$FloatRegister) : ++ __ vftintrzl_l_s($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_DOUBLE: Matcher::vector_length_in_bytes(this) > 16 ? ++ __ xvfcvtl_d_s($dst$$FloatRegister, $dst$$FloatRegister) : ++ __ vfcvtl_d_s($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ---------------------------- Vector Cast D2X ------------------------------- ++ ++instruct cvtVD(vReg dst, vReg src) %{ ++ match(Set dst (VectorCastD2X src)); ++ format %{ "(x)vconvert $dst, $src\t# @cvtVD" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this, $src) > 16 && type2aelembytes(Matcher::vector_element_basic_type(this)) < 8) { ++ __ xvpermi_q(fscratch, $src$$FloatRegister, 0x11); ++ if (Matcher::vector_element_basic_type(this) != T_FLOAT) ++ __ vftintrz_w_d($dst$$FloatRegister, fscratch, $src$$FloatRegister); ++ } else if (Matcher::vector_length_in_bytes(this, $src) <= 16 && type2aelembytes(Matcher::vector_element_basic_type(this)) < 8) { ++ if (Matcher::vector_element_basic_type(this) != T_FLOAT) ++ __ vftintrz_w_d($dst$$FloatRegister, $src$$FloatRegister, $src$$FloatRegister); ++ } ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : assert(Matcher::vector_length_in_bytes(this, $src) > 16, "only support 4Dto4B"); ++ __ vsrlni_h_w($dst$$FloatRegister, $dst$$FloatRegister, 0); ++ __ vsrlni_b_h($dst$$FloatRegister, $dst$$FloatRegister, 0); break; ++ case T_SHORT: __ vsrlni_h_w($dst$$FloatRegister, $dst$$FloatRegister, 0); break; ++ case T_INT : break; ++ case T_LONG : Matcher::vector_length_in_bytes(this) > 16 ? ++ __ xvftintrz_l_d($dst$$FloatRegister, $src$$FloatRegister) : ++ __ vftintrz_l_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_FLOAT: Matcher::vector_length_in_bytes(this, $src) > 16 ? ++ __ vfcvt_s_d($dst$$FloatRegister, fscratch, $src$$FloatRegister) : ++ __ vfcvt_s_d($dst$$FloatRegister, $src$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ---------------------------- Vector Cast HF2F ------------------------------- ++ ++instruct cvtHFtoF(vReg dst, vReg src) %{ ++ match(Set dst (VectorCastHF2F src)); ++ format %{ "(x)vconvert $dst, $src\t# @cvtHFtoF" %} ++ ins_encode %{ ++ switch(Matcher::vector_length(this)) { ++ case 2: ++ case 4: __ vfcvtl_s_h($dst$$FloatRegister, $src$$FloatRegister); break; ++ case 8: __ xvpermi_d($dst$$FloatRegister, $src$$FloatRegister, 0b01010000); ++ __ xvfcvtl_s_h($dst$$FloatRegister, $dst$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ---------------------------- Vector Cast F2HF ------------------------------- ++ ++instruct cvtFtoHF(vReg dst, vReg src) %{ ++ match(Set dst (VectorCastF2HF src)); ++ format %{ "(x)vconvert $dst, $src\t# @cvtFtoHF" %} ++ ins_encode %{ ++ switch(Matcher::vector_length(this)) { ++ case 2: ++ case 4: __ vfcvt_h_s($dst$$FloatRegister, $src$$FloatRegister, $src$$FloatRegister); break; ++ case 8: __ xvfcvt_h_s($dst$$FloatRegister, $src$$FloatRegister, $src$$FloatRegister); ++ __ xvpermi_d($dst$$FloatRegister, $dst$$FloatRegister, 0b11011000); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ------------------------------ VectorReinterpret --------------------------- ++ ++instruct reinterpretV(vReg dst, vReg src) %{ ++ match(Set dst (VectorReinterpret src)); ++ format %{ "(x)vreinterpret $dst, $src\t# @reinterpretV" %} ++ ins_encode %{ ++ uint length_in_bytes_src = Matcher::vector_length_in_bytes(this, $src); ++ uint length_in_bytes_dst = Matcher::vector_length_in_bytes(this); ++ if ($dst$$FloatRegister != $src$$FloatRegister) { ++ if (length_in_bytes_dst > 16) ++ __ xvori_b($dst$$FloatRegister, $src$$FloatRegister, 0); ++ else ++ __ vori_b($dst$$FloatRegister, $src$$FloatRegister, 0); ++ } ++ if (length_in_bytes_dst > length_in_bytes_src) { ++ if (length_in_bytes_dst == 32) { ++ switch (length_in_bytes_src) { ++ case 4: __ xvinsgr2vr_w($dst$$FloatRegister, R0, 1); ++ case 8: __ xvinsgr2vr_d($dst$$FloatRegister, R0, 1); ++ case 16: __ xvinsgr2vr_d($dst$$FloatRegister, R0, 2); ++ __ xvinsgr2vr_d($dst$$FloatRegister, R0, 3); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (length_in_bytes_dst == 16) { ++ switch (length_in_bytes_src) { ++ case 4: __ vinsgr2vr_w($dst$$FloatRegister, R0, 1); ++ case 8: __ vinsgr2vr_d($dst$$FloatRegister, R0, 1); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (length_in_bytes_dst == 8) { ++ assert(length_in_bytes_src == 4, "invalid vector length"); ++ __ vinsgr2vr_w($dst$$FloatRegister, R0, 1); ++ } else { ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( empty ); ++%} ++ ++// ------------------------------ VectorInsert -------------------------------- ++ ++instruct insertV(vReg dst, mRegI val, immIU4 idx) %{ ++ predicate(Matcher::vector_length_in_bytes(n) <= 16); ++ match(Set dst (VectorInsert (Binary dst val) idx)); ++ format %{ "(x)vinsgr2vr $dst, $val, $idx\t# @insertV" %} ++ ins_encode %{ ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vinsgr2vr_b($dst$$FloatRegister, $val$$Register, $idx$$constant); break; ++ case T_SHORT: __ vinsgr2vr_h($dst$$FloatRegister, $val$$Register, $idx$$constant); break; ++ case T_INT : __ vinsgr2vr_w($dst$$FloatRegister, $val$$Register, $idx$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct insertVL(vReg dst, mRegL val, immIU2 idx) %{ ++ match(Set dst (VectorInsert (Binary dst val) idx)); ++ format %{ "(x)vinsgr2vr.d $dst, $val, $idx\t# @insertVL" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 2: __ vinsgr2vr_d($dst$$FloatRegister, $val$$Register, $idx$$constant); break; ++ case 4: __ xvinsgr2vr_d($dst$$FloatRegister, $val$$Register, $idx$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct insertVF(vReg dst, regF val, immIU3 idx) %{ ++ match(Set dst (VectorInsert (Binary dst val) idx)); ++ format %{ "(x)vinsert $dst, $val, $idx\t# @insertVF" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 2: ++ case 4: __ movfr2gr_s(AT, $val$$FloatRegister); ++ __ vinsgr2vr_w($dst$$FloatRegister, AT, $idx$$constant); break; ++ case 8: __ xvinsve0_w($dst$$FloatRegister, $val$$FloatRegister, $idx$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct insertVD(vReg dst, regD val, immIU2 idx) %{ ++ match(Set dst (VectorInsert (Binary dst val) idx)); ++ format %{ "(x)vinsert $dst, $val, $idx\t# @insertVD" %} ++ ins_encode %{ ++ switch (Matcher::vector_length(this)) { ++ case 2: __ movfr2gr_d(AT, $val$$FloatRegister); ++ __ vinsgr2vr_d($dst$$FloatRegister, AT, $idx$$constant); break; ++ case 4: __ xvinsve0_d($dst$$FloatRegister, $val$$FloatRegister, $idx$$constant); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct insert32B(vReg dst, mRegI val, immIU5 idx) %{ ++ predicate(Matcher::vector_length(n) == 32 && Matcher::vector_element_basic_type(n) == T_BYTE); ++ match(Set dst (VectorInsert (Binary dst val) idx)); ++ format %{ "(x)vinsert $dst, $val, $idx\t# @insert32B" %} ++ ins_encode %{ ++ int idx = $idx$$constant; ++ int msbw, lsbw; ++ switch (idx % 4) { ++ case 0: msbw = 7, lsbw = 0; break; ++ case 1: msbw = 15, lsbw = 8; break; ++ case 2: msbw = 23, lsbw = 16; break; ++ case 3: msbw = 31, lsbw = 24; break; ++ default: ++ ShouldNotReachHere(); ++ } ++ __ xvpickve2gr_w(SCR1, $dst$$FloatRegister, idx >> 2); ++ __ bstrins_w(SCR1, $val$$Register, msbw, lsbw); ++ __ xvinsgr2vr_w($dst$$FloatRegister, SCR1, idx >> 2); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct insert16S(vReg dst, mRegI val, immIU4 idx) %{ ++ predicate(Matcher::vector_length(n) == 16 && Matcher::vector_element_basic_type(n) == T_SHORT); ++ match(Set dst (VectorInsert (Binary dst val) idx)); ++ format %{ "(x)vinsert $dst, $val, $idx\t# @insert16S" %} ++ ins_encode %{ ++ int idx = $idx$$constant; ++ int msbw = (idx % 2) ? 31 : 15; ++ int lsbw = (idx % 2) ? 16 : 0; ++ __ xvpickve2gr_w(SCR1, $dst$$FloatRegister, idx >> 1); ++ __ bstrins_w(SCR1, $val$$Register, msbw, lsbw); ++ __ xvinsgr2vr_w($dst$$FloatRegister, SCR1, idx >> 1); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct insert8I(vReg dst, mRegI val, immIU3 idx) %{ ++ predicate(Matcher::vector_length(n) == 8 && Matcher::vector_element_basic_type(n) == T_INT); ++ match(Set dst (VectorInsert (Binary dst val) idx)); ++ format %{ "(x)vinsgr2vr.w $dst, $val, $idx\t# @insert8I" %} ++ ins_encode %{ ++ __ xvinsgr2vr_w($dst$$FloatRegister, $val$$Register, $idx$$constant); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// -------------------------------- Vector Blend ------------------------------ ++ ++instruct blendV(vReg dst, vReg src1, vReg src2, vReg mask) ++%{ ++ match(Set dst (VectorBlend (Binary src1 src2) mask)); ++ format %{ "(x)vbitsel.v $dst, $src1, $src2, $mask\t# @blendV" %} ++ ins_encode %{ ++ switch (Matcher::vector_length_in_bytes(this)) { ++ case 4: ++ case 8: ++ case 16: __ vbitsel_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $mask$$FloatRegister); break; ++ case 32: __ xvbitsel_v($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, $mask$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// -------------------------------- LoadMask ---------------------------------- ++ ++instruct loadmaskV(vReg dst, vReg src) %{ ++ match(Set dst (VectorLoadMask src)); ++ format %{ "(x)vloadmask $dst, $src\t# @loadmaskV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvneg_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_SHORT : __ vext2xv_h_b($dst$$FloatRegister, $src$$FloatRegister); ++ __ xvneg_h($dst$$FloatRegister, $dst$$FloatRegister); break; ++ case T_FLOAT : ++ case T_INT : __ vext2xv_w_b($dst$$FloatRegister, $src$$FloatRegister); ++ __ xvneg_w($dst$$FloatRegister, $dst$$FloatRegister); break; ++ case T_DOUBLE: ++ case T_LONG : __ vext2xv_d_b($dst$$FloatRegister, $src$$FloatRegister); ++ __ xvneg_d($dst$$FloatRegister, $dst$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vneg_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_SHORT: __ vext2xv_h_b($dst$$FloatRegister, $src$$FloatRegister); ++ __ vneg_h($dst$$FloatRegister, $dst$$FloatRegister); break; ++ case T_FLOAT: ++ case T_INT : __ vext2xv_w_b($dst$$FloatRegister, $src$$FloatRegister); ++ __ vneg_w($dst$$FloatRegister, $dst$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++//-------------------------------- StoreMask ---------------------------------- ++ ++instruct storemaskV(vReg dst, vReg src, immIU4 size) %{ ++ match(Set dst (VectorStoreMask src size)); ++ format %{ "(x)vstoremask $dst, $src\t# @storemaskV" %} ++ ins_encode %{ ++ uint size = $size$$constant; ++ if (Matcher::vector_length_in_bytes(this, $src) > 16 && size != 1 /* byte */) ++ __ xvpermi_d(fscratch, $src$$FloatRegister, 0b00001110); ++ ++ switch (size) { ++ case 8: /* long or double */ ++ __ vsrlni_w_d(fscratch, $src$$FloatRegister, 0); ++ __ vsrlni_h_w(fscratch, fscratch, 0); ++ __ vsrlni_b_h(fscratch, fscratch, 0); ++ __ vneg_b($dst$$FloatRegister, fscratch); ++ break; ++ case 4: /* int or float */ ++ __ vsrlni_h_w(fscratch, $src$$FloatRegister, 0); ++ __ vsrlni_b_h(fscratch, fscratch, 0); ++ __ vneg_b($dst$$FloatRegister, fscratch); ++ break; ++ case 2: /* short */ ++ __ vsrlni_b_h(fscratch, $src$$FloatRegister, 0); ++ __ vneg_b($dst$$FloatRegister, fscratch); ++ break; ++ case 1: /* byte */ ++ if (Matcher::vector_length_in_bytes(this, $src) > 16) ++ __ xvneg_b($dst$$FloatRegister, $src$$FloatRegister); ++ else ++ __ vneg_b($dst$$FloatRegister, $src$$FloatRegister); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ----------------------------- VectorMaskCast ----------------------------------- ++ ++instruct vmaskcast_eq(vReg dst) %{ ++ predicate(Matcher::vector_length_in_bytes(n) == Matcher::vector_length_in_bytes(n->in(1))); ++ match(Set dst (VectorMaskCast dst)); ++ format %{ "(x)vmaskcast $dst\t# @vmaskcast_eq" %} ++ ins_encode(/* empty encoding */); ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct vmaskcast_gt(vReg dst, vReg src) %{ ++ predicate(Matcher::vector_length_in_bytes(n) > Matcher::vector_length_in_bytes(n->in(1))); ++ match(Set dst (VectorMaskCast src)); ++ format %{ "(x)vmaskcast $dst\t# @vmaskcast_gt" %} ++ ins_encode %{ ++ if (Matcher::vector_element_basic_type(this, $src) == T_BYTE) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_SHORT : __ vext2xv_h_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_FLOAT : ++ case T_INT : __ vext2xv_w_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_DOUBLE: ++ case T_LONG : __ vext2xv_d_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (Matcher::vector_element_basic_type(this, $src) == T_SHORT) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_FLOAT : ++ case T_INT : __ vext2xv_w_h($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_DOUBLE: ++ case T_LONG : __ vext2xv_d_h($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else if (type2aelembytes(Matcher::vector_element_basic_type(this, $src)) == 4 /* int or float */) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_DOUBLE: ++ case T_LONG : __ vext2xv_d_w($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct vmaskcast_lt(vReg dst, vReg src) %{ ++ predicate(Matcher::vector_length_in_bytes(n) < Matcher::vector_length_in_bytes(n->in(1))); ++ match(Set dst (VectorMaskCast src)); ++ effect(TEMP_DEF dst); ++ format %{ "(x)vmaskcast $dst\t# @vmaskcast_lt" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this, $src) > 16) ++ __ xvpermi_d($dst$$FloatRegister, $src$$FloatRegister, 0b00001110); ++ ++ if (type2aelembytes(Matcher::vector_element_basic_type(this, $src)) == 8 /* long or double */) { ++ if (type2aelembytes(Matcher::vector_element_basic_type(this)) <= 4) { ++ __ vsrlni_w_d($dst$$FloatRegister, $src$$FloatRegister, 0); ++ if (type2aelembytes(Matcher::vector_element_basic_type(this)) <= 2) { ++ __ vsrlni_h_w($dst$$FloatRegister, $dst$$FloatRegister, 0); ++ if (type2aelembytes(Matcher::vector_element_basic_type(this)) == 1) { ++ __ vsrlni_b_h($dst$$FloatRegister, $dst$$FloatRegister, 0); ++ } ++ } ++ } ++ } else if (type2aelembytes(Matcher::vector_element_basic_type(this, $src)) == 4 /* int or float */) { ++ if (type2aelembytes(Matcher::vector_element_basic_type(this)) <= 2) { ++ __ vsrlni_h_w($dst$$FloatRegister, $src$$FloatRegister, 0); ++ if (type2aelembytes(Matcher::vector_element_basic_type(this)) == 1) { ++ __ vsrlni_b_h($dst$$FloatRegister, $dst$$FloatRegister, 0); ++ } ++ } ++ } else if (Matcher::vector_element_basic_type(this, $src) == T_SHORT) { ++ __ vsrlni_b_h($dst$$FloatRegister, $src$$FloatRegister, 0); ++ } else { ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ----------------------------- VectorTest ----------------------------------- ++ ++instruct anytrue_in_maskV16_branch(cmpOp cop, vReg op1, vReg op2, label labl) %{ ++ predicate(Matcher::vector_length_in_bytes(n->in(2)->in(1)) == 16 && static_cast(n->in(2))->get_predicate() == BoolTest::ne); ++ match(If cop (VectorTest op1 op2)); ++ effect(USE labl); ++ format %{ "b$cop $op1, $op2(not used), $labl\t# @anytrue_in_maskV16_branch" %} ++ ++ ins_encode %{ ++ Label &L = *($labl$$label); ++ // No need to use op2, op2 is all ones. ++ __ vseteqz_v(FCC0, $op1$$FloatRegister); ++ switch($cop$$cmpcode) { ++ case 0x01: // EQ ++ __ bcnez(FCC0, L); ++ break; ++ case 0x02: // NE ++ __ bceqz(FCC0, L); ++ break; ++ default: ++ Unimplemented(); ++ } ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct alltrue_in_maskV16_branch(cmpOp cop, vReg op1, vReg op2, label labl) %{ ++ predicate(Matcher::vector_length_in_bytes(n->in(2)->in(1)) == 16 && static_cast(n->in(2))->get_predicate() == BoolTest::overflow); ++ match(If cop (VectorTest op1 op2)); ++ effect(USE labl); ++ format %{ "b$cop $op1, $op2(not used), $labl\t# @alltrue_in_maskV16__branch" %} ++ ++ ins_encode %{ ++ Label &L = *($labl$$label); ++ // No need to use op2, op2 is all ones. ++ __ vsetallnez_b(FCC0, $op1$$FloatRegister); ++ switch($cop$$cmpcode) { ++ case 0x01: // EQ ++ __ bcnez(FCC0, L); ++ break; ++ case 0x02: // NE ++ __ bceqz(FCC0, L); ++ break; ++ default: ++ Unimplemented(); ++ } ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct CMoveI_anytrue_in_maskV16(mRegI dst, mRegIorL2I src1, mRegIorL2I src2, vReg op1, vReg op2, cmpOp cop, regF tmp1, regF tmp2) ++%{ ++ predicate(Matcher::vector_length_in_bytes(n->in(1)->in(2)->in(1)) == 16 && static_cast(n->in(1)->in(2))->get_predicate() == BoolTest::ne); ++ match(Set dst (CMoveI (Binary cop (VectorTest op1 op2)) (Binary src1 src2))); ++ effect(TEMP tmp1, TEMP tmp2); ++ format %{ "cmovei_vtest($cop) $dst, $src1, $src2, $op1, $op2(not used)\t# TEMP($tmp1, $tmp2) @CMoveI_anytrue_in_maskV16" %} ++ ins_encode %{ ++ // No need to use op2, op2 is all ones. ++ __ vseteqz_v(FCC0, $op1$$FloatRegister); ++ __ movgr2fr_w($tmp1$$FloatRegister, $src1$$Register); ++ __ movgr2fr_w($tmp2$$FloatRegister, $src2$$Register); ++ switch($cop$$cmpcode) { ++ case 0x01: // EQ ++ __ fsel($tmp1$$FloatRegister, $tmp1$$FloatRegister, $tmp2$$FloatRegister, FCC0); ++ break; ++ case 0x02: // NE ++ __ fsel($tmp1$$FloatRegister, $tmp2$$FloatRegister, $tmp1$$FloatRegister, FCC0); ++ break; ++ default: ++ Unimplemented(); ++ } ++ __ movfr2gr_s($dst$$Register, $tmp1$$FloatRegister); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct CMoveI_alltrue_in_maskV16(mRegI dst, mRegIorL2I src1, mRegIorL2I src2, vReg op1, vReg op2, cmpOp cop, regF tmp1, regF tmp2) ++%{ ++ predicate(Matcher::vector_length_in_bytes(n->in(1)->in(2)->in(1)) == 16 && static_cast(n->in(1)->in(2))->get_predicate() == BoolTest::overflow); ++ match(Set dst (CMoveI (Binary cop (VectorTest op1 op2)) (Binary src1 src2))); ++ effect(TEMP tmp1, TEMP tmp2); ++ format %{ "cmovei_vtest($cop) $dst, $src1, $src2, $op1, $op2(not used)\t# TEMP($tmp1, $tmp2) @CMoveI_alltrue_in_maskV16" %} ++ ins_encode %{ ++ // No need to use op2, op2 is all ones. ++ __ vsetallnez_b(FCC0, $op1$$FloatRegister); ++ __ movgr2fr_w($tmp1$$FloatRegister, $src1$$Register); ++ __ movgr2fr_w($tmp2$$FloatRegister, $src2$$Register); ++ switch($cop$$cmpcode) { ++ case 0x01: // EQ ++ __ fsel($tmp1$$FloatRegister, $tmp1$$FloatRegister, $tmp2$$FloatRegister, FCC0); ++ break; ++ case 0x02: // NE ++ __ fsel($tmp1$$FloatRegister, $tmp2$$FloatRegister, $tmp1$$FloatRegister, FCC0); ++ break; ++ default: ++ Unimplemented(); ++ } ++ __ movfr2gr_s($dst$$Register, $tmp1$$FloatRegister); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct anytrue_in_maskV32_branch(cmpOp cop, vReg op1, vReg op2, label labl) %{ ++ predicate(Matcher::vector_length_in_bytes(n->in(2)->in(1)) == 32 && static_cast(n->in(2))->get_predicate() == BoolTest::ne); ++ match(If cop (VectorTest op1 op2)); ++ effect(USE labl); ++ format %{ "b$cop $op1, $op2(not used), $labl\t# @anytrue_in_maskV32__branch" %} ++ ++ ins_encode %{ ++ Label &L = *($labl$$label); ++ // No need to use op2, op2 is all ones. ++ __ xvseteqz_v(FCC0, $op1$$FloatRegister); ++ switch($cop$$cmpcode) { ++ case 0x01: // EQ ++ __ bcnez(FCC0, L); ++ break; ++ case 0x02: // NE ++ __ bceqz(FCC0, L); ++ break; ++ default: ++ Unimplemented(); ++ } ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct alltrue_in_maskV32_branch(cmpOp cop, vReg op1, vReg op2, label labl) %{ ++ predicate(Matcher::vector_length_in_bytes(n->in(2)->in(1)) == 32 && static_cast(n->in(2))->get_predicate() == BoolTest::overflow); ++ match(If cop (VectorTest op1 op2)); ++ effect(USE labl); ++ format %{ "b$cop $op1, $op2(not used), $labl\t# @alltrue_in_maskV32__branch" %} ++ ++ ins_encode %{ ++ Label &L = *($labl$$label); ++ // No need to use op2, op2 is all ones. ++ __ xvsetallnez_b(FCC0, $op1$$FloatRegister); ++ switch($cop$$cmpcode) { ++ case 0x01: // EQ ++ __ bcnez(FCC0, L); ++ break; ++ case 0x02: // NE ++ __ bceqz(FCC0, L); ++ break; ++ default: ++ Unimplemented(); ++ } ++ %} ++ ++ ins_pc_relative(1); ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct CMoveI_anytrue_in_maskV32(mRegI dst, mRegIorL2I src1, mRegIorL2I src2, vReg op1, vReg op2, cmpOp cop, regF tmp1, regF tmp2) ++%{ ++ predicate(Matcher::vector_length_in_bytes(n->in(1)->in(2)->in(1)) == 32 && static_cast(n->in(1)->in(2))->get_predicate() == BoolTest::ne); ++ match(Set dst (CMoveI (Binary cop (VectorTest op1 op2)) (Binary src1 src2))); ++ effect(TEMP tmp1, TEMP tmp2); ++ format %{ "cmovei_xvtest($cop) $dst, $src1, $src2, $op1, $op2(not used)\t# TEMP($tmp1, $tmp2) @CMoveI_anytrue_in_maskV32" %} ++ ins_encode %{ ++ // No need to use op2, op2 is all ones. ++ __ xvseteqz_v(FCC0, $op1$$FloatRegister); ++ __ movgr2fr_w($tmp1$$FloatRegister, $src1$$Register); ++ __ movgr2fr_w($tmp2$$FloatRegister, $src2$$Register); ++ switch($cop$$cmpcode) { ++ case 0x01: // EQ ++ __ fsel($tmp1$$FloatRegister, $tmp1$$FloatRegister, $tmp2$$FloatRegister, FCC0); ++ break; ++ case 0x02: // NE ++ __ fsel($tmp1$$FloatRegister, $tmp2$$FloatRegister, $tmp1$$FloatRegister, FCC0); ++ break; ++ default: ++ Unimplemented(); ++ } ++ __ movfr2gr_s($dst$$Register, $tmp1$$FloatRegister); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct CMoveI_alltrue_in_maskV32(mRegI dst, mRegIorL2I src1, mRegIorL2I src2, vReg op1, vReg op2, cmpOp cop, regF tmp1, regF tmp2) ++%{ ++ predicate(Matcher::vector_length_in_bytes(n->in(1)->in(2)->in(1)) == 32 && static_cast(n->in(1)->in(2))->get_predicate() == BoolTest::overflow); ++ match(Set dst (CMoveI (Binary cop (VectorTest op1 op2)) (Binary src1 src2))); ++ effect(TEMP tmp1, TEMP tmp2); ++ format %{ "cmovei_xvtest($cop) $dst, $src1, $src2, $op1, $op2(not used)\t# TEMP($tmp1, $tmp2) @CMoveI_alltrue_in_maskV32" %} ++ ins_encode %{ ++ // No need to use op2, op2 is all ones. ++ __ xvsetallnez_b(FCC0, $op1$$FloatRegister); ++ __ movgr2fr_w($tmp1$$FloatRegister, $src1$$Register); ++ __ movgr2fr_w($tmp2$$FloatRegister, $src2$$Register); ++ switch($cop$$cmpcode) { ++ case 0x01: // EQ ++ __ fsel($tmp1$$FloatRegister, $tmp1$$FloatRegister, $tmp2$$FloatRegister, FCC0); ++ break; ++ case 0x02: // NE ++ __ fsel($tmp1$$FloatRegister, $tmp2$$FloatRegister, $tmp1$$FloatRegister, FCC0); ++ break; ++ default: ++ Unimplemented(); ++ } ++ __ movfr2gr_s($dst$$Register, $tmp1$$FloatRegister); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ----------------------------- VectorMaskTrueCount ---------------------------- ++ ++instruct mask_truecountV(mRegI dst, vReg src, vReg tmp) %{ ++ match(Set dst (VectorMaskTrueCount src)); ++ effect(TEMP tmp); ++ format %{ "(x)vmask_truecount $dst, $src\t# TEMP($tmp) @mask_truecountV" %} ++ ins_encode %{ ++ // Input "src" is a vector of boolean represented as bytes with ++ // 0x00/0x01 as element values. ++ if (Matcher::vector_length(this, $src) == 4) { ++ __ vpcnt_w(fscratch, $src$$FloatRegister); ++ __ vpickve2gr_b($dst$$Register, fscratch, 0); ++ } else if (Matcher::vector_length(this, $src) == 8) { ++ __ vpcnt_d(fscratch, $src$$FloatRegister); ++ __ vpickve2gr_b($dst$$Register, fscratch, 0); ++ } else if (Matcher::vector_length(this, $src) == 16) { ++ __ vpcnt_d(fscratch, $src$$FloatRegister); ++ __ vhaddw_q_d(fscratch, fscratch, fscratch); ++ __ vpickve2gr_b($dst$$Register, fscratch, 0); ++ } else if (Matcher::vector_length(this, $src) == 32) { ++ __ xvpcnt_d($tmp$$FloatRegister, $src$$FloatRegister); ++ __ xvhaddw_q_d($tmp$$FloatRegister, $tmp$$FloatRegister, $tmp$$FloatRegister); ++ __ xvpermi_d(fscratch, $tmp$$FloatRegister, 0b00001110); ++ __ vadd_b($tmp$$FloatRegister, $tmp$$FloatRegister, fscratch); ++ __ vpickve2gr_b($dst$$Register, $tmp$$FloatRegister, 0); ++ } else { ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ----------------------------- VectorMaskFirstTrue ---------------------------- ++ ++instruct mask_first_trueV(mRegI dst, vReg src, vReg tmp) %{ ++ match(Set dst (VectorMaskFirstTrue src)); ++ effect(TEMP tmp); ++ format %{ "(x)vmask_first_true $dst, $src\t# TEMP($tmp) @mask_first_trueV" %} ++ ins_encode %{ ++ // Returns the index of the first active lane of the ++ // vector mask, or 4/8/16/32 (VLENGTH) if no lane is active. ++ // ++ // Input "src" is a vector of boolean represented as ++ // bytes with 0x00/0x01 as element values. ++ ++ if (Matcher::vector_length(this, $src) == 4) { ++ __ movfr2gr_s($dst$$Register, $src$$FloatRegister); ++ __ ctz_w($dst$$Register, $dst$$Register); ++ __ srli_w($dst$$Register, $dst$$Register, 3); ++ } else if (Matcher::vector_length(this, $src) == 8) { ++ __ movfr2gr_d($dst$$Register, $src$$FloatRegister); ++ __ ctz_d($dst$$Register, $dst$$Register); ++ __ srli_w($dst$$Register, $dst$$Register, 3); ++ } else if (Matcher::vector_length(this, $src) == 16) { ++ __ vneg_b(fscratch, $src$$FloatRegister); ++ __ vfrstpi_b(fscratch, fscratch, 0); ++ __ vpickve2gr_b($dst$$Register, fscratch, 0); ++ } else if (Matcher::vector_length(this, $src) == 32) { ++ Label DONE; ++ __ xvneg_b($tmp$$FloatRegister, $src$$FloatRegister); ++ __ xvfrstpi_b($tmp$$FloatRegister, $tmp$$FloatRegister, 0); ++ __ xvpermi_q(fscratch, $tmp$$FloatRegister, 0x01); ++ __ vpickve2gr_b($dst$$Register, $tmp$$FloatRegister, 0); ++ __ li(AT, (long)16); ++ __ blt($dst$$Register, AT, DONE); ++ __ vpickve2gr_b($dst$$Register, fscratch, 0); ++ __ add_w($dst$$Register, $dst$$Register, AT); ++ __ bind(DONE); ++ } else { ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ----------------------------- VectorMaskLastTrue ---------------------------- ++ ++instruct mask_last_trueV(mRegI dst, vReg src) %{ ++ match(Set dst (VectorMaskLastTrue src)); ++ format %{ "(x)vmask_last_true $dst, $src\t# @mask_last_trueV" %} ++ ins_encode %{ ++ // Returns the index of the last active lane of the ++ // vector mask, or -1 if no lane is active. ++ // ++ // Input "src" is a vector of boolean represented as ++ // bytes with 0x00/0x01 as element values. ++ ++ if (Matcher::vector_length(this, $src) == 4) { ++ __ movfr2gr_s($dst$$Register, $src$$FloatRegister); ++ __ clz_w($dst$$Register, $dst$$Register); ++ __ srli_w($dst$$Register, $dst$$Register, 3); ++ __ addi_w($dst$$Register, $dst$$Register, -3); ++ __ sub_w($dst$$Register, R0, $dst$$Register); ++ } else if (Matcher::vector_length(this, $src) == 8) { ++ __ movfr2gr_d($dst$$Register, $src$$FloatRegister); ++ __ clz_d($dst$$Register, $dst$$Register); ++ __ srli_w($dst$$Register, $dst$$Register, 3); ++ __ addi_w($dst$$Register, $dst$$Register, -7); ++ __ sub_w($dst$$Register, R0, $dst$$Register); ++ } else if (Matcher::vector_length(this, $src) == 16) { ++ Label FIRST_TRUE_INDEX; ++ __ vpickve2gr_d($dst$$Register, $src$$FloatRegister, 1); ++ __ li(AT, (long)15); ++ __ bnez($dst$$Register, FIRST_TRUE_INDEX); ++ ++ __ vpickve2gr_d($dst$$Register, $src$$FloatRegister, 0); ++ __ li(AT, (long)7); ++ __ bind(FIRST_TRUE_INDEX); ++ __ clz_d($dst$$Register, $dst$$Register); ++ __ srli_w($dst$$Register, $dst$$Register, 3); ++ __ sub_w($dst$$Register, AT, $dst$$Register); ++ } else if (Matcher::vector_length(this, $src) == 32) { ++ Label FIRST_TRUE_INDEX; ++ __ xvpickve2gr_d($dst$$Register, $src$$FloatRegister, 3); ++ __ li(AT, (long)31); ++ __ bnez($dst$$Register, FIRST_TRUE_INDEX); ++ ++ __ xvpickve2gr_d($dst$$Register, $src$$FloatRegister, 2); ++ __ li(AT, (long)23); ++ __ bnez($dst$$Register, FIRST_TRUE_INDEX); ++ ++ __ xvpickve2gr_d($dst$$Register, $src$$FloatRegister, 1); ++ __ li(AT, (long)15); ++ __ bnez($dst$$Register, FIRST_TRUE_INDEX); ++ ++ __ xvpickve2gr_d($dst$$Register, $src$$FloatRegister, 0); ++ __ li(AT, (long)7); ++ __ bind(FIRST_TRUE_INDEX); ++ __ clz_d($dst$$Register, $dst$$Register); ++ __ srli_w($dst$$Register, $dst$$Register, 3); ++ __ sub_w($dst$$Register, AT, $dst$$Register); ++ } else { ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ----------------------------- Vector comparison ---------------------------- ++ ++instruct cmpV(vReg dst, vReg src1, vReg src2, immI cond) ++%{ ++ match(Set dst (VectorMaskCmp (Binary src1 src2) cond)); ++ format %{ "(x)vcompare $dst, $src1, $src2, $cond\t# @cmpV" %} ++ ins_encode %{ ++ BasicType bt = Matcher::vector_element_basic_type(this); ++ __ vector_compare($dst$$FloatRegister, $src1$$FloatRegister, $src2$$FloatRegister, ++ bt, $cond$$constant, Matcher::vector_length_in_bytes(this)); ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ---------------------------- LOAD_IOTA_INDICES ----------------------------- ++ ++instruct loadconV(vReg dst, immI_0 src) %{ ++ match(Set dst (VectorLoadConst src)); ++ format %{ "(x)vld_con $dst, CONSTANT_MEMORY\t# @loadconV" %} ++ ins_encode %{ ++ // The iota indices are ordered by type B/S/I/L/F/D, and the offset between two types is 32. ++ BasicType bt = Matcher::vector_element_basic_type(this); ++ int offset = exact_log2(type2aelembytes(bt)) << 5; ++ if (is_floating_point_type(bt)) { ++ offset += 64; ++ } ++ __ li(AT, (long)(StubRoutines::la::vector_iota_indices() + offset)); ++ switch (Matcher::vector_length_in_bytes(this)) { ++ case 4: ++ case 8: ++ case 16: __ vld($dst$$FloatRegister, AT, (int)0); break; ++ case 32: __ xvld($dst$$FloatRegister, AT, (int)0); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ------------------------------ Populate Index to a Vector ------------------ ++ ++instruct populateIndexV(vReg dst, mRegI src1, immI_1 src2) %{ ++ match(Set dst (PopulateIndex src1 src2)); ++ format %{ "(x)vpopulate_index $dst, $src1, $src2\t# @populateIndexV" %} ++ ins_encode %{ ++ assert($src2$$constant == 1, "required"); ++ BasicType bt = Matcher::vector_element_basic_type(this); ++ int offset = exact_log2(type2aelembytes(bt)) << 5; ++ __ li(AT, (long)(StubRoutines::la::vector_iota_indices() + offset)); ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ __ xvld(fscratch, AT, (int)0); ++ switch (bt) { ++ case T_BYTE : __ xvreplgr2vr_b($dst$$FloatRegister, $src1$$Register); ++ __ xvadd_b($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ case T_SHORT : __ xvreplgr2vr_h($dst$$FloatRegister, $src1$$Register); ++ __ xvadd_h($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ case T_INT : __ xvreplgr2vr_w($dst$$FloatRegister, $src1$$Register); ++ __ xvadd_w($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ case T_LONG : __ xvreplgr2vr_d($dst$$FloatRegister, $src1$$Register); ++ __ xvadd_d($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ __ vld(fscratch, AT, (int)0); ++ switch (bt) { ++ case T_BYTE : __ vreplgr2vr_b($dst$$FloatRegister, $src1$$Register); ++ __ vadd_b($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ case T_SHORT : __ vreplgr2vr_h($dst$$FloatRegister, $src1$$Register); ++ __ vadd_h($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ case T_INT : __ vreplgr2vr_w($dst$$FloatRegister, $src1$$Register); ++ __ vadd_w($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ case T_LONG : __ vreplgr2vr_d($dst$$FloatRegister, $src1$$Register); ++ __ vadd_d($dst$$FloatRegister, $dst$$FloatRegister, fscratch); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ---------------------------- LOAD_SHUFFLE ---------------------------------- ++ ++instruct loadShuffleVB(vReg dst) %{ ++ predicate(Matcher::vector_element_basic_type(n) == T_BYTE); ++ match(Set dst (VectorLoadShuffle dst)); ++ format %{ "(x)vld_shuffle $dst\t# @loadShuffleVB" %} ++ ins_encode %{ ++ // empty ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++instruct loadShuffleV(vReg dst, vReg src) %{ ++ predicate(Matcher::vector_element_basic_type(n) != T_BYTE); ++ match(Set dst (VectorLoadShuffle src)); ++ format %{ "(x)vld_shuffle $dst, $src\t# @loadShuffleV" %} ++ ins_encode %{ ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_SHORT : __ vext2xv_hu_bu($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_FLOAT : ++ case T_INT : __ vext2xv_wu_bu($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_DOUBLE: ++ case T_LONG : __ vext2xv_du_bu($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ---------------------------- Rearrange ------------------------------------- ++ ++instruct rearrangeV(vReg dst, vReg src, vReg tmp) %{ ++ match(Set dst (VectorRearrange src dst)); ++ effect(TEMP tmp); ++ format %{ "(x)vrearrange $dst, $src, $dst\t# TEMP($tmp) @rearrangeV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ __ xvpermi_q($tmp$$FloatRegister, $src$$FloatRegister, 0x00); ++ __ xvpermi_q(fscratch, $src$$FloatRegister, 0x11); ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvshuf_b($dst$$FloatRegister, fscratch, $tmp$$FloatRegister, $dst$$FloatRegister); break; ++ case T_SHORT : __ xvshuf_h($dst$$FloatRegister, fscratch, $tmp$$FloatRegister); break; ++ case T_FLOAT : ++ case T_INT : __ xvshuf_w($dst$$FloatRegister, fscratch, $tmp$$FloatRegister); break; ++ case T_DOUBLE: ++ case T_LONG : __ xvshuf_d($dst$$FloatRegister, fscratch, $tmp$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vshuf_b($dst$$FloatRegister, $src$$FloatRegister, $src$$FloatRegister, $dst$$FloatRegister); break; ++ case T_SHORT : __ vshuf_h($dst$$FloatRegister, $src$$FloatRegister, $src$$FloatRegister); break; ++ case T_FLOAT : ++ case T_INT : __ vshuf_w($dst$$FloatRegister, $src$$FloatRegister, $src$$FloatRegister); break; ++ case T_DOUBLE: ++ case T_LONG : __ vshuf_d($dst$$FloatRegister, $src$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ---------------------------- PopCount -------------------------------------- ++ ++instruct popcountV(vReg dst, vReg src) %{ ++ match(Set dst (PopCountVI src)); ++ match(Set dst (PopCountVL src)); ++ format %{ "(x)vpcnt $dst, $src\t# @popcountV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvpcnt_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_SHORT : __ xvpcnt_h($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_INT : __ xvpcnt_w($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_LONG : __ xvpcnt_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vpcnt_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_SHORT : __ vpcnt_h($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_INT : __ vpcnt_w($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_LONG : __ vpcnt_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ---------------------------- CountLeadingZerosV -------------------------------------- ++ ++instruct clzV(vReg dst, vReg src) %{ ++ match(Set dst (CountLeadingZerosV src)); ++ format %{ "(x)vclz $dst, $src\t# @clzV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ xvclz_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_SHORT : __ xvclz_h($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_INT : __ xvclz_w($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_LONG : __ xvclz_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } else { ++ switch (Matcher::vector_element_basic_type(this)) { ++ case T_BYTE : __ vclz_b($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_SHORT : __ vclz_h($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_INT : __ vclz_w($dst$$FloatRegister, $src$$FloatRegister); break; ++ case T_LONG : __ vclz_d($dst$$FloatRegister, $src$$FloatRegister); break; ++ default: ++ ShouldNotReachHere(); ++ } ++ } ++ %} ++ ins_pipe( pipe_slow ); ++%} ++ ++// ------------------------------ Vector signum -------------------------------- ++ ++instruct signumV(vReg dst, vReg src, vReg zero, vReg one, vReg tmp) %{ ++ match(Set dst (SignumVF src (Binary zero one))); ++ match(Set dst (SignumVD src (Binary zero one))); ++ effect(TEMP_DEF dst, TEMP tmp); ++ format %{ "(x)vsignum $dst, $src, $zero, $one\t# TEMP($tmp) @signumV" %} ++ ins_encode %{ ++ if (Matcher::vector_length_in_bytes(this) > 16) { ++ switch (Matcher::vector_element_basic_type(this, $src)) { ++ case T_FLOAT: __ xvfcmp_clt_s($dst$$FloatRegister, $zero$$FloatRegister, $src$$FloatRegister); ++ __ xvfcmp_clt_s($tmp$$FloatRegister, $src$$FloatRegister, $zero$$FloatRegister); ++ __ xvor_v($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister); ++ __ xvsrli_w($dst$$FloatRegister, $dst$$FloatRegister, 1); ++ break; ++ case T_DOUBLE: __ xvfcmp_clt_d($dst$$FloatRegister, $zero$$FloatRegister, $src$$FloatRegister); ++ __ xvfcmp_clt_d($tmp$$FloatRegister, $src$$FloatRegister, $zero$$FloatRegister); ++ __ xvor_v($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister); ++ __ xvsrli_d($dst$$FloatRegister, $dst$$FloatRegister, 1); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ __ xvbitsel_v($dst$$FloatRegister, $src$$FloatRegister, $one$$FloatRegister, $dst$$FloatRegister); ++ } else { ++ switch (Matcher::vector_element_basic_type(this, $src)) { ++ case T_FLOAT: __ vfcmp_clt_s($dst$$FloatRegister, $zero$$FloatRegister, $src$$FloatRegister); ++ __ vfcmp_clt_s($tmp$$FloatRegister, $src$$FloatRegister, $zero$$FloatRegister); ++ __ vor_v($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister); ++ __ vsrli_w($dst$$FloatRegister, $dst$$FloatRegister, 1); ++ break; ++ case T_DOUBLE: __ vfcmp_clt_d($dst$$FloatRegister, $zero$$FloatRegister, $src$$FloatRegister); ++ __ vfcmp_clt_d($tmp$$FloatRegister, $src$$FloatRegister, $zero$$FloatRegister); ++ __ vor_v($dst$$FloatRegister, $dst$$FloatRegister, $tmp$$FloatRegister); ++ __ vsrli_d($dst$$FloatRegister, $dst$$FloatRegister, 1); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ __ vbitsel_v($dst$$FloatRegister, $src$$FloatRegister, $one$$FloatRegister, $dst$$FloatRegister); ++ } ++ %} ++ ins_pipe(pipe_slow); ++%} ++ ++ ++//----------PEEPHOLE RULES----------------------------------------------------- ++// These must follow all instruction definitions as they use the names ++// defined in the instructions definitions. ++// ++// peepmatch ( root_instr_name [preceeding_instruction]* ); ++// ++// peepconstraint %{ ++// (instruction_number.operand_name relational_op instruction_number.operand_name ++// [, ...] ); ++// // instruction numbers are zero-based using left to right order in peepmatch ++// ++// peepreplace ( instr_name ( [instruction_number.operand_name]* ) ); ++// // provide an instruction_number.operand_name for each operand that appears ++// // in the replacement instruction's match rule ++// ++// ---------VM FLAGS--------------------------------------------------------- ++// ++// All peephole optimizations can be turned off using -XX:-OptoPeephole ++// ++// Each peephole rule is given an identifying number starting with zero and ++// increasing by one in the order seen by the parser. An individual peephole ++// can be enabled, and all others disabled, by using -XX:OptoPeepholeAt=# ++// on the command-line. ++// ++// ---------CURRENT LIMITATIONS---------------------------------------------- ++// ++// Only match adjacent instructions in same basic block ++// Only equality constraints ++// Only constraints between operands, not (0.dest_reg == EAX_enc) ++// Only one replacement instruction ++// ++// ---------EXAMPLE---------------------------------------------------------- ++// ++// // pertinent parts of existing instructions in architecture description ++// instruct movI(eRegI dst, eRegI src) %{ ++// match(Set dst (CopyI src)); ++// %} ++// ++// instruct incI_eReg(eRegI dst, immI_1 src, eFlagsReg cr) %{ ++// match(Set dst (AddI dst src)); ++// effect(KILL cr); ++// %} ++// ++// // Change (inc mov) to lea ++// peephole %{ ++// // increment preceded by register-register move ++// peepmatch ( incI_eReg movI ); ++// // require that the destination register of the increment ++// // match the destination register of the move ++// peepconstraint ( 0.dst == 1.dst ); ++// // construct a replacement instruction that sets ++// // the destination to ( move's source register + one ) ++// peepreplace ( leaI_eReg_immI( 0.dst 1.src 0.src ) ); ++// %} ++// ++// Implementation no longer uses movX instructions since ++// machine-independent system no longer uses CopyX nodes. ++// ++// peephole %{ ++// peepmatch ( incI_eReg movI ); ++// peepconstraint ( 0.dst == 1.dst ); ++// peepreplace ( leaI_eReg_immI( 0.dst 1.src 0.src ) ); ++// %} ++// ++// peephole %{ ++// peepmatch ( decI_eReg movI ); ++// peepconstraint ( 0.dst == 1.dst ); ++// peepreplace ( leaI_eReg_immI( 0.dst 1.src 0.src ) ); ++// %} ++// ++// peephole %{ ++// peepmatch ( addI_eReg_imm movI ); ++// peepconstraint ( 0.dst == 1.dst ); ++// peepreplace ( leaI_eReg_immI( 0.dst 1.src 0.src ) ); ++// %} ++// ++// peephole %{ ++// peepmatch ( addP_eReg_imm movP ); ++// peepconstraint ( 0.dst == 1.dst ); ++// peepreplace ( leaP_eReg_immI( 0.dst 1.src 0.src ) ); ++// %} ++ ++// // Change load of spilled value to only a spill ++// instruct storeI(memory mem, eRegI src) %{ ++// match(Set mem (StoreI mem src)); ++// %} ++// ++// instruct loadI(eRegI dst, memory mem) %{ ++// match(Set dst (LoadI mem)); ++// %} ++// ++//peephole %{ ++// peepmatch ( loadI storeI ); ++// peepconstraint ( 1.src == 0.dst, 1.mem == 0.mem ); ++// peepreplace ( storeI( 1.mem 1.mem 1.src ) ); ++//%} ++ ++//----------SMARTSPILL RULES--------------------------------------------------- ++// These must follow all instruction definitions as they use the names ++// defined in the instructions definitions. ++ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/loongarch.ad b/src/hotspot/cpu/loongarch/loongarch.ad +--- a/src/hotspot/cpu/loongarch/loongarch.ad 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/loongarch.ad 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,25 @@ ++// ++// Copyright (c) 2011, 2012, Oracle and/or its affiliates. All rights reserved. ++// Copyright (c) 2015, 2021, Loongson Technology. All rights reserved. ++// DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++// ++// This code is free software; you can redistribute it and/or modify it ++// under the terms of the GNU General Public License version 2 only, as ++// published by the Free Software Foundation. ++// ++// This code is distributed in the hope that it will be useful, but WITHOUT ++// ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++// FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++// version 2 for more details (a copy is included in the LICENSE file that ++// accompanied this code). ++// ++// You should have received a copy of the GNU General Public License version ++// 2 along with this work; if not, write to the Free Software Foundation, ++// Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++// ++// Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++// or visit www.oracle.com if you need additional information or have any ++// questions. ++// ++// ++ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/macroAssembler_loongarch_chacha.cpp b/src/hotspot/cpu/loongarch/macroAssembler_loongarch_chacha.cpp +--- a/src/hotspot/cpu/loongarch/macroAssembler_loongarch_chacha.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/macroAssembler_loongarch_chacha.cpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,86 @@ ++/* ++ * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++ ++#include "asm/assembler.hpp" ++#include "asm/assembler.inline.hpp" ++#include "macroAssembler_loongarch.hpp" ++#include "memory/resourceArea.hpp" ++#include "runtime/stubRoutines.hpp" ++ ++/** ++ * Perform the quarter round calculations on values contained within ++ * four SIMD registers. ++ * ++ * @param aVec the SIMD register containing only the "a" values ++ * @param bVec the SIMD register containing only the "b" values ++ * @param cVec the SIMD register containing only the "c" values ++ * @param dVec the SIMD register containing only the "d" values ++ */ ++void MacroAssembler::cc20_quarter_round(FloatRegister aVec, FloatRegister bVec, ++ FloatRegister cVec, FloatRegister dVec) { ++ ++ // a += b, d ^= a, d <<<= 16 ++ vadd_w(aVec, aVec, bVec); ++ vxor_v(dVec, dVec, aVec); ++ vrotri_w(dVec, dVec, 16); ++ ++ // c += d, b ^= c, b <<<= 12 ++ vadd_w(cVec, cVec, dVec); ++ vxor_v(bVec, bVec, cVec); ++ vrotri_w(bVec, bVec, 20); ++ ++ // a += b, d ^= a, d <<<= 8 ++ vadd_w(aVec, aVec, bVec); ++ vxor_v(dVec, dVec, aVec); ++ vrotri_w(dVec, dVec, 24); ++ ++ // c += d, b ^= c, b <<<= 7 ++ vadd_w(cVec, cVec, dVec); ++ vxor_v(bVec, bVec, cVec); ++ vrotri_w(bVec, bVec, 25); ++} ++ ++/** ++ * Shift the b, c, and d vectors between columnar and diagonal representations. ++ * Note that the "a" vector does not shift. ++ * ++ * @param bVec the SIMD register containing only the "b" values ++ * @param cVec the SIMD register containing only the "c" values ++ * @param dVec the SIMD register containing only the "d" values ++ * @param colToDiag true if moving columnar to diagonal, false if ++ * moving diagonal back to columnar. ++ */ ++void MacroAssembler::cc20_shift_lane_org(FloatRegister bVec, FloatRegister cVec, ++ FloatRegister dVec, bool colToDiag) { ++ int bShift = colToDiag ? 0b00111001 : 0b10010011; ++ int cShift = 0b01001110; ++ int dShift = colToDiag ? 0b10010011 : 0b00111001; ++ ++ vshuf4i_w(bVec, bVec, bShift); ++ vshuf4i_w(cVec, cVec, cShift); ++ vshuf4i_w(dVec, dVec, dShift); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/macroAssembler_loongarch.cpp b/src/hotspot/cpu/loongarch/macroAssembler_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/macroAssembler_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/macroAssembler_loongarch.cpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,4238 @@ ++/* ++ * Copyright (c) 1997, 2014, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2017, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/assembler.hpp" ++#include "asm/assembler.inline.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "compiler/disassembler.hpp" ++#include "compiler/oopMap.hpp" ++#include "gc/shared/barrierSet.hpp" ++#include "gc/shared/barrierSetAssembler.hpp" ++#include "gc/shared/collectedHeap.inline.hpp" ++#include "interpreter/bytecodeHistogram.hpp" ++#include "interpreter/interpreter.hpp" ++#include "jvm.h" ++#include "memory/resourceArea.hpp" ++#include "memory/universe.hpp" ++#include "nativeInst_loongarch.hpp" ++#include "oops/compressedOops.inline.hpp" ++#include "oops/klass.inline.hpp" ++#include "prims/methodHandles.hpp" ++#include "runtime/interfaceSupport.inline.hpp" ++#include "runtime/jniHandles.inline.hpp" ++#include "runtime/objectMonitor.hpp" ++#include "runtime/os.hpp" ++#include "runtime/safepoint.hpp" ++#include "runtime/safepointMechanism.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "utilities/macros.hpp" ++ ++#ifdef COMPILER2 ++#include "opto/compile.hpp" ++#include "opto/output.hpp" ++#endif ++ ++#if INCLUDE_ZGC ++#include "gc/z/zThreadLocalData.hpp" ++#endif ++ ++// Implementation of MacroAssembler ++ ++void MacroAssembler::pd_patch_instruction(address branch, address target, const char* file, int line) { ++ jint& stub_inst = *(jint*)branch; ++ jint *pc = (jint *)branch; ++ ++ if (high(stub_inst, 7) == pcaddu18i_op) { ++ // far: ++ // pcaddu18i reg, si20 ++ // jirl r0, reg, si18 ++ ++ assert(high(pc[1], 6) == jirl_op, "Not a branch label patch"); ++ jlong offs = target - branch; ++ CodeBuffer cb(branch, 2 * BytesPerInstWord); ++ MacroAssembler masm(&cb); ++ if (reachable_from_branch_short(offs)) { ++ // convert far to short ++#define __ masm. ++ __ b(target); ++ __ nop(); ++#undef __ ++ } else { ++ masm.patchable_jump_far(R0, offs); ++ } ++ return; ++ } else if (high(stub_inst, 7) == pcaddi_op) { ++ // see MacroAssembler::set_last_Java_frame: ++ // pcaddi reg, si20 ++ ++ jint offs = (target - branch) >> 2; ++ guarantee(is_simm(offs, 20), "Not signed 20-bit offset"); ++ CodeBuffer cb(branch, 1 * BytesPerInstWord); ++ MacroAssembler masm(&cb); ++ masm.pcaddi(as_Register(low(stub_inst, 5)), offs); ++ return; ++ } else if (high(stub_inst, 7) == pcaddu12i_op) { ++ // pc-relative ++ jlong offs = target - branch; ++ guarantee(is_simm(offs, 32), "Not signed 32-bit offset"); ++ jint si12, si20; ++ jint& stub_instNext = *(jint*)(branch+4); ++ split_simm32(offs, si12, si20); ++ CodeBuffer cb(branch, 2 * BytesPerInstWord); ++ MacroAssembler masm(&cb); ++ masm.pcaddu12i(as_Register(low(stub_inst, 5)), si20); ++ masm.addi_d(as_Register(low((stub_instNext), 5)), as_Register(low((stub_instNext) >> 5, 5)), si12); ++ return; ++ } else if (high(stub_inst, 7) == lu12i_w_op) { ++ // long call (absolute) ++ CodeBuffer cb(branch, 3 * BytesPerInstWord); ++ MacroAssembler masm(&cb); ++ masm.call_long(target); ++ return; ++ } ++ ++ stub_inst = patched_branch(target - branch, stub_inst, 0); ++} ++ ++bool MacroAssembler::reachable_from_branch_short(jlong offs) { ++ if (ForceUnreachable) { ++ return false; ++ } ++ return is_simm(offs >> 2, 26); ++} ++ ++void MacroAssembler::patchable_jump_far(Register ra, jlong offs) { ++ jint si18, si20; ++ guarantee(is_simm(offs, 38), "Not signed 38-bit offset"); ++ split_simm38(offs, si18, si20); ++ pcaddu18i(AT, si20); ++ jirl(ra, AT, si18); ++} ++ ++void MacroAssembler::patchable_jump(address target, bool force_patchable) { ++ assert(ReservedCodeCacheSize < 4*G, "branch out of range"); ++ assert(CodeCache::find_blob(target) != nullptr, ++ "destination of jump not found in code cache"); ++ if (force_patchable || patchable_branches()) { ++ jlong offs = target - pc(); ++ if (reachable_from_branch_short(offs)) { // Short jump ++ b(offset26(target)); ++ nop(); ++ } else { // Far jump ++ patchable_jump_far(R0, offs); ++ } ++ } else { // Real short jump ++ b(offset26(target)); ++ } ++} ++ ++void MacroAssembler::patchable_call(address target, address call_site) { ++ jlong offs = target - (call_site ? call_site : pc()); ++ if (reachable_from_branch_short(offs - BytesPerInstWord)) { // Short call ++ nop(); ++ bl((offs - BytesPerInstWord) >> 2); ++ } else { // Far call ++ patchable_jump_far(RA, offs); ++ } ++} ++ ++// Maybe emit a call via a trampoline. If the code cache is small ++// trampolines won't be emitted. ++address MacroAssembler::trampoline_call(AddressLiteral entry, CodeBuffer* cbuf) { ++ assert(entry.rspec().type() == relocInfo::runtime_call_type || ++ entry.rspec().type() == relocInfo::opt_virtual_call_type || ++ entry.rspec().type() == relocInfo::static_call_type || ++ entry.rspec().type() == relocInfo::virtual_call_type, "wrong reloc type"); ++ ++ address target = entry.target(); ++ ++ // We need a trampoline if branches are far. ++ if (far_branches()) { ++ if (!in_scratch_emit_size()) { ++ address stub = emit_trampoline_stub(offset(), target); ++ if (stub == nullptr) { ++ postcond(pc() == badAddress); ++ return nullptr; // CodeCache is full ++ } ++ } ++ target = pc(); ++ } ++ ++ if (cbuf != nullptr) { cbuf->set_insts_mark(); } ++ relocate(entry.rspec()); ++ bl(target); ++ ++ // just need to return a non-null address ++ postcond(pc() != badAddress); ++ return pc(); ++} ++ ++// Emit a trampoline stub for a call to a target which is too far away. ++// ++// code sequences: ++// ++// call-site: ++// branch-and-link to or ++// ++// Related trampoline stub for this call site in the stub section: ++// load the call target from the constant pool ++// branch (RA still points to the call site above) ++ ++address MacroAssembler::emit_trampoline_stub(int insts_call_instruction_offset, ++ address dest) { ++ // Start the stub ++ address stub = start_a_stub(NativeInstruction::nop_instruction_size ++ + NativeCallTrampolineStub::instruction_size); ++ if (stub == nullptr) { ++ return nullptr; // CodeBuffer::expand failed ++ } ++ ++ // Create a trampoline stub relocation which relates this trampoline stub ++ // with the call instruction at insts_call_instruction_offset in the ++ // instructions code-section. ++ align(wordSize); ++ relocate(trampoline_stub_Relocation::spec(code()->insts()->start() ++ + insts_call_instruction_offset)); ++ const int stub_start_offset = offset(); ++ ++ // Now, create the trampoline stub's code: ++ // - load the call ++ // - call ++ pcaddi(AT, 0); ++ ld_d(AT, AT, 16); ++ jr(AT); ++ nop(); //align ++ assert(offset() - stub_start_offset == NativeCallTrampolineStub::data_offset, ++ "should be"); ++ emit_int64((int64_t)dest); ++ ++ const address stub_start_addr = addr_at(stub_start_offset); ++ ++ NativeInstruction* ni = nativeInstruction_at(stub_start_addr); ++ assert(ni->is_NativeCallTrampolineStub_at(), "doesn't look like a trampoline"); ++ ++ end_a_stub(); ++ return stub_start_addr; ++} ++ ++void MacroAssembler::beq_far(Register rs, Register rt, address entry) { ++ if (is_simm16((entry - pc()) >> 2)) { // Short jump ++ beq(rs, rt, offset16(entry)); ++ } else { // Far jump ++ Label not_jump; ++ bne(rs, rt, not_jump); ++ b_far(entry); ++ bind(not_jump); ++ } ++} ++ ++void MacroAssembler::beq_far(Register rs, Register rt, Label& L) { ++ if (L.is_bound()) { ++ beq_far(rs, rt, target(L)); ++ } else { ++ Label not_jump; ++ bne(rs, rt, not_jump); ++ b_far(L); ++ bind(not_jump); ++ } ++} ++ ++void MacroAssembler::bne_far(Register rs, Register rt, address entry) { ++ if (is_simm16((entry - pc()) >> 2)) { // Short jump ++ bne(rs, rt, offset16(entry)); ++ } else { // Far jump ++ Label not_jump; ++ beq(rs, rt, not_jump); ++ b_far(entry); ++ bind(not_jump); ++ } ++} ++ ++void MacroAssembler::bne_far(Register rs, Register rt, Label& L) { ++ if (L.is_bound()) { ++ bne_far(rs, rt, target(L)); ++ } else { ++ Label not_jump; ++ beq(rs, rt, not_jump); ++ b_far(L); ++ bind(not_jump); ++ } ++} ++ ++void MacroAssembler::blt_far(Register rs, Register rt, address entry, bool is_signed) { ++ if (is_simm16((entry - pc()) >> 2)) { // Short jump ++ if (is_signed) { ++ blt(rs, rt, offset16(entry)); ++ } else { ++ bltu(rs, rt, offset16(entry)); ++ } ++ } else { // Far jump ++ Label not_jump; ++ if (is_signed) { ++ bge(rs, rt, not_jump); ++ } else { ++ bgeu(rs, rt, not_jump); ++ } ++ b_far(entry); ++ bind(not_jump); ++ } ++} ++ ++void MacroAssembler::blt_far(Register rs, Register rt, Label& L, bool is_signed) { ++ if (L.is_bound()) { ++ blt_far(rs, rt, target(L), is_signed); ++ } else { ++ Label not_jump; ++ if (is_signed) { ++ bge(rs, rt, not_jump); ++ } else { ++ bgeu(rs, rt, not_jump); ++ } ++ b_far(L); ++ bind(not_jump); ++ } ++} ++ ++void MacroAssembler::bge_far(Register rs, Register rt, address entry, bool is_signed) { ++ if (is_simm16((entry - pc()) >> 2)) { // Short jump ++ if (is_signed) { ++ bge(rs, rt, offset16(entry)); ++ } else { ++ bgeu(rs, rt, offset16(entry)); ++ } ++ } else { // Far jump ++ Label not_jump; ++ if (is_signed) { ++ blt(rs, rt, not_jump); ++ } else { ++ bltu(rs, rt, not_jump); ++ } ++ b_far(entry); ++ bind(not_jump); ++ } ++} ++ ++void MacroAssembler::bge_far(Register rs, Register rt, Label& L, bool is_signed) { ++ if (L.is_bound()) { ++ bge_far(rs, rt, target(L), is_signed); ++ } else { ++ Label not_jump; ++ if (is_signed) { ++ blt(rs, rt, not_jump); ++ } else { ++ bltu(rs, rt, not_jump); ++ } ++ b_far(L); ++ bind(not_jump); ++ } ++} ++ ++void MacroAssembler::b_far(Label& L) { ++ if (L.is_bound()) { ++ b_far(target(L)); ++ } else { ++ L.add_patch_at(code(), locator()); ++ if (ForceUnreachable) { ++ patchable_jump_far(R0, 0); ++ } else { ++ b(0); ++ } ++ } ++} ++ ++void MacroAssembler::b_far(address entry) { ++ jlong offs = entry - pc(); ++ if (reachable_from_branch_short(offs)) { // Short jump ++ b(offset26(entry)); ++ } else { // Far jump ++ patchable_jump_far(R0, offs); ++ } ++} ++ ++// tmp_reg1 and tmp_reg2 should be saved outside of atomic_inc32 (caller saved). ++void MacroAssembler::atomic_inc32(address counter_addr, int inc, Register tmp_reg1, Register tmp_reg2) { ++ li(tmp_reg1, inc); ++ li(tmp_reg2, counter_addr); ++ amadd_w(R0, tmp_reg1, tmp_reg2); ++} ++ ++// Writes to stack successive pages until offset reached to check for ++// stack overflow + shadow pages. This clobbers tmp. ++void MacroAssembler::bang_stack_size(Register size, Register tmp) { ++ assert_different_registers(tmp, size, AT); ++ move(tmp, SP); ++ // Bang stack for total size given plus shadow page size. ++ // Bang one page at a time because large size can bang beyond yellow and ++ // red zones. ++ Label loop; ++ li(AT, (int)os::vm_page_size()); ++ bind(loop); ++ sub_d(tmp, tmp, AT); ++ sub_d(size, size, AT); ++ st_d(size, tmp, 0); ++ blt(R0, size, loop); ++ ++ // Bang down shadow pages too. ++ // At this point, (tmp-0) is the last address touched, so don't ++ // touch it again. (It was touched as (tmp-pagesize) but then tmp ++ // was post-decremented.) Skip this address by starting at i=1, and ++ // touch a few more pages below. N.B. It is important to touch all ++ // the way down to and including i=StackShadowPages. ++ for (int i = 0; i < (int)(StackOverflow::stack_shadow_zone_size() / (int)os::vm_page_size()) - 1; i++) { ++ // this could be any sized move but this is can be a debugging crumb ++ // so the bigger the better. ++ sub_d(tmp, tmp, AT); ++ st_d(size, tmp, 0); ++ } ++} ++ ++void MacroAssembler::reserved_stack_check() { ++ // testing if reserved zone needs to be enabled ++ Label no_reserved_zone_enabling; ++ ++ ld_d(AT, Address(TREG, JavaThread::reserved_stack_activation_offset())); ++ sub_d(AT, SP, AT); ++ blt(AT, R0, no_reserved_zone_enabling); ++ ++ enter(); // RA and FP are live. ++ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::enable_stack_reserved_zone), TREG); ++ leave(); ++ ++ // We have already removed our own frame. ++ // throw_delayed_StackOverflowError will think that it's been ++ // called by our caller. ++ li(AT, (long)StubRoutines::throw_delayed_StackOverflowError_entry()); ++ jr(AT); ++ should_not_reach_here(); ++ ++ bind(no_reserved_zone_enabling); ++} ++ ++// the stack pointer adjustment is needed. see InterpreterMacroAssembler::super_call_VM_leaf ++// this method will handle the stack problem, you need not to preserve the stack space for the argument now ++void MacroAssembler::call_VM_leaf_base(address entry_point, int number_of_arguments) { ++ assert(number_of_arguments <= 4, "just check"); ++ assert(StackAlignmentInBytes == 16, "must be"); ++ move(AT, SP); ++ bstrins_d(SP, R0, 3, 0); ++ addi_d(SP, SP, -(StackAlignmentInBytes)); ++ st_d(AT, SP, 0); ++ call(entry_point, relocInfo::runtime_call_type); ++ ld_d(SP, SP, 0); ++} ++ ++ ++void MacroAssembler::jmp(address entry) { ++ jlong offs = entry - pc(); ++ if (reachable_from_branch_short(offs)) { // Short jump ++ b(offset26(entry)); ++ } else { // Far jump ++ patchable_jump_far(R0, offs); ++ } ++} ++ ++void MacroAssembler::jmp(address entry, relocInfo::relocType rtype) { ++ switch (rtype) { ++ case relocInfo::none: ++ jmp(entry); ++ break; ++ default: ++ { ++ InstructionMark im(this); ++ relocate(rtype); ++ patchable_jump(entry); ++ } ++ break; ++ } ++} ++ ++void MacroAssembler::jmp_far(Label& L) { ++ if (L.is_bound()) { ++ assert(target(L) != nullptr, "jmp most probably wrong"); ++ patchable_jump(target(L), true /* force patchable */); ++ } else { ++ L.add_patch_at(code(), locator()); ++ patchable_jump_far(R0, 0); ++ } ++} ++ ++// Move an oop into a register. ++void MacroAssembler::movoop(Register dst, jobject obj) { ++ int oop_index; ++ if (obj == nullptr) { ++ oop_index = oop_recorder()->allocate_oop_index(obj); ++ } else { ++#ifdef ASSERT ++ { ++ ThreadInVMfromUnknown tiv; ++ assert(Universe::heap()->is_in(JNIHandles::resolve(obj)), "should be real oop"); ++ } ++#endif ++ oop_index = oop_recorder()->find_index(obj); ++ } ++ RelocationHolder rspec = oop_Relocation::spec(oop_index); ++ ++ if (BarrierSet::barrier_set()->barrier_set_assembler()->supports_instruction_patching()) { ++ relocate(rspec); ++ patchable_li52(dst, (long)obj); ++ } else { ++ address dummy = address(uintptr_t(pc()) & -wordSize); // A nearby aligned address ++ relocate(rspec); ++ patchable_li52(dst, (long)dummy); ++ } ++} ++ ++void MacroAssembler::mov_metadata(Address dst, Metadata* obj) { ++ int oop_index; ++ if (obj) { ++ oop_index = oop_recorder()->find_index(obj); ++ } else { ++ oop_index = oop_recorder()->allocate_metadata_index(obj); ++ } ++ relocate(metadata_Relocation::spec(oop_index)); ++ patchable_li52(AT, (long)obj); ++ st_d(AT, dst); ++} ++ ++void MacroAssembler::mov_metadata(Register dst, Metadata* obj) { ++ int oop_index; ++ if (obj) { ++ oop_index = oop_recorder()->find_index(obj); ++ } else { ++ oop_index = oop_recorder()->allocate_metadata_index(obj); ++ } ++ relocate(metadata_Relocation::spec(oop_index)); ++ patchable_li52(dst, (long)obj); ++} ++ ++void MacroAssembler::call(address entry) { ++ jlong offs = entry - pc(); ++ if (reachable_from_branch_short(offs)) { // Short call (pc-rel) ++ bl(offset26(entry)); ++ } else if (is_simm(offs, 38)) { // Far call (pc-rel) ++ patchable_jump_far(RA, offs); ++ } else { // Long call (absolute) ++ call_long(entry); ++ } ++} ++ ++void MacroAssembler::call(address entry, relocInfo::relocType rtype) { ++ switch (rtype) { ++ case relocInfo::none: ++ call(entry); ++ break; ++ case relocInfo::runtime_call_type: ++ if (!is_simm(entry - pc(), 38)) { ++ call_long(entry); ++ break; ++ } ++ // fallthrough ++ default: ++ { ++ InstructionMark im(this); ++ relocate(rtype); ++ patchable_call(entry); ++ } ++ break; ++ } ++} ++ ++void MacroAssembler::call(address entry, RelocationHolder& rh){ ++ switch (rh.type()) { ++ case relocInfo::none: ++ call(entry); ++ break; ++ case relocInfo::runtime_call_type: ++ if (!is_simm(entry - pc(), 38)) { ++ call_long(entry); ++ break; ++ } ++ // fallthrough ++ default: ++ { ++ InstructionMark im(this); ++ relocate(rh); ++ patchable_call(entry); ++ } ++ break; ++ } ++} ++ ++void MacroAssembler::call_long(address entry) { ++ jlong value = (jlong)entry; ++ lu12i_w(AT, split_low20(value >> 12)); ++ lu32i_d(AT, split_low20(value >> 32)); ++ jirl(RA, AT, split_low12(value)); ++} ++ ++address MacroAssembler::ic_call(address entry, jint method_index) { ++ RelocationHolder rh = virtual_call_Relocation::spec(pc(), method_index); ++ patchable_li52(IC_Klass, (long)Universe::non_oop_word()); ++ assert(entry != nullptr, "call most probably wrong"); ++ InstructionMark im(this); ++ return trampoline_call(AddressLiteral(entry, rh)); ++} ++ ++void MacroAssembler::emit_static_call_stub() { ++ // Code stream for loading method may be changed. ++ ibar(0); ++ ++ // static stub relocation also tags the Method* in the code-stream. ++ mov_metadata(Rmethod, nullptr); ++ // This is recognized as unresolved by relocs/nativeInst/ic code ++ ++ patchable_jump(pc()); ++} ++ ++void MacroAssembler::c2bool(Register r) { ++ sltu(r, R0, r); ++} ++ ++#ifndef PRODUCT ++extern "C" void findpc(intptr_t x); ++#endif ++ ++void MacroAssembler::debug(char* msg/*, RegistersForDebugging* regs*/) { ++ if ( ShowMessageBoxOnError ) { ++ JavaThreadState saved_state = JavaThread::current()->thread_state(); ++ JavaThread::current()->set_thread_state(_thread_in_vm); ++ { ++ // In order to get locks work, we need to fake a in_VM state ++ ttyLocker ttyl; ++ ::tty->print_cr("EXECUTION STOPPED: %s\n", msg); ++ if (CountBytecodes || TraceBytecodes || StopInterpreterAt) { ++ BytecodeCounter::print(); ++ } ++ } ++ } ++ fatal("DEBUG MESSAGE: %s", msg); ++} ++ ++void MacroAssembler::stop(const char* msg) { ++#ifndef PRODUCT ++ block_comment(msg); ++#endif ++ csrrd(R0, 0); ++ emit_int64((uintptr_t)msg); ++} ++ ++void MacroAssembler::increment(Register reg, int imm) { ++ if (!imm) return; ++ if (is_simm(imm, 12)) { ++ addi_d(reg, reg, imm); ++ } else { ++ li(AT, imm); ++ add_d(reg, reg, AT); ++ } ++} ++ ++void MacroAssembler::decrement(Register reg, int imm) { ++ increment(reg, -imm); ++} ++ ++void MacroAssembler::increment(Address addr, int imm) { ++ if (!imm) return; ++ assert(is_simm(imm, 12), "must be"); ++ ld_d(AT, addr); ++ addi_d(AT, AT, imm); ++ st_d(AT, addr); ++} ++ ++void MacroAssembler::decrement(Address addr, int imm) { ++ increment(addr, -imm); ++} ++ ++void MacroAssembler::call_VM(Register oop_result, ++ address entry_point, ++ bool check_exceptions) { ++ call_VM_helper(oop_result, entry_point, 0, check_exceptions); ++} ++ ++void MacroAssembler::call_VM(Register oop_result, ++ address entry_point, ++ Register arg_1, ++ bool check_exceptions) { ++ if (arg_1!=A1) move(A1, arg_1); ++ call_VM_helper(oop_result, entry_point, 1, check_exceptions); ++} ++ ++void MacroAssembler::call_VM(Register oop_result, ++ address entry_point, ++ Register arg_1, ++ Register arg_2, ++ bool check_exceptions) { ++ if (arg_1 != A1) move(A1, arg_1); ++ assert(arg_2 != A1, "smashed argument"); ++ if (arg_2 != A2) move(A2, arg_2); ++ call_VM_helper(oop_result, entry_point, 2, check_exceptions); ++} ++ ++void MacroAssembler::call_VM(Register oop_result, ++ address entry_point, ++ Register arg_1, ++ Register arg_2, ++ Register arg_3, ++ bool check_exceptions) { ++ if (arg_1 != A1) move(A1, arg_1); ++ assert(arg_2 != A1, "smashed argument"); ++ if (arg_2 != A2) move(A2, arg_2); ++ assert(arg_3 != A1 && arg_3 != A2, "smashed argument"); ++ if (arg_3 != A3) move(A3, arg_3); ++ call_VM_helper(oop_result, entry_point, 3, check_exceptions); ++} ++ ++void MacroAssembler::call_VM(Register oop_result, ++ Register last_java_sp, ++ address entry_point, ++ int number_of_arguments, ++ bool check_exceptions) { ++ call_VM_base(oop_result, NOREG, last_java_sp, entry_point, number_of_arguments, check_exceptions); ++} ++ ++void MacroAssembler::call_VM(Register oop_result, ++ Register last_java_sp, ++ address entry_point, ++ Register arg_1, ++ bool check_exceptions) { ++ if (arg_1 != A1) move(A1, arg_1); ++ call_VM(oop_result, last_java_sp, entry_point, 1, check_exceptions); ++} ++ ++void MacroAssembler::call_VM(Register oop_result, ++ Register last_java_sp, ++ address entry_point, ++ Register arg_1, ++ Register arg_2, ++ bool check_exceptions) { ++ if (arg_1 != A1) move(A1, arg_1); ++ assert(arg_2 != A1, "smashed argument"); ++ if (arg_2 != A2) move(A2, arg_2); ++ call_VM(oop_result, last_java_sp, entry_point, 2, check_exceptions); ++} ++ ++void MacroAssembler::call_VM(Register oop_result, ++ Register last_java_sp, ++ address entry_point, ++ Register arg_1, ++ Register arg_2, ++ Register arg_3, ++ bool check_exceptions) { ++ if (arg_1 != A1) move(A1, arg_1); ++ assert(arg_2 != A1, "smashed argument"); ++ if (arg_2 != A2) move(A2, arg_2); ++ assert(arg_3 != A1 && arg_3 != A2, "smashed argument"); ++ if (arg_3 != A3) move(A3, arg_3); ++ call_VM(oop_result, last_java_sp, entry_point, 3, check_exceptions); ++} ++ ++void MacroAssembler::call_VM_base(Register oop_result, ++ Register java_thread, ++ Register last_java_sp, ++ address entry_point, ++ int number_of_arguments, ++ bool check_exceptions) { ++ // determine java_thread register ++ if (!java_thread->is_valid()) { ++ java_thread = TREG; ++ } ++ // determine last_java_sp register ++ if (!last_java_sp->is_valid()) { ++ last_java_sp = SP; ++ } ++ // debugging support ++ assert(number_of_arguments >= 0 , "cannot have negative number of arguments"); ++ assert(number_of_arguments <= 4 , "cannot have negative number of arguments"); ++ assert(java_thread != oop_result , "cannot use the same register for java_thread & oop_result"); ++ assert(java_thread != last_java_sp, "cannot use the same register for java_thread & last_java_sp"); ++ ++ assert(last_java_sp != FP, "this code doesn't work for last_java_sp == fp, which currently can't portably work anyway since C2 doesn't save fp"); ++ ++ // set last Java frame before call ++ Label before_call; ++ bind(before_call); ++ set_last_Java_frame(java_thread, last_java_sp, FP, before_call); ++ ++ // do the call ++ move(A0, java_thread); ++ call(entry_point, relocInfo::runtime_call_type); ++ ++ // restore the thread (cannot use the pushed argument since arguments ++ // may be overwritten by C code generated by an optimizing compiler); ++ // however can use the register value directly if it is callee saved. ++ ++#ifdef ASSERT ++ { ++ Label L; ++ get_thread(AT); ++ beq(java_thread, AT, L); ++ stop("MacroAssembler::call_VM_base: TREG not callee saved?"); ++ bind(L); ++ } ++#endif ++ ++ // discard thread and arguments ++ ld_d(SP, Address(java_thread, JavaThread::last_Java_sp_offset())); ++ // reset last Java frame ++ reset_last_Java_frame(java_thread, false); ++ ++ check_and_handle_popframe(java_thread); ++ check_and_handle_earlyret(java_thread); ++ if (check_exceptions) { ++ // check for pending exceptions (java_thread is set upon return) ++ Label L; ++ ld_d(AT, java_thread, in_bytes(Thread::pending_exception_offset())); ++ beq(AT, R0, L); ++ // reload RA that may have been modified by the entry_point ++ lipc(RA, before_call); ++ jmp(StubRoutines::forward_exception_entry(), relocInfo::runtime_call_type); ++ bind(L); ++ } ++ ++ // get oop result if there is one and reset the value in the thread ++ if (oop_result->is_valid()) { ++ ld_d(oop_result, java_thread, in_bytes(JavaThread::vm_result_offset())); ++ st_d(R0, java_thread, in_bytes(JavaThread::vm_result_offset())); ++ verify_oop(oop_result); ++ } ++} ++ ++void MacroAssembler::call_VM_helper(Register oop_result, address entry_point, int number_of_arguments, bool check_exceptions) { ++ move(V0, SP); ++ //we also reserve space for java_thread here ++ assert(StackAlignmentInBytes == 16, "must be"); ++ bstrins_d(SP, R0, 3, 0); ++ call_VM_base(oop_result, NOREG, V0, entry_point, number_of_arguments, check_exceptions); ++} ++ ++void MacroAssembler::call_VM_leaf(address entry_point, int number_of_arguments) { ++ call_VM_leaf_base(entry_point, number_of_arguments); ++} ++ ++void MacroAssembler::call_VM_leaf(address entry_point, Register arg_0) { ++ if (arg_0 != A0) move(A0, arg_0); ++ call_VM_leaf(entry_point, 1); ++} ++ ++void MacroAssembler::call_VM_leaf(address entry_point, Register arg_0, Register arg_1) { ++ if (arg_0 != A0) move(A0, arg_0); ++ assert(arg_1 != A0, "smashed argument"); ++ if (arg_1 != A1) move(A1, arg_1); ++ call_VM_leaf(entry_point, 2); ++} ++ ++void MacroAssembler::call_VM_leaf(address entry_point, Register arg_0, Register arg_1, Register arg_2) { ++ if (arg_0 != A0) move(A0, arg_0); ++ assert(arg_1 != A0, "smashed argument"); ++ if (arg_1 != A1) move(A1, arg_1); ++ assert(arg_2 != A0 && arg_2 != A1, "smashed argument"); ++ if (arg_2 != A2) move(A2, arg_2); ++ call_VM_leaf(entry_point, 3); ++} ++ ++void MacroAssembler::super_call_VM_leaf(address entry_point) { ++ MacroAssembler::call_VM_leaf_base(entry_point, 0); ++} ++ ++void MacroAssembler::super_call_VM_leaf(address entry_point, ++ Register arg_1) { ++ if (arg_1 != A0) move(A0, arg_1); ++ MacroAssembler::call_VM_leaf_base(entry_point, 1); ++} ++ ++void MacroAssembler::super_call_VM_leaf(address entry_point, ++ Register arg_1, ++ Register arg_2) { ++ if (arg_1 != A0) move(A0, arg_1); ++ assert(arg_2 != A0, "smashed argument"); ++ if (arg_2 != A1) move(A1, arg_2); ++ MacroAssembler::call_VM_leaf_base(entry_point, 2); ++} ++ ++void MacroAssembler::super_call_VM_leaf(address entry_point, ++ Register arg_1, ++ Register arg_2, ++ Register arg_3) { ++ if (arg_1 != A0) move(A0, arg_1); ++ assert(arg_2 != A0, "smashed argument"); ++ if (arg_2 != A1) move(A1, arg_2); ++ assert(arg_3 != A0 && arg_3 != A1, "smashed argument"); ++ if (arg_3 != A2) move(A2, arg_3); ++ MacroAssembler::call_VM_leaf_base(entry_point, 3); ++} ++ ++// these are no-ops overridden by InterpreterMacroAssembler ++void MacroAssembler::check_and_handle_earlyret(Register java_thread) {} ++ ++void MacroAssembler::check_and_handle_popframe(Register java_thread) {} ++ ++void MacroAssembler::null_check(Register reg, int offset) { ++ if (needs_explicit_null_check(offset)) { ++ // provoke OS null exception if reg is null by ++ // accessing M[reg] w/o changing any (non-CC) registers ++ // NOTE: cmpl is plenty here to provoke a segv ++ ld_w(AT, reg, 0); ++ } else { ++ // nothing to do, (later) access of M[reg + offset] ++ // will provoke OS null exception if reg is null ++ } ++} ++ ++void MacroAssembler::enter() { ++ push2(RA, FP); ++ addi_d(FP, SP, 2 * wordSize); ++} ++ ++void MacroAssembler::leave() { ++ addi_d(SP, FP, -2 * wordSize); ++ pop2(RA, FP); ++} ++ ++void MacroAssembler::build_frame(int framesize) { ++ assert(framesize >= 2 * wordSize, "framesize must include space for FP/RA"); ++ assert(framesize % (2 * wordSize) == 0, "must preserve 2 * wordSize alignment"); ++ if (Assembler::is_simm(-framesize, 12)) { ++ addi_d(SP, SP, -framesize); ++ st_d(FP, Address(SP, framesize - 2 * wordSize)); ++ st_d(RA, Address(SP, framesize - 1 * wordSize)); ++ if (PreserveFramePointer) ++ addi_d(FP, SP, framesize); ++ } else { ++ addi_d(SP, SP, -2 * wordSize); ++ st_d(FP, Address(SP, 0 * wordSize)); ++ st_d(RA, Address(SP, 1 * wordSize)); ++ if (PreserveFramePointer) ++ addi_d(FP, SP, 2 * wordSize); ++ li(SCR1, framesize - 2 * wordSize); ++ sub_d(SP, SP, SCR1); ++ } ++ verify_cross_modify_fence_not_required(); ++} ++ ++void MacroAssembler::remove_frame(int framesize) { ++ assert(framesize >= 2 * wordSize, "framesize must include space for FP/RA"); ++ assert(framesize % (2*wordSize) == 0, "must preserve 2*wordSize alignment"); ++ if (Assembler::is_simm(framesize, 12)) { ++ ld_d(FP, Address(SP, framesize - 2 * wordSize)); ++ ld_d(RA, Address(SP, framesize - 1 * wordSize)); ++ addi_d(SP, SP, framesize); ++ } else { ++ li(SCR1, framesize - 2 * wordSize); ++ add_d(SP, SP, SCR1); ++ ld_d(FP, Address(SP, 0 * wordSize)); ++ ld_d(RA, Address(SP, 1 * wordSize)); ++ addi_d(SP, SP, 2 * wordSize); ++ } ++} ++ ++void MacroAssembler::unimplemented(const char* what) { ++ const char* buf = nullptr; ++ { ++ ResourceMark rm; ++ stringStream ss; ++ ss.print("unimplemented: %s", what); ++ buf = code_string(ss.as_string()); ++ } ++ stop(buf); ++} ++ ++// get_thread() can be called anywhere inside generated code so we ++// need to save whatever non-callee save context might get clobbered ++// by the call to Thread::current() or, indeed, the call setup code. ++void MacroAssembler::get_thread(Register thread) { ++ // save all call-clobbered int regs except thread ++ RegSet caller_saved_gpr = RegSet::range(A0, T8) + FP + RA - thread; ++ ++ push(caller_saved_gpr); ++ ++ call(CAST_FROM_FN_PTR(address, Thread::current), relocInfo::runtime_call_type); ++ ++ if (thread != A0) { ++ move(thread, A0); ++ } ++ ++ pop(caller_saved_gpr); ++} ++ ++void MacroAssembler::reset_last_Java_frame(Register java_thread, bool clear_fp) { ++ // determine java_thread register ++ if (!java_thread->is_valid()) { ++ java_thread = TREG; ++ } ++ // we must set sp to zero to clear frame ++ st_d(R0, Address(java_thread, JavaThread::last_Java_sp_offset())); ++ // must clear fp, so that compiled frames are not confused; it is possible ++ // that we need it only for debugging ++ if(clear_fp) { ++ st_d(R0, Address(java_thread, JavaThread::last_Java_fp_offset())); ++ } ++ ++ // Always clear the pc because it could have been set by make_walkable() ++ st_d(R0, Address(java_thread, JavaThread::last_Java_pc_offset())); ++} ++ ++void MacroAssembler::reset_last_Java_frame(bool clear_fp) { ++ // we must set sp to zero to clear frame ++ st_d(R0, TREG, in_bytes(JavaThread::last_Java_sp_offset())); ++ // must clear fp, so that compiled frames are not confused; it is ++ // possible that we need it only for debugging ++ if (clear_fp) { ++ st_d(R0, TREG, in_bytes(JavaThread::last_Java_fp_offset())); ++ } ++ ++ // Always clear the pc because it could have been set by make_walkable() ++ st_d(R0, TREG, in_bytes(JavaThread::last_Java_pc_offset())); ++} ++ ++void MacroAssembler::safepoint_poll(Label& slow_path, Register thread_reg, bool at_return, bool acquire, bool in_nmethod) { ++ if (acquire) { ++ ld_d(AT, thread_reg, in_bytes(JavaThread::polling_word_offset())); ++ membar(Assembler::Membar_mask_bits(LoadLoad|LoadStore)); ++ } else { ++ ld_d(AT, thread_reg, in_bytes(JavaThread::polling_word_offset())); ++ } ++ if (at_return) { ++ // Note that when in_nmethod is set, the stack pointer is incremented before the poll. Therefore, ++ // we may safely use the sp instead to perform the stack watermark check. ++ blt_far(AT, in_nmethod ? SP : FP, slow_path, false /* signed */); ++ } else { ++ andi(AT, AT, SafepointMechanism::poll_bit()); ++ bnez(AT, slow_path); ++ } ++} ++ ++// Calls to C land ++// ++// When entering C land, the fp, & sp of the last Java frame have to be recorded ++// in the (thread-local) JavaThread object. When leaving C land, the last Java fp ++// has to be reset to 0. This is required to allow proper stack traversal. ++void MacroAssembler::set_last_Java_frame(Register java_thread, ++ Register last_java_sp, ++ Register last_java_fp, ++ Label& last_java_pc) { ++ // determine java_thread register ++ if (!java_thread->is_valid()) { ++ java_thread = TREG; ++ } ++ ++ // determine last_java_sp register ++ if (!last_java_sp->is_valid()) { ++ last_java_sp = SP; ++ } ++ ++ // last_java_fp is optional ++ if (last_java_fp->is_valid()) { ++ st_d(last_java_fp, Address(java_thread, JavaThread::last_Java_fp_offset())); ++ } ++ ++ // last_java_pc ++ lipc(AT, last_java_pc); ++ st_d(AT, Address(java_thread, JavaThread::frame_anchor_offset() + ++ JavaFrameAnchor::last_Java_pc_offset())); ++ ++ st_d(last_java_sp, Address(java_thread, JavaThread::last_Java_sp_offset())); ++} ++ ++void MacroAssembler::set_last_Java_frame(Register last_java_sp, ++ Register last_java_fp, ++ Label& last_java_pc) { ++ set_last_Java_frame(NOREG, last_java_sp, last_java_fp, last_java_pc); ++} ++ ++void MacroAssembler::set_last_Java_frame(Register last_java_sp, ++ Register last_java_fp, ++ Register last_java_pc) { ++ // determine last_java_sp register ++ if (!last_java_sp->is_valid()) { ++ last_java_sp = SP; ++ } ++ ++ // last_java_fp is optional ++ if (last_java_fp->is_valid()) { ++ st_d(last_java_fp, Address(TREG, JavaThread::last_Java_fp_offset())); ++ } ++ ++ // last_java_pc is optional ++ if (last_java_pc->is_valid()) { ++ st_d(last_java_pc, Address(TREG, JavaThread::frame_anchor_offset() + ++ JavaFrameAnchor::last_Java_pc_offset())); ++ } ++ ++ st_d(last_java_sp, Address(TREG, JavaThread::last_Java_sp_offset())); ++} ++ ++// Defines obj, preserves var_size_in_bytes, okay for t2 == var_size_in_bytes. ++void MacroAssembler::tlab_allocate(Register obj, ++ Register var_size_in_bytes, ++ int con_size_in_bytes, ++ Register t1, ++ Register t2, ++ Label& slow_case) { ++ BarrierSetAssembler *bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ bs->tlab_allocate(this, obj, var_size_in_bytes, con_size_in_bytes, t1, t2, slow_case); ++} ++ ++void MacroAssembler::incr_allocated_bytes(Register thread, ++ Register var_size_in_bytes, ++ int con_size_in_bytes, ++ Register t1) { ++ if (!thread->is_valid()) { ++ thread = TREG; ++ } ++ ++ ld_d(AT, Address(thread, JavaThread::allocated_bytes_offset())); ++ if (var_size_in_bytes->is_valid()) { ++ add_d(AT, AT, var_size_in_bytes); ++ } else { ++ addi_d(AT, AT, con_size_in_bytes); ++ } ++ st_d(AT, Address(thread, JavaThread::allocated_bytes_offset())); ++} ++ ++void MacroAssembler::li(Register rd, jlong value) { ++ jlong hi12 = bitfield(value, 52, 12); ++ jlong lo52 = bitfield(value, 0, 52); ++ ++ if (hi12 != 0 && lo52 == 0) { ++ lu52i_d(rd, R0, hi12); ++ } else { ++ jlong hi20 = bitfield(value, 32, 20); ++ jlong lo20 = bitfield(value, 12, 20); ++ jlong lo12 = bitfield(value, 0, 12); ++ ++ if (lo20 == 0) { ++ ori(rd, R0, lo12); ++ } else if (bitfield(simm12(lo12), 12, 20) == lo20) { ++ addi_w(rd, R0, simm12(lo12)); ++ } else { ++ lu12i_w(rd, lo20); ++ if (lo12 != 0) ++ ori(rd, rd, lo12); ++ } ++ if (hi20 != bitfield(simm20(lo20), 20, 20)) ++ lu32i_d(rd, hi20); ++ if (hi12 != bitfield(simm20(hi20), 20, 12)) ++ lu52i_d(rd, rd, hi12); ++ } ++} ++ ++void MacroAssembler::patchable_li52(Register rd, jlong value) { ++ int count = 0; ++ ++ if (value <= max_jint && value >= min_jint) { ++ if (is_simm(value, 12)) { ++ addi_d(rd, R0, value); ++ count++; ++ } else if (is_uimm(value, 12)) { ++ ori(rd, R0, value); ++ count++; ++ } else { ++ lu12i_w(rd, split_low20(value >> 12)); ++ count++; ++ if (split_low12(value)) { ++ ori(rd, rd, split_low12(value)); ++ count++; ++ } ++ } ++ } else if (is_simm(value, 52)) { ++ lu12i_w(rd, split_low20(value >> 12)); ++ count++; ++ if (split_low12(value)) { ++ ori(rd, rd, split_low12(value)); ++ count++; ++ } ++ lu32i_d(rd, split_low20(value >> 32)); ++ count++; ++ } else { ++ tty->print_cr("value = 0x%lx", value); ++ guarantee(false, "Not supported yet !"); ++ } ++ ++ while (count < 3) { ++ nop(); ++ count++; ++ } ++} ++ ++void MacroAssembler::lipc(Register rd, Label& L) { ++ if (L.is_bound()) { ++ jint offs = (target(L) - pc()) >> 2; ++ guarantee(is_simm(offs, 20), "Not signed 20-bit offset"); ++ pcaddi(rd, offs); ++ } else { ++ InstructionMark im(this); ++ L.add_patch_at(code(), locator()); ++ pcaddi(rd, 0); ++ } ++} ++ ++void MacroAssembler::set_narrow_klass(Register dst, Klass* k) { ++ assert(UseCompressedClassPointers, "should only be used for compressed header"); ++ assert(oop_recorder() != nullptr, "this assembler needs an OopRecorder"); ++ ++ int klass_index = oop_recorder()->find_index(k); ++ RelocationHolder rspec = metadata_Relocation::spec(klass_index); ++ long narrowKlass = (long)CompressedKlassPointers::encode(k); ++ ++ relocate(rspec, Assembler::narrow_oop_operand); ++ patchable_li52(dst, narrowKlass); ++} ++ ++void MacroAssembler::set_narrow_oop(Register dst, jobject obj) { ++ assert(UseCompressedOops, "should only be used for compressed header"); ++ assert(oop_recorder() != nullptr, "this assembler needs an OopRecorder"); ++ ++ int oop_index = oop_recorder()->find_index(obj); ++ RelocationHolder rspec = oop_Relocation::spec(oop_index); ++ ++ relocate(rspec, Assembler::narrow_oop_operand); ++ patchable_li52(dst, oop_index); ++} ++ ++// ((OopHandle)result).resolve(); ++void MacroAssembler::resolve_oop_handle(Register result, Register tmp1, Register tmp2) { ++ // OopHandle::resolve is an indirection. ++ access_load_at(T_OBJECT, IN_NATIVE, result, Address(result, 0), tmp1, tmp2); ++} ++ ++// ((WeakHandle)result).resolve(); ++void MacroAssembler::resolve_weak_handle(Register result, Register tmp1, Register tmp2) { ++ assert_different_registers(result, tmp1, tmp2); ++ Label resolved; ++ ++ // A null weak handle resolves to null. ++ beqz(result, resolved); ++ ++ // Only 64 bit platforms support GCs that require a tmp register ++ // WeakHandle::resolve is an indirection like jweak. ++ access_load_at(T_OBJECT, IN_NATIVE | ON_PHANTOM_OOP_REF, ++ result, Address(result), tmp1, tmp2); ++ bind(resolved); ++} ++ ++void MacroAssembler::load_mirror(Register mirror, Register method, Register tmp1, Register tmp2) { ++ ld_d(mirror, Address(method, Method::const_offset())); ++ ld_d(mirror, Address(mirror, ConstMethod::constants_offset())); ++ ld_d(mirror, Address(mirror, ConstantPool::pool_holder_offset())); ++ ld_d(mirror, Address(mirror, Klass::java_mirror_offset())); ++ resolve_oop_handle(mirror, tmp1, tmp2); ++} ++ ++void MacroAssembler::_verify_oop(Register reg, const char* s, const char* file, int line) { ++ if (!VerifyOops) return; ++ ++ const char* bx = nullptr; ++ { ++ ResourceMark rm; ++ stringStream ss; ++ ss.print("verify_oop: %s: %s (%s:%d)", reg->name(), s, file, line); ++ bx = code_string(ss.as_string()); ++ } ++ ++ push(RegSet::of(RA, SCR1, c_rarg0, c_rarg1)); ++ ++ move(c_rarg1, reg); ++ // The length of the instruction sequence emitted should be independent ++ // of the value of the local char buffer address so that the size of mach ++ // nodes for scratch emit and normal emit matches. ++ patchable_li52(c_rarg0, (long)bx); ++ ++ // call indirectly to solve generation ordering problem ++ li(SCR1, StubRoutines::verify_oop_subroutine_entry_address()); ++ ld_d(SCR1, SCR1, 0); ++ jalr(SCR1); ++ ++ pop(RegSet::of(RA, SCR1, c_rarg0, c_rarg1)); ++} ++ ++void MacroAssembler::_verify_oop_addr(Address addr, const char* s, const char* file, int line) { ++ if (!VerifyOops) return; ++ ++ const char* bx = nullptr; ++ { ++ ResourceMark rm; ++ stringStream ss; ++ ss.print("verify_oop_addr: %s (%s:%d)", s, file, line); ++ bx = code_string(ss.as_string()); ++ } ++ ++ push(RegSet::of(RA, SCR1, c_rarg0, c_rarg1)); ++ ++ // addr may contain sp so we will have to adjust it based on the ++ // pushes that we just did. ++ if (addr.uses(SP)) { ++ lea(c_rarg1, addr); ++ ld_d(c_rarg1, Address(c_rarg1, 4 * wordSize)); ++ } else { ++ ld_d(c_rarg1, addr); ++ } ++ ++ // The length of the instruction sequence emitted should be independent ++ // of the value of the local char buffer address so that the size of mach ++ // nodes for scratch emit and normal emit matches. ++ patchable_li52(c_rarg0, (long)bx); ++ ++ // call indirectly to solve generation ordering problem ++ li(SCR1, StubRoutines::verify_oop_subroutine_entry_address()); ++ ld_d(SCR1, SCR1, 0); ++ jalr(SCR1); ++ ++ pop(RegSet::of(RA, SCR1, c_rarg0, c_rarg1)); ++} ++ ++void MacroAssembler::verify_tlab(Register t1, Register t2) { ++#ifdef ASSERT ++ assert_different_registers(t1, t2, AT); ++ if (UseTLAB && VerifyOops) { ++ Label next, ok; ++ ++ get_thread(t1); ++ ++ ld_d(t2, Address(t1, JavaThread::tlab_top_offset())); ++ ld_d(AT, Address(t1, JavaThread::tlab_start_offset())); ++ bgeu(t2, AT, next); ++ ++ stop("assert(top >= start)"); ++ ++ bind(next); ++ ld_d(AT, Address(t1, JavaThread::tlab_end_offset())); ++ bgeu(AT, t2, ok); ++ ++ stop("assert(top <= end)"); ++ ++ bind(ok); ++ ++ } ++#endif ++} ++ ++void MacroAssembler::bswap_h(Register dst, Register src) { ++ revb_2h(dst, src); ++ ext_w_h(dst, dst); // sign extension of the lower 16 bits ++} ++ ++void MacroAssembler::bswap_hu(Register dst, Register src) { ++ revb_2h(dst, src); ++ bstrpick_d(dst, dst, 15, 0); // zero extension of the lower 16 bits ++} ++ ++void MacroAssembler::bswap_w(Register dst, Register src) { ++ revb_2w(dst, src); ++ slli_w(dst, dst, 0); // keep sign, clear upper bits ++} ++ ++void MacroAssembler::cmpxchg(Address addr, Register oldval, Register newval, ++ Register resflag, bool retold, bool acquire, ++ bool weak, bool exchange) { ++ assert(oldval != resflag, "oldval != resflag"); ++ assert(newval != resflag, "newval != resflag"); ++ assert(addr.base() != resflag, "addr.base() != resflag"); ++ Label again, succ, fail; ++ ++ if (UseAMCAS) { ++ move(resflag, oldval/* compare_value */); ++ if (addr.disp() != 0) { ++ assert_different_registers(AT, oldval); ++ assert_different_registers(AT, newval); ++ assert_different_registers(AT, resflag); ++ ++ if (Assembler::is_simm(addr.disp(), 12)) { ++ addi_d(AT, addr.base(), addr.disp()); ++ } else { ++ li(AT, addr.disp()); ++ add_d(AT, addr.base(), AT); ++ } ++ amcas_db_d(resflag, newval, AT); ++ } else { ++ amcas_db_d(resflag, newval, addr.base()); ++ } ++ bne(resflag, oldval, fail); ++ if (!exchange) { ++ ori(resflag, R0, 1); ++ } ++ b(succ); ++ bind(fail); ++ if (retold && oldval != R0) { ++ move(oldval, resflag); ++ } ++ if (!exchange) { ++ move(resflag, R0); ++ } ++ bind(succ); ++ ++ } else { ++ bind(again); ++ ll_d(resflag, addr); ++ bne(resflag, oldval, fail); ++ move(resflag, newval); ++ sc_d(resflag, addr); ++ if (weak) { ++ b(succ); ++ } else { ++ beqz(resflag, again); ++ } ++ if (exchange) { ++ move(resflag, oldval); ++ } ++ b(succ); ++ ++ bind(fail); ++ if (acquire) { ++ membar(Assembler::Membar_mask_bits(LoadLoad|LoadStore)); ++ } else { ++ dbar(0x700); ++ } ++ if (retold && oldval != R0) ++ move(oldval, resflag); ++ if (!exchange) { ++ move(resflag, R0); ++ } ++ bind(succ); ++ } ++} ++ ++void MacroAssembler::cmpxchg(Address addr, Register oldval, Register newval, ++ Register tmp, bool retold, bool acquire, Label& succ, Label* fail) { ++ assert(oldval != tmp, "oldval != tmp"); ++ assert(newval != tmp, "newval != tmp"); ++ Label again, neq; ++ ++ if (UseAMCAS) { ++ move(tmp, oldval); ++ if (addr.disp() != 0) { ++ assert_different_registers(AT, oldval); ++ assert_different_registers(AT, newval); ++ assert_different_registers(AT, tmp); ++ ++ if (Assembler::is_simm(addr.disp(), 12)) { ++ addi_d(AT, addr.base(), addr.disp()); ++ } else { ++ li(AT, addr.disp()); ++ add_d(AT, addr.base(), AT); ++ } ++ amcas_db_d(tmp, newval, AT); ++ } else { ++ amcas_db_d(tmp, newval, addr.base()); ++ } ++ bne(tmp, oldval, neq); ++ b(succ); ++ bind(neq); ++ if (fail) { ++ b(*fail); ++ } ++ } else { ++ bind(again); ++ ll_d(tmp, addr); ++ bne(tmp, oldval, neq); ++ move(tmp, newval); ++ sc_d(tmp, addr); ++ beqz(tmp, again); ++ b(succ); ++ bind(neq); ++ if (acquire) { ++ membar(Assembler::Membar_mask_bits(LoadLoad|LoadStore)); ++ } else { ++ dbar(0x700); ++ } ++ if (retold && oldval != R0) ++ move(oldval, tmp); ++ if (fail) ++ b(*fail); ++ } ++} ++ ++void MacroAssembler::cmpxchg32(Address addr, Register oldval, Register newval, ++ Register resflag, bool sign, bool retold, bool acquire, ++ bool weak, bool exchange) { ++ assert(oldval != resflag, "oldval != resflag"); ++ assert(newval != resflag, "newval != resflag"); ++ assert(addr.base() != resflag, "addr.base() != resflag"); ++ Label again, succ, fail; ++ ++ if (UseAMCAS) { ++ move(resflag, oldval/* compare_value */); ++ if (addr.disp() != 0) { ++ assert_different_registers(AT, oldval); ++ assert_different_registers(AT, newval); ++ assert_different_registers(AT, resflag); ++ ++ if (Assembler::is_simm(addr.disp(), 12)) { ++ addi_d(AT, addr.base(), addr.disp()); ++ } else { ++ li(AT, addr.disp()); ++ add_d(AT, addr.base(), AT); ++ } ++ amcas_db_w(resflag, newval, AT); ++ } else { ++ amcas_db_w(resflag, newval, addr.base()); ++ } ++ if (!sign) { ++ lu32i_d(resflag, 0); ++ } ++ bne(resflag, oldval, fail); ++ if (!exchange) { ++ ori(resflag, R0, 1); ++ } ++ b(succ); ++ bind(fail); ++ if (retold && oldval != R0) { ++ move(oldval, resflag); ++ } ++ if (!exchange) { ++ move(resflag, R0); ++ } ++ bind(succ); ++ } else { ++ bind(again); ++ ll_w(resflag, addr); ++ if (!sign) ++ lu32i_d(resflag, 0); ++ bne(resflag, oldval, fail); ++ move(resflag, newval); ++ sc_w(resflag, addr); ++ if (weak) { ++ b(succ); ++ } else { ++ beqz(resflag, again); ++ } ++ if (exchange) { ++ move(resflag, oldval); ++ } ++ b(succ); ++ ++ bind(fail); ++ if (acquire) { ++ membar(Assembler::Membar_mask_bits(LoadLoad|LoadStore)); ++ } else { ++ dbar(0x700); ++ } ++ if (retold && oldval != R0) ++ move(oldval, resflag); ++ if (!exchange) { ++ move(resflag, R0); ++ } ++ bind(succ); ++ } ++} ++ ++void MacroAssembler::cmpxchg32(Address addr, Register oldval, Register newval, Register tmp, ++ bool sign, bool retold, bool acquire, Label& succ, Label* fail) { ++ assert(oldval != tmp, "oldval != tmp"); ++ assert(newval != tmp, "newval != tmp"); ++ Label again, neq; ++ ++ if (UseAMCAS) { ++ move(tmp, oldval); ++ if (addr.disp() != 0) { ++ assert_different_registers(AT, oldval); ++ assert_different_registers(AT, newval); ++ assert_different_registers(AT, tmp); ++ ++ if (Assembler::is_simm(addr.disp(), 12)) { ++ addi_d(AT, addr.base(), addr.disp()); ++ } else { ++ li(AT, addr.disp()); ++ add_d(AT, addr.base(), AT); ++ } ++ amcas_db_w(tmp, newval, AT); ++ } else { ++ amcas_db_w(tmp, newval, addr.base()); ++ } ++ if (!sign) { ++ lu32i_d(tmp, 0); ++ } ++ bne(tmp, oldval, neq); ++ b(succ); ++ bind(neq); ++ if (fail) { ++ b(*fail); ++ } ++ } else { ++ bind(again); ++ ll_w(tmp, addr); ++ if (!sign) ++ lu32i_d(tmp, 0); ++ bne(tmp, oldval, neq); ++ move(tmp, newval); ++ sc_w(tmp, addr); ++ beqz(tmp, again); ++ b(succ); ++ ++ bind(neq); ++ if (acquire) { ++ membar(Assembler::Membar_mask_bits(LoadLoad|LoadStore)); ++ } else { ++ dbar(0x700); ++ } ++ if (retold && oldval != R0) ++ move(oldval, tmp); ++ if (fail) ++ b(*fail); ++ } ++} ++ ++void MacroAssembler::cmpxchg16(Address addr, Register oldval, Register newval, ++ Register resflag, bool sign, bool retold, bool acquire, ++ bool weak, bool exchange) { ++ assert(oldval != resflag, "oldval != resflag"); ++ assert(newval != resflag, "newval != resflag"); ++ assert(addr.base() != resflag, "addr.base() != resflag"); ++ assert(UseAMCAS == true, "UseAMCAS == true"); ++ Label again, succ, fail; ++ ++ move(resflag, oldval/* compare_value */); ++ if (addr.disp() != 0) { ++ assert_different_registers(AT, oldval); ++ assert_different_registers(AT, newval); ++ assert_different_registers(AT, resflag); ++ ++ if (Assembler::is_simm(addr.disp(), 12)) { ++ addi_d(AT, addr.base(), addr.disp()); ++ } else { ++ li(AT, addr.disp()); ++ add_d(AT, addr.base(), AT); ++ } ++ amcas_db_h(resflag, newval, AT); ++ } else { ++ amcas_db_h(resflag, newval, addr.base()); ++ } ++ if (!sign) { ++ bstrpick_w(resflag, resflag, 15, 0); ++ } ++ bne(resflag, oldval, fail); ++ if (!exchange) { ++ ori(resflag, R0, 1); ++ } ++ b(succ); ++ bind(fail); ++ if (retold && oldval != R0) { ++ move(oldval, resflag); ++ } ++ if (!exchange) { ++ move(resflag, R0); ++ } ++ bind(succ); ++} ++ ++void MacroAssembler::cmpxchg16(Address addr, Register oldval, Register newval, Register tmp, ++ bool sign, bool retold, bool acquire, Label& succ, Label* fail) { ++ assert(oldval != tmp, "oldval != tmp"); ++ assert(newval != tmp, "newval != tmp"); ++ assert(UseAMCAS == true, "UseAMCAS == true"); ++ Label again, neq; ++ ++ move(tmp, oldval); ++ if (addr.disp() != 0) { ++ assert_different_registers(AT, oldval); ++ assert_different_registers(AT, newval); ++ assert_different_registers(AT, tmp); ++ ++ if (Assembler::is_simm(addr.disp(), 12)) { ++ addi_d(AT, addr.base(), addr.disp()); ++ } else { ++ li(AT, addr.disp()); ++ add_d(AT, addr.base(), AT); ++ } ++ amcas_db_h(tmp, newval, AT); ++ } else { ++ amcas_db_h(tmp, newval, addr.base()); ++ } ++ if (!sign) { ++ bstrpick_w(tmp, tmp, 15, 0); ++ } ++ bne(tmp, oldval, neq); ++ b(succ); ++ bind(neq); ++ if (retold && oldval != R0) { ++ move(oldval, tmp); ++ } ++ if (fail) { ++ b(*fail); ++ } ++} ++ ++void MacroAssembler::cmpxchg8(Address addr, Register oldval, Register newval, ++ Register resflag, bool sign, bool retold, bool acquire, ++ bool weak, bool exchange) { ++ assert(oldval != resflag, "oldval != resflag"); ++ assert(newval != resflag, "newval != resflag"); ++ assert(addr.base() != resflag, "addr.base() != resflag"); ++ assert(UseAMCAS == true, "UseAMCAS == true"); ++ Label again, succ, fail; ++ ++ move(resflag, oldval/* compare_value */); ++ if (addr.disp() != 0) { ++ assert_different_registers(AT, oldval); ++ assert_different_registers(AT, newval); ++ assert_different_registers(AT, resflag); ++ ++ if (Assembler::is_simm(addr.disp(), 12)) { ++ addi_d(AT, addr.base(), addr.disp()); ++ } else { ++ li(AT, addr.disp()); ++ add_d(AT, addr.base(), AT); ++ } ++ amcas_db_b(resflag, newval, AT); ++ } else { ++ amcas_db_b(resflag, newval, addr.base()); ++ } ++ if (!sign) { ++ andi(resflag, resflag, 0xFF); ++ } ++ bne(resflag, oldval, fail); ++ if (!exchange) { ++ ori(resflag, R0, 1); ++ } ++ b(succ); ++ bind(fail); ++ if (retold && oldval != R0) { ++ move(oldval, resflag); ++ } ++ if (!exchange) { ++ move(resflag, R0); ++ } ++ bind(succ); ++} ++ ++void MacroAssembler::cmpxchg8(Address addr, Register oldval, Register newval, Register tmp, ++ bool sign, bool retold, bool acquire, Label& succ, Label* fail) { ++ assert(oldval != tmp, "oldval != tmp"); ++ assert(newval != tmp, "newval != tmp"); ++ assert(UseAMCAS == true, "UseAMCAS == true"); ++ Label again, neq; ++ ++ move(tmp, oldval); ++ if (addr.disp() != 0) { ++ assert_different_registers(AT, oldval); ++ assert_different_registers(AT, newval); ++ assert_different_registers(AT, tmp); ++ ++ if (Assembler::is_simm(addr.disp(), 12)) { ++ addi_d(AT, addr.base(), addr.disp()); ++ } else { ++ li(AT, addr.disp()); ++ add_d(AT, addr.base(), AT); ++ } ++ amcas_db_b(tmp, newval, AT); ++ } else { ++ amcas_db_b(tmp, newval, addr.base()); ++ } ++ if (!sign) { ++ andi(tmp, tmp, 0xFF); ++ } ++ bne(tmp, oldval, neq); ++ b(succ); ++ bind(neq); ++ if (retold && oldval != R0) { ++ move(oldval, tmp); ++ } ++ if (fail) { ++ b(*fail); ++ } ++} ++ ++void MacroAssembler::push_cont_fastpath(Register java_thread) { ++ if (!Continuations::enabled()) return; ++ Label done; ++ ld_d(AT, Address(java_thread, JavaThread::cont_fastpath_offset())); ++ bgeu(AT, SP, done); ++ st_d(SP, Address(java_thread, JavaThread::cont_fastpath_offset())); ++ bind(done); ++} ++ ++void MacroAssembler::pop_cont_fastpath(Register java_thread) { ++ if (!Continuations::enabled()) return; ++ Label done; ++ ld_d(AT, Address(java_thread, JavaThread::cont_fastpath_offset())); ++ bltu(SP, AT, done); ++ st_d(R0, Address(java_thread, JavaThread::cont_fastpath_offset())); ++ bind(done); ++} ++ ++void MacroAssembler::align(int modulus) { ++ while (offset() % modulus != 0) nop(); ++} ++ ++void MacroAssembler::post_call_nop() { ++ if (!Continuations::enabled()) return; ++ InstructionMark im(this); ++ relocate(post_call_nop_Relocation::spec()); ++ // pick 2 instructions to save oopmap(8 bits) and offset(24 bits) ++ nop(); ++ ori(R0, R0, 0); ++ ori(R0, R0, 0); ++} ++ ++// SCR2 is allocable in C2 Compiler ++static RegSet caller_saved_regset = RegSet::range(A0, A7) + RegSet::range(T0, T8) + RegSet::of(FP, RA) - RegSet::of(SCR1); ++static FloatRegSet caller_saved_fpu_regset = FloatRegSet::range(F0, F23); ++ ++void MacroAssembler::push_call_clobbered_registers_except(RegSet exclude) { ++ push(caller_saved_regset - exclude); ++ push_fpu(caller_saved_fpu_regset); ++} ++ ++void MacroAssembler::pop_call_clobbered_registers_except(RegSet exclude) { ++ pop_fpu(caller_saved_fpu_regset); ++ pop(caller_saved_regset - exclude); ++} ++ ++void MacroAssembler::push2(Register reg1, Register reg2) { ++ addi_d(SP, SP, -16); ++ st_d(reg1, SP, 8); ++ st_d(reg2, SP, 0); ++} ++ ++void MacroAssembler::pop2(Register reg1, Register reg2) { ++ ld_d(reg1, SP, 8); ++ ld_d(reg2, SP, 0); ++ addi_d(SP, SP, 16); ++} ++ ++void MacroAssembler::push(unsigned int bitset) { ++ unsigned char regs[31]; ++ int count = 0; ++ ++ bitset >>= 1; ++ for (int reg = 1; reg < 31; reg++) { ++ if (1 & bitset) ++ regs[count++] = reg; ++ bitset >>= 1; ++ } ++ ++ addi_d(SP, SP, -align_up(count, 2) * wordSize); ++ for (int i = 0; i < count; i ++) ++ st_d(as_Register(regs[i]), SP, i * wordSize); ++} ++ ++void MacroAssembler::pop(unsigned int bitset) { ++ unsigned char regs[31]; ++ int count = 0; ++ ++ bitset >>= 1; ++ for (int reg = 1; reg < 31; reg++) { ++ if (1 & bitset) ++ regs[count++] = reg; ++ bitset >>= 1; ++ } ++ ++ for (int i = 0; i < count; i ++) ++ ld_d(as_Register(regs[i]), SP, i * wordSize); ++ addi_d(SP, SP, align_up(count, 2) * wordSize); ++} ++ ++void MacroAssembler::push_fpu(unsigned int bitset) { ++ unsigned char regs[32]; ++ int count = 0; ++ ++ if (bitset == 0) ++ return; ++ ++ for (int reg = 0; reg <= 31; reg++) { ++ if (1 & bitset) ++ regs[count++] = reg; ++ bitset >>= 1; ++ } ++ ++ addi_d(SP, SP, -align_up(count, 2) * wordSize); ++ for (int i = 0; i < count; i++) ++ fst_d(as_FloatRegister(regs[i]), SP, i * wordSize); ++} ++ ++void MacroAssembler::pop_fpu(unsigned int bitset) { ++ unsigned char regs[32]; ++ int count = 0; ++ ++ if (bitset == 0) ++ return; ++ ++ for (int reg = 0; reg <= 31; reg++) { ++ if (1 & bitset) ++ regs[count++] = reg; ++ bitset >>= 1; ++ } ++ ++ for (int i = 0; i < count; i++) ++ fld_d(as_FloatRegister(regs[i]), SP, i * wordSize); ++ addi_d(SP, SP, align_up(count, 2) * wordSize); ++} ++ ++static int vpr_offset(int off) { ++ int slots_per_vpr = 0; ++ ++ if (UseLASX) ++ slots_per_vpr = FloatRegister::slots_per_lasx_register; ++ else if (UseLSX) ++ slots_per_vpr = FloatRegister::slots_per_lsx_register; ++ ++ return off * slots_per_vpr * VMRegImpl::stack_slot_size; ++} ++ ++void MacroAssembler::push_vp(unsigned int bitset) { ++ unsigned char regs[32]; ++ int count = 0; ++ ++ if (bitset == 0) ++ return; ++ ++ for (int reg = 0; reg <= 31; reg++) { ++ if (1 & bitset) ++ regs[count++] = reg; ++ bitset >>= 1; ++ } ++ ++ addi_d(SP, SP, vpr_offset(-align_up(count, 2))); ++ ++ for (int i = 0; i < count; i++) { ++ int off = vpr_offset(i); ++ if (UseLASX) ++ xvst(as_FloatRegister(regs[i]), SP, off); ++ else if (UseLSX) ++ vst(as_FloatRegister(regs[i]), SP, off); ++ } ++} ++ ++void MacroAssembler::pop_vp(unsigned int bitset) { ++ unsigned char regs[32]; ++ int count = 0; ++ ++ if (bitset == 0) ++ return; ++ ++ for (int reg = 0; reg <= 31; reg++) { ++ if (1 & bitset) ++ regs[count++] = reg; ++ bitset >>= 1; ++ } ++ ++ for (int i = 0; i < count; i++) { ++ int off = vpr_offset(i); ++ if (UseLASX) ++ xvld(as_FloatRegister(regs[i]), SP, off); ++ else if (UseLSX) ++ vld(as_FloatRegister(regs[i]), SP, off); ++ } ++ ++ addi_d(SP, SP, vpr_offset(align_up(count, 2))); ++} ++ ++void MacroAssembler::load_method_holder(Register holder, Register method) { ++ ld_d(holder, Address(method, Method::const_offset())); // ConstMethod* ++ ld_d(holder, Address(holder, ConstMethod::constants_offset())); // ConstantPool* ++ ld_d(holder, Address(holder, ConstantPool::pool_holder_offset())); // InstanceKlass* ++} ++ ++void MacroAssembler::load_method_holder_cld(Register rresult, Register rmethod) { ++ load_method_holder(rresult, rmethod); ++ ld_d(rresult, Address(rresult, InstanceKlass::class_loader_data_offset())); ++} ++ ++// for UseCompressedOops Option ++void MacroAssembler::load_klass(Register dst, Register src) { ++ if(UseCompressedClassPointers){ ++ ld_wu(dst, Address(src, oopDesc::klass_offset_in_bytes())); ++ decode_klass_not_null(dst); ++ } else { ++ ld_d(dst, src, oopDesc::klass_offset_in_bytes()); ++ } ++} ++ ++void MacroAssembler::store_klass(Register dst, Register src) { ++ if(UseCompressedClassPointers){ ++ encode_klass_not_null(src); ++ st_w(src, dst, oopDesc::klass_offset_in_bytes()); ++ } else { ++ st_d(src, dst, oopDesc::klass_offset_in_bytes()); ++ } ++} ++ ++void MacroAssembler::store_klass_gap(Register dst, Register src) { ++ if (UseCompressedClassPointers) { ++ st_w(src, dst, oopDesc::klass_gap_offset_in_bytes()); ++ } ++} ++ ++void MacroAssembler::access_load_at(BasicType type, DecoratorSet decorators, Register dst, Address src, ++ Register tmp1, Register tmp2) { ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ decorators = AccessInternal::decorator_fixup(decorators, type); ++ bool as_raw = (decorators & AS_RAW) != 0; ++ if (as_raw) { ++ bs->BarrierSetAssembler::load_at(this, decorators, type, dst, src, tmp1, tmp2); ++ } else { ++ bs->load_at(this, decorators, type, dst, src, tmp1, tmp2); ++ } ++} ++ ++void MacroAssembler::access_store_at(BasicType type, DecoratorSet decorators, Address dst, Register val, ++ Register tmp1, Register tmp2, Register tmp3) { ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ decorators = AccessInternal::decorator_fixup(decorators, type); ++ bool as_raw = (decorators & AS_RAW) != 0; ++ if (as_raw) { ++ bs->BarrierSetAssembler::store_at(this, decorators, type, dst, val, tmp1, tmp2, tmp3); ++ } else { ++ bs->store_at(this, decorators, type, dst, val, tmp1, tmp2, tmp3); ++ } ++} ++ ++void MacroAssembler::load_heap_oop(Register dst, Address src, Register tmp1, ++ Register tmp2, DecoratorSet decorators) { ++ access_load_at(T_OBJECT, IN_HEAP | decorators, dst, src, tmp1, tmp2); ++} ++ ++// Doesn't do verification, generates fixed size code ++void MacroAssembler::load_heap_oop_not_null(Register dst, Address src, Register tmp1, ++ Register tmp2, DecoratorSet decorators) { ++ access_load_at(T_OBJECT, IN_HEAP | IS_NOT_NULL | decorators, dst, src, tmp1, tmp2); ++} ++ ++void MacroAssembler::store_heap_oop(Address dst, Register val, Register tmp1, ++ Register tmp2, Register tmp3, DecoratorSet decorators) { ++ access_store_at(T_OBJECT, IN_HEAP | decorators, dst, val, tmp1, tmp2, tmp3); ++} ++ ++// Used for storing NULLs. ++void MacroAssembler::store_heap_oop_null(Address dst) { ++ access_store_at(T_OBJECT, IN_HEAP, dst, noreg, noreg, noreg, noreg); ++} ++ ++#ifdef ASSERT ++void MacroAssembler::verify_heapbase(const char* msg) { ++ assert (UseCompressedOops || UseCompressedClassPointers, "should be compressed"); ++ assert (Universe::heap() != nullptr, "java heap should be initialized"); ++} ++#endif ++ ++// Algorithm must match oop.inline.hpp encode_heap_oop. ++void MacroAssembler::encode_heap_oop(Register r) { ++#ifdef ASSERT ++ verify_heapbase("MacroAssembler::encode_heap_oop:heap base corrupted?"); ++#endif ++ verify_oop_msg(r, "broken oop in encode_heap_oop"); ++ if (CompressedOops::base() == nullptr) { ++ if (CompressedOops::shift() != 0) { ++ assert(LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ srli_d(r, r, LogMinObjAlignmentInBytes); ++ } ++ return; ++ } ++ ++ sub_d(AT, r, S5_heapbase); ++ maskeqz(r, AT, r); ++ if (CompressedOops::shift() != 0) { ++ assert(LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ srli_d(r, r, LogMinObjAlignmentInBytes); ++ } ++} ++ ++void MacroAssembler::encode_heap_oop(Register dst, Register src) { ++#ifdef ASSERT ++ verify_heapbase("MacroAssembler::encode_heap_oop:heap base corrupted?"); ++#endif ++ verify_oop_msg(src, "broken oop in encode_heap_oop"); ++ if (CompressedOops::base() == nullptr) { ++ if (CompressedOops::shift() != 0) { ++ assert(LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ srli_d(dst, src, LogMinObjAlignmentInBytes); ++ } else { ++ if (dst != src) { ++ move(dst, src); ++ } ++ } ++ return; ++ } ++ ++ sub_d(AT, src, S5_heapbase); ++ maskeqz(dst, AT, src); ++ if (CompressedOops::shift() != 0) { ++ assert(LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ srli_d(dst, dst, LogMinObjAlignmentInBytes); ++ } ++} ++ ++void MacroAssembler::encode_heap_oop_not_null(Register r) { ++ assert (UseCompressedOops, "should be compressed"); ++#ifdef ASSERT ++ if (CheckCompressedOops) { ++ Label ok; ++ bne(r, R0, ok); ++ stop("null oop passed to encode_heap_oop_not_null"); ++ bind(ok); ++ } ++#endif ++ verify_oop_msg(r, "broken oop in encode_heap_oop_not_null"); ++ if (CompressedOops::base() != nullptr) { ++ sub_d(r, r, S5_heapbase); ++ } ++ if (CompressedOops::shift() != 0) { ++ assert (LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ srli_d(r, r, LogMinObjAlignmentInBytes); ++ } ++ ++} ++ ++void MacroAssembler::encode_heap_oop_not_null(Register dst, Register src) { ++ assert (UseCompressedOops, "should be compressed"); ++#ifdef ASSERT ++ if (CheckCompressedOops) { ++ Label ok; ++ bne(src, R0, ok); ++ stop("null oop passed to encode_heap_oop_not_null2"); ++ bind(ok); ++ } ++#endif ++ verify_oop_msg(src, "broken oop in encode_heap_oop_not_null2"); ++ if (CompressedOops::base() == nullptr) { ++ if (CompressedOops::shift() != 0) { ++ assert(LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ srli_d(dst, src, LogMinObjAlignmentInBytes); ++ } else { ++ if (dst != src) { ++ move(dst, src); ++ } ++ } ++ return; ++ } ++ ++ sub_d(dst, src, S5_heapbase); ++ if (CompressedOops::shift() != 0) { ++ assert(LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ srli_d(dst, dst, LogMinObjAlignmentInBytes); ++ } ++} ++ ++void MacroAssembler::decode_heap_oop(Register r) { ++#ifdef ASSERT ++ verify_heapbase("MacroAssembler::decode_heap_oop corrupted?"); ++#endif ++ if (CompressedOops::base() == nullptr) { ++ if (CompressedOops::shift() != 0) { ++ assert(LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ slli_d(r, r, LogMinObjAlignmentInBytes); ++ } ++ return; ++ } ++ ++ move(AT, r); ++ if (CompressedOops::shift() != 0) { ++ assert(LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ if (LogMinObjAlignmentInBytes <= 4) { ++ alsl_d(r, r, S5_heapbase, LogMinObjAlignmentInBytes - 1); ++ } else { ++ slli_d(r, r, LogMinObjAlignmentInBytes); ++ add_d(r, r, S5_heapbase); ++ } ++ } else { ++ add_d(r, r, S5_heapbase); ++ } ++ maskeqz(r, r, AT); ++ verify_oop_msg(r, "broken oop in decode_heap_oop"); ++} ++ ++void MacroAssembler::decode_heap_oop(Register dst, Register src) { ++#ifdef ASSERT ++ verify_heapbase("MacroAssembler::decode_heap_oop corrupted?"); ++#endif ++ if (CompressedOops::base() == nullptr) { ++ if (CompressedOops::shift() != 0) { ++ assert(LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ slli_d(dst, src, LogMinObjAlignmentInBytes); ++ } else { ++ if (dst != src) { ++ move(dst, src); ++ } ++ } ++ return; ++ } ++ ++ Register cond; ++ if (dst == src) { ++ cond = AT; ++ move(cond, src); ++ } else { ++ cond = src; ++ } ++ if (CompressedOops::shift() != 0) { ++ assert(LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ if (LogMinObjAlignmentInBytes <= 4) { ++ alsl_d(dst, src, S5_heapbase, LogMinObjAlignmentInBytes - 1); ++ } else { ++ slli_d(dst, src, LogMinObjAlignmentInBytes); ++ add_d(dst, dst, S5_heapbase); ++ } ++ } else { ++ add_d(dst, src, S5_heapbase); ++ } ++ maskeqz(dst, dst, cond); ++ verify_oop_msg(dst, "broken oop in decode_heap_oop"); ++} ++ ++void MacroAssembler::decode_heap_oop_not_null(Register r) { ++ // Note: it will change flags ++ assert(UseCompressedOops, "should only be used for compressed headers"); ++ assert(Universe::heap() != nullptr, "java heap should be initialized"); ++ // Cannot assert, unverified entry point counts instructions (see .ad file) ++ // vtableStubs also counts instructions in pd_code_size_limit. ++ // Also do not verify_oop as this is called by verify_oop. ++ if (CompressedOops::shift() != 0) { ++ assert(LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ if (CompressedOops::base() != nullptr) { ++ if (LogMinObjAlignmentInBytes <= 4) { ++ alsl_d(r, r, S5_heapbase, LogMinObjAlignmentInBytes - 1); ++ } else { ++ slli_d(r, r, LogMinObjAlignmentInBytes); ++ add_d(r, r, S5_heapbase); ++ } ++ } else { ++ slli_d(r, r, LogMinObjAlignmentInBytes); ++ } ++ } else { ++ assert(CompressedOops::base() == nullptr, "sanity"); ++ } ++} ++ ++void MacroAssembler::decode_heap_oop_not_null(Register dst, Register src) { ++ assert(UseCompressedOops, "should only be used for compressed headers"); ++ assert(Universe::heap() != nullptr, "java heap should be initialized"); ++ // Cannot assert, unverified entry point counts instructions (see .ad file) ++ // vtableStubs also counts instructions in pd_code_size_limit. ++ // Also do not verify_oop as this is called by verify_oop. ++ if (CompressedOops::shift() != 0) { ++ assert(LogMinObjAlignmentInBytes == CompressedOops::shift(), "decode alg wrong"); ++ if (CompressedOops::base() != nullptr) { ++ if (LogMinObjAlignmentInBytes <= 4) { ++ alsl_d(dst, src, S5_heapbase, LogMinObjAlignmentInBytes - 1); ++ } else { ++ slli_d(dst, src, LogMinObjAlignmentInBytes); ++ add_d(dst, dst, S5_heapbase); ++ } ++ } else { ++ slli_d(dst, src, LogMinObjAlignmentInBytes); ++ } ++ } else { ++ assert (CompressedOops::base() == nullptr, "sanity"); ++ if (dst != src) { ++ move(dst, src); ++ } ++ } ++} ++ ++void MacroAssembler::encode_klass_not_null(Register r) { ++ if (CompressedKlassPointers::base() != nullptr) { ++ if (((uint64_t)CompressedKlassPointers::base() & 0xffffffff) == 0 ++ && CompressedKlassPointers::shift() == 0) { ++ bstrpick_d(r, r, 31, 0); ++ return; ++ } ++ assert(r != AT, "Encoding a klass in AT"); ++ li(AT, (int64_t)CompressedKlassPointers::base()); ++ sub_d(r, r, AT); ++ } ++ if (CompressedKlassPointers::shift() != 0) { ++ assert (LogKlassAlignmentInBytes == CompressedKlassPointers::shift(), "decode alg wrong"); ++ srli_d(r, r, LogKlassAlignmentInBytes); ++ } ++} ++ ++void MacroAssembler::encode_klass_not_null(Register dst, Register src) { ++ if (dst == src) { ++ encode_klass_not_null(src); ++ } else { ++ if (CompressedKlassPointers::base() != nullptr) { ++ if (((uint64_t)CompressedKlassPointers::base() & 0xffffffff) == 0 ++ && CompressedKlassPointers::shift() == 0) { ++ bstrpick_d(dst, src, 31, 0); ++ return; ++ } ++ li(dst, (int64_t)CompressedKlassPointers::base()); ++ sub_d(dst, src, dst); ++ if (CompressedKlassPointers::shift() != 0) { ++ assert (LogKlassAlignmentInBytes == CompressedKlassPointers::shift(), "decode alg wrong"); ++ srli_d(dst, dst, LogKlassAlignmentInBytes); ++ } ++ } else { ++ if (CompressedKlassPointers::shift() != 0) { ++ assert (LogKlassAlignmentInBytes == CompressedKlassPointers::shift(), "decode alg wrong"); ++ srli_d(dst, src, LogKlassAlignmentInBytes); ++ } else { ++ move(dst, src); ++ } ++ } ++ } ++} ++ ++void MacroAssembler::decode_klass_not_null(Register r) { ++ assert(UseCompressedClassPointers, "should only be used for compressed headers"); ++ assert(r != AT, "Decoding a klass in AT"); ++ // Cannot assert, unverified entry point counts instructions (see .ad file) ++ // vtableStubs also counts instructions in pd_code_size_limit. ++ // Also do not verify_oop as this is called by verify_oop. ++ if (CompressedKlassPointers::base() != nullptr) { ++ if (CompressedKlassPointers::shift() == 0) { ++ if (((uint64_t)CompressedKlassPointers::base() & 0xffffffff) == 0) { ++ lu32i_d(r, (uint64_t)CompressedKlassPointers::base() >> 32); ++ } else { ++ li(AT, (int64_t)CompressedKlassPointers::base()); ++ add_d(r, r, AT); ++ } ++ } else { ++ assert(LogKlassAlignmentInBytes == CompressedKlassPointers::shift(), "decode alg wrong"); ++ assert(LogKlassAlignmentInBytes == Address::times_8, "klass not aligned on 64bits?"); ++ li(AT, (int64_t)CompressedKlassPointers::base()); ++ alsl_d(r, r, AT, Address::times_8 - 1); ++ } ++ } else { ++ if (CompressedKlassPointers::shift() != 0) { ++ assert(LogKlassAlignmentInBytes == CompressedKlassPointers::shift(), "decode alg wrong"); ++ slli_d(r, r, LogKlassAlignmentInBytes); ++ } ++ } ++} ++ ++void MacroAssembler::decode_klass_not_null(Register dst, Register src) { ++ assert(UseCompressedClassPointers, "should only be used for compressed headers"); ++ if (dst == src) { ++ decode_klass_not_null(dst); ++ } else { ++ // Cannot assert, unverified entry point counts instructions (see .ad file) ++ // vtableStubs also counts instructions in pd_code_size_limit. ++ // Also do not verify_oop as this is called by verify_oop. ++ if (CompressedKlassPointers::base() != nullptr) { ++ if (CompressedKlassPointers::shift() == 0) { ++ if (((uint64_t)CompressedKlassPointers::base() & 0xffffffff) == 0) { ++ move(dst, src); ++ lu32i_d(dst, (uint64_t)CompressedKlassPointers::base() >> 32); ++ } else { ++ li(dst, (int64_t)CompressedKlassPointers::base()); ++ add_d(dst, dst, src); ++ } ++ } else { ++ assert(LogKlassAlignmentInBytes == CompressedKlassPointers::shift(), "decode alg wrong"); ++ assert(LogKlassAlignmentInBytes == Address::times_8, "klass not aligned on 64bits?"); ++ li(dst, (int64_t)CompressedKlassPointers::base()); ++ alsl_d(dst, src, dst, Address::times_8 - 1); ++ } ++ } else { ++ if (CompressedKlassPointers::shift() != 0) { ++ assert(LogKlassAlignmentInBytes == CompressedKlassPointers::shift(), "decode alg wrong"); ++ slli_d(dst, src, LogKlassAlignmentInBytes); ++ } else { ++ move(dst, src); ++ } ++ } ++ } ++} ++ ++void MacroAssembler::reinit_heapbase() { ++ if (UseCompressedOops) { ++ if (Universe::heap() != nullptr) { ++ if (CompressedOops::base() == nullptr) { ++ move(S5_heapbase, R0); ++ } else { ++ li(S5_heapbase, (int64_t)CompressedOops::ptrs_base()); ++ } ++ } else { ++ li(S5_heapbase, (intptr_t)CompressedOops::ptrs_base_addr()); ++ ld_d(S5_heapbase, S5_heapbase, 0); ++ } ++ } ++} ++ ++void MacroAssembler::check_klass_subtype(Register sub_klass, ++ Register super_klass, ++ Register temp_reg, ++ Label& L_success) { ++//implement ind gen_subtype_check ++ Label L_failure; ++ check_klass_subtype_fast_path(sub_klass, super_klass, temp_reg, &L_success, &L_failure, nullptr); ++ check_klass_subtype_slow_path(sub_klass, super_klass, temp_reg, noreg, &L_success, nullptr); ++ bind(L_failure); ++} ++ ++void MacroAssembler::check_klass_subtype_fast_path(Register sub_klass, ++ Register super_klass, ++ Register temp_reg, ++ Label* L_success, ++ Label* L_failure, ++ Label* L_slow_path, ++ RegisterOrConstant super_check_offset) { ++ assert_different_registers(sub_klass, super_klass, temp_reg); ++ bool must_load_sco = (super_check_offset.constant_or_zero() == -1); ++ if (super_check_offset.is_register()) { ++ assert_different_registers(sub_klass, super_klass, ++ super_check_offset.as_register()); ++ } else if (must_load_sco) { ++ assert(temp_reg != noreg, "supply either a temp or a register offset"); ++ } ++ ++ Label L_fallthrough; ++ int label_nulls = 0; ++ if (L_success == nullptr) { L_success = &L_fallthrough; label_nulls++; } ++ if (L_failure == nullptr) { L_failure = &L_fallthrough; label_nulls++; } ++ if (L_slow_path == nullptr) { L_slow_path = &L_fallthrough; label_nulls++; } ++ assert(label_nulls <= 1, "at most one null in the batch"); ++ ++ int sc_offset = in_bytes(Klass::secondary_super_cache_offset()); ++ int sco_offset = in_bytes(Klass::super_check_offset_offset()); ++ // If the pointers are equal, we are done (e.g., String[] elements). ++ // This self-check enables sharing of secondary supertype arrays among ++ // non-primary types such as array-of-interface. Otherwise, each such ++ // type would need its own customized SSA. ++ // We move this check to the front of the fast path because many ++ // type checks are in fact trivially successful in this manner, ++ // so we get a nicely predicted branch right at the start of the check. ++ beq(sub_klass, super_klass, *L_success); ++ // Check the supertype display: ++ if (must_load_sco) { ++ ld_wu(temp_reg, super_klass, sco_offset); ++ super_check_offset = RegisterOrConstant(temp_reg); ++ } ++ add_d(AT, sub_klass, super_check_offset.register_or_noreg()); ++ ld_d(AT, AT, super_check_offset.constant_or_zero()); ++ ++ // This check has worked decisively for primary supers. ++ // Secondary supers are sought in the super_cache ('super_cache_addr'). ++ // (Secondary supers are interfaces and very deeply nested subtypes.) ++ // This works in the same check above because of a tricky aliasing ++ // between the super_cache and the primary super display elements. ++ // (The 'super_check_addr' can address either, as the case requires.) ++ // Note that the cache is updated below if it does not help us find ++ // what we need immediately. ++ // So if it was a primary super, we can just fail immediately. ++ // Otherwise, it's the slow path for us (no success at this point). ++ ++ if (super_check_offset.is_register()) { ++ beq(super_klass, AT, *L_success); ++ addi_d(AT, super_check_offset.as_register(), -sc_offset); ++ if (L_failure == &L_fallthrough) { ++ beq(AT, R0, *L_slow_path); ++ } else { ++ bne_far(AT, R0, *L_failure); ++ b(*L_slow_path); ++ } ++ } else if (super_check_offset.as_constant() == sc_offset) { ++ // Need a slow path; fast failure is impossible. ++ if (L_slow_path == &L_fallthrough) { ++ beq(super_klass, AT, *L_success); ++ } else { ++ bne(super_klass, AT, *L_slow_path); ++ b(*L_success); ++ } ++ } else { ++ // No slow path; it's a fast decision. ++ if (L_failure == &L_fallthrough) { ++ beq(super_klass, AT, *L_success); ++ } else { ++ bne_far(super_klass, AT, *L_failure); ++ b(*L_success); ++ } ++ } ++ ++ bind(L_fallthrough); ++} ++ ++template ++void MacroAssembler::check_klass_subtype_slow_path(Register sub_klass, ++ Register super_klass, ++ Register temp_reg, ++ Register temp2_reg, ++ Label* L_success, ++ Label* L_failure, ++ bool set_cond_codes); ++template ++void MacroAssembler::check_klass_subtype_slow_path(Register sub_klass, ++ Register super_klass, ++ Register temp_reg, ++ Register temp2_reg, ++ Label* L_success, ++ Label* L_failure, ++ bool set_cond_codes); ++template ++void MacroAssembler::check_klass_subtype_slow_path(Register sub_klass, ++ Register super_klass, ++ Register temp_reg, ++ Register temp2_reg, ++ Label* L_success, ++ Label* L_failure, ++ bool set_cond_codes) { ++ if (!LONG_JMP) { ++ if (temp2_reg == noreg) ++ temp2_reg = TSR; ++ } ++ assert_different_registers(sub_klass, super_klass, temp_reg, temp2_reg); ++#define IS_A_TEMP(reg) ((reg) == temp_reg || (reg) == temp2_reg) ++ ++ Label L_fallthrough; ++ int label_nulls = 0; ++ if (L_success == nullptr) { L_success = &L_fallthrough; label_nulls++; } ++ if (L_failure == nullptr) { L_failure = &L_fallthrough; label_nulls++; } ++ assert(label_nulls <= 1, "at most one null in the batch"); ++ ++ // a couple of useful fields in sub_klass: ++ int ss_offset = in_bytes(Klass::secondary_supers_offset()); ++ int sc_offset = in_bytes(Klass::secondary_super_cache_offset()); ++ Address secondary_supers_addr(sub_klass, ss_offset); ++ Address super_cache_addr( sub_klass, sc_offset); ++ ++ // Do a linear scan of the secondary super-klass chain. ++ // This code is rarely used, so simplicity is a virtue here. ++ // The repne_scan instruction uses fixed registers, which we must spill. ++ // Don't worry too much about pre-existing connections with the input regs. ++ ++#ifndef PRODUCT ++ int* pst_counter = &SharedRuntime::_partial_subtype_ctr; ++ ExternalAddress pst_counter_addr((address) pst_counter); ++#endif //PRODUCT ++ ++ // We will consult the secondary-super array. ++ ld_d(temp_reg, secondary_supers_addr); ++ // Load the array length. ++ ld_w(temp2_reg, Address(temp_reg, Array::length_offset_in_bytes())); ++ // Skip to start of data. ++ addi_d(temp_reg, temp_reg, Array::base_offset_in_bytes()); ++ ++ Label Loop, subtype; ++ bind(Loop); ++ ++ if (LONG_JMP) { ++ Label not_taken; ++ bne(temp2_reg, R0, not_taken); ++ jmp_far(*L_failure); ++ bind(not_taken); ++ } else { ++ beqz(temp2_reg, *L_failure); ++ } ++ ++ ld_d(AT, temp_reg, 0); ++ addi_d(temp_reg, temp_reg, 1 * wordSize); ++ beq(AT, super_klass, subtype); ++ addi_d(temp2_reg, temp2_reg, -1); ++ b(Loop); ++ ++ bind(subtype); ++ st_d(super_klass, super_cache_addr); ++ if (L_success != &L_fallthrough) { ++ if (LONG_JMP) ++ jmp_far(*L_success); ++ else ++ b(*L_success); ++ } ++ ++ // Success. Cache the super we found and proceed in triumph. ++#undef IS_A_TEMP ++ ++ bind(L_fallthrough); ++} ++ ++void MacroAssembler::clinit_barrier(Register klass, Register scratch, Label* L_fast_path, Label* L_slow_path) { ++ ++ assert(L_fast_path != nullptr || L_slow_path != nullptr, "at least one is required"); ++ assert_different_registers(klass, TREG, scratch); ++ ++ Label L_fallthrough; ++ if (L_fast_path == nullptr) { ++ L_fast_path = &L_fallthrough; ++ } else if (L_slow_path == nullptr) { ++ L_slow_path = &L_fallthrough; ++ } ++ ++ // Fast path check: class is fully initialized ++ ld_b(scratch, Address(klass, InstanceKlass::init_state_offset())); ++ addi_d(scratch, scratch, -InstanceKlass::fully_initialized); ++ beqz(scratch, *L_fast_path); ++ ++ // Fast path check: current thread is initializer thread ++ ld_d(scratch, Address(klass, InstanceKlass::init_thread_offset())); ++ if (L_slow_path == &L_fallthrough) { ++ beq(TREG, scratch, *L_fast_path); ++ bind(*L_slow_path); ++ } else if (L_fast_path == &L_fallthrough) { ++ bne(TREG, scratch, *L_slow_path); ++ bind(*L_fast_path); ++ } else { ++ Unimplemented(); ++ } ++} ++ ++void MacroAssembler::get_vm_result(Register oop_result, Register java_thread) { ++ ld_d(oop_result, Address(java_thread, JavaThread::vm_result_offset())); ++ st_d(R0, Address(java_thread, JavaThread::vm_result_offset())); ++ verify_oop_msg(oop_result, "broken oop in call_VM_base"); ++} ++ ++void MacroAssembler::get_vm_result_2(Register metadata_result, Register java_thread) { ++ ld_d(metadata_result, Address(java_thread, JavaThread::vm_result_2_offset())); ++ st_d(R0, Address(java_thread, JavaThread::vm_result_2_offset())); ++} ++ ++Address MacroAssembler::argument_address(RegisterOrConstant arg_slot, ++ int extra_slot_offset) { ++ // cf. TemplateTable::prepare_invoke(), if (load_receiver). ++ int stackElementSize = Interpreter::stackElementSize; ++ int offset = Interpreter::expr_offset_in_bytes(extra_slot_offset+0); ++#ifdef ASSERT ++ int offset1 = Interpreter::expr_offset_in_bytes(extra_slot_offset+1); ++ assert(offset1 - offset == stackElementSize, "correct arithmetic"); ++#endif ++ Register scale_reg = noreg; ++ Address::ScaleFactor scale_factor = Address::no_scale; ++ if (arg_slot.is_constant()) { ++ offset += arg_slot.as_constant() * stackElementSize; ++ } else { ++ scale_reg = arg_slot.as_register(); ++ scale_factor = Address::times(stackElementSize); ++ } ++ return Address(SP, scale_reg, scale_factor, offset); ++} ++ ++SkipIfEqual::~SkipIfEqual() { ++ _masm->bind(_label); ++} ++ ++void MacroAssembler::load_sized_value(Register dst, Address src, size_t size_in_bytes, bool is_signed, Register dst2) { ++ switch (size_in_bytes) { ++ case 8: ld_d(dst, src); break; ++ case 4: ld_w(dst, src); break; ++ case 2: is_signed ? ld_h(dst, src) : ld_hu(dst, src); break; ++ case 1: is_signed ? ld_b( dst, src) : ld_bu( dst, src); break; ++ default: ShouldNotReachHere(); ++ } ++} ++ ++void MacroAssembler::store_sized_value(Address dst, Register src, size_t size_in_bytes, Register src2) { ++ switch (size_in_bytes) { ++ case 8: st_d(src, dst); break; ++ case 4: st_w(src, dst); break; ++ case 2: st_h(src, dst); break; ++ case 1: st_b(src, dst); break; ++ default: ShouldNotReachHere(); ++ } ++} ++ ++// Look up the method for a megamorphic invokeinterface call. ++// The target method is determined by . ++// The receiver klass is in recv_klass. ++// On success, the result will be in method_result, and execution falls through. ++// On failure, execution transfers to the given label. ++void MacroAssembler::lookup_interface_method(Register recv_klass, ++ Register intf_klass, ++ RegisterOrConstant itable_index, ++ Register method_result, ++ Register scan_temp, ++ Label& L_no_such_interface, ++ bool return_method) { ++ assert_different_registers(recv_klass, intf_klass, scan_temp, AT); ++ assert_different_registers(method_result, intf_klass, scan_temp, AT); ++ assert(recv_klass != method_result || !return_method, ++ "recv_klass can be destroyed when method isn't needed"); ++ ++ assert(itable_index.is_constant() || itable_index.as_register() == method_result, ++ "caller must use same register for non-constant itable index as for method"); ++ ++ // Compute start of first itableOffsetEntry (which is at the end of the vtable) ++ int vtable_base = in_bytes(Klass::vtable_start_offset()); ++ int itentry_off = in_bytes(itableMethodEntry::method_offset()); ++ int scan_step = itableOffsetEntry::size() * wordSize; ++ int vte_size = vtableEntry::size() * wordSize; ++ Address::ScaleFactor times_vte_scale = Address::times_ptr; ++ assert(vte_size == wordSize, "else adjust times_vte_scale"); ++ ++ ld_w(scan_temp, Address(recv_klass, Klass::vtable_length_offset())); ++ ++ // %%% Could store the aligned, prescaled offset in the klassoop. ++ alsl_d(scan_temp, scan_temp, recv_klass, times_vte_scale - 1); ++ addi_d(scan_temp, scan_temp, vtable_base); ++ ++ if (return_method) { ++ // Adjust recv_klass by scaled itable_index, so we can free itable_index. ++ if (itable_index.is_constant()) { ++ li(AT, (itable_index.as_constant() * itableMethodEntry::size() * wordSize) + itentry_off); ++ add_d(recv_klass, recv_klass, AT); ++ } else { ++ assert(itableMethodEntry::size() * wordSize == wordSize, "adjust the scaling in the code below"); ++ alsl_d(AT, itable_index.as_register(), recv_klass, (int)Address::times_ptr - 1); ++ addi_d(recv_klass, AT, itentry_off); ++ } ++ } ++ ++ Label search, found_method; ++ ++ ld_d(method_result, Address(scan_temp, itableOffsetEntry::interface_offset())); ++ beq(intf_klass, method_result, found_method); ++ ++ bind(search); ++ // Check that the previous entry is non-null. A null entry means that ++ // the receiver class doesn't implement the interface, and wasn't the ++ // same as when the caller was compiled. ++ beqz(method_result, L_no_such_interface); ++ addi_d(scan_temp, scan_temp, scan_step); ++ ld_d(method_result, Address(scan_temp, itableOffsetEntry::interface_offset())); ++ bne(intf_klass, method_result, search); ++ ++ bind(found_method); ++ if (return_method) { ++ // Got a hit. ++ ld_wu(scan_temp, Address(scan_temp, itableOffsetEntry::offset_offset())); ++ ldx_d(method_result, recv_klass, scan_temp); ++ } ++} ++ ++// virtual method calling ++void MacroAssembler::lookup_virtual_method(Register recv_klass, ++ RegisterOrConstant vtable_index, ++ Register method_result) { ++ assert(vtableEntry::size() * wordSize == wordSize, "else adjust the scaling in the code below"); ++ ++ if (vtable_index.is_constant()) { ++ li(AT, vtable_index.as_constant()); ++ alsl_d(AT, AT, recv_klass, Address::times_ptr - 1); ++ } else { ++ alsl_d(AT, vtable_index.as_register(), recv_klass, Address::times_ptr - 1); ++ } ++ ++ ld_d(method_result, AT, in_bytes(Klass::vtable_start_offset() + vtableEntry::method_offset())); ++} ++ ++void MacroAssembler::load_byte_map_base(Register reg) { ++ CardTable::CardValue* byte_map_base = ++ ((CardTableBarrierSet*)(BarrierSet::barrier_set()))->card_table()->byte_map_base(); ++ ++ // Strictly speaking the byte_map_base isn't an address at all, and it might ++ // even be negative. It is thus materialised as a constant. ++ li(reg, (uint64_t)byte_map_base); ++} ++ ++void MacroAssembler::resolve_jobject(Register value, Register tmp1, Register tmp2) { ++ assert_different_registers(value, tmp1, tmp2); ++ Label done, tagged, weak_tagged; ++ ++ beqz(value, done); // Use null as-is. ++ // Test for tag. ++ andi(AT, value, JNIHandles::tag_mask); ++ bnez(AT, tagged); ++ ++ // Resolve local handle ++ access_load_at(T_OBJECT, IN_NATIVE | AS_RAW, value, Address(value, 0), tmp1, tmp2); ++ verify_oop(value); ++ b(done); ++ ++ bind(tagged); ++ // Test for jweak tag. ++ andi(AT, value, JNIHandles::TypeTag::weak_global); ++ bnez(AT, weak_tagged); ++ ++ // Resolve global handle ++ access_load_at(T_OBJECT, IN_NATIVE, value, ++ Address(value, -JNIHandles::TypeTag::global), tmp1, tmp2); ++ verify_oop(value); ++ b(done); ++ ++ bind(weak_tagged); ++ // Resolve jweak. ++ access_load_at(T_OBJECT, IN_NATIVE | ON_PHANTOM_OOP_REF, ++ value, Address(value, -JNIHandles::TypeTag::weak_global), tmp1, tmp2); ++ verify_oop(value); ++ bind(done); ++} ++ ++void MacroAssembler::resolve_global_jobject(Register value, Register tmp1, Register tmp2) { ++ assert_different_registers(value, tmp1, tmp2); ++ Label done; ++ ++ beqz(value, done); // Use null as-is. ++ ++#ifdef ASSERT ++ { ++ Label valid_global_tag; ++ andi(AT, value, JNIHandles::TypeTag::global); // Test for global tag. ++ bnez(AT, valid_global_tag); ++ stop("non global jobject using resolve_global_jobject"); ++ bind(valid_global_tag); ++ } ++#endif ++ ++ // Resolve global handle ++ access_load_at(T_OBJECT, IN_NATIVE, value, ++ Address(value, -JNIHandles::TypeTag::global), tmp1, tmp2); ++ verify_oop(value); ++ ++ bind(done); ++} ++ ++void MacroAssembler::lea(Register rd, Address src) { ++ Register dst = rd; ++ Register base = src.base(); ++ Register index = src.index(); ++ ++ int scale = src.scale(); ++ int disp = src.disp(); ++ ++ if (index == noreg) { ++ if (is_simm(disp, 12)) { ++ addi_d(dst, base, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ add_d(dst, base, AT); ++ } ++ } else { ++ if (scale == 0) { ++ if (disp == 0) { ++ add_d(dst, base, index); ++ } else if (is_simm(disp, 12)) { ++ add_d(AT, base, index); ++ addi_d(dst, AT, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ add_d(AT, base, AT); ++ add_d(dst, AT, index); ++ } ++ } else { ++ if (disp == 0) { ++ alsl_d(dst, index, base, scale - 1); ++ } else if (is_simm(disp, 12)) { ++ alsl_d(AT, index, base, scale - 1); ++ addi_d(dst, AT, disp); ++ } else { ++ lu12i_w(AT, split_low20(disp >> 12)); ++ if (split_low12(disp)) ++ ori(AT, AT, split_low12(disp)); ++ add_d(AT, AT, base); ++ alsl_d(dst, index, AT, scale - 1); ++ } ++ } ++ } ++} ++ ++void MacroAssembler::lea(Register dst, AddressLiteral adr) { ++ code_section()->relocate(pc(), adr.rspec()); ++ pcaddi(dst, (adr.target() - pc()) >> 2); ++} ++ ++void MacroAssembler::lea_long(Register dst, AddressLiteral adr) { ++ code_section()->relocate(pc(), adr.rspec()); ++ jint si12, si20; ++ split_simm32((adr.target() - pc()), si12, si20); ++ pcaddu12i(dst, si20); ++ addi_d(dst, dst, si12); ++} ++ ++int MacroAssembler::patched_branch(int dest_pos, int inst, int inst_pos) { ++ int v = (dest_pos - inst_pos) >> 2; ++ switch(high(inst, 6)) { ++ case beq_op: ++ case bne_op: ++ case blt_op: ++ case bge_op: ++ case bltu_op: ++ case bgeu_op: ++#ifndef PRODUCT ++ if(!is_simm16(v)) ++ { ++ tty->print_cr("must be simm16"); ++ tty->print_cr("Inst: %x", inst); ++ tty->print_cr("Op: %x", high(inst, 6)); ++ } ++#endif ++ assert(is_simm16(v), "must be simm16"); ++ ++ inst &= 0xfc0003ff; ++ inst |= ((v & 0xffff) << 10); ++ break; ++ case beqz_op: ++ case bnez_op: ++ case bccondz_op: ++ assert(is_simm(v, 21), "must be simm21"); ++#ifndef PRODUCT ++ if(!is_simm(v, 21)) ++ { ++ tty->print_cr("must be simm21"); ++ tty->print_cr("Inst: %x", inst); ++ } ++#endif ++ ++ inst &= 0xfc0003e0; ++ inst |= ( ((v & 0xffff) << 10) | ((v >> 16) & 0x1f) ); ++ break; ++ case b_op: ++ case bl_op: ++ assert(is_simm(v, 26), "must be simm26"); ++#ifndef PRODUCT ++ if(!is_simm(v, 26)) ++ { ++ tty->print_cr("must be simm26"); ++ tty->print_cr("Inst: %x", inst); ++ } ++#endif ++ ++ inst &= 0xfc000000; ++ inst |= ( ((v & 0xffff) << 10) | ((v >> 16) & 0x3ff) ); ++ break; ++ default: ++ ShouldNotReachHere(); ++ break; ++ } ++ return inst; ++} ++ ++void MacroAssembler::cmp_cmov_zero(Register op1, ++ Register op2, ++ Register dst, ++ Register src, ++ CMCompare cmp, ++ bool is_signed) { ++ switch (cmp) { ++ case EQ: ++ sub_d(AT, op1, op2); ++ maskeqz(dst, src, AT); ++ break; ++ ++ case NE: ++ sub_d(AT, op1, op2); ++ masknez(dst, src, AT); ++ break; ++ ++ case GT: ++ if (is_signed) { ++ slt(AT, op2, op1); ++ } else { ++ sltu(AT, op2, op1); ++ } ++ masknez(dst, src, AT); ++ break; ++ ++ case GE: ++ if (is_signed) { ++ slt(AT, op1, op2); ++ } else { ++ sltu(AT, op1, op2); ++ } ++ maskeqz(dst, src, AT); ++ break; ++ ++ case LT: ++ if (is_signed) { ++ slt(AT, op1, op2); ++ } else { ++ sltu(AT, op1, op2); ++ } ++ masknez(dst, src, AT); ++ break; ++ ++ case LE: ++ if (is_signed) { ++ slt(AT, op2, op1); ++ } else { ++ sltu(AT, op2, op1); ++ } ++ maskeqz(dst, src, AT); ++ break; ++ ++ default: ++ Unimplemented(); ++ } ++} ++ ++void MacroAssembler::cmp_cmov(Register op1, ++ Register op2, ++ Register dst, ++ Register src1, ++ Register src2, ++ CMCompare cmp, ++ bool is_signed) { ++ switch (cmp) { ++ case EQ: ++ sub_d(AT, op1, op2); ++ if (dst == src2) { ++ masknez(dst, src2, AT); ++ maskeqz(AT, src1, AT); ++ } else { ++ maskeqz(dst, src1, AT); ++ masknez(AT, src2, AT); ++ } ++ break; ++ ++ case NE: ++ sub_d(AT, op1, op2); ++ if (dst == src2) { ++ maskeqz(dst, src2, AT); ++ masknez(AT, src1, AT); ++ } else { ++ masknez(dst, src1, AT); ++ maskeqz(AT, src2, AT); ++ } ++ break; ++ ++ case GT: ++ if (is_signed) { ++ slt(AT, op2, op1); ++ } else { ++ sltu(AT, op2, op1); ++ } ++ if(dst == src2) { ++ maskeqz(dst, src2, AT); ++ masknez(AT, src1, AT); ++ } else { ++ masknez(dst, src1, AT); ++ maskeqz(AT, src2, AT); ++ } ++ break; ++ case GE: ++ if (is_signed) { ++ slt(AT, op1, op2); ++ } else { ++ sltu(AT, op1, op2); ++ } ++ if(dst == src2) { ++ masknez(dst, src2, AT); ++ maskeqz(AT, src1, AT); ++ } else { ++ maskeqz(dst, src1, AT); ++ masknez(AT, src2, AT); ++ } ++ break; ++ ++ case LT: ++ if (is_signed) { ++ slt(AT, op1, op2); ++ } else { ++ sltu(AT, op1, op2); ++ } ++ if(dst == src2) { ++ maskeqz(dst, src2, AT); ++ masknez(AT, src1, AT); ++ } else { ++ masknez(dst, src1, AT); ++ maskeqz(AT, src2, AT); ++ } ++ break; ++ case LE: ++ if (is_signed) { ++ slt(AT, op2, op1); ++ } else { ++ sltu(AT, op2, op1); ++ } ++ if(dst == src2) { ++ masknez(dst, src2, AT); ++ maskeqz(AT, src1, AT); ++ } else { ++ maskeqz(dst, src1, AT); ++ masknez(AT, src2, AT); ++ } ++ break; ++ default: ++ Unimplemented(); ++ } ++ OR(dst, dst, AT); ++} ++ ++void MacroAssembler::cmp_cmov(Register op1, ++ Register op2, ++ Register dst, ++ Register src, ++ CMCompare cmp, ++ bool is_signed) { ++ switch (cmp) { ++ case EQ: ++ sub_d(AT, op1, op2); ++ maskeqz(dst, dst, AT); ++ masknez(AT, src, AT); ++ break; ++ ++ case NE: ++ sub_d(AT, op1, op2); ++ masknez(dst, dst, AT); ++ maskeqz(AT, src, AT); ++ break; ++ ++ case GT: ++ if (is_signed) { ++ slt(AT, op2, op1); ++ } else { ++ sltu(AT, op2, op1); ++ } ++ masknez(dst, dst, AT); ++ maskeqz(AT, src, AT); ++ break; ++ ++ case GE: ++ if (is_signed) { ++ slt(AT, op1, op2); ++ } else { ++ sltu(AT, op1, op2); ++ } ++ maskeqz(dst, dst, AT); ++ masknez(AT, src, AT); ++ break; ++ ++ case LT: ++ if (is_signed) { ++ slt(AT, op1, op2); ++ } else { ++ sltu(AT, op1, op2); ++ } ++ masknez(dst, dst, AT); ++ maskeqz(AT, src, AT); ++ break; ++ ++ case LE: ++ if (is_signed) { ++ slt(AT, op2, op1); ++ } else { ++ sltu(AT, op2, op1); ++ } ++ maskeqz(dst, dst, AT); ++ masknez(AT, src, AT); ++ break; ++ ++ default: ++ Unimplemented(); ++ } ++ OR(dst, dst, AT); ++} ++ ++ ++void MacroAssembler::cmp_cmov(FloatRegister op1, ++ FloatRegister op2, ++ Register dst, ++ Register src, ++ FloatRegister tmp1, ++ FloatRegister tmp2, ++ CMCompare cmp, ++ bool is_float) { ++ movgr2fr_d(tmp1, dst); ++ movgr2fr_d(tmp2, src); ++ ++ switch(cmp) { ++ case EQ: ++ if (is_float) { ++ fcmp_ceq_s(FCC0, op1, op2); ++ } else { ++ fcmp_ceq_d(FCC0, op1, op2); ++ } ++ fsel(tmp1, tmp1, tmp2, FCC0); ++ break; ++ ++ case NE: ++ if (is_float) { ++ fcmp_ceq_s(FCC0, op1, op2); ++ } else { ++ fcmp_ceq_d(FCC0, op1, op2); ++ } ++ fsel(tmp1, tmp2, tmp1, FCC0); ++ break; ++ ++ case GT: ++ if (is_float) { ++ fcmp_cule_s(FCC0, op1, op2); ++ } else { ++ fcmp_cule_d(FCC0, op1, op2); ++ } ++ fsel(tmp1, tmp2, tmp1, FCC0); ++ break; ++ ++ case GE: ++ if (is_float) { ++ fcmp_cult_s(FCC0, op1, op2); ++ } else { ++ fcmp_cult_d(FCC0, op1, op2); ++ } ++ fsel(tmp1, tmp2, tmp1, FCC0); ++ break; ++ ++ case LT: ++ if (is_float) { ++ fcmp_cult_s(FCC0, op1, op2); ++ } else { ++ fcmp_cult_d(FCC0, op1, op2); ++ } ++ fsel(tmp1, tmp1, tmp2, FCC0); ++ break; ++ ++ case LE: ++ if (is_float) { ++ fcmp_cule_s(FCC0, op1, op2); ++ } else { ++ fcmp_cule_d(FCC0, op1, op2); ++ } ++ fsel(tmp1, tmp1, tmp2, FCC0); ++ break; ++ ++ default: ++ Unimplemented(); ++ } ++ ++ movfr2gr_d(dst, tmp1); ++} ++ ++void MacroAssembler::cmp_cmov(FloatRegister op1, ++ FloatRegister op2, ++ FloatRegister dst, ++ FloatRegister src, ++ CMCompare cmp, ++ bool is_float) { ++ switch(cmp) { ++ case EQ: ++ if (!is_float) { ++ fcmp_ceq_d(FCC0, op1, op2); ++ } else { ++ fcmp_ceq_s(FCC0, op1, op2); ++ } ++ fsel(dst, dst, src, FCC0); ++ break; ++ ++ case NE: ++ if (!is_float) { ++ fcmp_ceq_d(FCC0, op1, op2); ++ } else { ++ fcmp_ceq_s(FCC0, op1, op2); ++ } ++ fsel(dst, src, dst, FCC0); ++ break; ++ ++ case GT: ++ if (!is_float) { ++ fcmp_cule_d(FCC0, op1, op2); ++ } else { ++ fcmp_cule_s(FCC0, op1, op2); ++ } ++ fsel(dst, src, dst, FCC0); ++ break; ++ ++ case GE: ++ if (!is_float) { ++ fcmp_cult_d(FCC0, op1, op2); ++ } else { ++ fcmp_cult_s(FCC0, op1, op2); ++ } ++ fsel(dst, src, dst, FCC0); ++ break; ++ ++ case LT: ++ if (!is_float) { ++ fcmp_cult_d(FCC0, op1, op2); ++ } else { ++ fcmp_cult_s(FCC0, op1, op2); ++ } ++ fsel(dst, dst, src, FCC0); ++ break; ++ ++ case LE: ++ if (!is_float) { ++ fcmp_cule_d(FCC0, op1, op2); ++ } else { ++ fcmp_cule_s(FCC0, op1, op2); ++ } ++ fsel(dst, dst, src, FCC0); ++ break; ++ ++ default: ++ Unimplemented(); ++ } ++} ++ ++void MacroAssembler::cmp_cmov(Register op1, ++ Register op2, ++ FloatRegister dst, ++ FloatRegister src, ++ CMCompare cmp, ++ bool is_signed) { ++ switch (cmp) { ++ case EQ: ++ case NE: ++ sub_d(AT, op1, op2); ++ sltu(AT, R0, AT); ++ break; ++ ++ case GT: ++ case LE: ++ if (is_signed) { ++ slt(AT, op2, op1); ++ } else { ++ sltu(AT, op2, op1); ++ } ++ break; ++ ++ case GE: ++ case LT: ++ if (is_signed) { ++ slt(AT, op1, op2); ++ } else { ++ sltu(AT, op1, op2); ++ } ++ break; ++ ++ default: ++ Unimplemented(); ++ } ++ ++ if (UseGR2CF) { ++ movgr2cf(FCC0, AT); ++ } else { ++ movgr2fr_w(fscratch, AT); ++ movfr2cf(FCC0, fscratch); ++ } ++ ++ switch (cmp) { ++ case EQ: ++ case GE: ++ case LE: ++ fsel(dst, src, dst, FCC0); ++ break; ++ ++ case NE: ++ case GT: ++ case LT: ++ fsel(dst, dst, src, FCC0); ++ break; ++ ++ default: ++ Unimplemented(); ++ } ++} ++ ++void MacroAssembler::membar(Membar_mask_bits hint){ ++ address prev = pc() - NativeInstruction::sync_instruction_size; ++ address last = code()->last_insn(); ++ if (last != nullptr && ((NativeInstruction*)last)->is_sync() && prev == last) { ++ code()->set_last_insn(nullptr); ++ NativeMembar *membar = (NativeMembar*)prev; ++ // merged membar ++ // e.g. LoadLoad and LoadLoad|LoadStore to LoadLoad|LoadStore ++ membar->set_hint(membar->get_hint() & (~hint & 0xF)); ++ block_comment("merged membar"); ++ } else { ++ code()->set_last_insn(pc()); ++ Assembler::membar(hint); ++ } ++} ++ ++/** ++ * Emits code to update CRC-32 with a byte value according to constants in table ++ * ++ * @param [in,out]crc Register containing the crc. ++ * @param [in]val Register containing the byte to fold into the CRC. ++ * @param [in]table Register containing the table of crc constants. ++ * ++ * uint32_t crc; ++ * val = crc_table[(val ^ crc) & 0xFF]; ++ * crc = val ^ (crc >> 8); ++**/ ++void MacroAssembler::update_byte_crc32(Register crc, Register val, Register table) { ++ xorr(val, val, crc); ++ andi(val, val, 0xff); ++ ld_w(val, Address(table, val, Address::times_4, 0)); ++ srli_w(crc, crc, 8); ++ xorr(crc, val, crc); ++} ++ ++/** ++ * @param crc register containing existing CRC (32-bit) ++ * @param buf register pointing to input byte buffer (byte*) ++ * @param len register containing number of bytes ++ * @param tmp scratch register ++**/ ++void MacroAssembler::kernel_crc32(Register crc, Register buf, Register len, Register tmp) { ++ Label CRC_by64_loop, CRC_by4_loop, CRC_by1_loop, CRC_less64, CRC_by64_pre, CRC_by32_loop, CRC_less32, L_exit; ++ assert_different_registers(crc, buf, len, tmp); ++ ++ nor(crc, crc, R0); ++ ++ addi_d(len, len, -64); ++ bge(len, R0, CRC_by64_loop); ++ addi_d(len, len, 64-4); ++ bge(len, R0, CRC_by4_loop); ++ addi_d(len, len, 4); ++ blt(R0, len, CRC_by1_loop); ++ b(L_exit); ++ ++ bind(CRC_by64_loop); ++ ld_d(tmp, buf, 0); ++ crc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 8); ++ crc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 16); ++ crc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 24); ++ crc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 32); ++ crc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 40); ++ crc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 48); ++ crc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 56); ++ crc_w_d_w(crc, tmp, crc); ++ addi_d(buf, buf, 64); ++ addi_d(len, len, -64); ++ bge(len, R0, CRC_by64_loop); ++ addi_d(len, len, 64-4); ++ bge(len, R0, CRC_by4_loop); ++ addi_d(len, len, 4); ++ blt(R0, len, CRC_by1_loop); ++ b(L_exit); ++ ++ bind(CRC_by4_loop); ++ ld_w(tmp, buf, 0); ++ crc_w_w_w(crc, tmp, crc); ++ addi_d(buf, buf, 4); ++ addi_d(len, len, -4); ++ bge(len, R0, CRC_by4_loop); ++ addi_d(len, len, 4); ++ bge(R0, len, L_exit); ++ ++ bind(CRC_by1_loop); ++ ld_b(tmp, buf, 0); ++ crc_w_b_w(crc, tmp, crc); ++ addi_d(buf, buf, 1); ++ addi_d(len, len, -1); ++ blt(R0, len, CRC_by1_loop); ++ ++ bind(L_exit); ++ nor(crc, crc, R0); ++} ++ ++/** ++ * @param crc register containing existing CRC (32-bit) ++ * @param buf register pointing to input byte buffer (byte*) ++ * @param len register containing number of bytes ++ * @param tmp scratch register ++**/ ++void MacroAssembler::kernel_crc32c(Register crc, Register buf, Register len, Register tmp) { ++ Label CRC_by64_loop, CRC_by4_loop, CRC_by1_loop, CRC_less64, CRC_by64_pre, CRC_by32_loop, CRC_less32, L_exit; ++ assert_different_registers(crc, buf, len, tmp); ++ ++ addi_d(len, len, -64); ++ bge(len, R0, CRC_by64_loop); ++ addi_d(len, len, 64-4); ++ bge(len, R0, CRC_by4_loop); ++ addi_d(len, len, 4); ++ blt(R0, len, CRC_by1_loop); ++ b(L_exit); ++ ++ bind(CRC_by64_loop); ++ ld_d(tmp, buf, 0); ++ crcc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 8); ++ crcc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 16); ++ crcc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 24); ++ crcc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 32); ++ crcc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 40); ++ crcc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 48); ++ crcc_w_d_w(crc, tmp, crc); ++ ld_d(tmp, buf, 56); ++ crcc_w_d_w(crc, tmp, crc); ++ addi_d(buf, buf, 64); ++ addi_d(len, len, -64); ++ bge(len, R0, CRC_by64_loop); ++ addi_d(len, len, 64-4); ++ bge(len, R0, CRC_by4_loop); ++ addi_d(len, len, 4); ++ blt(R0, len, CRC_by1_loop); ++ b(L_exit); ++ ++ bind(CRC_by4_loop); ++ ld_w(tmp, buf, 0); ++ crcc_w_w_w(crc, tmp, crc); ++ addi_d(buf, buf, 4); ++ addi_d(len, len, -4); ++ bge(len, R0, CRC_by4_loop); ++ addi_d(len, len, 4); ++ bge(R0, len, L_exit); ++ ++ bind(CRC_by1_loop); ++ ld_b(tmp, buf, 0); ++ crcc_w_b_w(crc, tmp, crc); ++ addi_d(buf, buf, 1); ++ addi_d(len, len, -1); ++ blt(R0, len, CRC_by1_loop); ++ ++ bind(L_exit); ++} ++ ++// Search for Non-ASCII character (Negative byte value) in a byte array, ++// return the index of the first such character, otherwise the length ++// of the array segment searched. ++// ..\jdk\src\java.base\share\classes\java\lang\StringCoding.java ++// @IntrinsicCandidate ++// public static int countPositives(byte[] ba, int off, int len) { ++// for (int i = off; i < off + len; i++) { ++// if (ba[i] < 0) { ++// return i - off; ++// } ++// } ++// return len; ++// } ++void MacroAssembler::count_positives(Register src, Register len, Register result, ++ Register tmp1, Register tmp2) { ++ Label Loop, Negative, Once, Done; ++ ++ move(result, R0); ++ beqz(len, Done); ++ ++ addi_w(tmp2, len, -8); ++ blt(tmp2, R0, Once); ++ ++ li(tmp1, 0x8080808080808080); ++ ++ bind(Loop); ++ ldx_d(AT, src, result); ++ andr(AT, AT, tmp1); ++ bnez(AT, Negative); ++ addi_w(result, result, 8); ++ bge(tmp2, result, Loop); ++ ++ beq(result, len, Done); ++ ldx_d(AT, src, tmp2); ++ andr(AT, AT, tmp1); ++ move(result, tmp2); ++ ++ bind(Negative); ++ ctz_d(AT, AT); ++ srai_w(AT, AT, 3); ++ add_w(result, result, AT); ++ b(Done); ++ ++ bind(Once); ++ ldx_b(tmp1, src, result); ++ blt(tmp1, R0, Done); ++ addi_w(result, result, 1); ++ blt(result, len, Once); ++ ++ bind(Done); ++} ++ ++// Compress char[] to byte[]. len must be positive int. ++// jtreg: TestStringIntrinsicRangeChecks.java ++void MacroAssembler::char_array_compress(Register src, Register dst, ++ Register len, Register result, ++ Register tmp1, Register tmp2, Register tmp3, ++ FloatRegister vtemp1, FloatRegister vtemp2, ++ FloatRegister vtemp3, FloatRegister vtemp4) { ++ encode_iso_array(src, dst, len, result, tmp1, tmp2, tmp3, false, vtemp1, vtemp2, vtemp3, vtemp4); ++ // Adjust result: result == len ? len : 0 ++ sub_w(tmp1, result, len); ++ masknez(result, result, tmp1); ++} ++ ++// Inflate byte[] to char[]. len must be positive int. ++// jtreg:test/jdk/sun/nio/cs/FindDecoderBugs.java ++void MacroAssembler::byte_array_inflate(Register src, Register dst, Register len, ++ Register tmp1, Register tmp2, ++ FloatRegister vtemp1, FloatRegister vtemp2) { ++ Label L_loop, L_small, L_small_loop, L_last, L_done; ++ ++ bge(R0, len, L_done); ++ ++ addi_w(tmp2, len, -16); ++ blt(tmp2, R0, L_small); ++ ++ move(tmp1, R0); ++ alsl_d(AT, len, dst, 0); // AT = dst + len * 2 ++ vxor_v(fscratch, fscratch, fscratch); ++ ++ // load and inflate 16 chars per loop ++ bind(L_loop); ++ vldx(vtemp1, src, tmp1); ++ addi_w(tmp1, tmp1, 16); ++ ++ // 0x0000000000000000a1b2c3d4e5f6g7h8 -> 0x00a100b200c300d4..... ++ vilvl_b(vtemp2, fscratch, vtemp1); ++ vst(vtemp2, dst, 0); ++ ++ // 0xa1b2c3d4e5f6g7h80000000000000000 -> 0x00a100b200c300d4..... ++ vilvh_b(vtemp1, fscratch, vtemp1); ++ vst(vtemp1, dst, 16); ++ ++ addi_d(dst, dst, 32); ++ bge(tmp2, tmp1, L_loop); ++ ++ // inflate the last 16 chars ++ beq(len, tmp1, L_done); ++ addi_d(AT, AT, -32); ++ vldx(vtemp1, src, tmp2); ++ vilvl_b(vtemp2, fscratch, vtemp1); ++ vst(vtemp2, AT, 0); ++ vilvh_b(vtemp1, fscratch, vtemp1); ++ vst(vtemp1, AT, 16); ++ b(L_done); ++ ++ bind(L_small); ++ li(AT, 4); ++ blt(len, AT, L_last); ++ ++ bind(L_small_loop); ++ ld_wu(tmp1, src, 0); ++ addi_d(src, src, 4); ++ addi_w(len, len, -4); ++ ++ // 0x00000000a1b2c3d4 -> 0x00a100b200c300d4 ++ bstrpick_d(tmp2, tmp1, 7, 0); ++ srli_d(tmp1, tmp1, 8); ++ bstrins_d(tmp2, tmp1, 23, 16); ++ srli_d(tmp1, tmp1, 8); ++ bstrins_d(tmp2, tmp1, 39, 32); ++ srli_d(tmp1, tmp1, 8); ++ bstrins_d(tmp2, tmp1, 55, 48); ++ ++ st_d(tmp2, dst, 0); ++ addi_d(dst, dst, 8); ++ bge(len, AT, L_small_loop); ++ ++ bind(L_last); ++ beqz(len, L_done); ++ ld_bu(AT, src, 0); ++ st_h(AT, dst, 0); ++ addi_w(len, len, -1); ++ ++ beqz(len, L_done); ++ ld_bu(AT, src, 1); ++ st_h(AT, dst, 2); ++ addi_w(len, len, -1); ++ ++ beqz(len, L_done); ++ ld_bu(AT, src, 2); ++ st_h(AT, dst, 4); ++ ++ bind(L_done); ++} ++ ++// Intrinsic for ++// ++// - java.lang.StringCoding::implEncodeISOArray ++// - java.lang.StringCoding::implEncodeAsciiArray ++// ++// This version always returns the number of characters copied. ++void MacroAssembler::encode_iso_array(Register src, Register dst, ++ Register len, Register result, ++ Register tmp1, Register tmp2, ++ Register tmp3, bool ascii, ++ FloatRegister vtemp1, FloatRegister vtemp2, ++ FloatRegister vtemp3, FloatRegister vtemp4) { ++ const FloatRegister shuf_index = vtemp3; ++ const FloatRegister latin_mask = vtemp4; ++ ++ Label Deal8, Loop8, Loop32, Done, Once; ++ ++ move(result, R0); // init in case of bad value ++ bge(R0, len, Done); ++ ++ li(tmp3, ascii ? 0xff80ff80ff80ff80 : 0xff00ff00ff00ff00); ++ srai_w(AT, len, 4); ++ beqz(AT, Deal8); ++ ++ li(tmp1, StubRoutines::la::string_compress_index()); ++ vld(shuf_index, tmp1, 0); ++ vreplgr2vr_d(latin_mask, tmp3); ++ ++ bind(Loop32); ++ beqz(AT, Deal8); ++ ++ vld(vtemp1, src, 0); ++ vld(vtemp2, src, 16); ++ addi_w(AT, AT, -1); ++ ++ vor_v(fscratch, vtemp1, vtemp2); ++ vand_v(fscratch, fscratch, latin_mask); ++ vseteqz_v(FCC0, fscratch); // not latin-1, apply slow path ++ bceqz(FCC0, Once); ++ ++ vshuf_b(fscratch, vtemp2, vtemp1, shuf_index); ++ ++ vstx(fscratch, dst, result); ++ addi_d(src, src, 32); ++ addi_w(result, result, 16); ++ b(Loop32); ++ ++ bind(Deal8); ++ bstrpick_w(AT, len, 3, 2); ++ ++ bind(Loop8); ++ beqz(AT, Once); ++ ld_d(tmp1, src, 0); ++ andr(tmp2, tmp3, tmp1); // not latin-1, apply slow path ++ bnez(tmp2, Once); ++ ++ // 0x00a100b200c300d4 -> 0x00000000a1b2c3d4 ++ srli_d(tmp2, tmp1, 8); ++ orr(tmp2, tmp2, tmp1); // 0x00a1a1b2b2c3c3d4 ++ bstrpick_d(tmp1, tmp2, 47, 32); // 0x0000a1b2 ++ slli_d(tmp1, tmp1, 16); // 0xa1b20000 ++ bstrins_d(tmp1, tmp2, 15, 0); // 0xa1b2c3d4 ++ ++ stx_w(tmp1, dst, result); ++ addi_w(AT, AT, -1); ++ addi_d(src, src, 8); ++ addi_w(result, result, 4); ++ b(Loop8); ++ ++ bind(Once); ++ beq(len, result, Done); ++ ld_hu(tmp1, src, 0); ++ andr(tmp2, tmp3, tmp1); // not latin-1, stop here ++ bnez(tmp2, Done); ++ stx_b(tmp1, dst, result); ++ addi_d(src, src, 2); ++ addi_w(result, result, 1); ++ b(Once); ++ ++ bind(Done); ++} ++ ++// Math.round employs the ties-to-positive round mode, ++// which is not a typically conversion method defined ++// in the IEEE-754-2008. For single-precision floatings, ++// the following algorithm can be used to effectively ++// implement rounding via standard operations. ++void MacroAssembler::java_round_float(Register dst, ++ FloatRegister src, ++ FloatRegister vtemp1) { ++ block_comment("java_round_float: { "); ++ ++ Label L_abnormal, L_done; ++ ++ li(AT, StubRoutines::la::round_float_imm()); ++ ++ // if src is -0.5f, return 0 as result ++ fld_s(vtemp1, AT, 0); ++ fcmp_ceq_s(FCC0, vtemp1, src); ++ bceqz(FCC0, L_abnormal); ++ move(dst, R0); ++ b(L_done); ++ ++ // else, floor src with the magic number ++ bind(L_abnormal); ++ fld_s(vtemp1, AT, 4); ++ fadd_s(fscratch, vtemp1, src); ++ ftintrm_w_s(fscratch, fscratch); ++ movfr2gr_s(dst, fscratch); ++ ++ bind(L_done); ++ ++ block_comment("} java_round_float"); ++} ++ ++void MacroAssembler::java_round_float_lsx(FloatRegister dst, ++ FloatRegister src, ++ FloatRegister vtemp1, ++ FloatRegister vtemp2) { ++ block_comment("java_round_float_lsx: { "); ++ li(AT, StubRoutines::la::round_float_imm()); ++ vldrepl_w(vtemp1, AT, 0); // repl -0.5f ++ vldrepl_w(vtemp2, AT, 1); // repl 0.49999997f ++ ++ vfcmp_cne_s(fscratch, src, vtemp1); // generate the mask ++ vand_v(fscratch, fscratch, src); // clear the special ++ vfadd_s(dst, fscratch, vtemp2); // plus the magic ++ vftintrm_w_s(dst, dst); // floor the result ++ block_comment("} java_round_float_lsx"); ++} ++ ++void MacroAssembler::java_round_float_lasx(FloatRegister dst, ++ FloatRegister src, ++ FloatRegister vtemp1, ++ FloatRegister vtemp2) { ++ block_comment("java_round_float_lasx: { "); ++ li(AT, StubRoutines::la::round_float_imm()); ++ xvldrepl_w(vtemp1, AT, 0); // repl -0.5f ++ xvldrepl_w(vtemp2, AT, 1); // repl 0.49999997f ++ ++ xvfcmp_cne_s(fscratch, src, vtemp1); // generate the mask ++ xvand_v(fscratch, fscratch, src); // clear the special ++ xvfadd_s(dst, fscratch, vtemp2); // plus the magic ++ xvftintrm_w_s(dst, dst); // floor the result ++ block_comment("} java_round_float_lasx"); ++} ++ ++// Math.round employs the ties-to-positive round mode, ++// which is not a typically conversion method defined ++// in the IEEE-754-2008. For double-precision floatings, ++// the following algorithm can be used to effectively ++// implement rounding via standard operations. ++void MacroAssembler::java_round_double(Register dst, ++ FloatRegister src, ++ FloatRegister vtemp1) { ++ block_comment("java_round_double: { "); ++ ++ Label L_abnormal, L_done; ++ ++ li(AT, StubRoutines::la::round_double_imm()); ++ ++ // if src is -0.5d, return 0 as result ++ fld_d(vtemp1, AT, 0); ++ fcmp_ceq_d(FCC0, vtemp1, src); ++ bceqz(FCC0, L_abnormal); ++ move(dst, R0); ++ b(L_done); ++ ++ // else, floor src with the magic number ++ bind(L_abnormal); ++ fld_d(vtemp1, AT, 8); ++ fadd_d(fscratch, vtemp1, src); ++ ftintrm_l_d(fscratch, fscratch); ++ movfr2gr_d(dst, fscratch); ++ ++ bind(L_done); ++ ++ block_comment("} java_round_double"); ++} ++ ++void MacroAssembler::java_round_double_lsx(FloatRegister dst, ++ FloatRegister src, ++ FloatRegister vtemp1, ++ FloatRegister vtemp2) { ++ block_comment("java_round_double_lsx: { "); ++ li(AT, StubRoutines::la::round_double_imm()); ++ vldrepl_d(vtemp1, AT, 0); // repl -0.5d ++ vldrepl_d(vtemp2, AT, 1); // repl 0.49999999999999994d ++ ++ vfcmp_cne_d(fscratch, src, vtemp1); // generate the mask ++ vand_v(fscratch, fscratch, src); // clear the special ++ vfadd_d(dst, fscratch, vtemp2); // plus the magic ++ vftintrm_l_d(dst, dst); // floor the result ++ block_comment("} java_round_double_lsx"); ++} ++ ++void MacroAssembler::java_round_double_lasx(FloatRegister dst, ++ FloatRegister src, ++ FloatRegister vtemp1, ++ FloatRegister vtemp2) { ++ block_comment("java_round_double_lasx: { "); ++ li(AT, StubRoutines::la::round_double_imm()); ++ xvldrepl_d(vtemp1, AT, 0); // repl -0.5d ++ xvldrepl_d(vtemp2, AT, 1); // repl 0.49999999999999994d ++ ++ xvfcmp_cne_d(fscratch, src, vtemp1); // generate the mask ++ xvand_v(fscratch, fscratch, src); // clear the special ++ xvfadd_d(dst, fscratch, vtemp2); // plus the magic ++ xvftintrm_l_d(dst, dst); // floor the result ++ block_comment("} java_round_double_lasx"); ++} ++ ++// Code for BigInteger::mulAdd intrinsic ++// out = c_rarg0 ++// in = c_rarg1 ++// offset = c_rarg2 (already out.length-offset) ++// len = c_rarg3 ++// k = c_rarg4 ++// ++// pseudo code from java implementation: ++// long kLong = k & LONG_MASK; ++// carry = 0; ++// offset = out.length-offset - 1; ++// for (int j = len - 1; j >= 0; j--) { ++// product = (in[j] & LONG_MASK) * kLong + (out[offset] & LONG_MASK) + carry; ++// out[offset--] = (int)product; ++// carry = product >>> 32; ++// } ++// return (int)carry; ++void MacroAssembler::mul_add(Register out, Register in, Register offset, ++ Register len, Register k) { ++ Label L_tail_loop, L_unroll, L_end; ++ ++ move(SCR2, out); ++ move(out, R0); // should clear out ++ bge(R0, len, L_end); ++ ++ alsl_d(offset, offset, SCR2, LogBytesPerInt - 1); ++ alsl_d(in, len, in, LogBytesPerInt - 1); ++ ++ const int unroll = 16; ++ li(SCR2, unroll); ++ blt(len, SCR2, L_tail_loop); ++ ++ bind(L_unroll); ++ ++ addi_d(in, in, -unroll * BytesPerInt); ++ addi_d(offset, offset, -unroll * BytesPerInt); ++ ++ for (int i = unroll - 1; i >= 0; i--) { ++ ld_wu(SCR1, in, i * BytesPerInt); ++ mulw_d_wu(SCR1, SCR1, k); ++ add_d(out, out, SCR1); // out as scratch ++ ld_wu(SCR1, offset, i * BytesPerInt); ++ add_d(SCR1, SCR1, out); ++ st_w(SCR1, offset, i * BytesPerInt); ++ srli_d(out, SCR1, 32); // keep carry ++ } ++ ++ sub_w(len, len, SCR2); ++ bge(len, SCR2, L_unroll); ++ ++ bge(R0, len, L_end); // check tail ++ ++ bind(L_tail_loop); ++ ++ addi_d(in, in, -BytesPerInt); ++ ld_wu(SCR1, in, 0); ++ mulw_d_wu(SCR1, SCR1, k); ++ add_d(out, out, SCR1); // out as scratch ++ ++ addi_d(offset, offset, -BytesPerInt); ++ ld_wu(SCR1, offset, 0); ++ add_d(SCR1, SCR1, out); ++ st_w(SCR1, offset, 0); ++ ++ srli_d(out, SCR1, 32); // keep carry ++ ++ addi_w(len, len, -1); ++ blt(R0, len, L_tail_loop); ++ ++ bind(L_end); ++} ++ ++#ifndef PRODUCT ++void MacroAssembler::verify_cross_modify_fence_not_required() { ++ if (VerifyCrossModifyFence) { ++ // Check if thread needs a cross modify fence. ++ ld_bu(SCR1, Address(TREG, in_bytes(JavaThread::requires_cross_modify_fence_offset()))); ++ Label fence_not_required; ++ beqz(SCR1, fence_not_required); ++ // If it does then fail. ++ move(A0, TREG); ++ call(CAST_FROM_FN_PTR(address, JavaThread::verify_cross_modify_fence_failure)); ++ bind(fence_not_required); ++ } ++} ++#endif ++ ++// The java_calling_convention describes stack locations as ideal slots on ++// a frame with no abi restrictions. Since we must observe abi restrictions ++// (like the placement of the register window) the slots must be biased by ++// the following value. ++static int reg2offset_in(VMReg r) { ++ // Account for saved rfp and lr ++ // This should really be in_preserve_stack_slots ++ return r->reg2stack() * VMRegImpl::stack_slot_size; ++} ++ ++static int reg2offset_out(VMReg r) { ++ return (r->reg2stack() + SharedRuntime::out_preserve_stack_slots()) * VMRegImpl::stack_slot_size; ++} ++ ++// A simple move of integer like type ++void MacroAssembler::simple_move32(VMRegPair src, VMRegPair dst, Register tmp) { ++ if (src.first()->is_stack()) { ++ if (dst.first()->is_stack()) { ++ // stack to stack ++ ld_w(tmp, FP, reg2offset_in(src.first())); ++ st_d(tmp, SP, reg2offset_out(dst.first())); ++ } else { ++ // stack to reg ++ ld_w(dst.first()->as_Register(), FP, reg2offset_in(src.first())); ++ } ++ } else if (dst.first()->is_stack()) { ++ // reg to stack ++ st_d(src.first()->as_Register(), SP, reg2offset_out(dst.first())); ++ } else { ++ if (dst.first() != src.first()) { ++ // 32bits extend sign ++ add_w(dst.first()->as_Register(), src.first()->as_Register(), R0); ++ } ++ } ++} ++ ++// An oop arg. Must pass a handle not the oop itself ++void MacroAssembler::object_move( ++ OopMap* map, ++ int oop_handle_offset, ++ int framesize_in_slots, ++ VMRegPair src, ++ VMRegPair dst, ++ bool is_receiver, ++ int* receiver_offset) { ++ ++ // must pass a handle. First figure out the location we use as a handle ++ Register rHandle = dst.first()->is_stack() ? T5 : dst.first()->as_Register(); ++ ++ if (src.first()->is_stack()) { ++ // Oop is already on the stack as an argument ++ Label nil; ++ move(rHandle, R0); ++ ld_d(AT, FP, reg2offset_in(src.first())); ++ beqz(AT, nil); ++ lea(rHandle, Address(FP, reg2offset_in(src.first()))); ++ bind(nil); ++ ++ int offset_in_older_frame = src.first()->reg2stack() ++ + SharedRuntime::out_preserve_stack_slots(); ++ map->set_oop(VMRegImpl::stack2reg(offset_in_older_frame + framesize_in_slots)); ++ if (is_receiver) { ++ *receiver_offset = (offset_in_older_frame + framesize_in_slots) * VMRegImpl::stack_slot_size; ++ } ++ } else { ++ // Oop is in an a register we must store it to the space we reserve ++ // on the stack for oop_handles and pass a handle if oop is non-null ++ const Register rOop = src.first()->as_Register(); ++ assert((rOop->encoding() >= A0->encoding()) && (rOop->encoding() <= T0->encoding()),"wrong register"); ++ //Important: refer to java_calling_convention ++ int oop_slot = (rOop->encoding() - j_rarg1->encoding()) * VMRegImpl::slots_per_word + oop_handle_offset; ++ int offset = oop_slot*VMRegImpl::stack_slot_size; ++ ++ Label skip; ++ st_d(rOop, SP, offset); ++ map->set_oop(VMRegImpl::stack2reg(oop_slot)); ++ move(rHandle, R0); ++ beqz(rOop, skip); ++ lea(rHandle, Address(SP, offset)); ++ bind(skip); ++ ++ if (is_receiver) { ++ *receiver_offset = offset; ++ } ++ } ++ ++ // If arg is on the stack then place it otherwise it is already in correct reg. ++ if (dst.first()->is_stack()) { ++ st_d(rHandle, Address(SP, reg2offset_out(dst.first()))); ++ } ++} ++ ++// Referring to c_calling_convention, float and/or double argument shuffling may ++// adopt int register for spilling. So we need to capture and deal with these ++// kinds of situations in the float_move and double_move methods. ++ ++// A float move ++void MacroAssembler::float_move(VMRegPair src, VMRegPair dst, Register tmp) { ++ assert(!src.second()->is_valid() && !dst.second()->is_valid(), "bad float_move"); ++ if (src.first()->is_stack()) { ++ // stack to stack/reg ++ if (dst.first()->is_stack()) { ++ ld_w(tmp, FP, reg2offset_in(src.first())); ++ st_w(tmp, SP, reg2offset_out(dst.first())); ++ } else if (dst.first()->is_FloatRegister()) { ++ fld_s(dst.first()->as_FloatRegister(), FP, reg2offset_in(src.first())); ++ } else { ++ ld_w(dst.first()->as_Register(), FP, reg2offset_in(src.first())); ++ } ++ } else { ++ // reg to stack/reg ++ if (dst.first()->is_stack()) { ++ fst_s(src.first()->as_FloatRegister(), SP, reg2offset_out(dst.first())); ++ } else if (dst.first()->is_FloatRegister()) { ++ fmov_s(dst.first()->as_FloatRegister(), src.first()->as_FloatRegister()); ++ } else { ++ movfr2gr_s(dst.first()->as_Register(), src.first()->as_FloatRegister()); ++ } ++ } ++} ++ ++// A long move ++void MacroAssembler::long_move(VMRegPair src, VMRegPair dst, Register tmp) { ++ if (src.first()->is_stack()) { ++ if (dst.first()->is_stack()) { ++ ld_d(tmp, FP, reg2offset_in(src.first())); ++ st_d(tmp, SP, reg2offset_out(dst.first())); ++ } else { ++ ld_d(dst.first()->as_Register(), FP, reg2offset_in(src.first())); ++ } ++ } else { ++ if (dst.first()->is_stack()) { ++ st_d(src.first()->as_Register(), SP, reg2offset_out(dst.first())); ++ } else { ++ move(dst.first()->as_Register(), src.first()->as_Register()); ++ } ++ } ++} ++ ++// A double move ++void MacroAssembler::double_move(VMRegPair src, VMRegPair dst, Register tmp) { ++ if (src.first()->is_stack()) { ++ // source is all stack ++ if (dst.first()->is_stack()) { ++ ld_d(tmp, FP, reg2offset_in(src.first())); ++ st_d(tmp, SP, reg2offset_out(dst.first())); ++ } else if (dst.first()->is_FloatRegister()) { ++ fld_d(dst.first()->as_FloatRegister(), FP, reg2offset_in(src.first())); ++ } else { ++ ld_d(dst.first()->as_Register(), FP, reg2offset_in(src.first())); ++ } ++ } else { ++ // reg to stack/reg ++ if (dst.first()->is_stack()) { ++ fst_d(src.first()->as_FloatRegister(), SP, reg2offset_out(dst.first())); ++ } else if (dst.first()->is_FloatRegister()) { ++ fmov_d(dst.first()->as_FloatRegister(), src.first()->as_FloatRegister()); ++ } else { ++ movfr2gr_d(dst.first()->as_Register(), src.first()->as_FloatRegister()); ++ } ++ } ++} ++ ++// Implements lightweight-locking. ++// Branches to slow upon failure to lock the object. ++// Falls through upon success. ++// ++// - obj: the object to be locked ++// - hdr: the header, already loaded from obj, will be destroyed ++// - flag: as cr for c2, but only as temporary regisgter for c1/interpreter ++// - tmp: temporary registers, will be destroyed ++void MacroAssembler::lightweight_lock(Register obj, Register hdr, Register flag, Register tmp, Label& slow) { ++ assert(LockingMode == LM_LIGHTWEIGHT, "only used with new lightweight locking"); ++ assert_different_registers(obj, hdr, flag, tmp); ++ ++ // Check if we would have space on lock-stack for the object. ++ ld_wu(flag, Address(TREG, JavaThread::lock_stack_top_offset())); ++ li(tmp, (unsigned)LockStack::end_offset()); ++ sltu(flag, flag, tmp); ++ beqz(flag, slow); ++ ++ // Load (object->mark() | 1) into hdr ++ ori(hdr, hdr, markWord::unlocked_value); ++ // Clear lock-bits, into tmp ++ xori(tmp, hdr, markWord::unlocked_value); ++ // Try to swing header from unlocked to locked ++ cmpxchg(/*addr*/ Address(obj, 0), /*old*/ hdr, /*new*/ tmp, /*flag*/ flag, /*retold*/ true, /*barrier*/true); ++ beqz(flag, slow); ++ ++ // After successful lock, push object on lock-stack ++ ld_wu(tmp, Address(TREG, JavaThread::lock_stack_top_offset())); ++ stx_d(obj, TREG, tmp); ++ addi_w(tmp, tmp, oopSize); ++ st_w(tmp, Address(TREG, JavaThread::lock_stack_top_offset())); ++} ++ ++// Implements lightweight-unlocking. ++// Branches to slow upon failure. ++// Falls through upon success. ++// ++// - obj: the object to be unlocked ++// - hdr: the (pre-loaded) header of the object ++// - flag: as cr for c2, but only as temporary regisgter for c1/interpreter ++// - tmp: temporary registers ++void MacroAssembler::lightweight_unlock(Register obj, Register hdr, Register flag, Register tmp, Label& slow) { ++ assert(LockingMode == LM_LIGHTWEIGHT, "only used with new lightweight locking"); ++ assert_different_registers(obj, hdr, tmp, flag); ++ ++#ifdef ASSERT ++ { ++ // The following checks rely on the fact that LockStack is only ever modified by ++ // its owning thread, even if the lock got inflated concurrently; removal of LockStack ++ // entries after inflation will happen delayed in that case. ++ ++ // Check for lock-stack underflow. ++ Label stack_ok; ++ ld_wu(tmp, Address(TREG, JavaThread::lock_stack_top_offset())); ++ li(flag, (unsigned)LockStack::start_offset()); ++ bltu(flag, tmp, stack_ok); ++ stop("Lock-stack underflow"); ++ bind(stack_ok); ++ } ++ { ++ // Check if the top of the lock-stack matches the unlocked object. ++ Label tos_ok; ++ addi_w(tmp, tmp, -oopSize); ++ ldx_d(tmp, TREG, tmp); ++ beq(tmp, obj, tos_ok); ++ stop("Top of lock-stack does not match the unlocked object"); ++ bind(tos_ok); ++ } ++ { ++ // Check that hdr is fast-locked. ++ Label hdr_ok; ++ andi(tmp, hdr, markWord::lock_mask_in_place); ++ beqz(tmp, hdr_ok); ++ stop("Header is not fast-locked"); ++ bind(hdr_ok); ++ } ++#endif ++ ++ // Load the new header (unlocked) into tmp ++ ori(tmp, hdr, markWord::unlocked_value); ++ ++ // Try to swing header from locked to unlocked ++ cmpxchg(/*addr*/ Address(obj, 0), /*old*/ hdr, /*new*/ tmp, /*flag*/ flag, /**/true, /*barrier*/ true); ++ beqz(flag, slow); ++ ++ // After successful unlock, pop object from lock-stack ++ ld_wu(tmp, Address(TREG, JavaThread::lock_stack_top_offset())); ++ addi_w(tmp, tmp, -oopSize); ++#ifdef ASSERT ++ stx_d(R0, TREG, tmp); ++#endif ++ st_w(tmp, Address(TREG, JavaThread::lock_stack_top_offset())); ++} ++ ++#if INCLUDE_ZGC ++void MacroAssembler::patchable_li16(Register rd, uint16_t value) { ++ int count = 0; ++ ++ if (is_simm(value, 12)) { ++ addi_d(rd, R0, value); ++ count++; ++ } else if (is_uimm(value, 12)) { ++ ori(rd, R0, value); ++ count++; ++ } else { ++ lu12i_w(rd, split_low20(value >> 12)); ++ count++; ++ if (split_low12(value)) { ++ ori(rd, rd, split_low12(value)); ++ count++; ++ } ++ } ++ ++ while (count < 2) { ++ nop(); ++ count++; ++ } ++} ++ ++void MacroAssembler::z_color(Register dst, Register src, Register tmp) { ++ assert_different_registers(dst, tmp); ++ assert_different_registers(src, tmp); ++ relocate(barrier_Relocation::spec(), ZBarrierRelocationFormatStoreGoodBits); ++ if (src != noreg) { ++ patchable_li16(tmp, barrier_Relocation::unpatched); ++ slli_d(dst, src, ZPointerLoadShift); ++ orr(dst, dst, tmp); ++ } else { ++ patchable_li16(dst, barrier_Relocation::unpatched); ++ } ++} ++ ++void MacroAssembler::z_uncolor(Register ref) { ++ srli_d(ref, ref, ZPointerLoadShift); ++} ++ ++void MacroAssembler::check_color(Register ref, Register tmp, bool on_non_strong) { ++ assert_different_registers(ref, tmp); ++ int relocFormat = on_non_strong ? ZBarrierRelocationFormatMarkBadMask ++ : ZBarrierRelocationFormatLoadBadMask; ++ relocate(barrier_Relocation::spec(), relocFormat); ++ patchable_li16(tmp, barrier_Relocation::unpatched); ++ andr(tmp, ref, tmp); ++} ++#endif +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/macroAssembler_loongarch.hpp b/src/hotspot/cpu/loongarch/macroAssembler_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/macroAssembler_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/macroAssembler_loongarch.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,809 @@ ++/* ++ * Copyright (c) 1997, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_MACROASSEMBLER_LOONGARCH_HPP ++#define CPU_LOONGARCH_MACROASSEMBLER_LOONGARCH_HPP ++ ++#include "asm/assembler.hpp" ++#include "code/vmreg.hpp" ++#include "runtime/rtmLocking.hpp" ++#include "utilities/macros.hpp" ++ ++class OopMap; ++ ++#if INCLUDE_ZGC ++const int ZBarrierRelocationFormatLoadBadMask = 0; ++const int ZBarrierRelocationFormatMarkBadMask = 1; ++const int ZBarrierRelocationFormatStoreGoodBits = 2; ++const int ZBarrierRelocationFormatStoreBadMask = 3; ++#endif ++ ++// MacroAssembler extends Assembler by frequently used macros. ++// ++// Instructions for which a 'better' code sequence exists depending ++// on arguments should also go in here. ++ ++class MacroAssembler: public Assembler { ++ friend class LIR_Assembler; ++ friend class Runtime1; // as_Address() ++ ++ public: ++ // Compare code ++ typedef enum { ++ EQ = 0x01, ++ NE = 0x02, ++ GT = 0x03, ++ GE = 0x04, ++ LT = 0x05, ++ LE = 0x06 ++ } CMCompare; ++ ++ public: ++ // Support for VM calls ++ // ++ // This is the base routine called by the different versions of call_VM_leaf. The interpreter ++ // may customize this version by overriding it for its purposes (e.g., to save/restore ++ // additional registers when doing a VM call). ++ #define VIRTUAL virtual ++ ++ VIRTUAL void call_VM_leaf_base( ++ address entry_point, // the entry point ++ int number_of_arguments // the number of arguments to pop after the call ++ ); ++ ++ protected: ++ // This is the base routine called by the different versions of call_VM. The interpreter ++ // may customize this version by overriding it for its purposes (e.g., to save/restore ++ // additional registers when doing a VM call). ++ // ++ // If no java_thread register is specified (noreg) than TREG will be used instead. call_VM_base ++ // returns the register which contains the thread upon return. If a thread register has been ++ // specified, the return value will correspond to that register. If no last_java_sp is specified ++ // (noreg) than sp will be used instead. ++ VIRTUAL void call_VM_base( // returns the register containing the thread upon return ++ Register oop_result, // where an oop-result ends up if any; use noreg otherwise ++ Register java_thread, // the thread if computed before ; use noreg otherwise ++ Register last_java_sp, // to set up last_Java_frame in stubs; use noreg otherwise ++ address entry_point, // the entry point ++ int number_of_arguments, // the number of arguments (w/o thread) to pop after the call ++ bool check_exceptions // whether to check for pending exceptions after return ++ ); ++ ++ void call_VM_helper(Register oop_result, address entry_point, int number_of_arguments, bool check_exceptions = true); ++ ++ // helpers for FPU flag access ++ // tmp is a temporary register, if none is available use noreg ++ ++ public: ++ MacroAssembler(CodeBuffer* code) : Assembler(code) {} ++ ++ // These routines should emit JVMTI PopFrame and ForceEarlyReturn handling code. ++ // The implementation is only non-empty for the InterpreterMacroAssembler, ++ // as only the interpreter handles PopFrame and ForceEarlyReturn requests. ++ virtual void check_and_handle_popframe(Register java_thread); ++ virtual void check_and_handle_earlyret(Register java_thread); ++ ++ // Support for null-checks ++ // ++ // Generates code that causes a null OS exception if the content of reg is null. ++ // If the accessed location is M[reg + offset] and the offset is known, provide the ++ // offset. No explicit code generation is needed if the offset is within a certain ++ // range (0 <= offset <= page_size). ++ ++ void null_check(Register reg, int offset = -1); ++ static bool needs_explicit_null_check(intptr_t offset); ++ static bool uses_implicit_null_check(void* address); ++ ++ // Required platform-specific helpers for Label::patch_instructions. ++ // They _shadow_ the declarations in AbstractAssembler, which are undefined. ++ static void pd_patch_instruction(address branch, address target, const char* file = nullptr, int line = 0); ++ ++ // Return whether code is emitted to a scratch blob. ++ virtual bool in_scratch_emit_size() { ++ return false; ++ } ++ ++ address emit_trampoline_stub(int insts_call_instruction_offset, address target); ++ ++ void push_cont_fastpath(Register java_thread); ++ void pop_cont_fastpath(Register java_thread); ++ ++ void flt_to_flt16(Register dst, FloatRegister src, FloatRegister tmp) { ++ vfcvt_h_s(tmp, src, src); ++ vpickve2gr_h(dst, tmp, 0); ++ } ++ ++ void flt16_to_flt(FloatRegister dst, Register src, FloatRegister tmp) { ++ vinsgr2vr_h(tmp, src, 0); ++ vfcvtl_s_h(dst, tmp); ++ } ++ ++ // Alignment ++ void align(int modulus); ++ ++ void post_call_nop(); ++ ++ // Stack frame creation/removal ++ void enter(); ++ void leave(); ++ ++ // Frame creation and destruction shared between JITs. ++ void build_frame(int framesize); ++ void remove_frame(int framesize); ++ ++ // Support for getting the JavaThread pointer (i.e.; a reference to thread-local information) ++ // The pointer will be loaded into the thread register. ++ void get_thread(Register thread); ++ ++ // support for argument shuffling ++ void simple_move32(VMRegPair src, VMRegPair dst, Register tmp = SCR1); ++ void float_move(VMRegPair src, VMRegPair dst, Register tmp = SCR1); ++ void long_move(VMRegPair src, VMRegPair dst, Register tmp = SCR1); ++ void double_move(VMRegPair src, VMRegPair dst, Register tmp = SCR1); ++ void object_move( ++ OopMap* map, ++ int oop_handle_offset, ++ int framesize_in_slots, ++ VMRegPair src, ++ VMRegPair dst, ++ bool is_receiver, ++ int* receiver_offset); ++ ++ // Support for VM calls ++ // ++ // It is imperative that all calls into the VM are handled via the call_VM macros. ++ // They make sure that the stack linkage is setup correctly. call_VM's correspond ++ // to ENTRY/ENTRY_X entry points while call_VM_leaf's correspond to LEAF entry points. ++ ++ ++ void call_VM(Register oop_result, ++ address entry_point, ++ bool check_exceptions = true); ++ void call_VM(Register oop_result, ++ address entry_point, ++ Register arg_1, ++ bool check_exceptions = true); ++ void call_VM(Register oop_result, ++ address entry_point, ++ Register arg_1, Register arg_2, ++ bool check_exceptions = true); ++ void call_VM(Register oop_result, ++ address entry_point, ++ Register arg_1, Register arg_2, Register arg_3, ++ bool check_exceptions = true); ++ ++ // Overloadings with last_Java_sp ++ void call_VM(Register oop_result, ++ Register last_java_sp, ++ address entry_point, ++ int number_of_arguments = 0, ++ bool check_exceptions = true); ++ void call_VM(Register oop_result, ++ Register last_java_sp, ++ address entry_point, ++ Register arg_1, bool ++ check_exceptions = true); ++ void call_VM(Register oop_result, ++ Register last_java_sp, ++ address entry_point, ++ Register arg_1, Register arg_2, ++ bool check_exceptions = true); ++ void call_VM(Register oop_result, ++ Register last_java_sp, ++ address entry_point, ++ Register arg_1, Register arg_2, Register arg_3, ++ bool check_exceptions = true); ++ ++ void get_vm_result (Register oop_result, Register thread); ++ void get_vm_result_2(Register metadata_result, Register thread); ++ void call_VM_leaf(address entry_point, ++ int number_of_arguments = 0); ++ void call_VM_leaf(address entry_point, ++ Register arg_1); ++ void call_VM_leaf(address entry_point, ++ Register arg_1, Register arg_2); ++ void call_VM_leaf(address entry_point, ++ Register arg_1, Register arg_2, Register arg_3); ++ ++ // Super call_VM calls - correspond to MacroAssembler::call_VM(_leaf) calls ++ void super_call_VM_leaf(address entry_point); ++ void super_call_VM_leaf(address entry_point, Register arg_1); ++ void super_call_VM_leaf(address entry_point, Register arg_1, Register arg_2); ++ void super_call_VM_leaf(address entry_point, Register arg_1, Register arg_2, Register arg_3); ++ ++ // last Java Frame (fills frame anchor) ++ void set_last_Java_frame(Register thread, ++ Register last_java_sp, ++ Register last_java_fp, ++ Label& last_java_pc); ++ ++ // thread in the default location (S6) ++ void set_last_Java_frame(Register last_java_sp, ++ Register last_java_fp, ++ Label& last_java_pc); ++ ++ void set_last_Java_frame(Register last_java_sp, ++ Register last_java_fp, ++ Register last_java_pc); ++ ++ void reset_last_Java_frame(Register thread, bool clear_fp); ++ ++ // thread in the default location (S6) ++ void reset_last_Java_frame(bool clear_fp); ++ ++ // jobjects ++ void resolve_jobject(Register value, Register tmp1, Register tmp2); ++ void resolve_global_jobject(Register value, Register tmp1, Register tmp2); ++ ++ // C 'boolean' to Java boolean: x == 0 ? 0 : 1 ++ void c2bool(Register x); ++ ++ void resolve_weak_handle(Register result, Register tmp1, Register tmp2); ++ void resolve_oop_handle(Register result, Register tmp1, Register tmp2); ++ void load_mirror(Register dst, Register method, Register tmp1, Register tmp2); ++ ++ void load_method_holder_cld(Register rresult, Register rmethod); ++ void load_method_holder(Register holder, Register method); ++ ++ // oop manipulations ++ void load_klass(Register dst, Register src); ++ void store_klass(Register dst, Register src); ++ ++ void access_load_at(BasicType type, DecoratorSet decorators, Register dst, Address src, ++ Register tmp1, Register tmp2); ++ void access_store_at(BasicType type, DecoratorSet decorators, Address dst, Register val, ++ Register tmp1, Register tmp2, Register tmp3); ++ ++ void load_heap_oop(Register dst, Address src, Register tmp1, ++ Register tmp2, DecoratorSet decorators = 0); ++ void load_heap_oop_not_null(Register dst, Address src, Register tmp1, ++ Register tmp2, DecoratorSet decorators = 0); ++ void store_heap_oop(Address dst, Register val, Register tmp1, ++ Register tmp2, Register tmp3, DecoratorSet decorators = 0); ++ ++ // Used for storing null. All other oop constants should be ++ // stored using routines that take a jobject. ++ void store_heap_oop_null(Address dst); ++ ++ void store_klass_gap(Register dst, Register src); ++ ++ void encode_heap_oop(Register r); ++ void encode_heap_oop(Register dst, Register src); ++ void decode_heap_oop(Register r); ++ void decode_heap_oop(Register dst, Register src); ++ void encode_heap_oop_not_null(Register r); ++ void decode_heap_oop_not_null(Register r); ++ void encode_heap_oop_not_null(Register dst, Register src); ++ void decode_heap_oop_not_null(Register dst, Register src); ++ ++ void encode_klass_not_null(Register r); ++ void decode_klass_not_null(Register r); ++ void encode_klass_not_null(Register dst, Register src); ++ void decode_klass_not_null(Register dst, Register src); ++ ++ // if heap base register is used - reinit it with the correct value ++ void reinit_heapbase(); ++ ++ DEBUG_ONLY(void verify_heapbase(const char* msg);) ++ ++ void set_narrow_klass(Register dst, Klass* k); ++ void set_narrow_oop(Register dst, jobject obj); ++ ++ // Sign extension ++ void sign_extend_short(Register reg) { ext_w_h(reg, reg); } ++ void sign_extend_byte(Register reg) { ext_w_b(reg, reg); } ++ ++ // java.lang.Math::round intrinsics ++ void java_round_float(Register dst, FloatRegister src, ++ FloatRegister vtemp1); ++ void java_round_float_lsx(FloatRegister dst, FloatRegister src, ++ FloatRegister vtemp1, FloatRegister vtemp2); ++ void java_round_float_lasx(FloatRegister dst, FloatRegister src, ++ FloatRegister vtemp1, FloatRegister vtemp2); ++ void java_round_double(Register dst, FloatRegister src, ++ FloatRegister vtemp1); ++ void java_round_double_lsx(FloatRegister dst, FloatRegister src, ++ FloatRegister vtemp1, FloatRegister vtemp2); ++ void java_round_double_lasx(FloatRegister dst, FloatRegister src, ++ FloatRegister vtemp1, FloatRegister vtemp2); ++ ++ // allocation ++ void tlab_allocate( ++ Register obj, // result: pointer to object after successful allocation ++ Register var_size_in_bytes, // object size in bytes if unknown at compile time; invalid otherwise ++ int con_size_in_bytes, // object size in bytes if known at compile time ++ Register t1, // temp register ++ Register t2, // temp register ++ Label& slow_case // continuation point if fast allocation fails ++ ); ++ void incr_allocated_bytes(Register thread, ++ Register var_size_in_bytes, int con_size_in_bytes, ++ Register t1 = noreg); ++ // interface method calling ++ void lookup_interface_method(Register recv_klass, ++ Register intf_klass, ++ RegisterOrConstant itable_index, ++ Register method_result, ++ Register scan_temp, ++ Label& no_such_interface, ++ bool return_method = true); ++ ++ // virtual method calling ++ void lookup_virtual_method(Register recv_klass, ++ RegisterOrConstant vtable_index, ++ Register method_result); ++ ++ // Test sub_klass against super_klass, with fast and slow paths. ++ ++ // The fast path produces a tri-state answer: yes / no / maybe-slow. ++ // One of the three labels can be null, meaning take the fall-through. ++ // If super_check_offset is -1, the value is loaded up from super_klass. ++ // No registers are killed, except temp_reg. ++ void check_klass_subtype_fast_path(Register sub_klass, ++ Register super_klass, ++ Register temp_reg, ++ Label* L_success, ++ Label* L_failure, ++ Label* L_slow_path, ++ RegisterOrConstant super_check_offset = RegisterOrConstant(-1)); ++ ++ // The rest of the type check; must be wired to a corresponding fast path. ++ // It does not repeat the fast path logic, so don't use it standalone. ++ // The temp_reg and temp2_reg can be noreg, if no temps are available. ++ // Updates the sub's secondary super cache as necessary. ++ // If set_cond_codes, condition codes will be Z on success, NZ on failure. ++ template ++ void check_klass_subtype_slow_path(Register sub_klass, ++ Register super_klass, ++ Register temp_reg, ++ Register temp2_reg, ++ Label* L_success, ++ Label* L_failure, ++ bool set_cond_codes = false); ++ ++ // Simplified, combined version, good for typical uses. ++ // Falls through on failure. ++ void check_klass_subtype(Register sub_klass, ++ Register super_klass, ++ Register temp_reg, ++ Label& L_success); ++ ++ void clinit_barrier(Register klass, ++ Register scratch, ++ Label* L_fast_path = nullptr, ++ Label* L_slow_path = nullptr); ++ ++ ++ // Debugging ++ ++ // only if +VerifyOops ++ void _verify_oop(Register reg, const char* s, const char* file, int line); ++ void _verify_oop_addr(Address addr, const char* s, const char* file, int line); ++ ++ void _verify_oop_checked(Register reg, const char* s, const char* file, int line) { ++ if (VerifyOops) { ++ _verify_oop(reg, s, file, line); ++ } ++ } ++ void _verify_oop_addr_checked(Address reg, const char* s, const char* file, int line) { ++ if (VerifyOops) { ++ _verify_oop_addr(reg, s, file, line); ++ } ++ } ++ ++ void verify_oop_subroutine(); ++ ++ // TODO: verify method and klass metadata (compare against vptr?) ++ void _verify_method_ptr(Register reg, const char * msg, const char * file, int line) {} ++ void _verify_klass_ptr(Register reg, const char * msg, const char * file, int line){} ++ ++#define verify_oop(reg) _verify_oop_checked(reg, "broken oop " #reg, __FILE__, __LINE__) ++#define verify_oop_msg(reg, msg) _verify_oop_checked(reg, "broken oop " #reg ", " #msg, __FILE__, __LINE__) ++#define verify_oop_addr(addr) _verify_oop_addr_checked(addr, "broken oop addr " #addr, __FILE__, __LINE__) ++#define verify_method_ptr(reg) _verify_method_ptr(reg, "broken method " #reg, __FILE__, __LINE__) ++#define verify_klass_ptr(reg) _verify_method_ptr(reg, "broken klass " #reg, __FILE__, __LINE__) ++ ++ // prints msg, dumps registers and stops execution ++ void stop(const char* msg); ++ ++ static void debug(char* msg/*, RegistersForDebugging* regs*/); ++ static void debug64(char* msg, int64_t pc, int64_t regs[]); ++ ++ void untested() { stop("untested"); } ++ ++ void unimplemented(const char* what = ""); ++ ++ void should_not_reach_here() { stop("should not reach here"); } ++ ++ // Stack overflow checking ++ void bang_stack_with_offset(int offset) { ++ // stack grows down, caller passes positive offset ++ assert(offset > 0, "must bang with negative offset"); ++ if (offset <= 2048) { ++ st_w(A0, SP, -offset); ++ } else if (offset <= 32768 && !(offset & 3)) { ++ stptr_w(A0, SP, -offset); ++ } else { ++ li(AT, offset); ++ sub_d(AT, SP, AT); ++ st_w(A0, AT, 0); ++ } ++ } ++ ++ // Writes to stack successive pages until offset reached to check for ++ // stack overflow + shadow pages. Also, clobbers tmp ++ void bang_stack_size(Register size, Register tmp); ++ ++ // Check for reserved stack access in method being exited (for JIT) ++ void reserved_stack_check(); ++ ++ void safepoint_poll(Label& slow_path, Register thread_reg, bool at_return, bool acquire, bool in_nmethod); ++ ++ void verify_tlab(Register t1, Register t2); ++ ++ // the follow two might use AT register, be sure you have no meanful data in AT before you call them ++ void increment(Register reg, int imm); ++ void decrement(Register reg, int imm); ++ void increment(Address addr, int imm = 1); ++ void decrement(Address addr, int imm = 1); ++ ++ // Helper functions for statistics gathering. ++ void atomic_inc32(address counter_addr, int inc, Register tmp_reg1, Register tmp_reg2); ++ ++ // Calls ++ void call(address entry); ++ void call(address entry, relocInfo::relocType rtype); ++ void call(address entry, RelocationHolder& rh); ++ void call_long(address entry); ++ ++ address trampoline_call(AddressLiteral entry, CodeBuffer *cbuf = nullptr); ++ ++ static const unsigned long branch_range = NOT_DEBUG(128 * M) DEBUG_ONLY(2 * M); ++ ++ static bool far_branches() { ++ if (ForceUnreachable) { ++ return true; ++ } else { ++ return ReservedCodeCacheSize > branch_range; ++ } ++ } ++ ++ // Emit the CompiledIC call idiom ++ address ic_call(address entry, jint method_index = 0); ++ ++ void emit_static_call_stub(); ++ ++ // Jumps ++ void jmp(address entry); ++ void jmp(address entry, relocInfo::relocType rtype); ++ void jmp_far(Label& L); // patchable ++ ++ /* branches may exceed 16-bit offset */ ++ void b_far(address entry); ++ void b_far(Label& L); ++ ++ void bne_far (Register rs, Register rt, address entry); ++ void bne_far (Register rs, Register rt, Label& L); ++ ++ void beq_far (Register rs, Register rt, address entry); ++ void beq_far (Register rs, Register rt, Label& L); ++ ++ void blt_far (Register rs, Register rt, address entry, bool is_signed); ++ void blt_far (Register rs, Register rt, Label& L, bool is_signed); ++ ++ void bge_far (Register rs, Register rt, address entry, bool is_signed); ++ void bge_far (Register rs, Register rt, Label& L, bool is_signed); ++ ++ static bool patchable_branches() { ++ const unsigned long branch_range = NOT_DEBUG(128 * M) DEBUG_ONLY(2 * M); ++ return ReservedCodeCacheSize > branch_range; ++ } ++ ++ static bool reachable_from_branch_short(jlong offs); ++ ++ void patchable_jump_far(Register ra, jlong offs); ++ void patchable_jump(address target, bool force_patchable = false); ++ void patchable_call(address target, address call_size = 0); ++ ++ // Floating ++ void generate_dsin_dcos(bool isCos, address npio2_hw, address two_over_pi, ++ address pio2, address dsin_coef, address dcos_coef); ++ ++ // Data ++ ++ // Load and store values by size and signed-ness ++ void load_sized_value(Register dst, Address src, size_t size_in_bytes, bool is_signed, Register dst2 = noreg); ++ void store_sized_value(Address dst, Register src, size_t size_in_bytes, Register src2 = noreg); ++ ++ // swap the two byte of the low 16-bit halfword ++ void bswap_h(Register dst, Register src); ++ void bswap_hu(Register dst, Register src); ++ ++ // convert big endian integer to little endian integer ++ void bswap_w(Register dst, Register src); ++ ++ void cmpxchg(Address addr, Register oldval, Register newval, Register resflag, ++ bool retold, bool acquire, bool weak = false, bool exchange = false); ++ void cmpxchg(Address addr, Register oldval, Register newval, Register tmp, ++ bool retold, bool acquire, Label& succ, Label* fail = nullptr); ++ void cmpxchg32(Address addr, Register oldval, Register newval, Register resflag, ++ bool sign, bool retold, bool acquire, bool weak = false, bool exchange = false); ++ void cmpxchg32(Address addr, Register oldval, Register newval, Register tmp, ++ bool sign, bool retold, bool acquire, Label& succ, Label* fail = nullptr); ++ void cmpxchg16(Address addr, Register oldval, Register newval, Register resflag, ++ bool sign, bool retold, bool acquire, bool weak = false, bool exchange = false); ++ void cmpxchg16(Address addr, Register oldval, Register newval, Register tmp, ++ bool sign, bool retold, bool acquire, Label& succ, Label* fail = nullptr); ++ void cmpxchg8(Address addr, Register oldval, Register newval, Register resflag, ++ bool sign, bool retold, bool acquire, bool weak = false, bool exchange = false); ++ void cmpxchg8(Address addr, Register oldval, Register newval, Register tmp, ++ bool sign, bool retold, bool acquire, Label& succ, Label* fail = nullptr); ++ ++ void push (Register reg) { addi_d(SP, SP, -8); st_d (reg, SP, 0); } ++ void push (FloatRegister reg) { addi_d(SP, SP, -8); fst_d (reg, SP, 0); } ++ void pop (Register reg) { ld_d (reg, SP, 0); addi_d(SP, SP, 8); } ++ void pop (FloatRegister reg) { fld_d (reg, SP, 0); addi_d(SP, SP, 8); } ++ void pop () { addi_d(SP, SP, 8); } ++ void pop2 () { addi_d(SP, SP, 16); } ++ void push2(Register reg1, Register reg2); ++ void pop2 (Register reg1, Register reg2); ++ // Push and pop everything that might be clobbered by a native ++ // runtime call except SCR1 and SCR2. (They are always scratch, ++ // so we don't have to protect them.) Only save the lower 64 bits ++ // of each vector register. Additional registers can be excluded ++ // in a passed RegSet. ++ void push_call_clobbered_registers_except(RegSet exclude); ++ void pop_call_clobbered_registers_except(RegSet exclude); ++ ++ void push_call_clobbered_registers() { ++ push_call_clobbered_registers_except(RegSet()); ++ } ++ void pop_call_clobbered_registers() { ++ pop_call_clobbered_registers_except(RegSet()); ++ } ++ void push(RegSet regs) { if (regs.bits()) push(regs.bits()); } ++ void pop(RegSet regs) { if (regs.bits()) pop(regs.bits()); } ++ void push_fpu(FloatRegSet regs) { if (regs.bits()) push_fpu(regs.bits()); } ++ void pop_fpu(FloatRegSet regs) { if (regs.bits()) pop_fpu(regs.bits()); } ++ void push_vp(FloatRegSet regs) { if (regs.bits()) push_vp(regs.bits()); } ++ void pop_vp(FloatRegSet regs) { if (regs.bits()) pop_vp(regs.bits()); } ++ ++ void li(Register rd, jlong value); ++ void li(Register rd, address addr) { li(rd, (long)addr); } ++ void patchable_li52(Register rd, jlong value); ++ void lipc(Register rd, Label& L); ++ ++ void move(Register rd, Register rs) { orr(rd, rs, R0); } ++ void move_u32(Register rd, Register rs) { add_w(rd, rs, R0); } ++ void mov_metadata(Register dst, Metadata* obj); ++ void mov_metadata(Address dst, Metadata* obj); ++ ++ // Load the base of the cardtable byte map into reg. ++ void load_byte_map_base(Register reg); ++ ++ // method handles (JSR 292) ++ Address argument_address(RegisterOrConstant arg_slot, int extra_slot_offset = 0); ++ ++ ++ // LA added: ++ void jr (Register reg) { jirl(R0, reg, 0); } ++ void jalr(Register reg) { jirl(RA, reg, 0); } ++ void nop () { andi(R0, R0, 0); } ++ void andr(Register rd, Register rj, Register rk) { AND(rd, rj, rk); } ++ void xorr(Register rd, Register rj, Register rk) { XOR(rd, rj, rk); } ++ void orr (Register rd, Register rj, Register rk) { OR(rd, rj, rk); } ++ void lea (Register rd, Address src); ++ void lea(Register dst, AddressLiteral adr); ++ void lea_long(Register dst, AddressLiteral adr); ++ static int patched_branch(int dest_pos, int inst, int inst_pos); ++ ++ // Conditional move ++ void cmp_cmov_zero(Register op1, ++ Register op2, ++ Register dst, ++ Register src, ++ CMCompare cmp = EQ, ++ bool is_signed = true); ++ void cmp_cmov(Register op1, ++ Register op2, ++ Register dst, ++ Register src1, ++ Register src2, ++ CMCompare cmp = EQ, ++ bool is_signed = true); ++ void cmp_cmov(Register op1, ++ Register op2, ++ Register dst, ++ Register src, ++ CMCompare cmp = EQ, ++ bool is_signed = true); ++ void cmp_cmov(FloatRegister op1, ++ FloatRegister op2, ++ Register dst, ++ Register src, ++ FloatRegister tmp1, ++ FloatRegister tmp2, ++ CMCompare cmp = EQ, ++ bool is_float = true); ++ void cmp_cmov(FloatRegister op1, ++ FloatRegister op2, ++ FloatRegister dst, ++ FloatRegister src, ++ CMCompare cmp = EQ, ++ bool is_float = true); ++ void cmp_cmov(Register op1, ++ Register op2, ++ FloatRegister dst, ++ FloatRegister src, ++ CMCompare cmp = EQ, ++ bool is_signed = true); ++ ++ void membar(Membar_mask_bits hint); ++ ++ void bind(Label& L) { ++ Assembler::bind(L); ++ code()->clear_last_insn(); ++ } ++ ++ // ChaCha20 functions support block ++ void cc20_quarter_round(FloatRegister aVec, FloatRegister bVec, ++ FloatRegister cVec, FloatRegister dVec); ++ void cc20_shift_lane_org(FloatRegister bVec, FloatRegister cVec, ++ FloatRegister dVec, bool colToDiag); ++ ++ // CRC32 code for java.util.zip.CRC32::update() intrinsic. ++ void update_byte_crc32(Register crc, Register val, Register table); ++ ++ // CRC32 code for java.util.zip.CRC32::updateBytes() intrinsic. ++ void kernel_crc32(Register crc, Register buf, Register len, Register tmp); ++ ++ // CRC32C code for java.util.zip.CRC32C::updateBytes() intrinsic. ++ void kernel_crc32c(Register crc, Register buf, Register len, Register tmp); ++ ++ // Code for java.lang.StringCoding::countPositives intrinsic. ++ void count_positives(Register src, Register len, Register result, ++ Register tmp1, Register tmp2); ++ ++ // Code for java.lang.StringUTF16::compress intrinsic. ++ void char_array_compress(Register src, Register dst, ++ Register len, Register result, ++ Register tmp1, Register tmp2, Register tmp3, ++ FloatRegister vtemp1, FloatRegister vtemp2, ++ FloatRegister vtemp3, FloatRegister vtemp4); ++ ++ // Code for java.lang.StringLatin1::inflate intrinsic. ++ void byte_array_inflate(Register src, Register dst, Register len, ++ Register tmp1, Register tmp2, ++ FloatRegister vtemp1, FloatRegister vtemp2); ++ ++ // Encode UTF16 to ISO_8859_1 or ASCII. ++ // Return len on success or position of first mismatch. ++ void encode_iso_array(Register src, Register dst, ++ Register len, Register result, ++ Register tmp1, Register tmp2, ++ Register tmp3, bool ascii, ++ FloatRegister vtemp1, FloatRegister vtemp2, ++ FloatRegister vtemp3, FloatRegister vtemp4); ++ ++ // Code for java.math.BigInteger::mulAdd intrinsic. ++ void mul_add(Register out, Register in, Register offset, ++ Register len, Register k); ++ ++ void movoop(Register dst, jobject obj); ++ ++ // Helpers for the array_fill() macro ++ inline void tiny_fill_0_24(Register to, Register value); ++ ++ // Inner part of the generate_fill() stub ++ inline void array_fill(BasicType t, Register to, ++ Register value, Register count, bool aligned); ++ inline void array_fill_lsx(BasicType t, Register to, ++ Register value, Register count); ++ inline void array_fill_lasx(BasicType t, Register to, ++ Register value, Register count); ++ ++#undef VIRTUAL ++ ++ void cast_primitive_type(BasicType type, Register reg) { ++ switch (type) { ++ case T_BOOLEAN: c2bool(reg); break; ++ case T_CHAR : bstrpick_d(reg, reg, 15, 0); break; ++ case T_BYTE : sign_extend_byte (reg); break; ++ case T_SHORT : sign_extend_short(reg); break; ++ case T_INT : add_w(reg, reg, R0); break; ++ case T_LONG : /* nothing to do */ break; ++ case T_VOID : /* nothing to do */ break; ++ case T_FLOAT : /* nothing to do */ break; ++ case T_DOUBLE : /* nothing to do */ break; ++ default: ShouldNotReachHere(); ++ } ++ } ++ ++ void lightweight_lock(Register obj, Register hdr, Register flag, Register tmp, Label& slow); ++ void lightweight_unlock(Register obj, Register hdr, Register flag, Register tmp, Label& slow); ++ ++#if INCLUDE_ZGC ++ void patchable_li16(Register rd, uint16_t value); ++ void z_color(Register dst, Register src, Register tmp); ++ void z_uncolor(Register ref); ++ void check_color(Register ref, Register tmp, bool on_non_strong); ++#endif ++ ++private: ++ void push(unsigned int bitset); ++ void pop(unsigned int bitset); ++ void push_fpu(unsigned int bitset); ++ void pop_fpu(unsigned int bitset); ++ void push_vp(unsigned int bitset); ++ void pop_vp(unsigned int bitset); ++ ++ // Check the current thread doesn't need a cross modify fence. ++ void verify_cross_modify_fence_not_required() PRODUCT_RETURN; ++ void generate_kernel_sin(FloatRegister x, bool iyIsOne, address dsin_coef); ++ void generate_kernel_cos(FloatRegister x, address dcos_coef); ++ void generate__ieee754_rem_pio2(address npio2_hw, address two_over_pi, address pio2); ++ void generate__kernel_rem_pio2(address two_over_pi, address pio2); ++}; ++ ++/** ++ * class SkipIfEqual: ++ * ++ * Instantiating this class will result in assembly code being output that will ++ * jump around any code emitted between the creation of the instance and it's ++ * automatic destruction at the end of a scope block, depending on the value of ++ * the flag passed to the constructor, which will be checked at run-time. ++ */ ++class SkipIfEqual { ++private: ++ MacroAssembler* _masm; ++ Label _label; ++ ++public: ++ inline SkipIfEqual(MacroAssembler* masm, const bool* flag_addr, bool value) ++ : _masm(masm) { ++ _masm->li(AT, (address)flag_addr); ++ _masm->ld_b(AT, AT, 0); ++ if (value) { ++ _masm->bne(AT, R0, _label); ++ } else { ++ _masm->beq(AT, R0, _label); ++ } ++ } ++ ++ ~SkipIfEqual(); ++}; ++ ++#ifdef ASSERT ++inline bool AbstractAssembler::pd_check_instruction_mark() { return true; } ++#endif ++ ++struct tableswitch { ++ Register _reg; ++ int _insn_index; jint _first_key; jint _last_key; ++ Label _after; ++ Label _branches; ++}; ++ ++#endif // CPU_LOONGARCH_MACROASSEMBLER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/macroAssembler_loongarch.inline.hpp b/src/hotspot/cpu/loongarch/macroAssembler_loongarch.inline.hpp +--- a/src/hotspot/cpu/loongarch/macroAssembler_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/macroAssembler_loongarch.inline.hpp 2024-02-20 10:42:36.158863448 +0800 +@@ -0,0 +1,937 @@ ++/* ++ * Copyright (c) 1997, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2017, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_MACROASSEMBLER_LOONGARCH_INLINE_HPP ++#define CPU_LOONGARCH_MACROASSEMBLER_LOONGARCH_INLINE_HPP ++ ++#include "asm/assembler.inline.hpp" ++#include "asm/macroAssembler.hpp" ++#include "asm/codeBuffer.hpp" ++#include "code/codeCache.hpp" ++ ++inline void MacroAssembler::tiny_fill_0_24(Register to, Register value) { ++ // 0: ++ jr(RA); ++ nop(); ++ nop(); ++ nop(); ++ ++ // 1: ++ st_b(value, to, 0); ++ jr(RA); ++ nop(); ++ nop(); ++ ++ // 2: ++ st_h(value, to, 0); ++ jr(RA); ++ nop(); ++ nop(); ++ ++ // 3: ++ st_h(value, to, 0); ++ st_b(value, to, 2); ++ jr(RA); ++ nop(); ++ ++ // 4: ++ st_w(value, to, 0); ++ jr(RA); ++ nop(); ++ nop(); ++ ++ // 5: ++ st_w(value, to, 0); ++ st_b(value, to, 4); ++ jr(RA); ++ nop(); ++ ++ // 6: ++ st_w(value, to, 0); ++ st_h(value, to, 4); ++ jr(RA); ++ nop(); ++ ++ // 7: ++ st_w(value, to, 0); ++ st_w(value, to, 3); ++ jr(RA); ++ nop(); ++ ++ // 8: ++ st_d(value, to, 0); ++ jr(RA); ++ nop(); ++ nop(); ++ ++ // 9: ++ st_d(value, to, 0); ++ st_b(value, to, 8); ++ jr(RA); ++ nop(); ++ ++ // 10: ++ st_d(value, to, 0); ++ st_h(value, to, 8); ++ jr(RA); ++ nop(); ++ ++ // 11: ++ st_d(value, to, 0); ++ st_w(value, to, 7); ++ jr(RA); ++ nop(); ++ ++ // 12: ++ st_d(value, to, 0); ++ st_w(value, to, 8); ++ jr(RA); ++ nop(); ++ ++ // 13: ++ st_d(value, to, 0); ++ st_d(value, to, 5); ++ jr(RA); ++ nop(); ++ ++ // 14: ++ st_d(value, to, 0); ++ st_d(value, to, 6); ++ jr(RA); ++ nop(); ++ ++ // 15: ++ st_d(value, to, 0); ++ st_d(value, to, 7); ++ jr(RA); ++ nop(); ++ ++ // 16: ++ st_d(value, to, 0); ++ st_d(value, to, 8); ++ jr(RA); ++ nop(); ++ ++ // 17: ++ st_d(value, to, 0); ++ st_d(value, to, 8); ++ st_b(value, to, 16); ++ jr(RA); ++ ++ // 18: ++ st_d(value, to, 0); ++ st_d(value, to, 8); ++ st_h(value, to, 16); ++ jr(RA); ++ ++ // 19: ++ st_d(value, to, 0); ++ st_d(value, to, 8); ++ st_w(value, to, 15); ++ jr(RA); ++ ++ // 20: ++ st_d(value, to, 0); ++ st_d(value, to, 8); ++ st_w(value, to, 16); ++ jr(RA); ++ ++ // 21: ++ st_d(value, to, 0); ++ st_d(value, to, 8); ++ st_d(value, to, 13); ++ jr(RA); ++ ++ // 22: ++ st_d(value, to, 0); ++ st_d(value, to, 8); ++ st_d(value, to, 14); ++ jr(RA); ++ ++ // 23: ++ st_d(value, to, 0); ++ st_d(value, to, 8); ++ st_d(value, to, 15); ++ jr(RA); ++ ++ // 24: ++ st_d(value, to, 0); ++ st_d(value, to, 8); ++ st_d(value, to, 16); ++ jr(RA); ++} ++ ++inline void MacroAssembler::array_fill(BasicType t, Register to, ++ Register value, Register count, ++ bool aligned) { ++ assert_different_registers(to, value, count, SCR1); ++ ++ Label L_small; ++ ++ int shift = -1; ++ switch (t) { ++ case T_BYTE: ++ shift = 0; ++ slti(SCR1, count, 25); ++ bstrins_d(value, value, 15, 8); // 8 bit -> 16 bit ++ bstrins_d(value, value, 31, 16); // 16 bit -> 32 bit ++ bstrins_d(value, value, 63, 32); // 32 bit -> 64 bit ++ bnez(SCR1, L_small); ++ // count denotes the end, in bytes ++ add_d(count, to, count); ++ break; ++ case T_SHORT: ++ shift = 1; ++ slti(SCR1, count, 13); ++ bstrins_d(value, value, 31, 16); // 16 bit -> 32 bit ++ bstrins_d(value, value, 63, 32); // 32 bit -> 64 bit ++ bnez(SCR1, L_small); ++ // count denotes the end, in bytes ++ alsl_d(count, count, to, shift - 1); ++ break; ++ case T_INT: ++ shift = 2; ++ slti(SCR1, count, 7); ++ bstrins_d(value, value, 63, 32); // 32 bit -> 64 bit ++ bnez(SCR1, L_small); ++ // count denotes the end, in bytes ++ alsl_d(count, count, to, shift - 1); ++ break; ++ case T_LONG: ++ shift = 3; ++ slti(SCR1, count, 4); ++ bnez(SCR1, L_small); ++ // count denotes the end, in bytes ++ alsl_d(count, count, to, shift - 1); ++ break; ++ default: ShouldNotReachHere(); ++ } ++ ++ // nature aligned for store ++ if (!aligned) { ++ st_d(value, to, 0); ++ bstrins_d(to, R0, 2, 0); ++ addi_d(to, to, 8); ++ } ++ ++ // fill large chunks ++ Label L_loop64, L_lt64, L_lt32, L_lt16, L_lt8; ++ ++ addi_d(SCR1, count, -64); ++ blt(SCR1, to, L_lt64); ++ ++ bind(L_loop64); ++ st_d(value, to, 0); ++ st_d(value, to, 8); ++ st_d(value, to, 16); ++ st_d(value, to, 24); ++ st_d(value, to, 32); ++ st_d(value, to, 40); ++ st_d(value, to, 48); ++ st_d(value, to, 56); ++ addi_d(to, to, 64); ++ bge(SCR1, to, L_loop64); ++ ++ bind(L_lt64); ++ addi_d(SCR1, count, -32); ++ blt(SCR1, to, L_lt32); ++ st_d(value, to, 0); ++ st_d(value, to, 8); ++ st_d(value, to, 16); ++ st_d(value, to, 24); ++ addi_d(to, to, 32); ++ ++ bind(L_lt32); ++ addi_d(SCR1, count, -16); ++ blt(SCR1, to, L_lt16); ++ st_d(value, to, 0); ++ st_d(value, to, 8); ++ addi_d(to, to, 16); ++ ++ bind(L_lt16); ++ addi_d(SCR1, count, -8); ++ blt(SCR1, to, L_lt8); ++ st_d(value, to, 0); ++ ++ bind(L_lt8); ++ st_d(value, count, -8); ++ ++ jr(RA); ++ ++ // Short arrays (<= 24 bytes) ++ bind(L_small); ++ pcaddi(SCR1, 4); ++ slli_d(count, count, 4 + shift); ++ add_d(SCR1, SCR1, count); ++ jr(SCR1); ++ ++ tiny_fill_0_24(to, value); ++} ++ ++inline void MacroAssembler::array_fill_lsx(BasicType t, Register to, ++ Register value, Register count) { ++ assert(UseLSX, "should be"); ++ assert_different_registers(to, value, count, SCR1); ++ ++ Label L_small; ++ ++ int shift = -1; ++ switch (t) { ++ case T_BYTE: ++ shift = 0; ++ slti(SCR1, count, 49); ++ vreplgr2vr_b(fscratch, value); // 8 bit -> 128 bit ++ movfr2gr_d(value, fscratch); ++ bnez(SCR1, L_small); ++ // count denotes the end, in bytes ++ add_d(count, to, count); ++ break; ++ case T_SHORT: ++ shift = 1; ++ slti(SCR1, count, 25); ++ vreplgr2vr_h(fscratch, value); // 16 bit -> 128 bit ++ movfr2gr_d(value, fscratch); ++ bnez(SCR1, L_small); ++ // count denotes the end, in bytes ++ alsl_d(count, count, to, shift - 1); ++ break; ++ case T_INT: ++ shift = 2; ++ slti(SCR1, count, 13); ++ vreplgr2vr_w(fscratch, value); // 32 bit -> 128 bit ++ movfr2gr_d(value, fscratch); ++ bnez(SCR1, L_small); ++ // count denotes the end, in bytes ++ alsl_d(count, count, to, shift - 1); ++ break; ++ case T_LONG: ++ shift = 3; ++ slti(SCR1, count, 7); ++ vreplgr2vr_d(fscratch, value); // 64 bit -> 128 bit ++ bnez(SCR1, L_small); ++ // count denotes the end, in bytes ++ alsl_d(count, count, to, shift - 1); ++ break; ++ default: ShouldNotReachHere(); ++ } ++ ++ // nature aligned for store ++ vst(fscratch, to, 0); ++ bstrins_d(to, R0, 3, 0); ++ addi_d(to, to, 16); ++ ++ // fill large chunks ++ Label L_loop128, L_lt128, L_lt64, L_lt32, L_lt16; ++ ++ addi_d(SCR1, count, -128); ++ blt(SCR1, to, L_lt128); ++ ++ bind(L_loop128); ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ vst(fscratch, to, 32); ++ vst(fscratch, to, 48); ++ vst(fscratch, to, 64); ++ vst(fscratch, to, 80); ++ vst(fscratch, to, 96); ++ vst(fscratch, to, 112); ++ addi_d(to, to, 128); ++ bge(SCR1, to, L_loop128); ++ ++ bind(L_lt128); ++ addi_d(SCR1, count, -64); ++ blt(SCR1, to, L_lt64); ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ vst(fscratch, to, 32); ++ vst(fscratch, to, 48); ++ addi_d(to, to, 64); ++ ++ bind(L_lt64); ++ addi_d(SCR1, count, -32); ++ blt(SCR1, to, L_lt32); ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ addi_d(to, to, 32); ++ ++ bind(L_lt32); ++ addi_d(SCR1, count, -16); ++ blt(SCR1, to, L_lt16); ++ vst(fscratch, to, 0); ++ ++ bind(L_lt16); ++ vst(fscratch, count, -16); ++ ++ jr(RA); ++ ++ // Short arrays (<= 48 bytes) ++ bind(L_small); ++ pcaddi(SCR1, 4); ++ slli_d(count, count, 4 + shift); ++ add_d(SCR1, SCR1, count); ++ jr(SCR1); ++ ++ tiny_fill_0_24(to, value); ++ ++ // 25: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 9); ++ jr(RA); ++ nop(); ++ ++ // 26: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 10); ++ jr(RA); ++ nop(); ++ ++ // 27: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 11); ++ jr(RA); ++ nop(); ++ ++ // 28: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 12); ++ jr(RA); ++ nop(); ++ ++ // 29: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 13); ++ jr(RA); ++ nop(); ++ ++ // 30: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 14); ++ jr(RA); ++ nop(); ++ ++ // 31: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 15); ++ jr(RA); ++ nop(); ++ ++ // 32: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ jr(RA); ++ nop(); ++ ++ // 33: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ st_b(value, to, 32); ++ jr(RA); ++ ++ // 34: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ st_h(value, to, 32); ++ jr(RA); ++ ++ // 35: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ st_w(value, to, 31); ++ jr(RA); ++ ++ // 36: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ st_w(value, to, 32); ++ jr(RA); ++ ++ // 37: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ st_d(value, to, 29); ++ jr(RA); ++ ++ // 38: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ st_d(value, to, 30); ++ jr(RA); ++ ++ // 39: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ st_d(value, to, 31); ++ jr(RA); ++ ++ // 40: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ st_d(value, to, 32); ++ jr(RA); ++ ++ // 41: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ vst(fscratch, to, 25); ++ jr(RA); ++ ++ // 42: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ vst(fscratch, to, 26); ++ jr(RA); ++ ++ // 43: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ vst(fscratch, to, 27); ++ jr(RA); ++ ++ // 44: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ vst(fscratch, to, 28); ++ jr(RA); ++ ++ // 45: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ vst(fscratch, to, 29); ++ jr(RA); ++ ++ // 46: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ vst(fscratch, to, 30); ++ jr(RA); ++ ++ // 47: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ vst(fscratch, to, 31); ++ jr(RA); ++ ++ // 48: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 16); ++ vst(fscratch, to, 32); ++ jr(RA); ++} ++ ++inline void MacroAssembler::array_fill_lasx(BasicType t, Register to, ++ Register value, Register count) { ++ assert(UseLASX, "should be"); ++ assert_different_registers(to, value, count, SCR1); ++ ++ Label L_small; ++ ++ int shift = -1; ++ switch (t) { ++ case T_BYTE: ++ shift = 0; ++ slti(SCR1, count, 73); ++ xvreplgr2vr_b(fscratch, value); // 8 bit -> 256 bit ++ movfr2gr_d(value, fscratch); ++ bnez(SCR1, L_small); ++ // count denotes the end, in bytes ++ add_d(count, to, count); ++ break; ++ case T_SHORT: ++ shift = 1; ++ slti(SCR1, count, 37); ++ xvreplgr2vr_h(fscratch, value); // 16 bit -> 256 bit ++ movfr2gr_d(value, fscratch); ++ bnez(SCR1, L_small); ++ // count denotes the end, in bytes ++ alsl_d(count, count, to, shift - 1); ++ break; ++ case T_INT: ++ shift = 2; ++ slti(SCR1, count, 19); ++ xvreplgr2vr_w(fscratch, value); // 32 bit -> 256 bit ++ movfr2gr_d(value, fscratch); ++ bnez(SCR1, L_small); ++ // count denotes the end, in bytes ++ alsl_d(count, count, to, shift - 1); ++ break; ++ case T_LONG: ++ shift = 3; ++ slti(SCR1, count, 10); ++ xvreplgr2vr_d(fscratch, value); // 64 bit -> 256 bit ++ bnez(SCR1, L_small); ++ // count denotes the end, in bytes ++ alsl_d(count, count, to, shift - 1); ++ break; ++ default: ShouldNotReachHere(); ++ } ++ ++ // nature aligned for store ++ xvst(fscratch, to, 0); ++ bstrins_d(to, R0, 4, 0); ++ addi_d(to, to, 32); ++ ++ // fill large chunks ++ Label L_loop256, L_lt256, L_lt128, L_lt64, L_lt32; ++ ++ addi_d(SCR1, count, -256); ++ blt(SCR1, to, L_lt256); ++ ++ bind(L_loop256); ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 32); ++ xvst(fscratch, to, 64); ++ xvst(fscratch, to, 96); ++ xvst(fscratch, to, 128); ++ xvst(fscratch, to, 160); ++ xvst(fscratch, to, 192); ++ xvst(fscratch, to, 224); ++ addi_d(to, to, 256); ++ bge(SCR1, to, L_loop256); ++ ++ bind(L_lt256); ++ addi_d(SCR1, count, -128); ++ blt(SCR1, to, L_lt128); ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 32); ++ xvst(fscratch, to, 64); ++ xvst(fscratch, to, 96); ++ addi_d(to, to, 128); ++ ++ bind(L_lt128); ++ addi_d(SCR1, count, -64); ++ blt(SCR1, to, L_lt64); ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 32); ++ addi_d(to, to, 64); ++ ++ bind(L_lt64); ++ addi_d(SCR1, count, -32); ++ blt(SCR1, to, L_lt32); ++ xvst(fscratch, to, 0); ++ ++ bind(L_lt32); ++ xvst(fscratch, count, -32); ++ ++ jr(RA); ++ ++ // Short arrays (<= 72 bytes) ++ bind(L_small); ++ pcaddi(SCR1, 4); ++ slli_d(count, count, 4 + shift); ++ add_d(SCR1, SCR1, count); ++ jr(SCR1); ++ ++ tiny_fill_0_24(to, value); ++ ++ // 25: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 9); ++ jr(RA); ++ nop(); ++ ++ // 26: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 10); ++ jr(RA); ++ nop(); ++ ++ // 27: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 11); ++ jr(RA); ++ nop(); ++ ++ // 28: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 12); ++ jr(RA); ++ nop(); ++ ++ // 29: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 13); ++ jr(RA); ++ nop(); ++ ++ // 30: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 14); ++ jr(RA); ++ nop(); ++ ++ // 31: ++ vst(fscratch, to, 0); ++ vst(fscratch, to, 15); ++ jr(RA); ++ nop(); ++ ++ // 32: ++ xvst(fscratch, to, 0); ++ jr(RA); ++ nop(); ++ nop(); ++ ++ // 33: ++ xvst(fscratch, to, 0); ++ st_b(value, to, 32); ++ jr(RA); ++ nop(); ++ ++ // 34: ++ xvst(fscratch, to, 0); ++ st_h(value, to, 32); ++ jr(RA); ++ nop(); ++ ++ // 35: ++ xvst(fscratch, to, 0); ++ st_w(value, to, 31); ++ jr(RA); ++ nop(); ++ ++ // 36: ++ xvst(fscratch, to, 0); ++ st_w(value, to, 32); ++ jr(RA); ++ nop(); ++ ++ // 37: ++ xvst(fscratch, to, 0); ++ st_d(value, to, 29); ++ jr(RA); ++ nop(); ++ ++ // 38: ++ xvst(fscratch, to, 0); ++ st_d(value, to, 30); ++ jr(RA); ++ nop(); ++ ++ // 39: ++ xvst(fscratch, to, 0); ++ st_d(value, to, 31); ++ jr(RA); ++ nop(); ++ ++ // 40: ++ xvst(fscratch, to, 0); ++ st_d(value, to, 32); ++ jr(RA); ++ nop(); ++ ++ // 41: ++ xvst(fscratch, to, 0); ++ vst(fscratch, to, 25); ++ jr(RA); ++ nop(); ++ ++ // 42: ++ xvst(fscratch, to, 0); ++ vst(fscratch, to, 26); ++ jr(RA); ++ nop(); ++ ++ // 43: ++ xvst(fscratch, to, 0); ++ vst(fscratch, to, 27); ++ jr(RA); ++ nop(); ++ ++ // 44: ++ xvst(fscratch, to, 0); ++ vst(fscratch, to, 28); ++ jr(RA); ++ nop(); ++ ++ // 45: ++ xvst(fscratch, to, 0); ++ vst(fscratch, to, 29); ++ jr(RA); ++ nop(); ++ ++ // 46: ++ xvst(fscratch, to, 0); ++ vst(fscratch, to, 30); ++ jr(RA); ++ nop(); ++ ++ // 47: ++ xvst(fscratch, to, 0); ++ vst(fscratch, to, 31); ++ jr(RA); ++ nop(); ++ ++ // 48: ++ xvst(fscratch, to, 0); ++ vst(fscratch, to, 32); ++ jr(RA); ++ nop(); ++ ++ // 49: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 17); ++ jr(RA); ++ nop(); ++ ++ // 50: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 18); ++ jr(RA); ++ nop(); ++ ++ // 51: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 19); ++ jr(RA); ++ nop(); ++ ++ // 52: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 20); ++ jr(RA); ++ nop(); ++ ++ // 53: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 21); ++ jr(RA); ++ nop(); ++ ++ // 54: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 22); ++ jr(RA); ++ nop(); ++ ++ // 55: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 23); ++ jr(RA); ++ nop(); ++ ++ // 56: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 24); ++ jr(RA); ++ nop(); ++ ++ // 57: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 25); ++ jr(RA); ++ nop(); ++ ++ // 58: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 26); ++ jr(RA); ++ nop(); ++ ++ // 59: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 27); ++ jr(RA); ++ nop(); ++ ++ // 60: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 28); ++ jr(RA); ++ nop(); ++ ++ // 61: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 29); ++ jr(RA); ++ nop(); ++ ++ // 62: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 30); ++ jr(RA); ++ nop(); ++ ++ // 63: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 31); ++ jr(RA); ++ nop(); ++ ++ // 64: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 32); ++ jr(RA); ++ nop(); ++ ++ // 65: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 32); ++ st_b(value, to, 64); ++ jr(RA); ++ ++ // 66: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 32); ++ st_h(value, to, 64); ++ jr(RA); ++ ++ // 67: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 32); ++ st_w(value, to, 63); ++ jr(RA); ++ ++ // 68: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 32); ++ st_w(value, to, 64); ++ jr(RA); ++ ++ // 69: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 32); ++ st_d(value, to, 61); ++ jr(RA); ++ ++ // 70: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 32); ++ st_d(value, to, 62); ++ jr(RA); ++ ++ // 71: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 32); ++ st_d(value, to, 63); ++ jr(RA); ++ ++ // 72: ++ xvst(fscratch, to, 0); ++ xvst(fscratch, to, 32); ++ st_d(value, to, 64); ++ jr(RA); ++} ++ ++#endif // CPU_LOONGARCH_MACROASSEMBLER_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/macroAssembler_loongarch_trig.cpp b/src/hotspot/cpu/loongarch/macroAssembler_loongarch_trig.cpp +--- a/src/hotspot/cpu/loongarch/macroAssembler_loongarch_trig.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/macroAssembler_loongarch_trig.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,1625 @@ ++/* Copyright (c) 2018, 2020, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, Cavium. All rights reserved. (By BELLSOFT) ++ * Copyright (c) 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/assembler.hpp" ++#include "asm/assembler.inline.hpp" ++#include "macroAssembler_loongarch.hpp" ++ ++// The following code is a optimized version of fdlibm sin/cos implementation ++// (C code is in share/runtime/sharedRuntimeTrig.cpp) adapted for LoongArch64. ++ ++// Please refer to sin/cos approximation via polynomial and ++// trigonometric argument reduction techniques to the following literature: ++// ++// [1] Muller, Jean-Michel, Nicolas Brisebarre, Florent De Dinechin, ++// Claude-Pierre Jeannerod, Vincent Lefevre, Guillaume Melquiond, ++// Nathalie Revol, Damien Stehlé, and Serge Torres: ++// Handbook of floating-point arithmetic. ++// Springer Science & Business Media, 2009. ++// [2] K. C. Ng ++// Argument Reduction for Huge Arguments: Good to the Last Bit ++// July 13, 1992, SunPro ++// ++// HOW TO READ THIS CODE: ++// This code consists of several functions. Each function has following header: ++// 1) Description ++// 2) C-pseudo code with differences from fdlibm marked by comments starting ++// with "NOTE". Check unmodified fdlibm code in ++// share/runtime/SharedRuntimeTrig.cpp ++// 3) Brief textual description of changes between fdlibm and current ++// implementation along with optimization notes (if applicable) ++// 4) Assumptions, input and output ++// 5) (Optional) additional notes about intrinsic implementation ++// Each function is separated in blocks which follow the pseudo-code structure ++// ++// HIGH-LEVEL ALGORITHM DESCRIPTION: ++// - entry point: generate_dsin_dcos(...); ++// - check corner cases: NaN, INF, tiny argument. ++// - check if |x| < Pi/4. Then approximate sin/cos via polynomial (kernel_sin/kernel_cos) ++// -- else proceed to argument reduction routine (__ieee754_rem_pio2) and ++// use reduced argument to get result via kernel_sin/kernel_cos ++// ++// HIGH-LEVEL CHANGES BETWEEN INTRINSICS AND FDLIBM: ++// 1) two_over_pi table fdlibm representation is int[], while intrinsic version ++// has these int values converted to double representation to load converted ++// double values directly (see stubRoutines_aarch4::_two_over_pi) ++// 2) Several loops are unrolled and vectorized: see comments in code after ++// labels: SKIP_F_LOAD, RECOMP_FOR1_CHECK, RECOMP_FOR2 ++// 3) fdlibm npio2_hw table now has "prefix" with constants used in ++// calculation. These constants are loaded from npio2_hw table instead of ++// constructing it in code (see stubRoutines_loongarch64.cpp) ++// 4) Polynomial coefficients for sin and cos are moved to table sin_coef ++// and cos_coef to use the same optimization as in 3). It allows to load most of ++// required constants via single instruction ++// ++// ++// ++///* __ieee754_rem_pio2(x,y) ++// * ++// * returns the remainder of x rem pi/2 in y[0]+y[1] (i.e. like x div pi/2) ++// * x is input argument, y[] is hi and low parts of reduced argument (x) ++// * uses __kernel_rem_pio2() ++// */ ++// // use tables(see stubRoutines_loongarch64.cpp): two_over_pi and modified npio2_hw ++// ++// BEGIN __ieee754_rem_pio2 PSEUDO CODE ++// ++//static int __ieee754_rem_pio2(double x, double *y) { ++// double z,w,t,r,fn; ++// double tx[3]; ++// int e0,i,j,nx,n,ix,hx,i0; ++// ++// i0 = ((*(int*)&two24A)>>30)^1; /* high word index */ ++// hx = *(i0+(int*)&x); /* high word of x */ ++// ix = hx&0x7fffffff; ++// if(ix<0x4002d97c) { /* |x| < 3pi/4, special case with n=+-1 */ ++// if(hx>0) { ++// z = x - pio2_1; ++// if(ix!=0x3ff921fb) { /* 33+53 bit pi is good enough */ ++// y[0] = z - pio2_1t; ++// y[1] = (z-y[0])-pio2_1t; ++// } else { /* near pi/2, use 33+33+53 bit pi */ ++// z -= pio2_2; ++// y[0] = z - pio2_2t; ++// y[1] = (z-y[0])-pio2_2t; ++// } ++// return 1; ++// } else { /* negative x */ ++// z = x + pio2_1; ++// if(ix!=0x3ff921fb) { /* 33+53 bit pi is good enough */ ++// y[0] = z + pio2_1t; ++// y[1] = (z-y[0])+pio2_1t; ++// } else { /* near pi/2, use 33+33+53 bit pi */ ++// z += pio2_2; ++// y[0] = z + pio2_2t; ++// y[1] = (z-y[0])+pio2_2t; ++// } ++// return -1; ++// } ++// } ++// if(ix<=0x413921fb) { /* |x| ~<= 2^19*(pi/2), medium size */ ++// t = fabsd(x); ++// n = (int) (t*invpio2+half); ++// fn = (double)n; ++// r = t-fn*pio2_1; ++// w = fn*pio2_1t; /* 1st round good to 85 bit */ ++// // NOTE: y[0] = r-w; is moved from if/else below to be before "if" ++// y[0] = r-w; ++// if(n<32&&ix!=npio2_hw[n-1]) { ++// // y[0] = r-w; /* quick check no cancellation */ // NOTE: moved earlier ++// } else { ++// j = ix>>20; ++// // y[0] = r-w; // NOTE: moved earlier ++// i = j-(((*(i0+(int*)&y[0]))>>20)&0x7ff); ++// if(i>16) { /* 2nd iteration needed, good to 118 */ ++// t = r; ++// w = fn*pio2_2; ++// r = t-w; ++// w = fn*pio2_2t-((t-r)-w); ++// y[0] = r-w; ++// i = j-(((*(i0+(int*)&y[0]))>>20)&0x7ff); ++// if(i>49) { /* 3rd iteration need, 151 bits acc */ ++// t = r; /* will cover all possible cases */ ++// w = fn*pio2_3; ++// r = t-w; ++// w = fn*pio2_3t-((t-r)-w); ++// y[0] = r-w; ++// } ++// } ++// } ++// y[1] = (r-y[0])-w; ++// if(hx<0) {y[0] = -y[0]; y[1] = -y[1]; return -n;} ++// else return n; ++// } ++// /* ++// * all other (large) arguments ++// */ ++// // NOTE: this check is removed, because it was checked in dsin/dcos ++// // if(ix>=0x7ff00000) { /* x is inf or NaN */ ++// // y[0]=y[1]=x-x; return 0; ++// // } ++// /* set z = scalbn(|x|,ilogb(x)-23) */ ++// *(1-i0+(int*)&z) = *(1-i0+(int*)&x); ++// e0 = (ix>>20)-1046; /* e0 = ilogb(z)-23; */ ++// *(i0+(int*)&z) = ix - (e0<<20); ++// ++// // NOTE: "for" loop below in unrolled. See comments in asm code ++// for(i=0;i<2;i++) { ++// tx[i] = (double)((int)(z)); ++// z = (z-tx[i])*two24A; ++// } ++// ++// tx[2] = z; ++// nx = 3; ++// ++// // NOTE: while(tx[nx-1]==zeroA) nx--; is unrolled. See comments in asm code ++// while(tx[nx-1]==zeroA) nx--; /* skip zero term */ ++// ++// n = __kernel_rem_pio2(tx,y,e0,nx,2,two_over_pi); ++// if(hx<0) {y[0] = -y[0]; y[1] = -y[1]; return -n;} ++// return n; ++//} ++// ++// END __ieee754_rem_pio2 PSEUDO CODE ++// ++// Changes between fdlibm and intrinsic for __ieee754_rem_pio2: ++// 1. INF/NaN check for huge argument is removed in comparison with fdlibm ++// code, because this check is already done in dcos/dsin code ++// 2. Most constants are now loaded from table instead of direct initialization ++// 3. Two loops are unrolled ++// Assumptions: ++// 1. Assume |X| >= PI/4 ++// 2. Assume SCR1 = 0x3fe921fb00000000 (~ PI/4) ++// 3. Assume ix = A3 ++// Input and output: ++// 1. Input: X = A0 ++// 2. Return n in A2, y[0] == y0 == FA4, y[1] == y1 == FA5 ++// NOTE: general purpose register names match local variable names in C code ++// NOTE: fpu registers are actively reused. See comments in code about their usage ++void MacroAssembler::generate__ieee754_rem_pio2(address npio2_hw, address two_over_pi, address pio2) { ++ const int64_t PIO2_1t = 0x3DD0B4611A626331ULL; ++ const int64_t PIO2_2 = 0x3DD0B4611A600000ULL; ++ const int64_t PIO2_2t = 0x3BA3198A2E037073ULL; ++ Label X_IS_NEGATIVE, X_IS_MEDIUM_OR_LARGE, X_IS_POSITIVE_LONG_PI, LARGE_ELSE, ++ REDUCTION_DONE, X_IS_MEDIUM_BRANCH_DONE, X_IS_LARGE, NX_SET, ++ X_IS_NEGATIVE_LONG_PI; ++ Register X = A0, n = A2, ix = A3, jv = A4, tmp5 = A5, jx = A6, ++ tmp3 = A7, iqBase = T0, ih = T1, i = T2; ++ FloatRegister v0 = FA0, v1 = FA1, v2 = FA2, v3 = FA3, v4 = FA4, v5 = FA5, v6 = FA6, v7 = FA7, ++ vt = FT1, v24 = FT8, v26 = FT10, v27 = FT11, v28 = FT12, v29 = FT13, v31 = FT15; ++ ++ push2(S0, S1); ++ ++ // initializing constants first ++ li(SCR1, 0x3ff921fb54400000); // PIO2_1 ++ li(SCR2, 0x4002d97c); // 3*PI/4 high word ++ movgr2fr_d(v1, SCR1); // v1 = PIO2_1 ++ bge(ix, SCR2, X_IS_MEDIUM_OR_LARGE); ++ ++ block_comment("if(ix<0x4002d97c) {... /* |x| ~< 3pi/4 */ "); { ++ blt(X, R0, X_IS_NEGATIVE); ++ ++ block_comment("if(hx>0) {"); { ++ fsub_d(v2, v0, v1); // v2 = z = x - pio2_1 ++ srli_d(SCR1, SCR1, 32); ++ li(n, 1); ++ beq(ix, SCR1, X_IS_POSITIVE_LONG_PI); ++ ++ block_comment("case: hx > 0 && ix!=0x3ff921fb {"); { /* 33+53 bit pi is good enough */ ++ li(SCR2, PIO2_1t); ++ movgr2fr_d(v27, SCR2); ++ fsub_d(v4, v2, v27); // v4 = y[0] = z - pio2_1t; ++ fsub_d(v5, v2, v4); ++ fsub_d(v5, v5, v27); // v5 = y[1] = (z-y[0])-pio2_1t ++ b(REDUCTION_DONE); ++ } ++ ++ block_comment("case: hx > 0 &*& ix==0x3ff921fb {"); { /* near pi/2, use 33+33+53 bit pi */ ++ bind(X_IS_POSITIVE_LONG_PI); ++ li(SCR1, PIO2_2); ++ li(SCR2, PIO2_2t); ++ movgr2fr_d(v27, SCR1); ++ movgr2fr_d(v6, SCR2); ++ fsub_d(v2, v2, v27); // z-= pio2_2 ++ fsub_d(v4, v2, v6); // y[0] = z - pio2_2t ++ fsub_d(v5, v2, v4); ++ fsub_d(v5, v5, v6); // v5 = (z - y[0]) - pio2_2t ++ b(REDUCTION_DONE); ++ } ++ } ++ ++ block_comment("case: hx <= 0)"); { ++ bind(X_IS_NEGATIVE); ++ fadd_d(v2, v0, v1); // v2 = z = x + pio2_1 ++ srli_d(SCR1, SCR1, 32); ++ li(n, -1); ++ beq(ix, SCR1, X_IS_NEGATIVE_LONG_PI); ++ ++ block_comment("case: hx <= 0 && ix!=0x3ff921fb) {"); { /* 33+53 bit pi is good enough */ ++ li(SCR2, PIO2_1t); ++ movgr2fr_d(v27, SCR2); ++ fadd_d(v4, v2, v27); // v4 = y[0] = z + pio2_1t; ++ fsub_d(v5, v2, v4); ++ fadd_d(v5, v5, v27); // v5 = y[1] = (z-y[0]) + pio2_1t ++ b(REDUCTION_DONE); ++ } ++ ++ block_comment("case: hx <= 0 && ix==0x3ff921fb"); { /* near pi/2, use 33+33+53 bit pi */ ++ bind(X_IS_NEGATIVE_LONG_PI); ++ li(SCR1, PIO2_2); ++ li(SCR2, PIO2_2t); ++ movgr2fr_d(v27, SCR1); ++ movgr2fr_d(v6, SCR2); ++ fadd_d(v2, v2, v27); // z += pio2_2 ++ fadd_d(v4, v2, v6); // y[0] = z + pio2_2t ++ fsub_d(v5, v2, v4); ++ fadd_d(v5, v5, v6); // v5 = (z - y[0]) + pio2_2t ++ b(REDUCTION_DONE); ++ } ++ } ++ } ++ bind(X_IS_MEDIUM_OR_LARGE); ++ li(SCR1, 0x413921fb); ++ blt(SCR1, ix, X_IS_LARGE); // ix < = 0x413921fb ? ++ ++ block_comment("|x| ~<= 2^19*(pi/2), medium size"); { ++ li(ih, npio2_hw); ++ fld_d(v4, ih, 0); ++ fld_d(v5, ih, 8); ++ fld_d(v6, ih, 16); ++ fld_d(v7, ih, 24); ++ fabs_d(v31, v0); // v31 = t = |x| ++ addi_d(ih, ih, 64); ++ fmadd_d(v2, v31, v5, v4); // v2 = t * invpio2 + half (invpio2 = 53 bits of 2/pi, half = 0.5) ++ ftintrz_w_d(vt, v2); // n = (int) v2 ++ movfr2gr_s(n, vt); ++ vfrintrz_d(v2, v2); ++ fnmsub_d(v3, v2, v6, v31); // v3 = r = t - fn * pio2_1 ++ fmul_d(v26, v2, v7); // v26 = w = fn * pio2_1t ++ fsub_d(v4, v3, v26); // y[0] = r - w. Calculated before branch ++ li(SCR1, 32); ++ blt(SCR1, n, LARGE_ELSE); ++ addi_w(tmp5, n, -1); // tmp5 = n - 1 ++ alsl_d(tmp5, tmp5, ih, 2 - 1); ++ ld_w(jv, tmp5, 0); ++ bne(ix, jv, X_IS_MEDIUM_BRANCH_DONE); ++ ++ block_comment("else block for if(n<32&&ix!=npio2_hw[n-1])"); { ++ bind(LARGE_ELSE); ++ movfr2gr_d(jx, v4); ++ srli_d(tmp5, ix, 20); // j = ix >> 20 ++ slli_d(jx, jx, 1); ++ srli_d(tmp3, jx, 32 + 20 + 1); // r7 = j-(((*(i0+(int*)&y[0]))>>20)&0x7ff); ++ sub_d(tmp3, tmp5, tmp3); ++ ++ block_comment("if(i>16)"); { ++ li(SCR1, 16); ++ bge(SCR1, tmp3, X_IS_MEDIUM_BRANCH_DONE); ++ // i > 16. 2nd iteration needed ++ fld_d(v6, ih, -32); ++ fld_d(v7, ih, -24); ++ fmov_d(v28, v3); // t = r ++ fmul_d(v29, v2, v6); // w = v29 = fn * pio2_2 ++ fsub_d(v3, v28, v29); // r = t - w ++ fsub_d(v31, v28, v3); // v31 = (t - r) ++ fsub_d(v31, v29, v31); // v31 = w - (t - r) = - ((t - r) - w) ++ fmadd_d(v26, v2, v7, v31); // v26 = w = fn*pio2_2t - ((t - r) - w) ++ fsub_d(v4, v3, v26); // y[0] = r - w ++ movfr2gr_d(jx, v4); ++ slli_d(jx, jx, 1); ++ srli_d(tmp3, jx, 32 + 20 + 1); // r7 = j-(((*(i0+(int*)&y[0]))>>20)&0x7ff); ++ sub_d(tmp3, tmp5, tmp3); ++ ++ block_comment("if(i>49)"); { ++ li(SCR1, 49); ++ bge(SCR1, tmp3, X_IS_MEDIUM_BRANCH_DONE); ++ // 3rd iteration need, 151 bits acc ++ fld_d(v6, ih, -16); ++ fld_d(v7, ih, -8); ++ fmov_d(v28, v3); // save "r" ++ fmul_d(v29, v2, v6); // v29 = fn * pio2_3 ++ fsub_d(v3, v28, v29); // r = r - w ++ fsub_d(v31, v28, v3); // v31 = (t - r) ++ fsub_d(v31, v29, v31); // v31 = w - (t - r) = - ((t - r) - w) ++ fmadd_d(v26, v2, v7, v31); // v26 = w = fn*pio2_3t - ((t - r) - w) ++ fsub_d(v4, v3, v26); // y[0] = r - w ++ } ++ } ++ } ++ block_comment("medium x tail"); { ++ bind(X_IS_MEDIUM_BRANCH_DONE); ++ fsub_d(v5, v3, v4); // v5 = y[1] = (r - y[0]) ++ fsub_d(v5, v5, v26); // v5 = y[1] = (r - y[0]) - w ++ blt(R0, X, REDUCTION_DONE); ++ fneg_d(v4, v4); ++ sub_w(n, R0, n); ++ fneg_d(v5, v5); ++ b(REDUCTION_DONE); ++ } ++ } ++ ++ block_comment("all other (large) arguments"); { ++ bind(X_IS_LARGE); ++ srli_d(SCR1, ix, 20); // ix >> 20 ++ li(tmp5, 0x4170000000000000); ++ addi_w(SCR1, SCR1, -1046); // e0 ++ movgr2fr_d(v24, tmp5); // init two24A value ++ slli_w(jv, SCR1, 20); // ix - (e0<<20) ++ sub_w(jv, ix, jv); ++ slli_d(jv, jv, 32); ++ addi_w(SCR2, SCR1, -3); ++ bstrins_d(jv, X, 31, 0); // jv = z ++ li(i, 24); ++ movgr2fr_d(v26, jv); // v26 = z ++ ++ block_comment("unrolled for(i=0;i<2;i++) {tx[i] = (double)((int)(z));z = (z-tx[i])*two24A;}"); { ++ // tx[0,1,2] = v6,v7,v26 ++ vfrintrz_d(v6, v26); // v6 = (double)((int)v26) ++ div_w(jv, SCR2, i); // jv = (e0 - 3)/24 ++ fsub_d(v26, v26, v6); ++ addi_d(SP, SP, -560); ++ fmul_d(v26, v26, v24); ++ vfrintrz_d(v7, v26); // v7 = (double)((int)v26) ++ li(jx, 2); // calculate jx as nx - 1, which is initially 2. Not a part of unrolled loop ++ fsub_d(v26, v26, v7); ++ } ++ ++ block_comment("nx calculation with unrolled while(tx[nx-1]==zeroA) nx--;"); { ++ vxor_v(vt, vt, vt); ++ fcmp_cne_d(FCC0, v26, vt); // if NE then jx == 2. else it's 1 or 0 ++ addi_d(iqBase, SP, 480); // base of iq[] ++ fmul_d(v3, v26, v24); ++ bcnez(FCC0, NX_SET); ++ fcmp_cne_d(FCC0, v7, vt); // v7 == 0 => jx = 0. Else jx = 1 ++ if (UseCF2GR) { ++ movcf2gr(jx, FCC0); ++ } else { ++ movcf2fr(vt, FCC0); ++ movfr2gr_s(jx, vt); ++ } ++ } ++ bind(NX_SET); ++ generate__kernel_rem_pio2(two_over_pi, pio2); ++ // now we have y[0] = v4, y[1] = v5 and n = r2 ++ bge(X, R0, REDUCTION_DONE); ++ fneg_d(v4, v4); ++ fneg_d(v5, v5); ++ sub_w(n, R0, n); ++ } ++ bind(REDUCTION_DONE); ++ ++ pop2(S0, S1); ++} ++ ++///* ++// * __kernel_rem_pio2(x,y,e0,nx,prec,ipio2) ++// * double x[],y[]; int e0,nx,prec; int ipio2[]; ++// * ++// * __kernel_rem_pio2 return the last three digits of N with ++// * y = x - N*pi/2 ++// * so that |y| < pi/2. ++// * ++// * The method is to compute the integer (mod 8) and fraction parts of ++// * (2/pi)*x without doing the full multiplication. In general we ++// * skip the part of the product that are known to be a huge integer ( ++// * more accurately, = 0 mod 8 ). Thus the number of operations are ++// * independent of the exponent of the input. ++// * ++// * NOTE: 2/pi int representation is converted to double ++// * // (2/pi) is represented by an array of 24-bit integers in ipio2[]. ++// * ++// * Input parameters: ++// * x[] The input value (must be positive) is broken into nx ++// * pieces of 24-bit integers in double precision format. ++// * x[i] will be the i-th 24 bit of x. The scaled exponent ++// * of x[0] is given in input parameter e0 (i.e., x[0]*2^e0 ++// * match x's up to 24 bits. ++// * ++// * Example of breaking a double positive z into x[0]+x[1]+x[2]: ++// * e0 = ilogb(z)-23 ++// * z = scalbn(z,-e0) ++// * for i = 0,1,2 ++// * x[i] = floor(z) ++// * z = (z-x[i])*2**24 ++// * ++// * ++// * y[] output result in an array of double precision numbers. ++// * The dimension of y[] is: ++// * 24-bit precision 1 ++// * 53-bit precision 2 ++// * 64-bit precision 2 ++// * 113-bit precision 3 ++// * The actual value is the sum of them. Thus for 113-bit ++// * precsion, one may have to do something like: ++// * ++// * long double t,w,r_head, r_tail; ++// * t = (long double)y[2] + (long double)y[1]; ++// * w = (long double)y[0]; ++// * r_head = t+w; ++// * r_tail = w - (r_head - t); ++// * ++// * e0 The exponent of x[0] ++// * ++// * nx dimension of x[] ++// * ++// * prec an integer indicating the precision: ++// * 0 24 bits (single) ++// * 1 53 bits (double) ++// * 2 64 bits (extended) ++// * 3 113 bits (quad) ++// * ++// * NOTE: ipio2[] array below is converted to double representation ++// * //ipio2[] ++// * // integer array, contains the (24*i)-th to (24*i+23)-th ++// * // bit of 2/pi after binary point. The corresponding ++// * // floating value is ++// * ++// * ipio2[i] * 2^(-24(i+1)). ++// * ++// * Here is the description of some local variables: ++// * ++// * jk jk+1 is the initial number of terms of ipio2[] needed ++// * in the computation. The recommended value is 2,3,4, ++// * 6 for single, double, extended,and quad. ++// * ++// * jz local integer variable indicating the number of ++// * terms of ipio2[] used. ++// * ++// * jx nx - 1 ++// * ++// * jv index for pointing to the suitable ipio2[] for the ++// * computation. In general, we want ++// * ( 2^e0*x[0] * ipio2[jv-1]*2^(-24jv) )/8 ++// * is an integer. Thus ++// * e0-3-24*jv >= 0 or (e0-3)/24 >= jv ++// * Hence jv = max(0,(e0-3)/24). ++// * ++// * jp jp+1 is the number of terms in PIo2[] needed, jp = jk. ++// * ++// * q[] double array with integral value, representing the ++// * 24-bits chunk of the product of x and 2/pi. ++// * ++// * q0 the corresponding exponent of q[0]. Note that the ++// * exponent for q[i] would be q0-24*i. ++// * ++// * PIo2[] double precision array, obtained by cutting pi/2 ++// * into 24 bits chunks. ++// * ++// * f[] ipio2[] in floating point ++// * ++// * iq[] integer array by breaking up q[] in 24-bits chunk. ++// * ++// * fq[] final product of x*(2/pi) in fq[0],..,fq[jk] ++// * ++// * ih integer. If >0 it indicates q[] is >= 0.5, hence ++// * it also indicates the *sign* of the result. ++// * ++// */ ++// ++// Use PIo2 table(see stubRoutines_loongarch64.cpp) ++// ++// BEGIN __kernel_rem_pio2 PSEUDO CODE ++// ++//static int __kernel_rem_pio2(double *x, double *y, int e0, int nx, int prec, /* NOTE: converted to double */ const double *ipio2 // const int *ipio2) { ++// int jz,jx,jv,jp,jk,carry,n,iq[20],i,j,k,m,q0,ih; ++// double z,fw,f[20],fq[20],q[20]; ++// ++// /* initialize jk*/ ++// // jk = init_jk[prec]; // NOTE: prec==2 for double. jk is always 4. ++// jp = jk; // NOTE: always 4 ++// ++// /* determine jx,jv,q0, note that 3>q0 */ ++// jx = nx-1; ++// jv = (e0-3)/24; if(jv<0) jv=0; ++// q0 = e0-24*(jv+1); ++// ++// /* set up f[0] to f[jx+jk] where f[jx+jk] = ipio2[jv+jk] */ ++// j = jv-jx; m = jx+jk; ++// ++// // NOTE: split into two for-loops: one with zeroB and one with ipio2[j]. It ++// // allows the use of wider loads/stores ++// for(i=0;i<=m;i++,j++) f[i] = (j<0)? zeroB : /* NOTE: converted to double */ ipio2[j]; //(double) ipio2[j]; ++// ++// // NOTE: unrolled and vectorized "for". See comments in asm code ++// /* compute q[0],q[1],...q[jk] */ ++// for (i=0;i<=jk;i++) { ++// for(j=0,fw=0.0;j<=jx;j++) fw += x[j]*f[jx+i-j]; q[i] = fw; ++// } ++// ++// jz = jk; ++//recompute: ++// /* distill q[] into iq[] reversingly */ ++// for(i=0,j=jz,z=q[jz];j>0;i++,j--) { ++// fw = (double)((int)(twon24* z)); ++// iq[i] = (int)(z-two24B*fw); ++// z = q[j-1]+fw; ++// } ++// ++// /* compute n */ ++// z = scalbnA(z,q0); /* actual value of z */ ++// z -= 8.0*floor(z*0.125); /* trim off integer >= 8 */ ++// n = (int) z; ++// z -= (double)n; ++// ih = 0; ++// if(q0>0) { /* need iq[jz-1] to determine n */ ++// i = (iq[jz-1]>>(24-q0)); n += i; ++// iq[jz-1] -= i<<(24-q0); ++// ih = iq[jz-1]>>(23-q0); ++// } ++// else if(q0==0) ih = iq[jz-1]>>23; ++// else if(z>=0.5) ih=2; ++// ++// if(ih>0) { /* q > 0.5 */ ++// n += 1; carry = 0; ++// for(i=0;i0) { /* rare case: chance is 1 in 12 */ ++// switch(q0) { ++// case 1: ++// iq[jz-1] &= 0x7fffff; break; ++// case 2: ++// iq[jz-1] &= 0x3fffff; break; ++// } ++// } ++// if(ih==2) { ++// z = one - z; ++// if(carry!=0) z -= scalbnA(one,q0); ++// } ++// } ++// ++// /* check if recomputation is needed */ ++// if(z==zeroB) { ++// j = 0; ++// for (i=jz-1;i>=jk;i--) j |= iq[i]; ++// if(j==0) { /* need recomputation */ ++// for(k=1;iq[jk-k]==0;k++); /* k = no. of terms needed */ ++// ++// for(i=jz+1;i<=jz+k;i++) { /* add q[jz+1] to q[jz+k] */ ++// f[jx+i] = /* NOTE: converted to double */ ipio2[jv+i]; //(double) ipio2[jv+i]; ++// for(j=0,fw=0.0;j<=jx;j++) fw += x[j]*f[jx+i-j]; ++// q[i] = fw; ++// } ++// jz += k; ++// goto recompute; ++// } ++// } ++// ++// /* chop off zero terms */ ++// if(z==0.0) { ++// jz -= 1; q0 -= 24; ++// while(iq[jz]==0) { jz--; q0-=24;} ++// } else { /* break z into 24-bit if necessary */ ++// z = scalbnA(z,-q0); ++// if(z>=two24B) { ++// fw = (double)((int)(twon24*z)); ++// iq[jz] = (int)(z-two24B*fw); ++// jz += 1; q0 += 24; ++// iq[jz] = (int) fw; ++// } else iq[jz] = (int) z ; ++// } ++// ++// /* convert integer "bit" chunk to floating-point value */ ++// fw = scalbnA(one,q0); ++// for(i=jz;i>=0;i--) { ++// q[i] = fw*(double)iq[i]; fw*=twon24; ++// } ++// ++// /* compute PIo2[0,...,jp]*q[jz,...,0] */ ++// for(i=jz;i>=0;i--) { ++// for(fw=0.0,k=0;k<=jp&&k<=jz-i;k++) fw += PIo2[k]*q[i+k]; ++// fq[jz-i] = fw; ++// } ++// ++// // NOTE: switch below is eliminated, because prec is always 2 for doubles ++// /* compress fq[] into y[] */ ++// //switch(prec) { ++// //case 0: ++// // fw = 0.0; ++// // for (i=jz;i>=0;i--) fw += fq[i]; ++// // y[0] = (ih==0)? fw: -fw; ++// // break; ++// //case 1: ++// //case 2: ++// fw = 0.0; ++// for (i=jz;i>=0;i--) fw += fq[i]; ++// y[0] = (ih==0)? fw: -fw; ++// fw = fq[0]-fw; ++// for (i=1;i<=jz;i++) fw += fq[i]; ++// y[1] = (ih==0)? fw: -fw; ++// // break; ++// //case 3: /* painful */ ++// // for (i=jz;i>0;i--) { ++// // fw = fq[i-1]+fq[i]; ++// // fq[i] += fq[i-1]-fw; ++// // fq[i-1] = fw; ++// // } ++// // for (i=jz;i>1;i--) { ++// // fw = fq[i-1]+fq[i]; ++// // fq[i] += fq[i-1]-fw; ++// // fq[i-1] = fw; ++// // } ++// // for (fw=0.0,i=jz;i>=2;i--) fw += fq[i]; ++// // if(ih==0) { ++// // y[0] = fq[0]; y[1] = fq[1]; y[2] = fw; ++// // } else { ++// // y[0] = -fq[0]; y[1] = -fq[1]; y[2] = -fw; ++// // } ++// //} ++// return n&7; ++//} ++// ++// END __kernel_rem_pio2 PSEUDO CODE ++// ++// Changes between fdlibm and intrinsic: ++// 1. One loop is unrolled and vectorized (see comments in code) ++// 2. One loop is split into 2 loops (see comments in code) ++// 3. Non-double code is removed(last switch). Several variables became ++// constants because of that (see comments in code) ++// 4. Use of jx, which is nx-1 instead of nx ++// Assumptions: ++// 1. Assume |X| >= PI/4 ++// Input and output: ++// 1. Input: X = A0, jx == nx - 1 == A6, e0 == SCR1 ++// 2. Return n in A2, y[0] == y0 == FA4, y[1] == y1 == FA5 ++// NOTE: general purpose register names match local variable names in C code ++// NOTE: fpu registers are actively reused. See comments in code about their usage ++void MacroAssembler::generate__kernel_rem_pio2(address two_over_pi, address pio2) { ++ Label Q_DONE, JX_IS_0, JX_IS_2, COMP_INNER_LOOP, RECOMP_FOR2, Q0_ZERO_CMP_LT, ++ RECOMP_CHECK_DONE_NOT_ZERO, Q0_ZERO_CMP_DONE, COMP_FOR, Q0_ZERO_CMP_EQ, ++ INIT_F_ZERO, RECOMPUTE, IH_FOR_INCREMENT, IH_FOR_STORE, RECOMP_CHECK_DONE, ++ Z_IS_LESS_THAN_TWO24B, Z_IS_ZERO, FW_Y1_NO_NEGATION, ++ RECOMP_FW_UPDATED, Z_ZERO_CHECK_DONE, FW_FOR1, IH_AFTER_SWITCH, IH_HANDLED, ++ CONVERTION_FOR, FW_Y0_NO_NEGATION, FW_FOR1_DONE, FW_FOR2, FW_FOR2_DONE, ++ IH_FOR, SKIP_F_LOAD, RECOMP_FOR1, RECOMP_FIRST_FOR, INIT_F_COPY, ++ RECOMP_FOR1_CHECK; ++ Register tmp2 = A1, n = A2, jv = A4, tmp5 = A5, jx = A6, ++ tmp3 = A7, iqBase = T0, ih = T1, i = T2, tmp1 = T3, ++ jz = S0, j = T5, twoOverPiBase = T6, tmp4 = S1, qBase = T8; ++ FloatRegister v0 = FA0, v1 = FA1, v2 = FA2, v3 = FA3, v4 = FA4, v5 = FA5, v6 = FA6, v7 = FA7, ++ vt = FT1, v17 = FT2, v18 = FT3, v19 = FT4, v20 = FT5, v21 = FT6, v22 = FT7, v24 = FT8, ++ v25 = FT9, v26 = FT10, v27 = FT11, v28 = FT12, v29 = FT13, v30 = FT14, v31 = FT15; ++ // jp = jk == init_jk[prec] = init_jk[2] == {2,3,4,6}[2] == 4 ++ // jx = nx - 1 ++ li(twoOverPiBase, two_over_pi); ++ slti(SCR2, jv, 0); ++ addi_w(tmp4, jx, 4); // tmp4 = m = jx + jk = jx + 4. jx is in {0,1,2} so m is in [4,5,6] ++ masknez(jv, jv, SCR2); ++ if (UseLASX) ++ xvxor_v(v26, v26, v26); ++ else ++ vxor_v(v26, v26, v26); ++ addi_w(tmp5, jv, 1); // jv+1 ++ sub_w(j, jv, jx); ++ addi_d(qBase, SP, 320); // base of q[] ++ mul_w(SCR2, i, tmp5); // q0 = e0-24*(jv+1) ++ sub_w(SCR1, SCR1, SCR2); ++ // use double f[20], fq[20], q[20], iq[20] on stack, which is ++ // (20 + 20 + 20) x 8 + 20 x 4 = 560 bytes. From lower to upper addresses it ++ // will contain f[20], fq[20], q[20], iq[20] ++ // now initialize f[20] indexes 0..m (inclusive) ++ // for(i=0;i<=m;i++,j++) f[i] = (j<0)? zeroB : /* NOTE: converted to double */ ipio2[j]; // (double) ipio2[j]; ++ move(tmp5, SP); ++ ++ block_comment("for(i=0;i<=m;i++,j++) f[i] = (j<0)? zeroB : /* NOTE: converted to double */ ipio2[j]; // (double) ipio2[j];"); { ++ xorr(i, i, i); ++ bge(j, R0, INIT_F_COPY); ++ bind(INIT_F_ZERO); ++ if (UseLASX) { ++ xvst(v26, tmp5, 0); ++ } else { ++ vst(v26, tmp5, 0); ++ vst(v26, tmp5, 16); ++ } ++ addi_d(tmp5, tmp5, 32); ++ addi_w(i, i, 4); ++ addi_w(j, j, 4); ++ blt(j, R0, INIT_F_ZERO); ++ sub_w(i, i, j); ++ move(j, R0); ++ bind(INIT_F_COPY); ++ alsl_d(tmp1, j, twoOverPiBase, 3 - 1); // ipio2[j] start address ++ if (UseLASX) { ++ xvld(v18, tmp1, 0); ++ xvld(v19, tmp1, 32); ++ } else { ++ vld(v18, tmp1, 0); ++ vld(v19, tmp1, 16); ++ vld(v20, tmp1, 32); ++ vld(v21, tmp1, 48); ++ } ++ alsl_d(tmp5, i, SP, 3 - 1); ++ if (UseLASX) { ++ xvst(v18, tmp5, 0); ++ xvst(v19, tmp5, 32); ++ } else { ++ vst(v18, tmp5, 0); ++ vst(v19, tmp5, 16); ++ vst(v20, tmp5, 32); ++ vst(v21, tmp5, 48); ++ } ++ } ++ // v18..v21 can actually contain f[0..7] ++ beqz(i, SKIP_F_LOAD); // i == 0 => f[i] == f[0] => already loaded ++ if (UseLASX) { ++ xvld(v18, SP, 0); // load f[0..7] ++ xvld(v19, SP, 32); ++ } else { ++ vld(v18, SP, 0); // load f[0..7] ++ vld(v19, SP, 16); ++ vld(v20, SP, 32); ++ vld(v21, SP, 48); ++ } ++ bind(SKIP_F_LOAD); ++ // calculate 2^q0 and 2^-q0, which we'll need further. ++ // q0 is exponent. So, calculate biased exponent(q0+1023) ++ sub_w(tmp4, R0, SCR1); ++ addi_w(tmp5, SCR1, 1023); ++ addi_w(tmp4, tmp4, 1023); ++ // Unroll following for(s) depending on jx in [0,1,2] ++ // for (i=0;i<=jk;i++) { ++ // for(j=0,fw=0.0;j<=jx;j++) fw += x[j]*f[jx+i-j]; q[i] = fw; ++ // } ++ // Unrolling for jx == 0 case: ++ // q[0] = x[0] * f[0] ++ // q[1] = x[0] * f[1] ++ // q[2] = x[0] * f[2] ++ // q[3] = x[0] * f[3] ++ // q[4] = x[0] * f[4] ++ // ++ // Vectorization for unrolled jx == 0 case: ++ // {q[0], q[1]} = {f[0], f[1]} * x[0] ++ // {q[2], q[3]} = {f[2], f[3]} * x[0] ++ // q[4] = f[4] * x[0] ++ // ++ // Unrolling for jx == 1 case: ++ // q[0] = x[0] * f[1] + x[1] * f[0] ++ // q[1] = x[0] * f[2] + x[1] * f[1] ++ // q[2] = x[0] * f[3] + x[1] * f[2] ++ // q[3] = x[0] * f[4] + x[1] * f[3] ++ // q[4] = x[0] * f[5] + x[1] * f[4] ++ // ++ // Vectorization for unrolled jx == 1 case: ++ // {q[0], q[1]} = {f[0], f[1]} * x[1] ++ // {q[2], q[3]} = {f[2], f[3]} * x[1] ++ // q[4] = f[4] * x[1] ++ // {q[0], q[1]} += {f[1], f[2]} * x[0] ++ // {q[2], q[3]} += {f[3], f[4]} * x[0] ++ // q[4] += f[5] * x[0] ++ // ++ // Unrolling for jx == 2 case: ++ // q[0] = x[0] * f[2] + x[1] * f[1] + x[2] * f[0] ++ // q[1] = x[0] * f[3] + x[1] * f[2] + x[2] * f[1] ++ // q[2] = x[0] * f[4] + x[1] * f[3] + x[2] * f[2] ++ // q[3] = x[0] * f[5] + x[1] * f[4] + x[2] * f[3] ++ // q[4] = x[0] * f[6] + x[1] * f[5] + x[2] * f[4] ++ // ++ // Vectorization for unrolled jx == 2 case: ++ // {q[0], q[1]} = {f[0], f[1]} * x[2] ++ // {q[2], q[3]} = {f[2], f[3]} * x[2] ++ // q[4] = f[4] * x[2] ++ // {q[0], q[1]} += {f[1], f[2]} * x[1] ++ // {q[2], q[3]} += {f[3], f[4]} * x[1] ++ // q[4] += f[5] * x[1] ++ // {q[0], q[1]} += {f[2], f[3]} * x[0] ++ // {q[2], q[3]} += {f[4], f[5]} * x[0] ++ // q[4] += f[6] * x[0] ++ block_comment("unrolled and vectorized computation of q[0]..q[jk]"); { ++ li(SCR2, 1); ++ slli_d(tmp5, tmp5, 52); // now it's 2^q0 double value ++ slli_d(tmp4, tmp4, 52); // now it's 2^-q0 double value ++ if (UseLASX) ++ xvpermi_d(v6, v6, 0); ++ else ++ vreplvei_d(v6, v6, 0); ++ blt(jx, SCR2, JX_IS_0); ++ addi_d(i, SP, 8); ++ if (UseLASX) { ++ xvld(v26, i, 0); // load f[1..4] ++ xvpermi_d(v3, v3, 0); ++ xvpermi_d(v7, v7, 0); ++ xvpermi_d(v20, v19, 85); ++ xvpermi_d(v21, v19, 170); ++ } else { ++ vld(v26, i, 0); // load f[1..4] ++ vld(v27, i, 16); ++ vreplvei_d(v3, v3, 0); ++ vreplvei_d(v7, v7, 0); ++ vreplvei_d(vt, v20, 1); ++ vreplvei_d(v21, v21, 0); ++ } ++ blt(SCR2, jx, JX_IS_2); ++ // jx == 1 ++ if (UseLASX) { ++ xvfmul_d(v28, v18, v7); // f[0,3] * x[1] ++ fmul_d(v30, v19, v7); // f[4] * x[1] ++ xvfmadd_d(v28, v26, v6, v28); ++ fmadd_d(v30, v6, v20, v30); // v30 += f[5] * x[0] ++ } else { ++ vfmul_d(v28, v18, v7); // f[0,1] * x[1] ++ vfmul_d(v29, v19, v7); // f[2,3] * x[1] ++ fmul_d(v30, v20, v7); // f[4] * x[1] ++ vfmadd_d(v28, v26, v6, v28); ++ vfmadd_d(v29, v27, v6, v29); ++ fmadd_d(v30, v6, vt, v30); // v30 += f[5] * x[0] ++ } ++ b(Q_DONE); ++ bind(JX_IS_2); ++ if (UseLASX) { ++ xvfmul_d(v28, v18, v3); // f[0,3] * x[2] ++ fmul_d(v30, v19, v3); // f[4] * x[2] ++ xvfmadd_d(v28, v26, v7, v28); ++ fmadd_d(v30, v7, v20, v30); // v30 += f[5] * x[1] ++ xvpermi_q(v18, v19, 3); ++ xvfmadd_d(v28, v18, v6, v28); ++ } else { ++ vfmul_d(v28, v18, v3); // f[0,1] * x[2] ++ vfmul_d(v29, v19, v3); // f[2,3] * x[2] ++ fmul_d(v30, v20, v3); // f[4] * x[2] ++ vfmadd_d(v28, v26, v7, v28); ++ vfmadd_d(v29, v27, v7, v29); ++ fmadd_d(v30, v7, vt, v30); // v30 += f[5] * x[1] ++ vfmadd_d(v28, v19, v6, v28); ++ vfmadd_d(v29, v20, v6, v29); ++ } ++ fmadd_d(v30, v6, v21, v30); // v30 += f[6] * x[0] ++ b(Q_DONE); ++ bind(JX_IS_0); ++ if (UseLASX) { ++ xvfmul_d(v28, v18, v6); // f[0,1] * x[0] ++ fmul_d(v30, v19, v6); // f[4] * x[0] ++ } else { ++ vfmul_d(v28, v18, v6); // f[0,1] * x[0] ++ vfmul_d(v29, v19, v6); // f[2,3] * x[0] ++ fmul_d(v30, v20, v6); // f[4] * x[0] ++ } ++ bind(Q_DONE); ++ if (UseLASX) { ++ xvst(v28, qBase, 0); // save calculated q[0]...q[jk] ++ } else { ++ vst(v28, qBase, 0); // save calculated q[0]...q[jk] ++ vst(v29, qBase, 16); ++ } ++ fst_d(v30, qBase, 32); ++ } ++ li(i, 0x3E70000000000000); ++ li(jz, 4); ++ movgr2fr_d(v17, i); // v17 = twon24 ++ movgr2fr_d(v30, tmp5); // 2^q0 ++ vldi(v21, -960); // 0.125 (0x3fc0000000000000) ++ vldi(v20, -992); // 8.0 (0x4020000000000000) ++ movgr2fr_d(v22, tmp4); // 2^-q0 ++ ++ block_comment("recompute loop"); { ++ bind(RECOMPUTE); ++ // for(i=0,j=jz,z=q[jz];j>0;i++,j--) { ++ // fw = (double)((int)(twon24* z)); ++ // iq[i] = (int)(z-two24A*fw); ++ // z = q[j-1]+fw; ++ // } ++ block_comment("distill q[] into iq[] reversingly"); { ++ xorr(i, i, i); ++ move(j, jz); ++ alsl_d(tmp2, jz, qBase, 3 - 1); // q[jz] address ++ fld_d(v18, tmp2, 0); // z = q[j] and moving address to q[j-1] ++ addi_d(tmp2, tmp2, -8); ++ bind(RECOMP_FIRST_FOR); ++ fld_d(v27, tmp2, 0); ++ addi_d(tmp2, tmp2, -8); ++ fmul_d(v29, v17, v18); // twon24*z ++ vfrintrz_d(v29, v29); // (double)(int) ++ fnmsub_d(v28, v24, v29, v18); // v28 = z-two24A*fw ++ ftintrz_w_d(vt, v28); // (int)(z-two24A*fw) ++ alsl_d(SCR2, i, iqBase, 2 - 1); ++ fst_s(vt, SCR2, 0); ++ fadd_d(v18, v27, v29); ++ addi_w(i, i, 1); ++ addi_w(j, j, -1); ++ blt(R0, j, RECOMP_FIRST_FOR); ++ } ++ // compute n ++ fmul_d(v18, v18, v30); ++ fmul_d(v2, v18, v21); ++ vfrintrm_d(v2, v2); // v2 = floor(v2) == rounding towards -inf ++ fnmsub_d(v18, v2, v20, v18); // z -= 8.0*floor(z*0.125); ++ li(ih, 2); ++ vfrintrz_d(v2, v18); // v2 = (double)((int)z) ++ ftintrz_w_d(vt, v18); // n = (int) z; ++ movfr2gr_s(n, vt); ++ fsub_d(v18, v18, v2); // z -= (double)n; ++ ++ block_comment("q0-dependent initialization"); { ++ blt(SCR1, R0, Q0_ZERO_CMP_LT); // if (q0 > 0) ++ addi_w(j, jz, -1); // j = jz - 1 ++ alsl_d(SCR2, j, iqBase, 2 - 1); ++ ld_w(tmp2, SCR2, 0); // tmp2 = iq[jz-1] ++ beq(SCR1, R0, Q0_ZERO_CMP_EQ); ++ li(tmp4, 24); ++ sub_w(tmp4, tmp4, SCR1); // == 24 - q0 ++ srl_w(i, tmp2, tmp4); // i = iq[jz-1] >> (24-q0) ++ sll_w(tmp5, i, tmp4); ++ sub_w(tmp2, tmp2, tmp5); // iq[jz-1] -= i<<(24-q0); ++ alsl_d(SCR2, j, iqBase, 2 - 1); ++ st_w(tmp2, SCR2, 0); // store iq[jz-1] ++ addi_w(SCR2, tmp4, -1); // == 23 - q0 ++ add_w(n, n, i); // n+=i ++ srl_w(ih, tmp2, SCR2); // ih = iq[jz-1] >> (23-q0) ++ b(Q0_ZERO_CMP_DONE); ++ bind(Q0_ZERO_CMP_EQ); ++ srli_d(ih, tmp2, 23); // ih = iq[z-1] >> 23 ++ b(Q0_ZERO_CMP_DONE); ++ bind(Q0_ZERO_CMP_LT); ++ vldi(v4, -928); // 0.5 (0x3fe0000000000000) ++ fcmp_clt_d(FCC0, v18, v4); ++ if (UseCF2GR) { ++ movcf2gr(SCR2, FCC0); ++ } else { ++ movcf2fr(vt, FCC0); ++ movfr2gr_s(SCR2, vt); ++ } ++ masknez(ih, ih, SCR2); // if (z<0.5) ih = 0 ++ } ++ bind(Q0_ZERO_CMP_DONE); ++ bge(R0, ih, IH_HANDLED); ++ ++ block_comment("if(ih>) {"); { ++ // use rscratch2 as carry ++ ++ block_comment("for(i=0;i0) {"); { ++ bge(R0, SCR1, IH_AFTER_SWITCH); ++ // tmp3 still has iq[jz-1] value. no need to reload ++ // now, zero high tmp3 bits (rscratch1 number of bits) ++ li(j, 0xffffffff); ++ addi_w(i, jz, -1); // set i to jz-1 ++ srl_d(j, j, SCR1); ++ srli_w(tmp1, j, 8); ++ andr(tmp3, tmp3, tmp1); // we have 24-bit-based constants ++ alsl_d(tmp1, i, iqBase, 2 - 1); ++ st_w(tmp3, tmp1, 0); // save iq[jz-1] ++ } ++ bind(IH_AFTER_SWITCH); ++ li(tmp1, 2); ++ bne(ih, tmp1, IH_HANDLED); ++ ++ block_comment("if(ih==2) {"); { ++ vldi(v25, -912); // 1.0 (0x3ff0000000000000) ++ fsub_d(v18, v25, v18); // z = one - z; ++ beqz(SCR2, IH_HANDLED); ++ fsub_d(v18, v18, v30); // z -= scalbnA(one,q0); ++ } ++ } ++ bind(IH_HANDLED); ++ // check if recomputation is needed ++ vxor_v(vt, vt, vt); ++ fcmp_cne_d(FCC0, v18, vt); ++ bcnez(FCC0, RECOMP_CHECK_DONE_NOT_ZERO); ++ ++ block_comment("if(z==zeroB) {"); { ++ ++ block_comment("for (i=jz-1;i>=jk;i--) j |= iq[i];"); { ++ addi_w(i, jz, -1); ++ xorr(j, j, j); ++ b(RECOMP_FOR1_CHECK); ++ bind(RECOMP_FOR1); ++ alsl_d(tmp1, i, iqBase, 2 - 1); ++ ld_w(tmp1, tmp1, 0); ++ orr(j, j, tmp1); ++ addi_w(i, i, -1); ++ bind(RECOMP_FOR1_CHECK); ++ li(SCR2, 4); ++ bge(i, SCR2, RECOMP_FOR1); ++ } ++ bnez(j, RECOMP_CHECK_DONE); ++ ++ block_comment("if(j==0) {"); { ++ // for(k=1;iq[jk-k]==0;k++); // let's unroll it. jk == 4. So, read ++ // iq[3], iq[2], iq[1], iq[0] until non-zero value ++ ld_d(tmp1, iqBase, 0); // iq[0..3] ++ ld_d(tmp3, iqBase, 8); ++ li(j, 2); ++ masknez(tmp1, tmp1, tmp3); // set register for further consideration ++ orr(tmp1, tmp1, tmp3); ++ masknez(j, j, tmp3); // set initial k. Use j as k ++ srli_d(SCR2, tmp1, 32); ++ sltu(SCR2, R0, SCR2); ++ addi_w(i, jz, 1); ++ add_w(j, j, SCR2); ++ ++ block_comment("for(i=jz+1;i<=jz+k;i++) {...}"); { ++ add_w(jz, i, j); // i = jz+1, j = k-1. j+i = jz+k (which is a new jz) ++ bind(RECOMP_FOR2); ++ add_w(tmp1, jv, i); ++ alsl_d(SCR2, tmp1, twoOverPiBase, 3 - 1); ++ fld_d(v29, SCR2, 0); ++ add_w(tmp2, jx, i); ++ alsl_d(SCR2, tmp2, SP, 3 - 1); ++ fst_d(v29, SCR2, 0); ++ // f[jx+i] = /* NOTE: converted to double */ ipio2[jv+i]; //(double) ipio2[jv+i]; ++ // since jx = 0, 1 or 2 we can unroll it: ++ // for(j=0,fw=0.0;j<=jx;j++) fw += x[j]*f[jx+i-j]; ++ // f[jx+i-j] == (for first iteration) f[jx+i], which is already v29 ++ alsl_d(tmp2, tmp2, SP, 3 - 1); // address of f[jx+i] ++ fld_d(v4, tmp2, -16); // load f[jx+i-2] and f[jx+i-1] ++ fld_d(v5, tmp2, -8); ++ fmul_d(v26, v6, v29); // initial fw ++ beqz(jx, RECOMP_FW_UPDATED); ++ fmadd_d(v26, v7, v5, v26); ++ li(SCR2, 1); ++ beq(jx, SCR2, RECOMP_FW_UPDATED); ++ fmadd_d(v26, v3, v4, v26); ++ bind(RECOMP_FW_UPDATED); ++ alsl_d(SCR2, i, qBase, 3 - 1); ++ fst_d(v26, SCR2, 0); // q[i] = fw; ++ addi_w(i, i, 1); ++ bge(jz, i, RECOMP_FOR2); // jz here is "old jz" + k ++ } ++ b(RECOMPUTE); ++ } ++ } ++ } ++ bind(RECOMP_CHECK_DONE); ++ // chop off zero terms ++ vxor_v(vt, vt, vt); ++ fcmp_ceq_d(FCC0, v18, vt); ++ bcnez(FCC0, Z_IS_ZERO); ++ ++ block_comment("else block of if(z==0.0) {"); { ++ bind(RECOMP_CHECK_DONE_NOT_ZERO); ++ fmul_d(v18, v18, v22); ++ fcmp_clt_d(FCC0, v18, v24); // v24 is sltill two24A ++ bcnez(FCC0, Z_IS_LESS_THAN_TWO24B); ++ fmul_d(v1, v18, v17); // twon24*z ++ vfrintrz_d(v1, v1); // v1 = (double)(int)(v1) ++ fnmsub_d(v2, v24, v1, v18); ++ ftintrz_w_d(vt, v1); // (int)fw ++ movfr2gr_s(tmp3, vt); ++ ftintrz_w_d(vt, v2); // double to int ++ movfr2gr_s(tmp2, vt); ++ alsl_d(SCR2, jz, iqBase, 2 - 1); ++ st_w(tmp2, SCR2, 0); ++ addi_w(SCR1, SCR1, 24); ++ addi_w(jz, jz, 1); ++ st_w(tmp3, SCR2, 0); // iq[jz] = (int) fw ++ b(Z_ZERO_CHECK_DONE); ++ bind(Z_IS_LESS_THAN_TWO24B); ++ ftintrz_w_d(vt, v18); // (int)z ++ movfr2gr_s(tmp3, vt); ++ alsl_d(SCR2, jz, iqBase, 2 - 1); ++ st_w(tmp3, SCR2, 0); // iq[jz] = (int) z ++ b(Z_ZERO_CHECK_DONE); ++ } ++ ++ block_comment("if(z==0.0) {"); { ++ bind(Z_IS_ZERO); ++ addi_w(jz, jz, -1); ++ alsl_d(SCR2, jz, iqBase, 2 - 1); ++ ld_w(tmp1, SCR2, 0); ++ addi_w(SCR1, SCR1, -24); ++ beqz(tmp1, Z_IS_ZERO); ++ } ++ bind(Z_ZERO_CHECK_DONE); ++ // convert integer "bit" chunk to floating-point value ++ // v17 = twon24 ++ // update v30, which was scalbnA(1.0, ); ++ addi_w(tmp2, SCR1, 1023); // biased exponent ++ slli_d(tmp2, tmp2, 52); // put at correct position ++ move(i, jz); ++ movgr2fr_d(v30, tmp2); ++ ++ block_comment("for(i=jz;i>=0;i--) {q[i] = fw*(double)iq[i]; fw*=twon24;}"); { ++ bind(CONVERTION_FOR); ++ alsl_d(SCR2, i, iqBase, 2 - 1); ++ fld_s(v31, SCR2, 0); ++ vffintl_d_w(v31, v31); ++ fmul_d(v31, v31, v30); ++ alsl_d(SCR2, i, qBase, 3 - 1); ++ fst_d(v31, SCR2, 0); ++ fmul_d(v30, v30, v17); ++ addi_w(i, i, -1); ++ bge(i, R0, CONVERTION_FOR); ++ } ++ addi_d(SCR2, SP, 160); // base for fq ++ // reusing twoOverPiBase ++ li(twoOverPiBase, pio2); ++ ++ block_comment("compute PIo2[0,...,jp]*q[jz,...,0]. for(i=jz;i>=0;i--) {...}"); { ++ move(i, jz); ++ move(tmp2, R0); // tmp2 will keep jz - i == 0 at start ++ bind(COMP_FOR); ++ // for(fw=0.0,k=0;k<=jp&&k<=jz-i;k++) fw += PIo2[k]*q[i+k]; ++ vxor_v(v30, v30, v30); ++ alsl_d(tmp5, i, qBase, 3 - 1); // address of q[i+k] for k==0 ++ li(tmp3, 4); ++ slti(tmp4, tmp2, 5); ++ alsl_d(tmp1, i, qBase, 3 - 1); // used as q[i] address ++ masknez(tmp3, tmp3, tmp4); // min(jz - i, jp); ++ maskeqz(tmp4, tmp2, tmp4); ++ orr(tmp3, tmp3, tmp4); ++ move(tmp4, R0); // used as k ++ ++ block_comment("for(fw=0.0,k=0;k<=jp&&k<=jz-i;k++) fw += PIo2[k]*q[i+k];"); { ++ bind(COMP_INNER_LOOP); ++ alsl_d(tmp5, tmp4, tmp1, 3 - 1); ++ fld_d(v18, tmp5, 0); // q[i+k] ++ alsl_d(tmp5, tmp4, twoOverPiBase, 3 - 1); ++ fld_d(v19, tmp5, 0); // PIo2[k] ++ fmadd_d(v30, v18, v19, v30); // fw += PIo2[k]*q[i+k]; ++ addi_w(tmp4, tmp4, 1); // k++ ++ bge(tmp3, tmp4, COMP_INNER_LOOP); ++ } ++ alsl_d(tmp5, tmp2, SCR2, 3 - 1); ++ fst_d(v30, tmp5, 0); // fq[jz-i] ++ addi_d(tmp2, tmp2, 1); ++ addi_w(i, i, -1); ++ bge(i, R0, COMP_FOR); ++ } ++ ++ block_comment("switch(prec) {...}. case 2:"); { ++ // compress fq into y[] ++ // remember prec == 2 ++ ++ block_comment("for (i=jz;i>=0;i--) fw += fq[i];"); { ++ vxor_v(v4, v4, v4); ++ move(i, jz); ++ bind(FW_FOR1); ++ alsl_d(tmp5, i, SCR2, 3 - 1); ++ fld_d(v1, tmp5, 0); ++ addi_w(i, i, -1); ++ fadd_d(v4, v4, v1); ++ bge(i, R0, FW_FOR1); ++ } ++ bind(FW_FOR1_DONE); ++ // v1 contains fq[0]. so, keep it so far ++ fsub_d(v5, v1, v4); // fw = fq[0] - fw ++ beqz(ih, FW_Y0_NO_NEGATION); ++ fneg_d(v4, v4); ++ bind(FW_Y0_NO_NEGATION); ++ ++ block_comment("for (i=1;i<=jz;i++) fw += fq[i];"); { ++ li(i, 1); ++ blt(jz, i, FW_FOR2_DONE); ++ bind(FW_FOR2); ++ alsl_d(tmp5, i, SCR2, 3 - 1); ++ fld_d(v1, tmp5, 0); ++ addi_w(i, i, 1); ++ fadd_d(v5, v5, v1); ++ bge(jz, i, FW_FOR2); ++ } ++ bind(FW_FOR2_DONE); ++ beqz(ih, FW_Y1_NO_NEGATION); ++ fneg_d(v5, v5); ++ bind(FW_Y1_NO_NEGATION); ++ addi_d(SP, SP, 560); ++ } ++} ++ ++///* __kernel_sin( x, y, iy) ++// * kernel sin function on [-pi/4, pi/4], pi/4 ~ 0.7854 ++// * Input x is assumed to be bounded by ~pi/4 in magnitude. ++// * Input y is the tail of x. ++// * Input iy indicates whether y is 0. (if iy=0, y assume to be 0). ++// * ++// * Algorithm ++// * 1. Since sin(-x) = -sin(x), we need only to consider positive x. ++// * 2. if x < 2^-27 (hx<0x3e400000 0), return x with inexact if x!=0. ++// * 3. sin(x) is approximated by a polynomial of degree 13 on ++// * [0,pi/4] ++// * 3 13 ++// * sin(x) ~ x + S1*x + ... + S6*x ++// * where ++// * ++// * |sin(x) 2 4 6 8 10 12 | -58 ++// * |----- - (1+S1*x +S2*x +S3*x +S4*x +S5*x +S6*x )| <= 2 ++// * | x | ++// * ++// * 4. sin(x+y) = sin(x) + sin'(x')*y ++// * ~ sin(x) + (1-x*x/2)*y ++// * For better accuracy, let ++// * 3 2 2 2 2 ++// * r = x *(S2+x *(S3+x *(S4+x *(S5+x *S6)))) ++// * then 3 2 ++// * sin(x) = x + (S1*x + (x *(r-y/2)+y)) ++// */ ++//static const double ++//S1 = -1.66666666666666324348e-01, /* 0xBFC55555, 0x55555549 */ ++//S2 = 8.33333333332248946124e-03, /* 0x3F811111, 0x1110F8A6 */ ++//S3 = -1.98412698298579493134e-04, /* 0xBF2A01A0, 0x19C161D5 */ ++//S4 = 2.75573137070700676789e-06, /* 0x3EC71DE3, 0x57B1FE7D */ ++//S5 = -2.50507602534068634195e-08, /* 0xBE5AE5E6, 0x8A2B9CEB */ ++//S6 = 1.58969099521155010221e-10; /* 0x3DE5D93A, 0x5ACFD57C */ ++// ++// NOTE: S1..S6 were moved into a table: StubRoutines::la::_dsin_coef ++// ++// BEGIN __kernel_sin PSEUDO CODE ++// ++//static double __kernel_sin(double x, double y, bool iy) ++//{ ++// double z,r,v; ++// ++// // NOTE: not needed. moved to dsin/dcos ++// //int ix; ++// //ix = high(x)&0x7fffffff; /* high word of x */ ++// ++// // NOTE: moved to dsin/dcos ++// //if(ix<0x3e400000) /* |x| < 2**-27 */ ++// // {if((int)x==0) return x;} /* generate inexact */ ++// ++// z = x*x; ++// v = z*x; ++// r = S2+z*(S3+z*(S4+z*(S5+z*S6))); ++// if(iy==0) return x+v*(S1+z*r); ++// else return x-((z*(half*y-v*r)-y)-v*S1); ++//} ++// ++// END __kernel_sin PSEUDO CODE ++// ++// Changes between fdlibm and intrinsic: ++// 1. Removed |x| < 2**-27 check, because if was done earlier in dsin/dcos ++// 2. Constants are now loaded from table dsin_coef ++// 3. C code parameter "int iy" was modified to "bool iyIsOne", because ++// iy is always 0 or 1. Also, iyIsOne branch was moved into ++// generation phase instead of taking it during code execution ++// Input and output: ++// 1. Input for generated function: X argument = x ++// 2. Input for generator: x = register to read argument from, iyIsOne ++// = flag to use low argument low part or not, dsin_coef = coefficients ++// table address ++// 3. Return sin(x) value in FA0 ++void MacroAssembler::generate_kernel_sin(FloatRegister x, bool iyIsOne, address dsin_coef) { ++ FloatRegister y = FA5, z = FA6, v = FA7, r = FT0, s1 = FT1, s2 = FT2, ++ s3 = FT3, s4 = FT4, s5 = FT5, s6 = FT6, half = FT7; ++ li(SCR2, dsin_coef); ++ fld_d(s5, SCR2, 32); ++ fld_d(s6, SCR2, 40); ++ fmul_d(z, x, x); // z = x*x; ++ fld_d(s1, SCR2, 0); ++ fld_d(s2, SCR2, 8); ++ fld_d(s3, SCR2, 16); ++ fld_d(s4, SCR2, 24); ++ fmul_d(v, z, x); // v = z*x; ++ ++ block_comment("calculate r = S2+z*(S3+z*(S4+z*(S5+z*S6)))"); { ++ fmadd_d(r, z, s6, s5); ++ // initialize "half" in current block to utilize 2nd FPU. However, it's ++ // not a part of this block ++ vldi(half, -928); // 0.5 (0x3fe0000000000000) ++ fmadd_d(r, z, r, s4); ++ fmadd_d(r, z, r, s3); ++ fmadd_d(r, z, r, s2); ++ } ++ ++ if (!iyIsOne) { ++ // return x+v*(S1+z*r); ++ fmadd_d(s1, z, r, s1); ++ fmadd_d(FA0, v, s1, x); ++ } else { ++ // return x-((z*(half*y-v*r)-y)-v*S1); ++ fmul_d(s6, half, y); // half*y ++ fnmsub_d(s6, v, r, s6); // half*y-v*r ++ fnmsub_d(s6, z, s6, y); // y - z*(half*y-v*r) = - (z*(half*y-v*r)-y) ++ fmadd_d(s6, v, s1, s6); // - (z*(half*y-v*r)-y) + v*S1 == -((z*(half*y-v*r)-y)-v*S1) ++ fadd_d(FA0, x, s6); ++ } ++} ++ ++///* ++// * __kernel_cos( x, y ) ++// * kernel cos function on [-pi/4, pi/4], pi/4 ~ 0.785398164 ++// * Input x is assumed to be bounded by ~pi/4 in magnitude. ++// * Input y is the tail of x. ++// * ++// * Algorithm ++// * 1. Since cos(-x) = cos(x), we need only to consider positive x. ++// * 2. if x < 2^-27 (hx<0x3e400000 0), return 1 with inexact if x!=0. ++// * 3. cos(x) is approximated by a polynomial of degree 14 on ++// * [0,pi/4] ++// * 4 14 ++// * cos(x) ~ 1 - x*x/2 + C1*x + ... + C6*x ++// * where the remez error is ++// * ++// * | 2 4 6 8 10 12 14 | -58 ++// * |cos(x)-(1-.5*x +C1*x +C2*x +C3*x +C4*x +C5*x +C6*x )| <= 2 ++// * | | ++// * ++// * 4 6 8 10 12 14 ++// * 4. let r = C1*x +C2*x +C3*x +C4*x +C5*x +C6*x , then ++// * cos(x) = 1 - x*x/2 + r ++// * since cos(x+y) ~ cos(x) - sin(x)*y ++// * ~ cos(x) - x*y, ++// * a correction term is necessary in cos(x) and hence ++// * cos(x+y) = 1 - (x*x/2 - (r - x*y)) ++// * For better accuracy when x > 0.3, let qx = |x|/4 with ++// * the last 32 bits mask off, and if x > 0.78125, let qx = 0.28125. ++// * Then ++// * cos(x+y) = (1-qx) - ((x*x/2-qx) - (r-x*y)). ++// * Note that 1-qx and (x*x/2-qx) is EXACT here, and the ++// * magnitude of the latter is at least a quarter of x*x/2, ++// * thus, reducing the rounding error in the subtraction. ++// */ ++// ++//static const double ++//C1 = 4.16666666666666019037e-02, /* 0x3FA55555, 0x5555554C */ ++//C2 = -1.38888888888741095749e-03, /* 0xBF56C16C, 0x16C15177 */ ++//C3 = 2.48015872894767294178e-05, /* 0x3EFA01A0, 0x19CB1590 */ ++//C4 = -2.75573143513906633035e-07, /* 0xBE927E4F, 0x809C52AD */ ++//C5 = 2.08757232129817482790e-09, /* 0x3E21EE9E, 0xBDB4B1C4 */ ++//C6 = -1.13596475577881948265e-11; /* 0xBDA8FAE9, 0xBE8838D4 */ ++// ++// NOTE: C1..C6 were moved into a table: StubRoutines::la::_dcos_coef ++// ++// BEGIN __kernel_cos PSEUDO CODE ++// ++//static double __kernel_cos(double x, double y) ++//{ ++// double a,h,z,r,qx=0; ++// ++// // NOTE: ix is already initialized in dsin/dcos. Reuse value from register ++// //int ix; ++// //ix = high(x)&0x7fffffff; /* ix = |x|'s high word*/ ++// ++// // NOTE: moved to dsin/dcos ++// //if(ix<0x3e400000) { /* if x < 2**27 */ ++// // if(((int)x)==0) return one; /* generate inexact */ ++// //} ++// ++// z = x*x; ++// r = z*(C1+z*(C2+z*(C3+z*(C4+z*(C5+z*C6))))); ++// if(ix < 0x3FD33333) /* if |x| < 0.3 */ ++// return one - (0.5*z - (z*r - x*y)); ++// else { ++// if(ix > 0x3fe90000) { /* x > 0.78125 */ ++// qx = 0.28125; ++// } else { ++// set_high(&qx, ix-0x00200000); /* x/4 */ ++// set_low(&qx, 0); ++// } ++// h = 0.5*z-qx; ++// a = one-qx; ++// return a - (h - (z*r-x*y)); ++// } ++//} ++// ++// END __kernel_cos PSEUDO CODE ++// ++// Changes between fdlibm and intrinsic: ++// 1. Removed |x| < 2**-27 check, because if was done earlier in dsin/dcos ++// 2. Constants are now loaded from table dcos_coef ++// Input and output: ++// 1. Input for generated function: X argument = x ++// 2. Input for generator: x = register to read argument from, dcos_coef ++// = coefficients table address ++// 3. Return cos(x) value in FA0 ++void MacroAssembler::generate_kernel_cos(FloatRegister x, address dcos_coef) { ++ Register ix = A3; ++ FloatRegister qx = FA1, h = FA2, a = FA3, y = FA5, z = FA6, r = FA7, C1 = FT0, ++ C2 = FT1, C3 = FT2, C4 = FT3, C5 = FT4, C6 = FT5, one = FT6, half = FT7; ++ Label IX_IS_LARGE, SET_QX_CONST, DONE, QX_SET; ++ li(SCR2, dcos_coef); ++ fld_d(C1, SCR2, 0); ++ fld_d(C2, SCR2, 8); ++ fld_d(C3, SCR2, 16); ++ fld_d(C4, SCR2, 24); ++ fld_d(C5, SCR2, 32); ++ fld_d(C6, SCR2, 40); ++ fmul_d(z, x, x); // z=x^2 ++ block_comment("calculate r = z*(C1+z*(C2+z*(C3+z*(C4+z*(C5+z*C6)))))"); { ++ fmadd_d(r, z, C6, C5); ++ vldi(half, -928); // 0.5 (0x3fe0000000000000) ++ fmadd_d(r, z, r, C4); ++ fmul_d(y, x, y); ++ fmadd_d(r, z, r, C3); ++ li(SCR1, 0x3FD33333); ++ fmadd_d(r, z, r, C2); ++ fmul_d(x, z, z); // x = z^2 ++ fmadd_d(r, z, r, C1); // r = C1+z(C2+z(C4+z(C5+z*C6))) ++ } ++ // need to multiply r by z to have "final" r value ++ vldi(one, -912); // 1.0 (0x3ff0000000000000) ++ bge(ix, SCR1, IX_IS_LARGE); ++ block_comment("if(ix < 0x3FD33333) return one - (0.5*z - (z*r - x*y))"); { ++ // return 1.0 - (0.5*z - (z*r - x*y)) = 1.0 - (0.5*z + (x*y - z*r)) ++ fnmsub_d(FA0, x, r, y); ++ fmadd_d(FA0, half, z, FA0); ++ fsub_d(FA0, one, FA0); ++ b(DONE); ++ } ++ block_comment("if(ix >= 0x3FD33333)"); { ++ bind(IX_IS_LARGE); ++ li(SCR2, 0x3FE90000); ++ blt(SCR2, ix, SET_QX_CONST); ++ block_comment("set_high(&qx, ix-0x00200000); set_low(&qx, 0);"); { ++ li(SCR2, 0x00200000); ++ sub_w(SCR2, ix, SCR2); ++ slli_d(SCR2, SCR2, 32); ++ movgr2fr_d(qx, SCR2); ++ } ++ b(QX_SET); ++ bind(SET_QX_CONST); ++ block_comment("if(ix > 0x3fe90000) qx = 0.28125;"); { ++ vldi(qx, -942); // 0.28125 (0x3fd2000000000000) ++ } ++ bind(QX_SET); ++ fmsub_d(C6, x, r, y); // z*r - xy ++ fmsub_d(h, half, z, qx); // h = 0.5*z - qx ++ fsub_d(a, one, qx); // a = 1-qx ++ fsub_d(C6, h, C6); // = h - (z*r - x*y) ++ fsub_d(FA0, a, C6); ++ } ++ bind(DONE); ++} ++ ++// generate_dsin_dcos creates stub for dsin and dcos ++// Generation is done via single call because dsin and dcos code is almost the ++// same(see C code below). These functions work as follows: ++// 1) handle corner cases: |x| ~< pi/4, x is NaN or INF, |x| < 2**-27 ++// 2) perform argument reduction if required ++// 3) call kernel_sin or kernel_cos which approximate sin/cos via polynomial ++// ++// BEGIN dsin/dcos PSEUDO CODE ++// ++//dsin_dcos(jdouble x, bool isCos) { ++// double y[2],z=0.0; ++// int n, ix; ++// ++// /* High word of x. */ ++// ix = high(x); ++// ++// /* |x| ~< pi/4 */ ++// ix &= 0x7fffffff; ++// if(ix <= 0x3fe921fb) return isCos ? __kernel_cos : __kernel_sin(x,z,0); ++// ++// /* sin/cos(Inf or NaN) is NaN */ ++// else if (ix>=0x7ff00000) return x-x; ++// else if (ix<0x3e400000) { /* if ix < 2**27 */ ++// if(((int)x)==0) return isCos ? one : x; /* generate inexact */ ++// } ++// /* argument reduction needed */ ++// else { ++// n = __ieee754_rem_pio2(x,y); ++// switch(n&3) { ++// case 0: return isCos ? __kernel_cos(y[0],y[1]) : __kernel_sin(y[0],y[1], true); ++// case 1: return isCos ? -__kernel_sin(y[0],y[1],true) : __kernel_cos(y[0],y[1]); ++// case 2: return isCos ? -__kernel_cos(y[0],y[1]) : -__kernel_sin(y[0],y[1], true); ++// default: ++// return isCos ? __kernel_sin(y[0],y[1],1) : -__kernel_cos(y[0],y[1]); ++// } ++// } ++//} ++// END dsin/dcos PSEUDO CODE ++// ++// Changes between fdlibm and intrinsic: ++// 1. Moved ix < 2**27 from kernel_sin/kernel_cos into dsin/dcos ++// 2. Final switch use equivalent bit checks(tbz/tbnz) ++// Input and output: ++// 1. Input for generated function: X = A0 ++// 2. Input for generator: isCos = generate sin or cos, npio2_hw = address ++// of npio2_hw table, two_over_pi = address of two_over_pi table, ++// pio2 = address if pio2 table, dsin_coef = address if dsin_coef table, ++// dcos_coef = address of dcos_coef table ++// 3. Return result in FA0 ++// NOTE: general purpose register names match local variable names in C code ++void MacroAssembler::generate_dsin_dcos(bool isCos, address npio2_hw, ++ address two_over_pi, address pio2, ++ address dsin_coef, address dcos_coef) { ++ Label DONE, ARG_REDUCTION, TINY_X, RETURN_SIN, EARLY_CASE; ++ Register X = A0, absX = A1, n = A2, ix = A3; ++ FloatRegister y0 = FA4, y1 = FA5; ++ ++ block_comment("check |x| ~< pi/4, NaN, Inf and |x| < 2**-27 cases"); { ++ movfr2gr_d(X, FA0); ++ li(SCR2, 0x3e400000); ++ li(SCR1, 0x3fe921fb); // high word of pi/4. ++ bstrpick_d(absX, X, 62, 0); // absX ++ li(T0, 0x7ff0000000000000); ++ srli_d(ix, absX, 32); // set ix ++ blt(ix, SCR2, TINY_X); // handle tiny x (|x| < 2^-27) ++ bge(SCR1, ix, EARLY_CASE); // if(ix <= 0x3fe921fb) return ++ blt(absX, T0, ARG_REDUCTION); ++ // X is NaN or INF(i.e. 0x7FF* or 0xFFF*). Return NaN (mantissa != 0). ++ // Set last bit unconditionally to make it NaN ++ ori(T0, T0, 1); ++ movgr2fr_d(FA0, T0); ++ jr(RA); ++ } ++ block_comment("kernel_sin/kernel_cos: if(ix<0x3e400000) {}"); { ++ bind(TINY_X); ++ if (isCos) { ++ vldi(FA0, -912); // 1.0 (0x3ff0000000000000) ++ } ++ jr(RA); ++ } ++ bind(ARG_REDUCTION); /* argument reduction needed */ ++ block_comment("n = __ieee754_rem_pio2(x,y);"); { ++ generate__ieee754_rem_pio2(npio2_hw, two_over_pi, pio2); ++ } ++ block_comment("switch(n&3) {case ... }"); { ++ if (isCos) { ++ srli_w(T0, n, 1); ++ xorr(absX, n, T0); ++ andi(T0, n, 1); ++ bnez(T0, RETURN_SIN); ++ } else { ++ andi(T0, n, 1); ++ beqz(T0, RETURN_SIN); ++ } ++ generate_kernel_cos(y0, dcos_coef); ++ if (isCos) { ++ andi(T0, absX, 1); ++ beqz(T0, DONE); ++ } else { ++ andi(T0, n, 2); ++ beqz(T0, DONE); ++ } ++ fneg_d(FA0, FA0); ++ jr(RA); ++ bind(RETURN_SIN); ++ generate_kernel_sin(y0, true, dsin_coef); ++ if (isCos) { ++ andi(T0, absX, 1); ++ beqz(T0, DONE); ++ } else { ++ andi(T0, n, 2); ++ beqz(T0, DONE); ++ } ++ fneg_d(FA0, FA0); ++ jr(RA); ++ } ++ bind(EARLY_CASE); ++ vxor_v(y1, y1, y1); ++ if (isCos) { ++ generate_kernel_cos(FA0, dcos_coef); ++ } else { ++ generate_kernel_sin(FA0, false, dsin_coef); ++ } ++ bind(DONE); ++ jr(RA); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/matcher_loongarch.hpp b/src/hotspot/cpu/loongarch/matcher_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/matcher_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/matcher_loongarch.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,181 @@ ++/* ++ * Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_MATCHER_LOONGARCH_HPP ++#define CPU_LOONGARCH_MATCHER_LOONGARCH_HPP ++ ++ // Defined within class Matcher ++ ++ // false => size gets scaled to BytesPerLong, ok. ++ static const bool init_array_count_is_in_bytes = false; ++ ++ // Whether this platform implements the scalable vector feature ++ static const bool implements_scalable_vector = false; ++ ++ static const bool supports_scalable_vector() { ++ return false; ++ } ++ ++ // LoongArch support misaligned vectors store/load ++ static constexpr bool misaligned_vectors_ok() { ++ return true; ++ } ++ ++ // Whether code generation need accurate ConvI2L types. ++ static const bool convi2l_type_required = false; ++ ++ // Does the CPU require late expand (see block.cpp for description of late expand)? ++ static const bool require_postalloc_expand = false; ++ ++ // Do we need to mask the count passed to shift instructions or does ++ // the cpu only look at the lower 5/6 bits anyway? ++ static const bool need_masked_shift_count = false; ++ ++ // LA supports generic vector operands: vReg. ++ static const bool supports_generic_vector_operands = true; ++ ++ static constexpr bool isSimpleConstant64(jlong value) { ++ // Will one (StoreL ConL) be cheaper than two (StoreI ConI)?. ++ // Probably always true, even if a temp register is required. ++ return true; ++ } ++ ++ // No additional cost for CMOVL. ++ static constexpr int long_cmove_cost() { return 0; } ++ ++ // No CMOVF/CMOVD with SSE2 ++ static int float_cmove_cost() { return ConditionalMoveLimit; } ++ ++ static bool narrow_oop_use_complex_address() { ++ assert(UseCompressedOops, "only for compressed oops code"); ++ return false; ++ } ++ ++ static bool narrow_klass_use_complex_address() { ++ assert(UseCompressedClassPointers, "only for compressed klass code"); ++ return false; ++ } ++ ++ static bool const_oop_prefer_decode() { ++ // Prefer ConN+DecodeN over ConP. ++ return true; ++ } ++ ++ static bool const_klass_prefer_decode() { ++ // TODO: Either support matching DecodeNKlass (heap-based) in operand ++ // or condisider the following: ++ // Prefer ConNKlass+DecodeNKlass over ConP in simple compressed klass mode. ++ //return CompressedKlassPointers::base() == nullptr; ++ return true; ++ } ++ ++ // Is it better to copy float constants, or load them directly from memory? ++ // Intel can load a float constant from a direct address, requiring no ++ // extra registers. Most RISCs will have to materialize an address into a ++ // register first, so they would do better to copy the constant from stack. ++ static const bool rematerialize_float_constants = false; ++ ++ // If CPU can load and store mis-aligned doubles directly then no fixup is ++ // needed. Else we split the double into 2 integer pieces and move it ++ // piece-by-piece. Only happens when passing doubles into C code as the ++ // Java calling convention forces doubles to be aligned. ++ static const bool misaligned_doubles_ok = false; ++ ++ // Advertise here if the CPU requires explicit rounding operations to implement strictfp mode. ++ static const bool strict_fp_requires_explicit_rounding = false; ++ ++ // Are floats converted to double when stored to stack during ++ // deoptimization? ++ static constexpr bool float_in_double() { return false; } ++ ++ // Do ints take an entire long register or just half? ++ static const bool int_in_long = true; ++ ++ // Does the CPU supports vector variable shift instructions? ++ static constexpr bool supports_vector_variable_shifts(void) { ++ return true; ++ } ++ ++ // Does the CPU supports vector variable rotate instructions? ++ static constexpr bool supports_vector_variable_rotates(void) { ++ return true; ++ } ++ ++ // Does the CPU supports vector constant rotate instructions? ++ static constexpr bool supports_vector_constant_rotates(int shift) { ++ return true; ++ } ++ ++ // Does the CPU supports vector unsigned comparison instructions? ++ static constexpr bool supports_vector_comparison_unsigned(int vlen, BasicType bt) { ++ return true; ++ } ++ ++ // Some microarchitectures have mask registers used on vectors ++ static const bool has_predicated_vectors(void) { ++ return false; ++ } ++ ++ // true means we have fast l2f conversion ++ // false means that conversion is done by runtime call ++ static constexpr bool convL2FSupported(void) { ++ return true; ++ } ++ ++ // Implements a variant of EncodeISOArrayNode that encode ASCII only ++ static const bool supports_encode_ascii_array = true; ++ ++ // No mask is used for the vector test ++ static constexpr bool vectortest_needs_second_argument(bool is_alltrue, bool is_predicate) { ++ return false; ++ } ++ ++ // BoolTest mask for vector test intrinsics ++ static constexpr BoolTest::mask vectortest_mask(bool is_alltrue, bool is_predicate, int vlen) { ++ return is_alltrue ? BoolTest::eq : BoolTest::ne; ++ } ++ ++ // Returns pre-selection estimated size of a vector operation. ++ static int vector_op_pre_select_sz_estimate(int vopc, BasicType ety, int vlen) { ++ switch(vopc) { ++ default: return 0; ++ case Op_RoundVF: // fall through ++ case Op_RoundVD: { ++ return 30; ++ } ++ } ++ } ++ // Returns pre-selection estimated size of a scalar operation. ++ static int scalar_op_pre_select_sz_estimate(int vopc, BasicType ety) { ++ switch(vopc) { ++ default: return 0; ++ case Op_RoundF: // fall through ++ case Op_RoundD: { ++ return 30; ++ } ++ } ++ } ++ ++#endif // CPU_LOONGARCH_MATCHER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/methodHandles_loongarch.cpp b/src/hotspot/cpu/loongarch/methodHandles_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/methodHandles_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/methodHandles_loongarch.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,568 @@ ++/* ++ * Copyright (c) 1997, 2014, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "classfile/javaClasses.inline.hpp" ++#include "classfile/vmClasses.hpp" ++#include "interpreter/interpreter.hpp" ++#include "interpreter/interpreterRuntime.hpp" ++#include "memory/allocation.inline.hpp" ++#include "prims/jvmtiExport.hpp" ++#include "prims/methodHandles.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "utilities/preserveException.hpp" ++ ++#define __ _masm-> ++ ++#ifdef PRODUCT ++#define BLOCK_COMMENT(str) // nothing ++#define STOP(error) stop(error) ++#else ++#define BLOCK_COMMENT(str) __ block_comment(str) ++#define STOP(error) block_comment(error); __ stop(error) ++#endif ++ ++#define BIND(label) bind(label); BLOCK_COMMENT(#label ":") ++ ++void MethodHandles::load_klass_from_Class(MacroAssembler* _masm, Register klass_reg) { ++ if (VerifyMethodHandles) ++ verify_klass(_masm, klass_reg, VM_CLASS_ID(java_lang_Class), ++ "MH argument is a Class"); ++ __ ld_d(klass_reg, Address(klass_reg, java_lang_Class::klass_offset())); ++} ++ ++#ifdef ASSERT ++static int check_nonzero(const char* xname, int x) { ++ assert(x != 0, "%s should be nonzero", xname); ++ return x; ++} ++#define NONZERO(x) check_nonzero(#x, x) ++#else //ASSERT ++#define NONZERO(x) (x) ++#endif //ASSERT ++ ++#ifdef ASSERT ++void MethodHandles::verify_klass(MacroAssembler* _masm, ++ Register obj_reg, vmClassID klass_id, ++ const char* error_message) { ++} ++ ++void MethodHandles::verify_ref_kind(MacroAssembler* _masm, int ref_kind, Register member_reg, Register temp) { ++ Label L; ++ BLOCK_COMMENT("verify_ref_kind {"); ++ __ ld_w(temp, Address(member_reg, NONZERO(java_lang_invoke_MemberName::flags_offset()))); ++ __ srai_w(temp, temp, java_lang_invoke_MemberName::MN_REFERENCE_KIND_SHIFT); ++ __ li(AT, java_lang_invoke_MemberName::MN_REFERENCE_KIND_MASK); ++ __ andr(temp, temp, AT); ++ __ li(AT, ref_kind); ++ __ beq(temp, AT, L); ++ { char* buf = NEW_C_HEAP_ARRAY(char, 100, mtInternal); ++ jio_snprintf(buf, 100, "verify_ref_kind expected %x", ref_kind); ++ if (ref_kind == JVM_REF_invokeVirtual || ++ ref_kind == JVM_REF_invokeSpecial) ++ // could do this for all ref_kinds, but would explode assembly code size ++ trace_method_handle(_masm, buf); ++ __ STOP(buf); ++ } ++ BLOCK_COMMENT("} verify_ref_kind"); ++ __ bind(L); ++} ++ ++#endif //ASSERT ++ ++void MethodHandles::jump_from_method_handle(MacroAssembler* _masm, Register method, Register temp, ++ bool for_compiler_entry) { ++ assert(method == Rmethod, "interpreter calling convention"); ++ ++ Label L_no_such_method; ++ __ beq(method, R0, L_no_such_method); ++ ++ __ verify_method_ptr(method); ++ ++ if (!for_compiler_entry && JvmtiExport::can_post_interpreter_events()) { ++ Label run_compiled_code; ++ // JVMTI events, such as single-stepping, are implemented partly by avoiding running ++ // compiled code in threads for which the event is enabled. Check here for ++ // interp_only_mode if these events CAN be enabled. ++ ++ // interp_only is an int, on little endian it is sufficient to test the byte only ++ // Is a cmpl faster? ++ __ ld_bu(AT, TREG, in_bytes(JavaThread::interp_only_mode_offset())); ++ __ beq(AT, R0, run_compiled_code); ++ __ ld_d(T4, method, in_bytes(Method::interpreter_entry_offset())); ++ __ jr(T4); ++ __ BIND(run_compiled_code); ++ } ++ ++ const ByteSize entry_offset = for_compiler_entry ? Method::from_compiled_offset() : ++ Method::from_interpreted_offset(); ++ __ ld_d(T4, method, in_bytes(entry_offset)); ++ __ jr(T4); ++ ++ __ bind(L_no_such_method); ++ address wrong_method = StubRoutines::throw_AbstractMethodError_entry(); ++ __ jmp(wrong_method, relocInfo::runtime_call_type); ++} ++ ++void MethodHandles::jump_to_lambda_form(MacroAssembler* _masm, ++ Register recv, Register method_temp, ++ Register temp2, ++ bool for_compiler_entry) { ++ BLOCK_COMMENT("jump_to_lambda_form {"); ++ // This is the initial entry point of a lazy method handle. ++ // After type checking, it picks up the invoker from the LambdaForm. ++ assert_different_registers(recv, method_temp, temp2); ++ assert(recv != noreg, "required register"); ++ assert(method_temp == Rmethod, "required register for loading method"); ++ ++ //NOT_PRODUCT({ FlagSetting fs(TraceMethodHandles, true); trace_method_handle(_masm, "LZMH"); }); ++ ++ // Load the invoker, as MH -> MH.form -> LF.vmentry ++ __ verify_oop(recv); ++ __ load_heap_oop(method_temp, Address(recv, NONZERO(java_lang_invoke_MethodHandle::form_offset())), temp2, SCR1); ++ __ verify_oop(method_temp); ++ __ load_heap_oop(method_temp, Address(method_temp, NONZERO(java_lang_invoke_LambdaForm::vmentry_offset())), temp2, SCR1); ++ __ verify_oop(method_temp); ++ __ load_heap_oop(method_temp, Address(method_temp, NONZERO(java_lang_invoke_MemberName::method_offset())), temp2, SCR1); ++ __ verify_oop(method_temp); ++ __ access_load_at(T_ADDRESS, IN_HEAP, method_temp, Address(method_temp, NONZERO(java_lang_invoke_ResolvedMethodName::vmtarget_offset())), noreg, noreg); ++ ++ if (VerifyMethodHandles && !for_compiler_entry) { ++ // make sure recv is already on stack ++ __ ld_d(temp2, Address(method_temp, Method::const_offset())); ++ __ load_sized_value(temp2, ++ Address(temp2, ConstMethod::size_of_parameters_offset()), ++ sizeof(u2), false); ++ Label L; ++ __ ld_d(AT, __ argument_address(temp2, -1)); ++ __ beq(recv, AT, L); ++ __ ld_d(V0, __ argument_address(temp2, -1)); ++ __ STOP("receiver not on stack"); ++ __ BIND(L); ++ } ++ ++ jump_from_method_handle(_masm, method_temp, temp2, for_compiler_entry); ++ BLOCK_COMMENT("} jump_to_lambda_form"); ++} ++ ++ ++// Code generation ++address MethodHandles::generate_method_handle_interpreter_entry(MacroAssembler* _masm, ++ vmIntrinsics::ID iid) { ++ const bool not_for_compiler_entry = false; // this is the interpreter entry ++ assert(is_signature_polymorphic(iid), "expected invoke iid"); ++ if (iid == vmIntrinsics::_invokeGeneric || ++ iid == vmIntrinsics::_compiledLambdaForm) { ++ // Perhaps surprisingly, the symbolic references visible to Java are not directly used. ++ // They are linked to Java-generated adapters via MethodHandleNatives.linkMethod. ++ // They all allow an appendix argument. ++ __ stop("empty stubs make SG sick"); ++ return nullptr; ++ } ++ ++ // No need in interpreter entry for linkToNative for now. ++ // Interpreter calls compiled entry through i2c. ++ if (iid == vmIntrinsics::_linkToNative) { ++ __ stop("Should not reach here"); // empty stubs make SG sick ++ return nullptr; ++ } ++ ++ // Rmethod: Method* ++ // T4: argument locator (parameter slot count, added to sp) ++ // S7: used as temp to hold mh or receiver ++ Register t4_argp = T4; // argument list ptr, live on error paths ++ Register s7_mh = S7; // MH receiver; dies quickly and is recycled ++ Register rm_method = Rmethod; // eventual target of this invocation ++ ++ // here's where control starts out: ++ __ align(CodeEntryAlignment); ++ address entry_point = __ pc(); ++ ++ if (VerifyMethodHandles) { ++ assert(Method::intrinsic_id_size_in_bytes() == 2, "assuming Method::_intrinsic_id is u2"); ++ ++ Label L; ++ BLOCK_COMMENT("verify_intrinsic_id {"); ++ __ ld_hu(AT, Address(rm_method, Method::intrinsic_id_offset())); ++ guarantee(Assembler::is_simm(vmIntrinsics::as_int(iid), 12), "Oops, iid is not simm12! Change the instructions."); ++ __ addi_d(AT, AT, -1 * (int) iid); ++ __ beq(AT, R0, L); ++ if (iid == vmIntrinsics::_linkToVirtual || ++ iid == vmIntrinsics::_linkToSpecial) { ++ // could do this for all kinds, but would explode assembly code size ++ trace_method_handle(_masm, "bad Method*::intrinsic_id"); ++ } ++ __ STOP("bad Method*::intrinsic_id"); ++ __ bind(L); ++ BLOCK_COMMENT("} verify_intrinsic_id"); ++ } ++ ++ // First task: Find out how big the argument list is. ++ Address t4_first_arg_addr; ++ int ref_kind = signature_polymorphic_intrinsic_ref_kind(iid); ++ assert(ref_kind != 0 || iid == vmIntrinsics::_invokeBasic, "must be _invokeBasic or a linkTo intrinsic"); ++ if (ref_kind == 0 || MethodHandles::ref_kind_has_receiver(ref_kind)) { ++ __ ld_d(t4_argp, Address(rm_method, Method::const_offset())); ++ __ load_sized_value(t4_argp, ++ Address(t4_argp, ConstMethod::size_of_parameters_offset()), ++ sizeof(u2), false); ++ // assert(sizeof(u2) == sizeof(Method::_size_of_parameters), ""); ++ t4_first_arg_addr = __ argument_address(t4_argp, -1); ++ } else { ++ DEBUG_ONLY(t4_argp = noreg); ++ } ++ ++ if (!is_signature_polymorphic_static(iid)) { ++ __ ld_d(s7_mh, t4_first_arg_addr); ++ DEBUG_ONLY(t4_argp = noreg); ++ } ++ ++ // t4_first_arg_addr is live! ++ ++ trace_method_handle_interpreter_entry(_masm, iid); ++ ++ if (iid == vmIntrinsics::_invokeBasic) { ++ generate_method_handle_dispatch(_masm, iid, s7_mh, noreg, not_for_compiler_entry); ++ ++ } else { ++ // Adjust argument list by popping the trailing MemberName argument. ++ Register r_recv = noreg; ++ if (MethodHandles::ref_kind_has_receiver(ref_kind)) { ++ // Load the receiver (not the MH; the actual MemberName's receiver) up from the interpreter stack. ++ __ ld_d(r_recv = T2, t4_first_arg_addr); ++ } ++ DEBUG_ONLY(t4_argp = noreg); ++ Register rm_member = rm_method; // MemberName ptr; incoming method ptr is dead now ++ __ pop(rm_member); // extract last argument ++ generate_method_handle_dispatch(_masm, iid, r_recv, rm_member, not_for_compiler_entry); ++ } ++ ++ return entry_point; ++} ++ ++void MethodHandles::jump_to_native_invoker(MacroAssembler* _masm, Register nep_reg, Register temp_target) { ++ BLOCK_COMMENT("jump_to_native_invoker {"); ++ assert_different_registers(nep_reg, temp_target); ++ assert(nep_reg != noreg, "required register"); ++ ++ // Load the invoker, as NEP -> .invoker ++ __ verify_oop(nep_reg); ++ __ access_load_at(T_ADDRESS, IN_HEAP, temp_target, ++ Address(nep_reg, NONZERO(jdk_internal_foreign_abi_NativeEntryPoint::downcall_stub_address_offset_in_bytes())), ++ noreg, noreg); ++ ++ __ jr(temp_target); ++ BLOCK_COMMENT("} jump_to_native_invoker"); ++} ++ ++void MethodHandles::generate_method_handle_dispatch(MacroAssembler* _masm, ++ vmIntrinsics::ID iid, ++ Register receiver_reg, ++ Register member_reg, ++ bool for_compiler_entry) { ++ assert(is_signature_polymorphic(iid), "expected invoke iid"); ++ Register rm_method = Rmethod; // eventual target of this invocation ++ // temps used in this code are not used in *either* compiled or interpreted calling sequences ++ Register temp1 = T8; ++ Register temp2 = T3; ++ Register temp3 = T5; ++ if (for_compiler_entry) { ++ assert(receiver_reg == (iid == vmIntrinsics::_linkToStatic || iid == vmIntrinsics::_linkToNative ? noreg : RECEIVER), "only valid assignment"); ++ } ++ else { ++ assert_different_registers(temp1, temp2, temp3, saved_last_sp_register()); // don't trash lastSP ++ } ++ assert_different_registers(temp1, temp2, temp3, receiver_reg); ++ assert_different_registers(temp1, temp2, temp3, member_reg); ++ ++ if (iid == vmIntrinsics::_invokeBasic) { ++ // indirect through MH.form.vmentry.vmtarget ++ jump_to_lambda_form(_masm, receiver_reg, rm_method, temp1, for_compiler_entry); ++ } else if (iid == vmIntrinsics::_linkToNative) { ++ assert(for_compiler_entry, "only compiler entry is supported"); ++ jump_to_native_invoker(_masm, member_reg, temp1); ++ } else { ++ // The method is a member invoker used by direct method handles. ++ if (VerifyMethodHandles) { ++ // make sure the trailing argument really is a MemberName (caller responsibility) ++ verify_klass(_masm, member_reg, VM_CLASS_ID(java_lang_invoke_MemberName), ++ "MemberName required for invokeVirtual etc."); ++ } ++ ++ Address member_clazz( member_reg, NONZERO(java_lang_invoke_MemberName::clazz_offset())); ++ Address member_vmindex( member_reg, NONZERO(java_lang_invoke_MemberName::vmindex_offset())); ++ Address member_vmtarget( member_reg, NONZERO(java_lang_invoke_MemberName::method_offset())); ++ Address vmtarget_method( rm_method, NONZERO(java_lang_invoke_ResolvedMethodName::vmtarget_offset())); ++ ++ Register temp1_recv_klass = temp1; ++ if (iid != vmIntrinsics::_linkToStatic) { ++ __ verify_oop(receiver_reg); ++ if (iid == vmIntrinsics::_linkToSpecial) { ++ // Don't actually load the klass; just null-check the receiver. ++ __ null_check(receiver_reg); ++ } else { ++ // load receiver klass itself ++ __ load_klass(temp1_recv_klass, receiver_reg); ++ __ verify_klass_ptr(temp1_recv_klass); ++ } ++ BLOCK_COMMENT("check_receiver {"); ++ // The receiver for the MemberName must be in receiver_reg. ++ // Check the receiver against the MemberName.clazz ++ if (VerifyMethodHandles && iid == vmIntrinsics::_linkToSpecial) { ++ // Did not load it above... ++ __ load_klass(temp1_recv_klass, receiver_reg); ++ __ verify_klass_ptr(temp1_recv_klass); ++ } ++ if (VerifyMethodHandles && iid != vmIntrinsics::_linkToInterface) { ++ Label L_ok; ++ Register temp2_defc = temp2; ++ __ load_heap_oop(temp2_defc, member_clazz, temp3, SCR1); ++ load_klass_from_Class(_masm, temp2_defc); ++ __ verify_klass_ptr(temp2_defc); ++ __ check_klass_subtype(temp1_recv_klass, temp2_defc, temp3, L_ok); ++ // If we get here, the type check failed! ++ __ STOP("receiver class disagrees with MemberName.clazz"); ++ __ bind(L_ok); ++ } ++ BLOCK_COMMENT("} check_receiver"); ++ } ++ if (iid == vmIntrinsics::_linkToSpecial || ++ iid == vmIntrinsics::_linkToStatic) { ++ DEBUG_ONLY(temp1_recv_klass = noreg); // these guys didn't load the recv_klass ++ } ++ ++ // Live registers at this point: ++ // member_reg - MemberName that was the trailing argument ++ // temp1_recv_klass - klass of stacked receiver, if needed ++ ++ Label L_incompatible_class_change_error; ++ switch (iid) { ++ case vmIntrinsics::_linkToSpecial: ++ if (VerifyMethodHandles) { ++ verify_ref_kind(_masm, JVM_REF_invokeSpecial, member_reg, temp3); ++ } ++ __ load_heap_oop(rm_method, member_vmtarget, temp3, SCR1); ++ __ access_load_at(T_ADDRESS, IN_HEAP, rm_method, vmtarget_method, noreg, noreg); ++ break; ++ ++ case vmIntrinsics::_linkToStatic: ++ if (VerifyMethodHandles) { ++ verify_ref_kind(_masm, JVM_REF_invokeStatic, member_reg, temp3); ++ } ++ __ load_heap_oop(rm_method, member_vmtarget, temp3, SCR1); ++ __ access_load_at(T_ADDRESS, IN_HEAP, rm_method, vmtarget_method, noreg, noreg); ++ break; ++ ++ case vmIntrinsics::_linkToVirtual: ++ { ++ // same as TemplateTable::invokevirtual, ++ // minus the CP setup and profiling: ++ ++ if (VerifyMethodHandles) { ++ verify_ref_kind(_masm, JVM_REF_invokeVirtual, member_reg, temp3); ++ } ++ ++ // pick out the vtable index from the MemberName, and then we can discard it: ++ Register temp2_index = temp2; ++ __ access_load_at(T_ADDRESS, IN_HEAP, temp2_index, member_vmindex, noreg, noreg); ++ if (VerifyMethodHandles) { ++ Label L_index_ok; ++ __ blt(R0, temp2_index, L_index_ok); ++ __ STOP("no virtual index"); ++ __ BIND(L_index_ok); ++ } ++ ++ // Note: The verifier invariants allow us to ignore MemberName.clazz and vmtarget ++ // at this point. And VerifyMethodHandles has already checked clazz, if needed. ++ ++ // get target Method* & entry point ++ __ lookup_virtual_method(temp1_recv_klass, temp2_index, rm_method); ++ break; ++ } ++ ++ case vmIntrinsics::_linkToInterface: ++ { ++ // same as TemplateTable::invokeinterface ++ // (minus the CP setup and profiling, with different argument motion) ++ if (VerifyMethodHandles) { ++ verify_ref_kind(_masm, JVM_REF_invokeInterface, member_reg, temp3); ++ } ++ ++ Register temp3_intf = temp3; ++ __ load_heap_oop(temp3_intf, member_clazz, temp2, SCR1); ++ load_klass_from_Class(_masm, temp3_intf); ++ __ verify_klass_ptr(temp3_intf); ++ ++ Register rm_index = rm_method; ++ __ access_load_at(T_ADDRESS, IN_HEAP, rm_index, member_vmindex, noreg, noreg); ++ if (VerifyMethodHandles) { ++ Label L; ++ __ bge(rm_index, R0, L); ++ __ STOP("invalid vtable index for MH.invokeInterface"); ++ __ bind(L); ++ } ++ ++ // given intf, index, and recv klass, dispatch to the implementation method ++ __ lookup_interface_method(temp1_recv_klass, temp3_intf, ++ // note: next two args must be the same: ++ rm_index, rm_method, ++ temp2, ++ L_incompatible_class_change_error); ++ break; ++ } ++ ++ default: ++ fatal("unexpected intrinsic %d: %s", vmIntrinsics::as_int(iid), vmIntrinsics::name_at(iid)); ++ break; ++ } ++ ++ // Live at this point: ++ // rm_method ++ ++ // After figuring out which concrete method to call, jump into it. ++ // Note that this works in the interpreter with no data motion. ++ // But the compiled version will require that r_recv be shifted out. ++ __ verify_method_ptr(rm_method); ++ jump_from_method_handle(_masm, rm_method, temp1, for_compiler_entry); ++ ++ if (iid == vmIntrinsics::_linkToInterface) { ++ __ bind(L_incompatible_class_change_error); ++ address icce_entry= StubRoutines::throw_IncompatibleClassChangeError_entry(); ++ __ jmp(icce_entry, relocInfo::runtime_call_type); ++ } ++ } ++} ++ ++#ifndef PRODUCT ++void trace_method_handle_stub(const char* adaptername, ++ oop mh, ++ intptr_t* saved_regs, ++ intptr_t* entry_sp) { ++ // called as a leaf from native code: do not block the JVM! ++ bool has_mh = (strstr(adaptername, "/static") == nullptr && ++ strstr(adaptername, "linkTo") == nullptr); // static linkers don't have MH ++ const char* mh_reg_name = has_mh ? "s7_mh" : "s7"; ++ tty->print_cr("MH %s %s=" PTR_FORMAT " sp=" PTR_FORMAT, ++ adaptername, mh_reg_name, ++ p2i(mh), p2i(entry_sp)); ++ ++ if (Verbose) { ++ tty->print_cr("Registers:"); ++ const int saved_regs_count = Register::number_of_registers; ++ for (int i = 0; i < saved_regs_count; i++) { ++ Register r = as_Register(i); ++ // The registers are stored in reverse order on the stack (by pusha). ++ tty->print("%3s=" PTR_FORMAT, r->name(), saved_regs[((saved_regs_count - 1) - i)]); ++ if ((i + 1) % 4 == 0) { ++ tty->cr(); ++ } else { ++ tty->print(", "); ++ } ++ } ++ tty->cr(); ++ ++ { ++ // dumping last frame with frame::describe ++ ++ JavaThread* p = JavaThread::active(); ++ ++ ResourceMark rm; ++ // may not be needed by safer and unexpensive here ++ PreserveExceptionMark pem(Thread::current()); ++ FrameValues values; ++ ++ // Note: We want to allow trace_method_handle from any call site. ++ // While trace_method_handle creates a frame, it may be entered ++ // without a PC on the stack top (e.g. not just after a call). ++ // Walking that frame could lead to failures due to that invalid PC. ++ // => carefully detect that frame when doing the stack walking ++ ++ // Current C frame ++ frame cur_frame = os::current_frame(); ++ ++ // Robust search of trace_calling_frame (independent of inlining). ++ // Assumes saved_regs comes from a pusha in the trace_calling_frame. ++ assert(cur_frame.sp() < saved_regs, "registers not saved on stack ?"); ++ frame trace_calling_frame = os::get_sender_for_C_frame(&cur_frame); ++ while (trace_calling_frame.fp() < saved_regs) { ++ trace_calling_frame = os::get_sender_for_C_frame(&trace_calling_frame); ++ } ++ ++ // safely create a frame and call frame::describe ++ intptr_t *dump_sp = trace_calling_frame.sender_sp(); ++ intptr_t *dump_fp = trace_calling_frame.link(); ++ ++ bool walkable = has_mh; // whether the traced frame should be walkable ++ ++ if (walkable) { ++ // The previous definition of walkable may have to be refined ++ // if new call sites cause the next frame constructor to start ++ // failing. Alternatively, frame constructors could be ++ // modified to support the current or future non walkable ++ // frames (but this is more intrusive and is not considered as ++ // part of this RFE, which will instead use a simpler output). ++ frame dump_frame = frame(dump_sp, dump_fp); ++ dump_frame.describe(values, 1); ++ } else { ++ // Stack may not be walkable (invalid PC above FP): ++ // Add descriptions without building a Java frame to avoid issues ++ values.describe(-1, dump_fp, "fp for #1 "); ++ values.describe(-1, dump_sp, "sp for #1"); ++ } ++ values.describe(-1, entry_sp, "raw top of stack"); ++ ++ tty->print_cr("Stack layout:"); ++ values.print(p); ++ } ++ if (has_mh && oopDesc::is_oop(mh)) { ++ mh->print(); ++ if (java_lang_invoke_MethodHandle::is_instance(mh)) { ++ java_lang_invoke_MethodHandle::form(mh)->print(); ++ } ++ } ++ } ++} ++ ++// The stub wraps the arguments in a struct on the stack to avoid ++// dealing with the different calling conventions for passing 6 ++// arguments. ++struct MethodHandleStubArguments { ++ const char* adaptername; ++ oopDesc* mh; ++ intptr_t* saved_regs; ++ intptr_t* entry_sp; ++}; ++void trace_method_handle_stub_wrapper(MethodHandleStubArguments* args) { ++ trace_method_handle_stub(args->adaptername, ++ args->mh, ++ args->saved_regs, ++ args->entry_sp); ++} ++ ++void MethodHandles::trace_method_handle(MacroAssembler* _masm, const char* adaptername) { ++} ++#endif //PRODUCT +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/methodHandles_loongarch.hpp b/src/hotspot/cpu/loongarch/methodHandles_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/methodHandles_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/methodHandles_loongarch.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,65 @@ ++/* ++ * Copyright (c) 2010, 2012, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++// Platform-specific definitions for method handles. ++// These definitions are inlined into class MethodHandles. ++ ++// Adapters ++enum /* platform_dependent_constants */ { ++ adapter_code_size = 32000 DEBUG_ONLY(+ 150000) ++}; ++ ++// Additional helper methods for MethodHandles code generation: ++public: ++ static void load_klass_from_Class(MacroAssembler* _masm, Register klass_reg); ++ ++ static void verify_klass(MacroAssembler* _masm, ++ Register obj, vmClassID klass_id, ++ const char* error_message = "wrong klass") NOT_DEBUG_RETURN; ++ ++ static void verify_method_handle(MacroAssembler* _masm, Register mh_reg) { ++ verify_klass(_masm, mh_reg, VM_CLASS_ID(MethodHandle_klass), ++ "reference is a MH"); ++ } ++ ++ static void verify_ref_kind(MacroAssembler* _masm, int ref_kind, Register member_reg, Register temp) NOT_DEBUG_RETURN; ++ ++ // Similar to InterpreterMacroAssembler::jump_from_interpreted. ++ // Takes care of special dispatch from single stepping too. ++ static void jump_from_method_handle(MacroAssembler* _masm, Register method, Register temp, ++ bool for_compiler_entry); ++ ++ static void jump_to_lambda_form(MacroAssembler* _masm, ++ Register recv, Register method_temp, ++ Register temp2, ++ bool for_compiler_entry); ++ ++ static void jump_to_native_invoker(MacroAssembler* _masm, ++ Register nep_reg, Register temp); ++ ++ static Register saved_last_sp_register() { ++ // Should be in sharedRuntime, not here. ++ return R3; ++ } +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/nativeInst_loongarch.cpp b/src/hotspot/cpu/loongarch/nativeInst_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/nativeInst_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/nativeInst_loongarch.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,537 @@ ++/* ++ * Copyright (c) 1997, 2014, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "code/codeCache.hpp" ++#include "code/compiledIC.hpp" ++#include "memory/resourceArea.hpp" ++#include "nativeInst_loongarch.hpp" ++#include "oops/oop.inline.hpp" ++#include "runtime/handles.hpp" ++#include "runtime/safepoint.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "utilities/ostream.hpp" ++#ifndef PRODUCT ++#include "compiler/disassembler.hpp" ++#endif ++ ++void NativeInstruction::wrote(int offset) { ++ ICache::invalidate_word(addr_at(offset)); ++} ++ ++void NativeInstruction::set_long_at(int offset, long i) { ++ address addr = addr_at(offset); ++ *(long*)addr = i; ++ ICache::invalidate_range(addr, 8); ++} ++ ++bool NativeInstruction::is_int_branch() { ++ int op = Assembler::high(insn_word(), 6); ++ return op == Assembler::beqz_op || op == Assembler::bnez_op || ++ op == Assembler::beq_op || op == Assembler::bne_op || ++ op == Assembler::blt_op || op == Assembler::bge_op || ++ op == Assembler::bltu_op || op == Assembler::bgeu_op; ++} ++ ++bool NativeInstruction::is_float_branch() { ++ return Assembler::high(insn_word(), 6) == Assembler::bccondz_op; ++} ++ ++bool NativeInstruction::is_lu12iw_lu32id() const { ++ return Assembler::high(int_at(0), 7) == Assembler::lu12i_w_op && ++ Assembler::high(int_at(4), 7) == Assembler::lu32i_d_op; ++} ++ ++bool NativeInstruction::is_pcaddu12i_add() const { ++ return Assembler::high(int_at(0), 7) == Assembler::pcaddu12i_op && ++ Assembler::high(int_at(4), 10) == Assembler::addi_d_op; ++} ++ ++bool NativeCall::is_bl() const { ++ return Assembler::high(int_at(0), 6) == Assembler::bl_op; ++} ++ ++void NativeCall::verify() { ++ assert(is_bl(), "not a NativeCall"); ++} ++ ++address NativeCall::target_addr_for_bl(address orig_addr) const { ++ address addr = orig_addr ? orig_addr : addr_at(0); ++ ++ // bl ++ if (is_bl()) { ++ return addr + (Assembler::simm26(((int_at(0) & 0x3ff) << 16) | ++ ((int_at(0) >> 10) & 0xffff)) << 2); ++ } ++ ++ fatal("not a NativeCall"); ++ return nullptr; ++} ++ ++address NativeCall::destination() const { ++ address addr = (address)this; ++ address destination = target_addr_for_bl(); ++ // Do we use a trampoline stub for this call? ++ // Trampoline stubs are located behind the main code. ++ if (destination > addr) { ++ // Filter out recursive method invocation (call to verified/unverified entry point). ++ CodeBlob* cb = CodeCache::find_blob(addr); ++ assert(cb && cb->is_nmethod(), "sanity"); ++ nmethod *nm = (nmethod *)cb; ++ NativeInstruction* ni = nativeInstruction_at(destination); ++ if (nm->stub_contains(destination) && ni->is_NativeCallTrampolineStub_at()) { ++ // Yes we do, so get the destination from the trampoline stub. ++ const address trampoline_stub_addr = destination; ++ destination = nativeCallTrampolineStub_at(trampoline_stub_addr)->destination(); ++ } ++ } ++ return destination; ++} ++ ++// Similar to replace_mt_safe, but just changes the destination. The ++// important thing is that free-running threads are able to execute this ++// call instruction at all times. ++// ++// Used in the runtime linkage of calls; see class CompiledIC. ++// ++// Add parameter assert_lock to switch off assertion ++// during code generation, where no patching lock is needed. ++void NativeCall::set_destination_mt_safe(address dest, bool assert_lock) { ++ assert(!assert_lock || ++ (Patching_lock->is_locked() || SafepointSynchronize::is_at_safepoint()) || ++ CompiledICLocker::is_safe(addr_at(0)), ++ "concurrent code patching"); ++ ++ ResourceMark rm; ++ address addr_call = addr_at(0); ++ bool reachable = MacroAssembler::reachable_from_branch_short(dest - addr_call); ++ assert(NativeCall::is_call_at(addr_call), "unexpected code at call site"); ++ ++ // Patch the call. ++ if (!reachable) { ++ address trampoline_stub_addr = get_trampoline(); ++ assert (trampoline_stub_addr != nullptr, "we need a trampoline"); ++ guarantee(Assembler::is_simm((trampoline_stub_addr - addr_call) >> 2, 26), "cannot reach trampoline stub"); ++ ++ // Patch the constant in the call's trampoline stub. ++ NativeInstruction* ni = nativeInstruction_at(dest); ++ assert (! ni->is_NativeCallTrampolineStub_at(), "chained trampolines"); ++ nativeCallTrampolineStub_at(trampoline_stub_addr)->set_destination(dest); ++ dest = trampoline_stub_addr; ++ } ++ set_destination(dest); ++} ++ ++address NativeCall::get_trampoline() { ++ address call_addr = addr_at(0); ++ ++ CodeBlob *code = CodeCache::find_blob(call_addr); ++ assert(code != nullptr, "Could not find the containing code blob"); ++ ++ address bl_destination ++ = nativeCall_at(call_addr)->target_addr_for_bl(); ++ NativeInstruction* ni = nativeInstruction_at(bl_destination); ++ if (code->contains(bl_destination) && ++ ni->is_NativeCallTrampolineStub_at()) ++ return bl_destination; ++ ++ if (code->is_nmethod()) { ++ return trampoline_stub_Relocation::get_trampoline_for(call_addr, (nmethod*)code); ++ } ++ ++ return nullptr; ++} ++ ++void NativeCall::set_destination(address dest) { ++ address addr_call = addr_at(0); ++ CodeBuffer cb(addr_call, instruction_size); ++ MacroAssembler masm(&cb); ++ assert(is_call_at(addr_call), "unexpected call type"); ++ jlong offs = dest - addr_call; ++ masm.bl(offs >> 2); ++ ICache::invalidate_range(addr_call, instruction_size); ++} ++ ++// Generate a trampoline for a branch to dest. If there's no need for a ++// trampoline, simply patch the call directly to dest. ++address NativeCall::trampoline_jump(CodeBuffer &cbuf, address dest) { ++ MacroAssembler a(&cbuf); ++ address stub = nullptr; ++ ++ if (a.far_branches() ++ && ! is_NativeCallTrampolineStub_at()) { ++ stub = a.emit_trampoline_stub(instruction_address() - cbuf.insts()->start(), dest); ++ } ++ ++ if (stub == nullptr) { ++ // If we generated no stub, patch this call directly to dest. ++ // This will happen if we don't need far branches or if there ++ // already was a trampoline. ++ set_destination(dest); ++ } ++ ++ return stub; ++} ++ ++void NativeCall::print() { ++ tty->print_cr(PTR_FORMAT ": call " PTR_FORMAT, ++ p2i(instruction_address()), p2i(destination())); ++} ++ ++// Inserts a native call instruction at a given pc ++void NativeCall::insert(address code_pos, address entry) { ++ //TODO: LA ++ guarantee(0, "LA not implemented yet"); ++} ++ ++// MT-safe patching of a call instruction. ++// First patches first word of instruction to two jmp's that jmps to themselves ++// (spinlock). Then patches the last byte, and then atomically replaces ++// the jmp's with the first 4 byte of the new instruction. ++void NativeCall::replace_mt_safe(address instr_addr, address code_buffer) { ++ Unimplemented(); ++} ++ ++bool NativeFarCall::is_short() const { ++ return Assembler::high(int_at(0), 10) == Assembler::andi_op && ++ Assembler::low(int_at(0), 22) == 0 && ++ Assembler::high(int_at(4), 6) == Assembler::bl_op; ++} ++ ++bool NativeFarCall::is_far() const { ++ return Assembler::high(int_at(0), 7) == Assembler::pcaddu18i_op && ++ Assembler::high(int_at(4), 6) == Assembler::jirl_op && ++ Assembler::low(int_at(4), 5) == RA->encoding(); ++} ++ ++address NativeFarCall::destination(address orig_addr) const { ++ address addr = orig_addr ? orig_addr : addr_at(0); ++ ++ if (is_short()) { ++ // short ++ return addr + BytesPerInstWord + ++ (Assembler::simm26(((int_at(4) & 0x3ff) << 16) | ++ ((int_at(4) >> 10) & 0xffff)) << 2); ++ } ++ ++ if (is_far()) { ++ // far ++ return addr + ((intptr_t)Assembler::simm20(int_at(0) >> 5 & 0xfffff) << 18) + ++ (Assembler::simm16(int_at(4) >> 10 & 0xffff) << 2); ++ } ++ ++ fatal("not a NativeFarCall"); ++ return nullptr; ++} ++ ++void NativeFarCall::set_destination(address dest) { ++ address addr_call = addr_at(0); ++ CodeBuffer cb(addr_call, instruction_size); ++ MacroAssembler masm(&cb); ++ assert(is_far_call_at(addr_call), "unexpected call type"); ++ masm.patchable_call(dest, addr_call); ++ ICache::invalidate_range(addr_call, instruction_size); ++} ++ ++void NativeFarCall::verify() { ++ assert(is_short() || is_far(), "not a NativeFarcall"); ++} ++ ++//------------------------------------------------------------------- ++ ++bool NativeMovConstReg::is_lu12iw_ori_lu32id() const { ++ return Assembler::high(int_at(0), 7) == Assembler::lu12i_w_op && ++ Assembler::high(int_at(4), 10) == Assembler::ori_op && ++ Assembler::high(int_at(8), 7) == Assembler::lu32i_d_op; ++} ++ ++bool NativeMovConstReg::is_lu12iw_lu32id_nop() const { ++ return Assembler::high(int_at(0), 7) == Assembler::lu12i_w_op && ++ Assembler::high(int_at(4), 7) == Assembler::lu32i_d_op && ++ Assembler::high(int_at(8), 10) == Assembler::andi_op; ++} ++ ++bool NativeMovConstReg::is_lu12iw_2nop() const { ++ return Assembler::high(int_at(0), 7) == Assembler::lu12i_w_op && ++ Assembler::high(int_at(4), 10) == Assembler::andi_op && ++ Assembler::high(int_at(8), 10) == Assembler::andi_op; ++} ++ ++bool NativeMovConstReg::is_lu12iw_ori_nop() const { ++ return Assembler::high(int_at(0), 7) == Assembler::lu12i_w_op && ++ Assembler::high(int_at(4), 10) == Assembler::ori_op && ++ Assembler::high(int_at(8), 10) == Assembler::andi_op; ++} ++ ++bool NativeMovConstReg::is_ori_2nop() const { ++ return Assembler::high(int_at(0), 10) == Assembler::ori_op && ++ Assembler::high(int_at(4), 10) == Assembler::andi_op && ++ Assembler::high(int_at(8), 10) == Assembler::andi_op; ++} ++ ++bool NativeMovConstReg::is_addid_2nop() const { ++ return Assembler::high(int_at(0), 10) == Assembler::addi_d_op && ++ Assembler::high(int_at(4), 10) == Assembler::andi_op && ++ Assembler::high(int_at(8), 10) == Assembler::andi_op; ++} ++ ++void NativeMovConstReg::verify() { ++ assert(is_li52(), "not a mov reg, imm52"); ++} ++ ++void NativeMovConstReg::print() { ++ tty->print_cr(PTR_FORMAT ": mov reg, " INTPTR_FORMAT, ++ p2i(instruction_address()), data()); ++} ++ ++intptr_t NativeMovConstReg::data() const { ++ if (is_lu12iw_ori_lu32id()) { ++ return Assembler::merge((intptr_t)((int_at(4) >> 10) & 0xfff), ++ (intptr_t)((int_at(0) >> 5) & 0xfffff), ++ (intptr_t)((int_at(8) >> 5) & 0xfffff)); ++ } ++ ++ if (is_lu12iw_lu32id_nop()) { ++ return Assembler::merge((intptr_t)0, ++ (intptr_t)((int_at(0) >> 5) & 0xfffff), ++ (intptr_t)((int_at(4) >> 5) & 0xfffff)); ++ } ++ ++ if (is_lu12iw_2nop()) { ++ return Assembler::merge((intptr_t)0, ++ (intptr_t)((int_at(0) >> 5) & 0xfffff)); ++ } ++ ++ if (is_lu12iw_ori_nop()) { ++ return Assembler::merge((intptr_t)((int_at(4) >> 10) & 0xfff), ++ (intptr_t)((int_at(0) >> 5) & 0xfffff)); ++ } ++ ++ if (is_ori_2nop()) { ++ return (int_at(0) >> 10) & 0xfff; ++ } ++ ++ if (is_addid_2nop()) { ++ return Assembler::simm12((int_at(0) >> 10) & 0xfff); ++ } ++ ++#ifndef PRODUCT ++ Disassembler::decode(addr_at(0), addr_at(0) + 16, tty); ++#endif ++ fatal("not a mov reg, imm52"); ++ return 0; // unreachable ++} ++ ++void NativeMovConstReg::set_data(intptr_t x, intptr_t o) { ++ CodeBuffer cb(addr_at(0), instruction_size); ++ MacroAssembler masm(&cb); ++ masm.patchable_li52(as_Register(int_at(0) & 0x1f), x); ++ ICache::invalidate_range(addr_at(0), instruction_size); ++ ++ // Find and replace the oop/metadata corresponding to this ++ // instruction in oops section. ++ CodeBlob* blob = CodeCache::find_blob(instruction_address()); ++ nmethod* nm = blob->as_nmethod_or_null(); ++ if (nm != nullptr) { ++ o = o ? o : x; ++ RelocIterator iter(nm, instruction_address(), next_instruction_address()); ++ while (iter.next()) { ++ if (iter.type() == relocInfo::oop_type) { ++ oop* oop_addr = iter.oop_reloc()->oop_addr(); ++ *oop_addr = cast_to_oop(o); ++ break; ++ } else if (iter.type() == relocInfo::metadata_type) { ++ Metadata** metadata_addr = iter.metadata_reloc()->metadata_addr(); ++ *metadata_addr = (Metadata*)o; ++ break; ++ } ++ } ++ } ++} ++ ++//------------------------------------------------------------------- ++ ++int NativeMovRegMem::offset() const{ ++ //TODO: LA ++ guarantee(0, "LA not implemented yet"); ++ return 0; // mute compiler ++} ++ ++void NativeMovRegMem::set_offset(int x) { ++ //TODO: LA ++ guarantee(0, "LA not implemented yet"); ++} ++ ++void NativeMovRegMem::verify() { ++ //TODO: LA ++ guarantee(0, "LA not implemented yet"); ++} ++ ++ ++void NativeMovRegMem::print() { ++ //TODO: LA ++ guarantee(0, "LA not implemented yet"); ++} ++ ++bool NativeInstruction::is_sigill_not_entrant() { ++ return uint_at(0) == NativeIllegalInstruction::instruction_code; ++} ++ ++bool NativeInstruction::is_stop() { ++ return uint_at(0) == 0x04000000; // csrrd R0 0 ++} ++ ++void NativeIllegalInstruction::insert(address code_pos) { ++ *(juint*)code_pos = instruction_code; ++ ICache::invalidate_range(code_pos, instruction_size); ++} ++ ++void NativeJump::verify() { ++ assert(is_short() || is_far(), "not a general jump instruction"); ++} ++ ++bool NativeJump::is_short() { ++ return Assembler::high(insn_word(), 6) == Assembler::b_op; ++} ++ ++bool NativeJump::is_far() { ++ return Assembler::high(int_at(0), 7) == Assembler::pcaddu18i_op && ++ Assembler::high(int_at(4), 6) == Assembler::jirl_op && ++ Assembler::low(int_at(4), 5) == R0->encoding(); ++} ++ ++address NativeJump::jump_destination(address orig_addr) { ++ address addr = orig_addr ? orig_addr : addr_at(0); ++ address ret = (address)-1; ++ ++ // short ++ if (is_short()) { ++ ret = addr + (Assembler::simm26(((int_at(0) & 0x3ff) << 16) | ++ ((int_at(0) >> 10) & 0xffff)) << 2); ++ return ret == instruction_address() ? (address)-1 : ret; ++ } ++ ++ // far ++ if (is_far()) { ++ ret = addr + ((intptr_t)Assembler::simm20(int_at(0) >> 5 & 0xfffff) << 18) + ++ (Assembler::simm16(int_at(4) >> 10 & 0xffff) << 2); ++ return ret == instruction_address() ? (address)-1 : ret; ++ } ++ ++ fatal("not a jump"); ++ return nullptr; ++} ++ ++void NativeJump::set_jump_destination(address dest) { ++ OrderAccess::fence(); ++ ++ CodeBuffer cb(addr_at(0), instruction_size); ++ MacroAssembler masm(&cb); ++ masm.patchable_jump(dest); ++ ICache::invalidate_range(addr_at(0), instruction_size); ++} ++ ++void NativeGeneralJump::insert_unconditional(address code_pos, address entry) { ++ //TODO: LA ++ guarantee(0, "LA not implemented yet"); ++} ++ ++// MT-safe patching of a long jump instruction. ++// First patches first word of instruction to two jmp's that jmps to themselves ++// (spinlock). Then patches the last byte, and then atomically replaces ++// the jmp's with the first 4 byte of the new instruction. ++void NativeGeneralJump::replace_mt_safe(address instr_addr, address code_buffer) { ++ //TODO: LA ++ guarantee(0, "LA not implemented yet"); ++} ++ ++// Must ensure atomicity ++void NativeJump::patch_verified_entry(address entry, address verified_entry, address dest) { ++ assert(dest == SharedRuntime::get_handle_wrong_method_stub(), "expected fixed destination of patch"); ++ jlong offs = dest - verified_entry; ++ ++ if (MacroAssembler::reachable_from_branch_short(offs)) { ++ CodeBuffer cb(verified_entry, 1 * BytesPerInstWord); ++ MacroAssembler masm(&cb); ++ masm.b(dest); ++ } else { ++ // We use an illegal instruction for marking a method as ++ // not_entrant. ++ NativeIllegalInstruction::insert(verified_entry); ++ } ++ ICache::invalidate_range(verified_entry, 1 * BytesPerInstWord); ++} ++ ++bool NativeInstruction::is_safepoint_poll() { ++ // ++ // 390 li T2, 0x0000000000400000 #@loadConP ++ // 394 st_w [SP + #12], V1 # spill 9 ++ // 398 Safepoint @ [T2] : poll for GC @ safePoint_poll # spec.benchmarks.compress.Decompressor::decompress @ bci:224 L[0]=A6 L[1]=_ L[2]=sp + #28 L[3]=_ L[4]=V1 ++ // ++ // 0x000000ffe5815130: lu12i_w t2, 0x40 ++ // 0x000000ffe5815134: st_w v1, 0xc(sp) ; OopMap{a6=Oop off=920} ++ // ;*goto ++ // ; - spec.benchmarks.compress.Decompressor::decompress@224 (line 584) ++ // ++ // 0x000000ffe5815138: ld_w at, 0x0(t2) ;*goto <--- PC ++ // ; - spec.benchmarks.compress.Decompressor::decompress@224 (line 584) ++ // ++ ++ // Since there may be some spill instructions between the safePoint_poll and loadConP, ++ // we check the safepoint instruction like this. ++ return Assembler::high(insn_word(), 10) == Assembler::ld_w_op && ++ Assembler::low(insn_word(), 5) == AT->encoding(); ++} ++ ++void NativePostCallNop::make_deopt() { ++ NativeDeoptInstruction::insert(addr_at(0)); ++} ++ ++void NativePostCallNop::patch(jint diff) { ++ assert(diff != 0, "must be"); ++ assert(check(), "must be"); ++ ++ int lo = (diff & 0xffff); ++ int hi = ((diff >> 16) & 0xffff); ++ ++ uint32_t *code_pos_first = (uint32_t *) addr_at(4); ++ uint32_t *code_pos_second = (uint32_t *) addr_at(8); ++ ++ int opcode = (Assembler::ori_op << 17); ++ ++ *((uint32_t *)(code_pos_first)) = (uint32_t) ((opcode | lo) << 5); ++ *((uint32_t *)(code_pos_second)) = (uint32_t) ((opcode | hi) << 5); ++} ++ ++void NativeDeoptInstruction::verify() { ++} ++ ++// Inserts an undefined instruction at a given pc ++void NativeDeoptInstruction::insert(address code_pos) { ++ *(uint32_t *)code_pos = instruction_code; ++ ICache::invalidate_range(code_pos, instruction_size); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/nativeInst_loongarch.hpp b/src/hotspot/cpu/loongarch/nativeInst_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/nativeInst_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/nativeInst_loongarch.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,595 @@ ++/* ++ * Copyright (c) 1997, 2011, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_NATIVEINST_LOONGARCH_HPP ++#define CPU_LOONGARCH_NATIVEINST_LOONGARCH_HPP ++ ++#include "asm/assembler.hpp" ++#include "asm/macroAssembler.hpp" ++#include "runtime/continuation.hpp" ++#include "runtime/icache.hpp" ++#include "runtime/os.hpp" ++#include "runtime/safepointMechanism.hpp" ++ ++// We have interfaces for the following instructions: ++// - NativeInstruction ++// - - NativeCall ++// - - NativeMovConstReg ++// - - NativeMovConstRegPatching ++// - - NativeMovRegMem ++// - - NativeMovRegMemPatching ++// - - NativeIllegalOpCode ++// - - NativeGeneralJump ++// - - NativePushConst ++// - - NativeTstRegMem ++// - - NativePostCallNop ++// - - NativeDeoptInstruction ++ ++// The base class for different kinds of native instruction abstractions. ++// Provides the primitive operations to manipulate code relative to this. ++ ++class NativeInstruction { ++ friend class Relocation; ++ ++ public: ++ enum loongarch_specific_constants { ++ nop_instruction_code = 0, ++ nop_instruction_size = 4, ++ sync_instruction_code = 0xf, ++ sync_instruction_size = 4 ++ }; ++ ++ inline bool is_nop() const { ++ uint32_t insn = *(uint32_t*)addr_at(0); ++ return insn == 0b00000011010000000000000000000000; // andi r0, r0, 0 ++ } ++ bool is_sync() { return Assembler::high(insn_word(), 17) == Assembler::dbar_op; } ++ inline bool is_call(); ++ inline bool is_far_call(); ++ inline bool is_illegal(); ++ bool is_jump(); ++ bool is_safepoint_poll(); ++ ++ // Helper func for jvmci ++ bool is_lu12iw_lu32id() const; ++ bool is_pcaddu12i_add() const; ++ ++ // LoongArch has no instruction to generate a illegal instruction exception? ++ // But `break 11` is not illegal instruction for LoongArch. ++ static int illegal_instruction(); ++ ++ bool is_int_branch(); ++ bool is_float_branch(); ++ ++ inline bool is_NativeCallTrampolineStub_at(); ++ //We use an illegal instruction for marking a method as not_entrant or zombie. ++ bool is_sigill_not_entrant(); ++ bool is_stop(); ++ ++ protected: ++ address addr_at(int offset) const { return address(this) + offset; } ++ address instruction_address() const { return addr_at(0); } ++ address next_instruction_address() const { return addr_at(BytesPerInstWord); } ++ address prev_instruction_address() const { return addr_at(-BytesPerInstWord); } ++ ++ s_char sbyte_at(int offset) const { return *(s_char*) addr_at(offset); } ++ u_char ubyte_at(int offset) const { return *(u_char*) addr_at(offset); } ++ ++ jint int_at(int offset) const { return *(jint*) addr_at(offset); } ++ juint uint_at(int offset) const { return *(juint*) addr_at(offset); } ++ ++ intptr_t ptr_at(int offset) const { return *(intptr_t*) addr_at(offset); } ++ ++ oop oop_at (int offset) const { return *(oop*) addr_at(offset); } ++ int long_at(int offset) const { return *(jint*)addr_at(offset); } ++ ++ ++ void set_char_at(int offset, char c) { *addr_at(offset) = (u_char)c; wrote(offset); } ++ void set_int_at(int offset, jint i) { *(jint*)addr_at(offset) = i; wrote(offset); } ++ void set_ptr_at (int offset, intptr_t ptr) { *(intptr_t*) addr_at(offset) = ptr; wrote(offset); } ++ void set_oop_at (int offset, oop o) { *(oop*) addr_at(offset) = o; wrote(offset); } ++ void set_long_at(int offset, long i); ++ ++ int insn_word() const { return long_at(0); } ++ ++ void wrote(int offset); ++ ++ public: ++ ++ // unit test stuff ++ static void test() {} // override for testing ++ ++ inline friend NativeInstruction* nativeInstruction_at(address address); ++}; ++ ++inline NativeInstruction* nativeInstruction_at(address address) { ++ NativeInstruction* inst = (NativeInstruction*)address; ++#ifdef ASSERT ++ //inst->verify(); ++#endif ++ return inst; ++} ++ ++class NativeCall; ++inline NativeCall* nativeCall_at(address address); ++ ++// The NativeCall is an abstraction for accessing/manipulating native call ++// instructions (used to manipulate inline caches, primitive & dll calls, etc.). ++class NativeCall: public NativeInstruction { ++ public: ++ enum loongarch_specific_constants { ++ instruction_offset = 0, ++ instruction_size = 1 * BytesPerInstWord, ++ return_address_offset = 1 * BytesPerInstWord, ++ displacement_offset = 0 ++ }; ++ ++ // We have only bl. ++ bool is_bl() const; ++ ++ address instruction_address() const { return addr_at(instruction_offset); } ++ ++ address next_instruction_address() const { ++ return addr_at(return_address_offset); ++ } ++ ++ address return_address() const { ++ return next_instruction_address(); ++ } ++ ++ address target_addr_for_bl(address orig_addr = 0) const; ++ address destination() const; ++ void set_destination(address dest); ++ ++ void verify_alignment() {} ++ void verify(); ++ void print(); ++ ++ // Creation ++ inline friend NativeCall* nativeCall_at(address address); ++ inline friend NativeCall* nativeCall_before(address return_address); ++ ++ static bool is_call_at(address instr) { ++ return nativeInstruction_at(instr)->is_call(); ++ } ++ ++ static bool is_call_before(address return_address) { ++ return is_call_at(return_address - return_address_offset); ++ } ++ ++ // MT-safe patching of a call instruction. ++ static void insert(address code_pos, address entry); ++ static void replace_mt_safe(address instr_addr, address code_buffer); ++ ++ // Similar to replace_mt_safe, but just changes the destination. The ++ // important thing is that free-running threads are able to execute ++ // this call instruction at all times. If the call is an immediate bl ++ // instruction we can simply rely on atomicity of 32-bit writes to ++ // make sure other threads will see no intermediate states. ++ ++ // We cannot rely on locks here, since the free-running threads must run at ++ // full speed. ++ // ++ // Used in the runtime linkage of calls; see class CompiledIC. ++ ++ // The parameter assert_lock disables the assertion during code generation. ++ void set_destination_mt_safe(address dest, bool assert_lock = true); ++ ++ address get_trampoline(); ++ address trampoline_jump(CodeBuffer &cbuf, address dest); ++}; ++ ++inline NativeCall* nativeCall_at(address address) { ++ NativeCall* call = (NativeCall*)(address - NativeCall::instruction_offset); ++#ifdef ASSERT ++ call->verify(); ++#endif ++ return call; ++} ++ ++inline NativeCall* nativeCall_before(address return_address) { ++ NativeCall* call = (NativeCall*)(return_address - NativeCall::return_address_offset); ++#ifdef ASSERT ++ call->verify(); ++#endif ++ return call; ++} ++ ++// The NativeFarCall is an abstraction for accessing/manipulating native ++// call-anywhere instructions. ++// Used to call native methods which may be loaded anywhere in the address ++// space, possibly out of reach of a call instruction. ++class NativeFarCall: public NativeInstruction { ++ public: ++ enum loongarch_specific_constants { ++ instruction_offset = 0, ++ instruction_size = 2 * BytesPerInstWord ++ }; ++ ++ address instruction_address() const { return addr_at(instruction_offset); } ++ ++ // We use MacroAssembler::patchable_call() for implementing a ++ // call-anywhere instruction. ++ bool is_short() const; ++ bool is_far() const; ++ ++ // Checks whether instr points at a NativeFarCall instruction. ++ static bool is_far_call_at(address address) { ++ return nativeInstruction_at(address)->is_far_call(); ++ } ++ ++ // Returns the NativeFarCall's destination. ++ address destination(address orig_addr = 0) const; ++ ++ // Sets the NativeFarCall's destination, not necessarily mt-safe. ++ // Used when relocating code. ++ void set_destination(address dest); ++ ++ void verify(); ++}; ++ ++// Instantiates a NativeFarCall object starting at the given instruction ++// address and returns the NativeFarCall object. ++inline NativeFarCall* nativeFarCall_at(address address) { ++ NativeFarCall* call = (NativeFarCall*)address; ++#ifdef ASSERT ++ call->verify(); ++#endif ++ return call; ++} ++ ++// An interface for accessing/manipulating native set_oop imm, reg instructions ++// (used to manipulate inlined data references, etc.). ++class NativeMovConstReg: public NativeInstruction { ++ public: ++ enum loongarch_specific_constants { ++ instruction_offset = 0, ++ instruction_size = 3 * BytesPerInstWord, ++ next_instruction_offset = 3 * BytesPerInstWord, ++ }; ++ ++ int insn_word() const { return long_at(instruction_offset); } ++ address instruction_address() const { return addr_at(0); } ++ address next_instruction_address() const { return addr_at(next_instruction_offset); } ++ intptr_t data() const; ++ void set_data(intptr_t x, intptr_t o = 0); ++ ++ bool is_li52() const { ++ return is_lu12iw_ori_lu32id() || ++ is_lu12iw_lu32id_nop() || ++ is_lu12iw_2nop() || ++ is_lu12iw_ori_nop() || ++ is_ori_2nop() || ++ is_addid_2nop(); ++ } ++ bool is_lu12iw_ori_lu32id() const; ++ bool is_lu12iw_lu32id_nop() const; ++ bool is_lu12iw_2nop() const; ++ bool is_lu12iw_ori_nop() const; ++ bool is_ori_2nop() const; ++ bool is_addid_2nop() const; ++ void verify(); ++ void print(); ++ ++ // unit test stuff ++ static void test() {} ++ ++ // Creation ++ inline friend NativeMovConstReg* nativeMovConstReg_at(address address); ++ inline friend NativeMovConstReg* nativeMovConstReg_before(address address); ++}; ++ ++inline NativeMovConstReg* nativeMovConstReg_at(address address) { ++ NativeMovConstReg* test = (NativeMovConstReg*)(address - NativeMovConstReg::instruction_offset); ++#ifdef ASSERT ++ test->verify(); ++#endif ++ return test; ++} ++ ++inline NativeMovConstReg* nativeMovConstReg_before(address address) { ++ NativeMovConstReg* test = (NativeMovConstReg*)(address - NativeMovConstReg::instruction_size - NativeMovConstReg::instruction_offset); ++#ifdef ASSERT ++ test->verify(); ++#endif ++ return test; ++} ++ ++class NativeMovConstRegPatching: public NativeMovConstReg { ++ private: ++ friend NativeMovConstRegPatching* nativeMovConstRegPatching_at(address address) { ++ NativeMovConstRegPatching* test = (NativeMovConstRegPatching*)(address - instruction_offset); ++ #ifdef ASSERT ++ test->verify(); ++ #endif ++ return test; ++ } ++}; ++ ++class NativeMovRegMem: public NativeInstruction { ++ public: ++ enum loongarch_specific_constants { ++ instruction_offset = 0, ++ instruction_size = 4, ++ hiword_offset = 4, ++ ldst_offset = 12, ++ immediate_size = 4, ++ ldst_size = 16 ++ }; ++ ++ address instruction_address() const { return addr_at(instruction_offset); } ++ ++ int num_bytes_to_end_of_patch() const { return instruction_offset + instruction_size; } ++ ++ int offset() const; ++ ++ void set_offset(int x); ++ ++ void add_offset_in_bytes(int add_offset) { set_offset ( ( offset() + add_offset ) ); } ++ ++ void verify(); ++ void print (); ++ ++ // unit test stuff ++ static void test() {} ++ ++ private: ++ inline friend NativeMovRegMem* nativeMovRegMem_at (address address); ++}; ++ ++inline NativeMovRegMem* nativeMovRegMem_at (address address) { ++ NativeMovRegMem* test = (NativeMovRegMem*)(address - NativeMovRegMem::instruction_offset); ++#ifdef ASSERT ++ test->verify(); ++#endif ++ return test; ++} ++ ++class NativeMovRegMemPatching: public NativeMovRegMem { ++ private: ++ friend NativeMovRegMemPatching* nativeMovRegMemPatching_at (address address) { ++ NativeMovRegMemPatching* test = (NativeMovRegMemPatching*)(address - instruction_offset); ++ #ifdef ASSERT ++ test->verify(); ++ #endif ++ return test; ++ } ++}; ++ ++ ++// Handles all kinds of jump on Loongson. ++// short: ++// b offs26 ++// nop ++// ++// far: ++// pcaddu18i reg, si20 ++// jirl r0, reg, si18 ++// ++class NativeJump: public NativeInstruction { ++ public: ++ enum loongarch_specific_constants { ++ instruction_offset = 0, ++ instruction_size = 2 * BytesPerInstWord ++ }; ++ ++ bool is_short(); ++ bool is_far(); ++ ++ address instruction_address() const { return addr_at(instruction_offset); } ++ address jump_destination(address orig_addr = 0); ++ void set_jump_destination(address dest); ++ ++ // Creation ++ inline friend NativeJump* nativeJump_at(address address); ++ ++ // Insertion of native jump instruction ++ static void insert(address code_pos, address entry) { Unimplemented(); } ++ // MT-safe insertion of native jump at verified method entry ++ static void check_verified_entry_alignment(address entry, address verified_entry){} ++ static void patch_verified_entry(address entry, address verified_entry, address dest); ++ ++ void verify(); ++}; ++ ++inline NativeJump* nativeJump_at(address address) { ++ NativeJump* jump = (NativeJump*)(address - NativeJump::instruction_offset); ++ debug_only(jump->verify();) ++ return jump; ++} ++ ++class NativeGeneralJump: public NativeJump { ++ public: ++ // Creation ++ inline friend NativeGeneralJump* nativeGeneralJump_at(address address); ++ ++ // Insertion of native general jump instruction ++ static void insert_unconditional(address code_pos, address entry); ++ static void replace_mt_safe(address instr_addr, address code_buffer); ++}; ++ ++inline NativeGeneralJump* nativeGeneralJump_at(address address) { ++ NativeGeneralJump* jump = (NativeGeneralJump*)(address); ++ debug_only(jump->verify();) ++ return jump; ++} ++ ++class NativeIllegalInstruction: public NativeInstruction { ++public: ++ enum loongarch_specific_constants { ++ instruction_code = 0xbadc0de0, // TODO: LA ++ // Temporary LoongArch reserved instruction ++ instruction_size = 4, ++ instruction_offset = 0, ++ next_instruction_offset = 4 ++ }; ++ ++ // Insert illegal opcode as specific address ++ static void insert(address code_pos); ++}; ++ ++inline bool NativeInstruction::is_illegal() { return insn_word() == illegal_instruction(); } ++ ++inline bool NativeInstruction::is_call() { ++ NativeCall *call = (NativeCall*)instruction_address(); ++ return call->is_bl(); ++} ++ ++inline bool NativeInstruction::is_far_call() { ++ NativeFarCall *call = (NativeFarCall*)instruction_address(); ++ ++ // short ++ if (call->is_short()) { ++ return true; ++ } ++ ++ // far ++ if (call->is_far()) { ++ return true; ++ } ++ ++ return false; ++} ++ ++inline bool NativeInstruction::is_jump() ++{ ++ NativeGeneralJump *jump = (NativeGeneralJump*)instruction_address(); ++ ++ // short ++ if (jump->is_short()) { ++ return true; ++ } ++ ++ // far ++ if (jump->is_far()) { ++ return true; ++ } ++ ++ return false; ++} ++ ++// Call trampoline stubs. ++class NativeCallTrampolineStub : public NativeInstruction { ++ public: ++ ++ enum la_specific_constants { ++ instruction_size = 6 * 4, ++ instruction_offset = 0, ++ data_offset = 4 * 4, ++ next_instruction_offset = 6 * 4 ++ }; ++ ++ address destination() const { ++ return (address)ptr_at(data_offset); ++ } ++ ++ void set_destination(address new_destination) { ++ set_ptr_at(data_offset, (intptr_t)new_destination); ++ OrderAccess::fence(); ++ } ++}; ++ ++// Note: Other stubs must not begin with this pattern. ++inline bool NativeInstruction::is_NativeCallTrampolineStub_at() { ++ // pcaddi ++ // ld_d ++ // jirl ++ return Assembler::high(int_at(0), 7) == Assembler::pcaddi_op && ++ Assembler::high(int_at(4), 10) == Assembler::ld_d_op && ++ Assembler::high(int_at(8), 6) == Assembler::jirl_op && ++ Assembler::low(int_at(8), 5) == R0->encoding(); ++} ++ ++inline NativeCallTrampolineStub* nativeCallTrampolineStub_at(address addr) { ++ NativeInstruction* ni = nativeInstruction_at(addr); ++ assert(ni->is_NativeCallTrampolineStub_at(), "no call trampoline found"); ++ return (NativeCallTrampolineStub*)addr; ++} ++ ++class NativePostCallNop: public NativeInstruction { ++public: ++ ++ bool check() const { ++ // nop; ori R0, xx, xx; ori R0, xx, xx; ++ return is_nop() && ((uint_at(4) & 0xffc0001f) == 0x03800000); ++ } ++ ++ jint displacement() const { ++ uint32_t first_ori = uint_at(4); ++ uint32_t second_ori = uint_at(8); ++ int lo = ((first_ori >> 5) & 0xffff); ++ int hi = ((second_ori >> 5) & 0xffff); ++ return (jint) ((hi << 16) | lo); ++ } ++ ++ void patch(jint diff); ++ void make_deopt(); ++}; ++ ++inline NativePostCallNop* nativePostCallNop_at(address address) { ++ NativePostCallNop* nop = (NativePostCallNop*) address; ++ if (nop->check()) { ++ return nop; ++ } ++ return nullptr; ++} ++ ++inline NativePostCallNop* nativePostCallNop_unsafe_at(address address) { ++ NativePostCallNop* nop = (NativePostCallNop*) address; ++ assert(nop->check(), ""); ++ return nop; ++} ++ ++class NativeDeoptInstruction: public NativeInstruction { ++ public: ++ enum { ++ // deopt instruction code should never be the same as NativeIllegalInstruction ++ instruction_code = 0xbadcdead, ++ instruction_size = 4, ++ instruction_offset = 0, ++ }; ++ ++ address instruction_address() const { return addr_at(instruction_offset); } ++ address next_instruction_address() const { return addr_at(instruction_size); } ++ ++ void verify(); ++ ++ static bool is_deopt_at(address instr) { ++ assert(instr != nullptr, ""); ++ uint32_t value = *(uint32_t *) instr; ++ return value == instruction_code; ++ } ++ ++ // MT-safe patching ++ static void insert(address code_pos); ++}; ++ ++class NativeMembar : public NativeInstruction { ++public: ++ unsigned int get_hint() { return Assembler::low(insn_word(), 4); } ++ void set_hint(int hint) { Assembler::patch(addr_at(0), 4, hint); } ++}; ++ ++#endif // CPU_LOONGARCH_NATIVEINST_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/register_loongarch.cpp b/src/hotspot/cpu/loongarch/register_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/register_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/register_loongarch.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,61 @@ ++/* ++ * Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "register_loongarch.hpp" ++ ++Register::RegisterImpl \ ++ all_RegisterImpls [Register::number_of_registers + 1]; ++FloatRegister::FloatRegisterImpl \ ++ all_FloatRegisterImpls [FloatRegister::number_of_registers + 1]; ++ConditionalFlagRegister::ConditionalFlagRegisterImpl \ ++ all_ConditionalFlagRegisterImpls[ConditionalFlagRegister::number_of_registers + 1]; ++ ++const char* Register::RegisterImpl::name() const { ++ static const char *const names[number_of_registers] = { ++ "zero", "ra", "tp", "sp", "a0/v0", "a1/v1", "a2", "a3", ++ "a4", "a5", "a6", "a7", "t0", "t1", "t2", "t3", ++ "t4", "t5", "t6", "t7", "t8", "x", "fp", "s0", ++ "s1", "s2", "s3", "s4", "s5", "s6", "s7", "s8" ++ }; ++ return is_valid() ? names[encoding()] : "noreg"; ++} ++ ++const char* FloatRegister::FloatRegisterImpl::name() const { ++ static const char *const names[number_of_registers] = { ++ "f0", "f1", "f2", "f3", "f4", "f5", "f6", "f7", ++ "f8", "f9", "f10", "f11", "f12", "f13", "f14", "f15", ++ "f16", "f17", "f18", "f19", "f20", "f21", "f22", "f23", ++ "f24", "f25", "f26", "f27", "f28", "f29", "f30", "f31", ++ }; ++ return is_valid() ? names[encoding()] : "fnoreg"; ++} ++ ++const char* ConditionalFlagRegister::ConditionalFlagRegisterImpl::name() const { ++ static const char *const names[number_of_registers] = { ++ "fcc0", "fcc1", "fcc2", "fcc3", "fcc4", "fcc5", "fcc6", "fcc7", ++ }; ++ return is_valid() ? names[encoding()] : "fccnoreg"; ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/register_loongarch.hpp b/src/hotspot/cpu/loongarch/register_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/register_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/register_loongarch.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,479 @@ ++/* ++ * Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_REGISTER_LOONGARCH_HPP ++#define CPU_LOONGARCH_REGISTER_LOONGARCH_HPP ++ ++#include "asm/register.hpp" ++#include "utilities/powerOfTwo.hpp" ++#include "logging/log.hpp" ++#include "utilities/bitMap.hpp" ++#include "utilities/formatBuffer.hpp" ++#include "utilities/ticks.hpp" ++ ++class VMRegImpl; ++typedef VMRegImpl* VMReg; ++ ++class Register { ++ private: ++ int _encoding; ++ ++ constexpr explicit Register(int encoding) : _encoding(encoding) {} ++ ++ public: ++ enum { ++ number_of_registers = 32, ++ max_slots_per_register = 2, ++ }; ++ ++ class RegisterImpl: public AbstractRegisterImpl { ++ friend class Register; ++ ++ static constexpr const RegisterImpl* first(); ++ ++ public: ++ // accessors ++ constexpr int raw_encoding() const { return this - first(); } ++ constexpr int encoding() const { assert(is_valid(), "invalid register"); return raw_encoding(); } ++ constexpr bool is_valid() const { return 0 <= raw_encoding() && raw_encoding() < number_of_registers; } ++ ++ // derived registers, offsets, and addresses ++ inline Register successor() const; ++ ++ VMReg as_VMReg() const; ++ ++ const char* name() const; ++ }; ++ ++ inline friend constexpr Register as_Register(int encoding); ++ ++ constexpr Register() : _encoding(-1) {} // noreg ++ ++ int operator==(const Register r) const { return _encoding == r._encoding; } ++ int operator!=(const Register r) const { return _encoding != r._encoding; } ++ ++ constexpr const RegisterImpl* operator->() const { return RegisterImpl::first() + _encoding; } ++}; ++ ++extern Register::RegisterImpl all_RegisterImpls[Register::number_of_registers + 1] INTERNAL_VISIBILITY; ++ ++inline constexpr const Register::RegisterImpl* Register::RegisterImpl::first() { ++ return all_RegisterImpls + 1; ++} ++ ++constexpr Register noreg = Register(); ++ ++inline constexpr Register as_Register(int encoding) { ++ if (0 <= encoding && encoding < Register::number_of_registers) { ++ return Register(encoding); ++ } ++ return noreg; ++} ++ ++inline Register Register::RegisterImpl::successor() const { ++ assert(is_valid(), "sanity"); ++ return as_Register(encoding() + 1); ++} ++ ++// The integer registers of the LoongArch architecture ++constexpr Register r0 = as_Register( 0); ++constexpr Register r1 = as_Register( 1); ++constexpr Register r2 = as_Register( 2); ++constexpr Register r3 = as_Register( 3); ++constexpr Register r4 = as_Register( 4); ++constexpr Register r5 = as_Register( 5); ++constexpr Register r6 = as_Register( 6); ++constexpr Register r7 = as_Register( 7); ++constexpr Register r8 = as_Register( 8); ++constexpr Register r9 = as_Register( 9); ++constexpr Register r10 = as_Register(10); ++constexpr Register r11 = as_Register(11); ++constexpr Register r12 = as_Register(12); ++constexpr Register r13 = as_Register(13); ++constexpr Register r14 = as_Register(14); ++constexpr Register r15 = as_Register(15); ++constexpr Register r16 = as_Register(16); ++constexpr Register r17 = as_Register(17); ++constexpr Register r18 = as_Register(18); ++constexpr Register r19 = as_Register(19); ++constexpr Register r20 = as_Register(20); ++constexpr Register r21 = as_Register(21); ++constexpr Register r22 = as_Register(22); ++constexpr Register r23 = as_Register(23); ++constexpr Register r24 = as_Register(24); ++constexpr Register r25 = as_Register(25); ++constexpr Register r26 = as_Register(26); ++constexpr Register r27 = as_Register(27); ++constexpr Register r28 = as_Register(28); ++constexpr Register r29 = as_Register(29); ++constexpr Register r30 = as_Register(30); ++constexpr Register r31 = as_Register(31); ++ ++ ++constexpr Register NOREG = noreg; ++constexpr Register R0 = r0; ++constexpr Register R1 = r1; ++constexpr Register R2 = r2; ++constexpr Register R3 = r3; ++constexpr Register R4 = r4; ++constexpr Register R5 = r5; ++constexpr Register R6 = r6; ++constexpr Register R7 = r7; ++constexpr Register R8 = r8; ++constexpr Register R9 = r9; ++constexpr Register R10 = r10; ++constexpr Register R11 = r11; ++constexpr Register R12 = r12; ++constexpr Register R13 = r13; ++constexpr Register R14 = r14; ++constexpr Register R15 = r15; ++constexpr Register R16 = r16; ++constexpr Register R17 = r17; ++constexpr Register R18 = r18; ++constexpr Register R19 = r19; ++constexpr Register R20 = r20; ++constexpr Register R21 = r21; ++constexpr Register R22 = r22; ++constexpr Register R23 = r23; ++constexpr Register R24 = r24; ++constexpr Register R25 = r25; ++constexpr Register R26 = r26; ++constexpr Register R27 = r27; ++constexpr Register R28 = r28; ++constexpr Register R29 = r29; ++constexpr Register R30 = r30; ++constexpr Register R31 = r31; ++ ++ ++constexpr Register RA = R1; ++constexpr Register TP = R2; ++constexpr Register SP = R3; ++constexpr Register A0 = R4; ++constexpr Register A1 = R5; ++constexpr Register A2 = R6; ++constexpr Register A3 = R7; ++constexpr Register A4 = R8; ++constexpr Register A5 = R9; ++constexpr Register A6 = R10; ++constexpr Register A7 = R11; ++constexpr Register T0 = R12; ++constexpr Register T1 = R13; ++constexpr Register T2 = R14; ++constexpr Register T3 = R15; ++constexpr Register T4 = R16; ++constexpr Register T5 = R17; ++constexpr Register T6 = R18; ++constexpr Register T7 = R19; ++constexpr Register T8 = R20; ++constexpr Register RX = R21; ++constexpr Register FP = R22; ++constexpr Register S0 = R23; ++constexpr Register S1 = R24; ++constexpr Register S2 = R25; ++constexpr Register S3 = R26; ++constexpr Register S4 = R27; ++constexpr Register S5 = R28; ++constexpr Register S6 = R29; ++constexpr Register S7 = R30; ++constexpr Register S8 = R31; ++ ++ ++// Use FloatRegister as shortcut ++class FloatRegister { ++ private: ++ int _encoding; ++ ++ constexpr explicit FloatRegister(int encoding) : _encoding(encoding) {} ++ ++ public: ++ inline friend constexpr FloatRegister as_FloatRegister(int encoding); ++ ++ enum { ++ number_of_registers = 32, ++ save_slots_per_register = 2, ++ slots_per_lsx_register = 4, ++ slots_per_lasx_register = 8, ++ max_slots_per_register = 8 ++ }; ++ ++ class FloatRegisterImpl: public AbstractRegisterImpl { ++ friend class FloatRegister; ++ ++ static constexpr const FloatRegisterImpl* first(); ++ ++ public: ++ // accessors ++ constexpr int raw_encoding() const { return this - first(); } ++ constexpr int encoding() const { assert(is_valid(), "invalid register"); return raw_encoding(); } ++ constexpr bool is_valid() const { return 0 <= raw_encoding() && raw_encoding() < number_of_registers; } ++ ++ // derived registers, offsets, and addresses ++ inline FloatRegister successor() const; ++ ++ VMReg as_VMReg() const; ++ ++ const char* name() const; ++ }; ++ ++ constexpr FloatRegister() : _encoding(-1) {} // fnoreg ++ ++ int operator==(const FloatRegister r) const { return _encoding == r._encoding; } ++ int operator!=(const FloatRegister r) const { return _encoding != r._encoding; } ++ ++ constexpr const FloatRegisterImpl* operator->() const { return FloatRegisterImpl::first() + _encoding; } ++}; ++ ++extern FloatRegister::FloatRegisterImpl all_FloatRegisterImpls[FloatRegister::number_of_registers + 1] INTERNAL_VISIBILITY; ++ ++inline constexpr const FloatRegister::FloatRegisterImpl* FloatRegister::FloatRegisterImpl::first() { ++ return all_FloatRegisterImpls + 1; ++} ++ ++constexpr FloatRegister fnoreg = FloatRegister(); ++ ++inline constexpr FloatRegister as_FloatRegister(int encoding) { ++ if (0 <= encoding && encoding < FloatRegister::number_of_registers) { ++ return FloatRegister(encoding); ++ } ++ return fnoreg; ++} ++ ++inline FloatRegister FloatRegister::FloatRegisterImpl::successor() const { ++ assert(is_valid(), "sanity"); ++ return as_FloatRegister(encoding() + 1); ++} ++ ++constexpr FloatRegister f0 = as_FloatRegister( 0); ++constexpr FloatRegister f1 = as_FloatRegister( 1); ++constexpr FloatRegister f2 = as_FloatRegister( 2); ++constexpr FloatRegister f3 = as_FloatRegister( 3); ++constexpr FloatRegister f4 = as_FloatRegister( 4); ++constexpr FloatRegister f5 = as_FloatRegister( 5); ++constexpr FloatRegister f6 = as_FloatRegister( 6); ++constexpr FloatRegister f7 = as_FloatRegister( 7); ++constexpr FloatRegister f8 = as_FloatRegister( 8); ++constexpr FloatRegister f9 = as_FloatRegister( 9); ++constexpr FloatRegister f10 = as_FloatRegister(10); ++constexpr FloatRegister f11 = as_FloatRegister(11); ++constexpr FloatRegister f12 = as_FloatRegister(12); ++constexpr FloatRegister f13 = as_FloatRegister(13); ++constexpr FloatRegister f14 = as_FloatRegister(14); ++constexpr FloatRegister f15 = as_FloatRegister(15); ++constexpr FloatRegister f16 = as_FloatRegister(16); ++constexpr FloatRegister f17 = as_FloatRegister(17); ++constexpr FloatRegister f18 = as_FloatRegister(18); ++constexpr FloatRegister f19 = as_FloatRegister(19); ++constexpr FloatRegister f20 = as_FloatRegister(20); ++constexpr FloatRegister f21 = as_FloatRegister(21); ++constexpr FloatRegister f22 = as_FloatRegister(22); ++constexpr FloatRegister f23 = as_FloatRegister(23); ++constexpr FloatRegister f24 = as_FloatRegister(24); ++constexpr FloatRegister f25 = as_FloatRegister(25); ++constexpr FloatRegister f26 = as_FloatRegister(26); ++constexpr FloatRegister f27 = as_FloatRegister(27); ++constexpr FloatRegister f28 = as_FloatRegister(28); ++constexpr FloatRegister f29 = as_FloatRegister(29); ++constexpr FloatRegister f30 = as_FloatRegister(30); ++constexpr FloatRegister f31 = as_FloatRegister(31); ++ ++ ++constexpr FloatRegister FNOREG = fnoreg; ++constexpr FloatRegister F0 = f0; ++constexpr FloatRegister F1 = f1; ++constexpr FloatRegister F2 = f2; ++constexpr FloatRegister F3 = f3; ++constexpr FloatRegister F4 = f4; ++constexpr FloatRegister F5 = f5; ++constexpr FloatRegister F6 = f6; ++constexpr FloatRegister F7 = f7; ++constexpr FloatRegister F8 = f8; ++constexpr FloatRegister F9 = f9; ++constexpr FloatRegister F10 = f10; ++constexpr FloatRegister F11 = f11; ++constexpr FloatRegister F12 = f12; ++constexpr FloatRegister F13 = f13; ++constexpr FloatRegister F14 = f14; ++constexpr FloatRegister F15 = f15; ++constexpr FloatRegister F16 = f16; ++constexpr FloatRegister F17 = f17; ++constexpr FloatRegister F18 = f18; ++constexpr FloatRegister F19 = f19; ++constexpr FloatRegister F20 = f20; ++constexpr FloatRegister F21 = f21; ++constexpr FloatRegister F22 = f22; ++constexpr FloatRegister F23 = f23; ++constexpr FloatRegister F24 = f24; ++constexpr FloatRegister F25 = f25; ++constexpr FloatRegister F26 = f26; ++constexpr FloatRegister F27 = f27; ++constexpr FloatRegister F28 = f28; ++constexpr FloatRegister F29 = f29; ++constexpr FloatRegister F30 = f30; ++constexpr FloatRegister F31 = f31; ++ ++constexpr FloatRegister FA0 = F0; ++constexpr FloatRegister FA1 = F1; ++constexpr FloatRegister FA2 = F2; ++constexpr FloatRegister FA3 = F3; ++constexpr FloatRegister FA4 = F4; ++constexpr FloatRegister FA5 = F5; ++constexpr FloatRegister FA6 = F6; ++constexpr FloatRegister FA7 = F7; ++constexpr FloatRegister FT0 = F8; ++constexpr FloatRegister FT1 = F9; ++constexpr FloatRegister FT2 = F10; ++constexpr FloatRegister FT3 = F11; ++constexpr FloatRegister FT4 = F12; ++constexpr FloatRegister FT5 = F13; ++constexpr FloatRegister FT6 = F14; ++constexpr FloatRegister FT7 = F15; ++constexpr FloatRegister FT8 = F16; ++constexpr FloatRegister FT9 = F17; ++constexpr FloatRegister FT10 = F18; ++constexpr FloatRegister FT11 = F19; ++constexpr FloatRegister FT12 = F20; ++constexpr FloatRegister FT13 = F21; ++constexpr FloatRegister FT14 = F22; ++constexpr FloatRegister FT15 = F23; ++constexpr FloatRegister FS0 = F24; ++constexpr FloatRegister FS1 = F25; ++constexpr FloatRegister FS2 = F26; ++constexpr FloatRegister FS3 = F27; ++constexpr FloatRegister FS4 = F28; ++constexpr FloatRegister FS5 = F29; ++constexpr FloatRegister FS6 = F30; ++constexpr FloatRegister FS7 = F31; ++ ++ ++class ConditionalFlagRegister { ++ int _encoding; ++ ++ constexpr explicit ConditionalFlagRegister(int encoding) : _encoding(encoding) {} ++ ++ public: ++ inline friend constexpr ConditionalFlagRegister as_ConditionalFlagRegister(int encoding); ++ ++ enum { ++ number_of_registers = 8 ++ }; ++ ++ class ConditionalFlagRegisterImpl: public AbstractRegisterImpl { ++ friend class ConditionalFlagRegister; ++ ++ static constexpr const ConditionalFlagRegisterImpl* first(); ++ ++ public: ++ // accessors ++ int raw_encoding() const { return this - first(); } ++ int encoding() const { assert(is_valid(), "invalid register"); return raw_encoding(); } ++ bool is_valid() const { return 0 <= raw_encoding() && raw_encoding() < number_of_registers; } ++ ++ // derived registers, offsets, and addresses ++ inline ConditionalFlagRegister successor() const; ++ ++ VMReg as_VMReg() const; ++ ++ const char* name() const; ++ }; ++ ++ constexpr ConditionalFlagRegister() : _encoding(-1) {} // vnoreg ++ ++ int operator==(const ConditionalFlagRegister r) const { return _encoding == r._encoding; } ++ int operator!=(const ConditionalFlagRegister r) const { return _encoding != r._encoding; } ++ ++ const ConditionalFlagRegisterImpl* operator->() const { return ConditionalFlagRegisterImpl::first() + _encoding; } ++}; ++ ++extern ConditionalFlagRegister::ConditionalFlagRegisterImpl all_ConditionalFlagRegisterImpls[ConditionalFlagRegister::number_of_registers + 1] INTERNAL_VISIBILITY; ++ ++inline constexpr const ConditionalFlagRegister::ConditionalFlagRegisterImpl* ConditionalFlagRegister::ConditionalFlagRegisterImpl::first() { ++ return all_ConditionalFlagRegisterImpls + 1; ++} ++ ++constexpr ConditionalFlagRegister cfnoreg = ConditionalFlagRegister(); ++ ++inline constexpr ConditionalFlagRegister as_ConditionalFlagRegister(int encoding) { ++ if (0 <= encoding && encoding < ConditionalFlagRegister::number_of_registers) { ++ return ConditionalFlagRegister(encoding); ++ } ++ return cfnoreg; ++} ++ ++inline ConditionalFlagRegister ConditionalFlagRegister::ConditionalFlagRegisterImpl::successor() const { ++ assert(is_valid(), "sanity"); ++ return as_ConditionalFlagRegister(encoding() + 1); ++} ++ ++constexpr ConditionalFlagRegister fcc0 = as_ConditionalFlagRegister(0); ++constexpr ConditionalFlagRegister fcc1 = as_ConditionalFlagRegister(1); ++constexpr ConditionalFlagRegister fcc2 = as_ConditionalFlagRegister(2); ++constexpr ConditionalFlagRegister fcc3 = as_ConditionalFlagRegister(3); ++constexpr ConditionalFlagRegister fcc4 = as_ConditionalFlagRegister(4); ++constexpr ConditionalFlagRegister fcc5 = as_ConditionalFlagRegister(5); ++constexpr ConditionalFlagRegister fcc6 = as_ConditionalFlagRegister(6); ++constexpr ConditionalFlagRegister fcc7 = as_ConditionalFlagRegister(7); ++ ++constexpr ConditionalFlagRegister FCC0 = fcc0; ++constexpr ConditionalFlagRegister FCC1 = fcc1; ++constexpr ConditionalFlagRegister FCC2 = fcc2; ++constexpr ConditionalFlagRegister FCC3 = fcc3; ++constexpr ConditionalFlagRegister FCC4 = fcc4; ++constexpr ConditionalFlagRegister FCC5 = fcc5; ++constexpr ConditionalFlagRegister FCC6 = fcc6; ++constexpr ConditionalFlagRegister FCC7 = fcc7; ++ ++// Need to know the total number of registers of all sorts for SharedInfo. ++// Define a class that exports it. ++class ConcreteRegisterImpl : public AbstractRegisterImpl { ++ public: ++ enum { ++ max_gpr = Register::number_of_registers * Register::max_slots_per_register, ++ max_fpr = max_gpr + FloatRegister::number_of_registers * FloatRegister::max_slots_per_register, ++ ++ // A big enough number for C2: all the registers plus flags ++ // This number must be large enough to cover REG_COUNT (defined by c2) registers. ++ // There is no requirement that any ordering here matches any ordering c2 gives ++ // it's optoregs. ++ number_of_registers = max_fpr // gpr/fpr/vpr ++ }; ++}; ++ ++typedef AbstractRegSet RegSet; ++typedef AbstractRegSet FloatRegSet; ++ ++ ++template <> ++inline Register AbstractRegSet::first() { ++ uint32_t first = _bitset & -_bitset; ++ return first ? as_Register(exact_log2(first)) : noreg; ++} ++ ++template <> ++inline FloatRegister AbstractRegSet::first() { ++ uint32_t first = _bitset & -_bitset; ++ return first ? as_FloatRegister(exact_log2(first)) : fnoreg; ++} ++ ++#endif //CPU_LOONGARCH_REGISTER_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/registerMap_loongarch.hpp b/src/hotspot/cpu/loongarch/registerMap_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/registerMap_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/registerMap_loongarch.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,59 @@ ++/* ++ * Copyright (c) 1998, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_REGISTERMAP_LOONGARCH_HPP ++#define CPU_LOONGARCH_REGISTERMAP_LOONGARCH_HPP ++ ++// machine-dependent implementation for register maps ++ friend class frame; ++ ++ private: ++ // This is the hook for finding a register in an "well-known" location, ++ // such as a register block of a predetermined format. ++ // Since there is none, we just return null. ++ // See registerMap_sparc.hpp for an example of grabbing registers ++ // from register save areas of a standard layout. ++ address pd_location(VMReg reg) const {return nullptr;} ++ address pd_location(VMReg base_reg, int slot_idx) const { ++ if (base_reg->is_FloatRegister()) { ++ assert(base_reg->is_concrete(), "must pass base reg"); ++ intptr_t offset_in_bytes = slot_idx * VMRegImpl::stack_slot_size; ++ address base_location = location(base_reg, nullptr); ++ if (base_location != nullptr) { ++ return base_location + offset_in_bytes; ++ } else { ++ return nullptr; ++ } ++ } else { ++ return location(base_reg->next(slot_idx), nullptr); ++ } ++ } ++ ++ // no PD state to clear or copy: ++ void pd_clear() {} ++ void pd_initialize() {} ++ void pd_initialize_from(const RegisterMap* map) {} ++ ++#endif // CPU_LOONGARCH_REGISTERMAP_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/relocInfo_loongarch.cpp b/src/hotspot/cpu/loongarch/relocInfo_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/relocInfo_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/relocInfo_loongarch.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,133 @@ ++/* ++ * Copyright (c) 1998, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "code/relocInfo.hpp" ++#include "compiler/disassembler.hpp" ++#include "nativeInst_loongarch.hpp" ++#include "oops/compressedOops.inline.hpp" ++#include "oops/oop.hpp" ++#include "oops/oop.inline.hpp" ++#include "runtime/safepoint.hpp" ++ ++ ++void Relocation::pd_set_data_value(address x, intptr_t o, bool verify_only) { ++ x += o; ++ typedef Assembler::WhichOperand WhichOperand; ++ WhichOperand which = (WhichOperand) format(); // that is, disp32 or imm, call32, narrow oop ++ assert(which == Assembler::disp32_operand || ++ which == Assembler::narrow_oop_operand || ++ which == Assembler::imm_operand, "format unpacks ok"); ++ if (type() == relocInfo::internal_word_type || ++ type() == relocInfo::section_word_type || ++ type() == relocInfo::external_word_type) { ++ MacroAssembler::pd_patch_instruction(addr(), x); ++ } else if (which == Assembler::imm_operand) { ++ if (verify_only) { ++ assert(nativeMovConstReg_at(addr())->data() == (long)x, "instructions must match"); ++ } else { ++ nativeMovConstReg_at(addr())->set_data((intptr_t)(x)); ++ } ++ } else if (which == Assembler::narrow_oop_operand) { ++ // both compressed oops and compressed classes look the same ++ if (CompressedOops::is_in((void*)x)) { ++ if (verify_only) { ++ assert(nativeMovConstReg_at(addr())->data() == (long)CompressedOops::encode(cast_to_oop(x)), "instructions must match"); ++ } else { ++ nativeMovConstReg_at(addr())->set_data((intptr_t)(CompressedOops::encode(cast_to_oop(x))), (intptr_t)(x)); ++ } ++ } else { ++ if (verify_only) { ++ assert(nativeMovConstReg_at(addr())->data() == (long)CompressedKlassPointers::encode((Klass*)x), "instructions must match"); ++ } else { ++ nativeMovConstReg_at(addr())->set_data((intptr_t)(CompressedKlassPointers::encode((Klass*)x)), (intptr_t)(x)); ++ } ++ } ++ } else { ++ // Note: Use runtime_call_type relocations for call32_operand. ++ assert(0, "call32_operand not supported in LoongArch64"); ++ } ++} ++ ++ ++address Relocation::pd_call_destination(address orig_addr) { ++ NativeInstruction* ni = nativeInstruction_at(addr()); ++ if (ni->is_far_call()) { ++ return nativeFarCall_at(addr())->destination(orig_addr); ++ } else if (ni->is_call()) { ++ address trampoline = nativeCall_at(addr())->get_trampoline(); ++ if (trampoline) { ++ return nativeCallTrampolineStub_at(trampoline)->destination(); ++ } else { ++ address new_addr = nativeCall_at(addr())->target_addr_for_bl(orig_addr); ++ // If call is branch to self, don't try to relocate it, just leave it ++ // as branch to self. This happens during code generation if the code ++ // buffer expands. It will be relocated to the trampoline above once ++ // code generation is complete. ++ return (new_addr == orig_addr) ? addr() : new_addr; ++ } ++ } else if (ni->is_jump()) { ++ return nativeGeneralJump_at(addr())->jump_destination(orig_addr); ++ } else { ++ tty->print_cr("\nError!\ncall destination: " INTPTR_FORMAT, p2i(addr())); ++ Disassembler::decode(addr() - 10 * BytesPerInstWord, addr() + 10 * BytesPerInstWord, tty); ++ ShouldNotReachHere(); ++ return nullptr; ++ } ++} ++ ++void Relocation::pd_set_call_destination(address x) { ++ NativeInstruction* ni = nativeInstruction_at(addr()); ++ if (ni->is_far_call()) { ++ nativeFarCall_at(addr())->set_destination(x); ++ } else if (ni->is_call()) { ++ address trampoline = nativeCall_at(addr())->get_trampoline(); ++ if (trampoline) { ++ nativeCall_at(addr())->set_destination_mt_safe(x, false); ++ } else { ++ nativeCall_at(addr())->set_destination(x); ++ } ++ } else if (ni->is_jump()) { ++ nativeGeneralJump_at(addr())->set_jump_destination(x); ++ } else { ++ ShouldNotReachHere(); ++ } ++} ++ ++address* Relocation::pd_address_in_code() { ++ return (address*)addr(); ++} ++ ++address Relocation::pd_get_address_from_code() { ++ NativeMovConstReg* ni = nativeMovConstReg_at(addr()); ++ return (address)ni->data(); ++} ++ ++void poll_Relocation::fix_relocation_after_move(const CodeBuffer* src, CodeBuffer* dest) { ++} ++ ++void metadata_Relocation::pd_fix_value(address x) { ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/relocInfo_loongarch.hpp b/src/hotspot/cpu/loongarch/relocInfo_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/relocInfo_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/relocInfo_loongarch.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,44 @@ ++/* ++ * Copyright (c) 1997, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_RELOCINFO_LOONGARCH_HPP ++#define CPU_LOONGARCH_RELOCINFO_LOONGARCH_HPP ++ ++ // machine-dependent parts of class relocInfo ++ private: ++ enum { ++ // Since LoongArch instructions are whole words, ++ // the two low-order offset bits can always be discarded. ++ offset_unit = 4, ++ ++ // imm_oop_operand vs. narrow_oop_operand ++ format_width = 2 ++ }; ++ ++ public: ++ ++ static bool mustIterateImmediateOopsInCode() { return false; } ++ ++#endif // CPU_LOONGARCH_RELOCINFO_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/sharedRuntime_loongarch_64.cpp b/src/hotspot/cpu/loongarch/sharedRuntime_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/sharedRuntime_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/sharedRuntime_loongarch_64.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,2975 @@ ++/* ++ * Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "code/compiledIC.hpp" ++#include "code/debugInfoRec.hpp" ++#include "code/icBuffer.hpp" ++#include "code/nativeInst.hpp" ++#include "code/vtableStubs.hpp" ++#include "compiler/oopMap.hpp" ++#include "gc/shared/barrierSetAssembler.hpp" ++#include "interpreter/interpreter.hpp" ++#include "oops/compiledICHolder.hpp" ++#include "oops/klass.inline.hpp" ++#include "oops/method.inline.hpp" ++#include "prims/methodHandles.hpp" ++#include "runtime/continuation.hpp" ++#include "runtime/continuationEntry.inline.hpp" ++#include "runtime/globals.hpp" ++#include "runtime/jniHandles.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/signature.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "runtime/vframeArray.hpp" ++#include "vmreg_loongarch.inline.hpp" ++#ifdef COMPILER2 ++#include "opto/runtime.hpp" ++#endif ++#if INCLUDE_JVMCI ++#include "jvmci/jvmciJavaClasses.hpp" ++#endif ++ ++#define __ masm-> ++ ++const int StackAlignmentInSlots = StackAlignmentInBytes / VMRegImpl::stack_slot_size; ++ ++class RegisterSaver { ++ // Capture info about frame layout ++ enum layout { ++ fpr0_off = 0, ++ fpr1_off, ++ fpr2_off, ++ fpr3_off, ++ fpr4_off, ++ fpr5_off, ++ fpr6_off, ++ fpr7_off, ++ fpr8_off, ++ fpr9_off, ++ fpr10_off, ++ fpr11_off, ++ fpr12_off, ++ fpr13_off, ++ fpr14_off, ++ fpr15_off, ++ fpr16_off, ++ fpr17_off, ++ fpr18_off, ++ fpr19_off, ++ fpr20_off, ++ fpr21_off, ++ fpr22_off, ++ fpr23_off, ++ fpr24_off, ++ fpr25_off, ++ fpr26_off, ++ fpr27_off, ++ fpr28_off, ++ fpr29_off, ++ fpr30_off, ++ fpr31_off, ++ a0_off, ++ a1_off, ++ a2_off, ++ a3_off, ++ a4_off, ++ a5_off, ++ a6_off, ++ a7_off, ++ t0_off, ++ t1_off, ++ t2_off, ++ t3_off, ++ t4_off, ++ t5_off, ++ t6_off, ++ t7_off, ++ t8_off, ++ s0_off, ++ s1_off, ++ s2_off, ++ s3_off, ++ s4_off, ++ s5_off, ++ s6_off, ++ s7_off, ++ s8_off, ++ fp_off, ++ ra_off, ++ fpr_size = fpr31_off - fpr0_off + 1, ++ gpr_size = ra_off - a0_off + 1, ++ }; ++ ++ const bool _save_vectors; ++ public: ++ RegisterSaver(bool save_vectors) : _save_vectors(save_vectors) {} ++ ++ OopMap* save_live_registers(MacroAssembler* masm, int additional_frame_words, int* total_frame_words); ++ void restore_live_registers(MacroAssembler* masm); ++ ++ int slots_save() { ++ int slots = gpr_size * VMRegImpl::slots_per_word; ++ ++ if (_save_vectors && UseLASX) ++ slots += FloatRegister::slots_per_lasx_register * fpr_size; ++ else if (_save_vectors && UseLSX) ++ slots += FloatRegister::slots_per_lsx_register * fpr_size; ++ else ++ slots += FloatRegister::save_slots_per_register * fpr_size; ++ ++ return slots; ++ } ++ ++ int gpr_offset(int off) { ++ int slots_per_fpr = FloatRegister::save_slots_per_register; ++ int slots_per_gpr = VMRegImpl::slots_per_word; ++ ++ if (_save_vectors && UseLASX) ++ slots_per_fpr = FloatRegister::slots_per_lasx_register; ++ else if (_save_vectors && UseLSX) ++ slots_per_fpr = FloatRegister::slots_per_lsx_register; ++ ++ return (fpr_size * slots_per_fpr + (off - a0_off) * slots_per_gpr) * VMRegImpl::stack_slot_size; ++ } ++ ++ int fpr_offset(int off) { ++ int slots_per_fpr = FloatRegister::save_slots_per_register; ++ ++ if (_save_vectors && UseLASX) ++ slots_per_fpr = FloatRegister::slots_per_lasx_register; ++ else if (_save_vectors && UseLSX) ++ slots_per_fpr = FloatRegister::slots_per_lsx_register; ++ ++ return off * slots_per_fpr * VMRegImpl::stack_slot_size; ++ } ++ ++ int ra_offset() { return gpr_offset(ra_off); } ++ int t5_offset() { return gpr_offset(t5_off); } ++ int s3_offset() { return gpr_offset(s3_off); } ++ int v0_offset() { return gpr_offset(a0_off); } ++ int v1_offset() { return gpr_offset(a1_off); } ++ ++ int fpr0_offset() { return fpr_offset(fpr0_off); } ++ int fpr1_offset() { return fpr_offset(fpr1_off); } ++ ++ // During deoptimization only the result register need to be restored ++ // all the other values have already been extracted. ++ void restore_result_registers(MacroAssembler* masm); ++}; ++ ++OopMap* RegisterSaver::save_live_registers(MacroAssembler* masm, int additional_frame_words, int* total_frame_words) { ++ // Always make the frame size 16-byte aligned ++ int frame_size_in_bytes = align_up(additional_frame_words * wordSize + slots_save() * VMRegImpl::stack_slot_size, StackAlignmentInBytes); ++ // OopMap frame size is in compiler stack slots (jint's) not bytes or words ++ int frame_size_in_slots = frame_size_in_bytes / VMRegImpl::stack_slot_size; ++ // The caller will allocate additional_frame_words ++ int additional_frame_slots = additional_frame_words * wordSize / VMRegImpl::stack_slot_size; ++ // CodeBlob frame size is in words. ++ int frame_size_in_words = frame_size_in_bytes / wordSize; ++ ++ *total_frame_words = frame_size_in_words; ++ ++ OopMapSet *oop_maps = new OopMapSet(); ++ OopMap* map = new OopMap(frame_size_in_slots, 0); ++ ++ // save registers ++ __ addi_d(SP, SP, -slots_save() * VMRegImpl::stack_slot_size); ++ ++ for (int i = 0; i < fpr_size; i++) { ++ FloatRegister fpr = as_FloatRegister(i); ++ int off = fpr_offset(i); ++ ++ if (_save_vectors && UseLASX) ++ __ xvst(fpr, SP, off); ++ else if (_save_vectors && UseLSX) ++ __ vst(fpr, SP, off); ++ else ++ __ fst_d(fpr, SP, off); ++ map->set_callee_saved(VMRegImpl::stack2reg(off / VMRegImpl::stack_slot_size + additional_frame_slots), fpr->as_VMReg()); ++ } ++ ++ for (int i = a0_off; i <= a7_off; i++) { ++ Register gpr = as_Register(A0->encoding() + (i - a0_off)); ++ int off = gpr_offset(i); ++ ++ __ st_d(gpr, SP, gpr_offset(i)); ++ map->set_callee_saved(VMRegImpl::stack2reg(off / VMRegImpl::stack_slot_size + additional_frame_slots), gpr->as_VMReg()); ++ } ++ ++ for (int i = t0_off; i <= t6_off; i++) { ++ Register gpr = as_Register(T0->encoding() + (i - t0_off)); ++ int off = gpr_offset(i); ++ ++ __ st_d(gpr, SP, gpr_offset(i)); ++ map->set_callee_saved(VMRegImpl::stack2reg(off / VMRegImpl::stack_slot_size + additional_frame_slots), gpr->as_VMReg()); ++ } ++ __ st_d(T8, SP, gpr_offset(t8_off)); ++ map->set_callee_saved(VMRegImpl::stack2reg(gpr_offset(t8_off) / VMRegImpl::stack_slot_size + additional_frame_slots), T8->as_VMReg()); ++ ++ for (int i = s0_off; i <= s8_off; i++) { ++ Register gpr = as_Register(S0->encoding() + (i - s0_off)); ++ int off = gpr_offset(i); ++ ++ __ st_d(gpr, SP, gpr_offset(i)); ++ map->set_callee_saved(VMRegImpl::stack2reg(off / VMRegImpl::stack_slot_size + additional_frame_slots), gpr->as_VMReg()); ++ } ++ ++ __ st_d(FP, SP, gpr_offset(fp_off)); ++ map->set_callee_saved(VMRegImpl::stack2reg(gpr_offset(fp_off) / VMRegImpl::stack_slot_size + additional_frame_slots), FP->as_VMReg()); ++ __ st_d(RA, SP, gpr_offset(ra_off)); ++ map->set_callee_saved(VMRegImpl::stack2reg(gpr_offset(ra_off) / VMRegImpl::stack_slot_size + additional_frame_slots), RA->as_VMReg()); ++ ++ __ addi_d(FP, SP, slots_save() * VMRegImpl::stack_slot_size); ++ ++ return map; ++} ++ ++ ++// Pop the current frame and restore all the registers that we ++// saved. ++void RegisterSaver::restore_live_registers(MacroAssembler* masm) { ++ for (int i = 0; i < fpr_size; i++) { ++ FloatRegister fpr = as_FloatRegister(i); ++ int off = fpr_offset(i); ++ ++ if (_save_vectors && UseLASX) ++ __ xvld(fpr, SP, off); ++ else if (_save_vectors && UseLSX) ++ __ vld(fpr, SP, off); ++ else ++ __ fld_d(fpr, SP, off); ++ } ++ ++ for (int i = a0_off; i <= a7_off; i++) { ++ Register gpr = as_Register(A0->encoding() + (i - a0_off)); ++ int off = gpr_offset(i); ++ ++ __ ld_d(gpr, SP, gpr_offset(i)); ++ } ++ ++ for (int i = t0_off; i <= t6_off; i++) { ++ Register gpr = as_Register(T0->encoding() + (i - t0_off)); ++ int off = gpr_offset(i); ++ ++ __ ld_d(gpr, SP, gpr_offset(i)); ++ } ++ __ ld_d(T8, SP, gpr_offset(t8_off)); ++ ++ for (int i = s0_off; i <= s8_off; i++) { ++ Register gpr = as_Register(S0->encoding() + (i - s0_off)); ++ int off = gpr_offset(i); ++ ++ __ ld_d(gpr, SP, gpr_offset(i)); ++ } ++ ++ __ ld_d(FP, SP, gpr_offset(fp_off)); ++ __ ld_d(RA, SP, gpr_offset(ra_off)); ++ ++ __ addi_d(SP, SP, slots_save() * VMRegImpl::stack_slot_size); ++} ++ ++// Pop the current frame and restore the registers that might be holding ++// a result. ++void RegisterSaver::restore_result_registers(MacroAssembler* masm) { ++ // Just restore result register. Only used by deoptimization. By ++ // now any callee save register that needs to be restore to a c2 ++ // caller of the deoptee has been extracted into the vframeArray ++ // and will be stuffed into the c2i adapter we create for later ++ // restoration so only result registers need to be restored here. ++ ++ __ ld_d(V0, SP, gpr_offset(a0_off)); ++ __ ld_d(V1, SP, gpr_offset(a1_off)); ++ ++ __ fld_d(F0, SP, fpr_offset(fpr0_off)); ++ __ fld_d(F1, SP, fpr_offset(fpr1_off)); ++ ++ __ addi_d(SP, SP, gpr_offset(ra_off)); ++} ++ ++// Is vector's size (in bytes) bigger than a size saved by default? ++// 8 bytes registers are saved by default using fld/fst instructions. ++bool SharedRuntime::is_wide_vector(int size) { ++ return size > 8; ++} ++ ++// --------------------------------------------------------------------------- ++// Read the array of BasicTypes from a signature, and compute where the ++// arguments should go. Values in the VMRegPair regs array refer to 4-byte ++// quantities. Values less than SharedInfo::stack0 are registers, those above ++// refer to 4-byte stack slots. All stack slots are based off of the stack pointer ++// as framesizes are fixed. ++// VMRegImpl::stack0 refers to the first slot 0(sp). ++// and VMRegImpl::stack0+1 refers to the memory word 4-byes higher. Register ++// up to Register::number_of_registers) are the 32-bit ++// integer registers. ++ ++// Note: the INPUTS in sig_bt are in units of Java argument words, which are ++// either 32-bit or 64-bit depending on the build. The OUTPUTS are in 32-bit ++// units regardless of build. ++ ++int SharedRuntime::java_calling_convention(const BasicType *sig_bt, ++ VMRegPair *regs, ++ int total_args_passed) { ++ ++ // Create the mapping between argument positions and registers. ++ static const Register INT_ArgReg[Argument::n_int_register_parameters_j] = { ++ j_rarg0, j_rarg1, j_rarg2, j_rarg3, ++ j_rarg4, j_rarg5, j_rarg6, j_rarg7, j_rarg8 ++ }; ++ static const FloatRegister FP_ArgReg[Argument::n_float_register_parameters_j] = { ++ j_farg0, j_farg1, j_farg2, j_farg3, ++ j_farg4, j_farg5, j_farg6, j_farg7 ++ }; ++ ++ uint int_args = 0; ++ uint fp_args = 0; ++ uint stk_args = 0; // inc by 2 each time ++ ++ for (int i = 0; i < total_args_passed; i++) { ++ switch (sig_bt[i]) { ++ case T_VOID: ++ // halves of T_LONG or T_DOUBLE ++ assert(i != 0 && (sig_bt[i - 1] == T_LONG || sig_bt[i - 1] == T_DOUBLE), "expecting half"); ++ regs[i].set_bad(); ++ break; ++ case T_BOOLEAN: ++ case T_CHAR: ++ case T_BYTE: ++ case T_SHORT: ++ case T_INT: ++ if (int_args < Argument::n_int_register_parameters_j) { ++ regs[i].set1(INT_ArgReg[int_args++]->as_VMReg()); ++ } else { ++ regs[i].set1(VMRegImpl::stack2reg(stk_args)); ++ stk_args += 2; ++ } ++ break; ++ case T_LONG: ++ assert(sig_bt[i + 1] == T_VOID, "expecting half"); ++ // fall through ++ case T_OBJECT: ++ case T_ARRAY: ++ case T_ADDRESS: ++ if (int_args < Argument::n_int_register_parameters_j) { ++ regs[i].set2(INT_ArgReg[int_args++]->as_VMReg()); ++ } else { ++ regs[i].set2(VMRegImpl::stack2reg(stk_args)); ++ stk_args += 2; ++ } ++ break; ++ case T_FLOAT: ++ if (fp_args < Argument::n_float_register_parameters_j) { ++ regs[i].set1(FP_ArgReg[fp_args++]->as_VMReg()); ++ } else { ++ regs[i].set1(VMRegImpl::stack2reg(stk_args)); ++ stk_args += 2; ++ } ++ break; ++ case T_DOUBLE: ++ assert(sig_bt[i + 1] == T_VOID, "expecting half"); ++ if (fp_args < Argument::n_float_register_parameters_j) { ++ regs[i].set2(FP_ArgReg[fp_args++]->as_VMReg()); ++ } else { ++ regs[i].set2(VMRegImpl::stack2reg(stk_args)); ++ stk_args += 2; ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ break; ++ } ++ } ++ ++ return align_up(stk_args, 2); ++} ++ ++// Patch the callers callsite with entry to compiled code if it exists. ++static void patch_callers_callsite(MacroAssembler *masm) { ++ Label L; ++ __ ld_d(AT, Address(Rmethod, Method::code_offset())); ++ __ beqz(AT, L); ++ ++ __ enter(); ++ __ bstrins_d(SP, R0, 3, 0); // align the stack ++ __ push_call_clobbered_registers(); ++ ++ // VM needs caller's callsite ++ // VM needs target method ++ // This needs to be a long call since we will relocate this adapter to ++ // the codeBuffer and it may not reach ++ ++#ifndef PRODUCT ++ assert(frame::arg_reg_save_area_bytes == 0, "not expecting frame reg save area"); ++#endif ++ ++ __ move(c_rarg0, Rmethod); ++ __ move(c_rarg1, RA); ++ __ call(CAST_FROM_FN_PTR(address, SharedRuntime::fixup_callers_callsite), ++ relocInfo::runtime_call_type); ++ ++ __ pop_call_clobbered_registers(); ++ __ leave(); ++ __ bind(L); ++} ++ ++static void gen_c2i_adapter(MacroAssembler *masm, ++ int total_args_passed, ++ int comp_args_on_stack, ++ const BasicType *sig_bt, ++ const VMRegPair *regs, ++ Label& skip_fixup) { ++ // Before we get into the guts of the C2I adapter, see if we should be here ++ // at all. We've come from compiled code and are attempting to jump to the ++ // interpreter, which means the caller made a static call to get here ++ // (vcalls always get a compiled target if there is one). Check for a ++ // compiled target. If there is one, we need to patch the caller's call. ++ patch_callers_callsite(masm); ++ ++ __ bind(skip_fixup); ++ ++ // Since all args are passed on the stack, total_args_passed * ++ // Interpreter::stackElementSize is the space we need. ++ int extraspace = total_args_passed * Interpreter::stackElementSize; ++ ++ __ move(Rsender, SP); ++ ++ // stack is aligned, keep it that way ++ extraspace = align_up(extraspace, 2 * wordSize); ++ ++ __ addi_d(SP, SP, -extraspace); ++ ++ // Now write the args into the outgoing interpreter space ++ for (int i = 0; i < total_args_passed; i++) { ++ if (sig_bt[i] == T_VOID) { ++ assert(i > 0 && (sig_bt[i-1] == T_LONG || sig_bt[i-1] == T_DOUBLE), "missing half"); ++ continue; ++ } ++ ++ // offset to start parameters ++ int st_off = (total_args_passed - i - 1) * Interpreter::stackElementSize; ++ int next_off = st_off - Interpreter::stackElementSize; ++ ++ // Say 4 args: ++ // i st_off ++ // 0 32 T_LONG ++ // 1 24 T_VOID ++ // 2 16 T_OBJECT ++ // 3 8 T_BOOL ++ // - 0 return address ++ // ++ // However to make thing extra confusing. Because we can fit a Java long/double in ++ // a single slot on a 64 bt vm and it would be silly to break them up, the interpreter ++ // leaves one slot empty and only stores to a single slot. In this case the ++ // slot that is occupied is the T_VOID slot. See I said it was confusing. ++ ++ VMReg r_1 = regs[i].first(); ++ VMReg r_2 = regs[i].second(); ++ if (!r_1->is_valid()) { ++ assert(!r_2->is_valid(), ""); ++ continue; ++ } ++ if (r_1->is_stack()) { ++ // memory to memory ++ int ld_off = r_1->reg2stack() * VMRegImpl::stack_slot_size + extraspace; ++ if (!r_2->is_valid()) { ++ __ ld_wu(AT, Address(SP, ld_off)); ++ __ st_d(AT, Address(SP, st_off)); ++ } else { ++ __ ld_d(AT, Address(SP, ld_off)); ++ ++ // Two VMREgs|OptoRegs can be T_OBJECT, T_ADDRESS, T_DOUBLE, T_LONG ++ // T_DOUBLE and T_LONG use two slots in the interpreter ++ if (sig_bt[i] == T_LONG || sig_bt[i] == T_DOUBLE) { ++ __ st_d(AT, Address(SP, next_off)); ++ } else { ++ __ st_d(AT, Address(SP, st_off)); ++ } ++ } ++ } else if (r_1->is_Register()) { ++ Register r = r_1->as_Register(); ++ if (!r_2->is_valid()) { ++ // must be only an int (or less ) so move only 32bits to slot ++ __ st_d(r, Address(SP, st_off)); ++ } else { ++ // Two VMREgs|OptoRegs can be T_OBJECT, T_ADDRESS, T_DOUBLE, T_LONG ++ // T_DOUBLE and T_LONG use two slots in the interpreter ++ if (sig_bt[i] == T_LONG || sig_bt[i] == T_DOUBLE) { ++ __ st_d(r, Address(SP, next_off)); ++ } else { ++ __ st_d(r, Address(SP, st_off)); ++ } ++ } ++ } else { ++ assert(r_1->is_FloatRegister(), ""); ++ FloatRegister fr = r_1->as_FloatRegister(); ++ if (!r_2->is_valid()) { ++ // only a float use just part of the slot ++ __ fst_s(fr, Address(SP, st_off)); ++ } else { ++ __ fst_d(fr, Address(SP, next_off)); ++ } ++ } ++ } ++ ++ __ ld_d(AT, Address(Rmethod, Method::interpreter_entry_offset())); ++ __ jr(AT); ++} ++ ++void SharedRuntime::gen_i2c_adapter(MacroAssembler *masm, ++ int total_args_passed, ++ int comp_args_on_stack, ++ const BasicType *sig_bt, ++ const VMRegPair *regs) { ++ // Note: Rsender contains the senderSP on entry. We must preserve ++ // it since we may do a i2c -> c2i transition if we lose a race ++ // where compiled code goes non-entrant while we get args ready. ++ const Register saved_sp = T5; ++ __ move(saved_sp, SP); ++ ++ // Cut-out for having no stack args. ++ int comp_words_on_stack = align_up(comp_args_on_stack * VMRegImpl::stack_slot_size, wordSize) >> LogBytesPerWord; ++ if (comp_args_on_stack != 0) { ++ __ addi_d(SP, SP, -1 * comp_words_on_stack * wordSize); ++ } ++ ++ // Align the outgoing SP ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); ++ ++ // Will jump to the compiled code just as if compiled code was doing it. ++ // Pre-load the register-jump target early, to schedule it better. ++ const Register comp_code_target = TSR; ++ __ ld_d(comp_code_target, Rmethod, in_bytes(Method::from_compiled_offset())); ++ ++#if INCLUDE_JVMCI ++ if (EnableJVMCI) { ++ // check if this call should be routed towards a specific entry point ++ __ ld_d(AT, Address(TREG, in_bytes(JavaThread::jvmci_alternate_call_target_offset()))); ++ Label no_alternative_target; ++ __ beqz(AT, no_alternative_target); ++ __ move(comp_code_target, AT); ++ __ st_d(R0, Address(TREG, in_bytes(JavaThread::jvmci_alternate_call_target_offset()))); ++ __ bind(no_alternative_target); ++ } ++#endif // INCLUDE_JVMCI ++ ++ // Now generate the shuffle code. ++ for (int i = 0; i < total_args_passed; i++) { ++ if (sig_bt[i] == T_VOID) { ++ assert(i > 0 && (sig_bt[i - 1] == T_LONG || sig_bt[i - 1] == T_DOUBLE), "missing half"); ++ continue; ++ } ++ ++ // Pick up 0, 1 or 2 words from SP+offset. ++ ++ assert(!regs[i].second()->is_valid() || regs[i].first()->next() == regs[i].second(), "scrambled load targets?"); ++ // Load in argument order going down. ++ int ld_off = (total_args_passed - i - 1) * Interpreter::stackElementSize; ++ // Point to interpreter value (vs. tag) ++ int next_off = ld_off - Interpreter::stackElementSize; ++ ++ VMReg r_1 = regs[i].first(); ++ VMReg r_2 = regs[i].second(); ++ if (!r_1->is_valid()) { ++ assert(!r_2->is_valid(), ""); ++ continue; ++ } ++ if (r_1->is_stack()) { ++ // Convert stack slot to an SP offset (+ wordSize to account for return address ) ++ int st_off = regs[i].first()->reg2stack() * VMRegImpl::stack_slot_size; ++ if (!r_2->is_valid()) { ++ __ ld_w(AT, Address(saved_sp, ld_off)); ++ __ st_d(AT, Address(SP, st_off)); ++ } else { ++ // We are using two optoregs. This can be either T_OBJECT, ++ // T_ADDRESS, T_LONG, or T_DOUBLE the interpreter allocates ++ // two slots but only uses one for thr T_LONG or T_DOUBLE case ++ // So we must adjust where to pick up the data to match the ++ // interpreter. ++ if (sig_bt[i] == T_LONG || sig_bt[i] == T_DOUBLE) { ++ __ ld_d(AT, Address(saved_sp, next_off)); ++ } else { ++ __ ld_d(AT, Address(saved_sp, ld_off)); ++ } ++ __ st_d(AT, Address(SP, st_off)); ++ } ++ } else if (r_1->is_Register()) { // Register argument ++ Register r = r_1->as_Register(); ++ if (r_2->is_valid()) { ++ // We are using two VMRegs. This can be either T_OBJECT, ++ // T_ADDRESS, T_LONG, or T_DOUBLE the interpreter allocates ++ // two slots but only uses one for thr T_LONG or T_DOUBLE case ++ // So we must adjust where to pick up the data to match the ++ // interpreter. ++ if (sig_bt[i] == T_LONG) { ++ __ ld_d(r, Address(saved_sp, next_off)); ++ } else { ++ __ ld_d(r, Address(saved_sp, ld_off)); ++ } ++ } else { ++ __ ld_w(r, Address(saved_sp, ld_off)); ++ } ++ } else { ++ assert(sig_bt[i] == T_FLOAT || sig_bt[i] == T_DOUBLE, "Must be float regs"); ++ FloatRegister fr = r_1->as_FloatRegister(); ++ if (!r_2->is_valid()) { ++ __ fld_s(fr, Address(saved_sp, ld_off)); ++ } else { ++ __ fld_d(fr, Address(saved_sp, next_off)); ++ } ++ } ++ } ++ ++ __ push_cont_fastpath(TREG); // Set JavaThread::_cont_fastpath to the sp of the oldest interpreted frame we know about ++ ++ // 6243940 We might end up in handle_wrong_method if ++ // the callee is deoptimized as we race thru here. If that ++ // happens we don't want to take a safepoint because the ++ // caller frame will look interpreted and arguments are now ++ // "compiled" so it is much better to make this transition ++ // invisible to the stack walking code. Unfortunately if ++ // we try and find the callee by normal means a safepoint ++ // is possible. So we stash the desired callee in the thread ++ // and the vm will find there should this case occur. ++ ++ __ st_d(Rmethod, Address(TREG, JavaThread::callee_target_offset())); ++ ++ // Jump to the compiled code just as if compiled code was doing it. ++ __ jr(comp_code_target); ++} ++ ++// --------------------------------------------------------------- ++AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler *masm, ++ int total_args_passed, ++ int comp_args_on_stack, ++ const BasicType *sig_bt, ++ const VMRegPair *regs, ++ AdapterFingerPrint* fingerprint) { ++ address i2c_entry = __ pc(); ++ ++ __ block_comment("gen_i2c_adapter"); ++ gen_i2c_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs); ++ ++ // ------------------------------------------------------------------------- ++ // Generate a C2I adapter. On entry we know G5 holds the Method*. The ++ // args start out packed in the compiled layout. They need to be unpacked ++ // into the interpreter layout. This will almost always require some stack ++ // space. We grow the current (compiled) stack, then repack the args. We ++ // finally end in a jump to the generic interpreter entry point. On exit ++ // from the interpreter, the interpreter will restore our SP (lest the ++ // compiled code, which relies solely on SP and not FP, get sick). ++ ++ address c2i_unverified_entry = __ pc(); ++ Label skip_fixup; ++ { ++ __ block_comment("c2i_unverified_entry {"); ++ Register holder = IC_Klass; ++ Register receiver = T0; ++ Register temp = T8; ++ address ic_miss = SharedRuntime::get_ic_miss_stub(); ++ ++ Label missed; ++ ++ //add for compressedoops ++ __ load_klass(temp, receiver); ++ ++ __ ld_d(AT, Address(holder, CompiledICHolder::holder_klass_offset())); ++ __ ld_d(Rmethod, Address(holder, CompiledICHolder::holder_metadata_offset())); ++ __ bne(AT, temp, missed); ++ // Method might have been compiled since the call site was patched to ++ // interpreted if that is the case treat it as a miss so we can get ++ // the call site corrected. ++ __ ld_d(AT, Address(Rmethod, Method::code_offset())); ++ __ beq(AT, R0, skip_fixup); ++ __ bind(missed); ++ ++ __ jmp(ic_miss, relocInfo::runtime_call_type); ++ __ block_comment("} c2i_unverified_entry"); ++ } ++ address c2i_entry = __ pc(); ++ ++ // Class initialization barrier for static methods ++ address c2i_no_clinit_check_entry = nullptr; ++ if (VM_Version::supports_fast_class_init_checks()) { ++ Label L_skip_barrier; ++ address handle_wrong_method = SharedRuntime::get_handle_wrong_method_stub(); ++ ++ { // Bypass the barrier for non-static methods ++ __ ld_w(AT, Address(Rmethod, Method::access_flags_offset())); ++ __ andi(AT, AT, JVM_ACC_STATIC); ++ __ beqz(AT, L_skip_barrier); // non-static ++ } ++ ++ __ load_method_holder(T4, Rmethod); ++ __ clinit_barrier(T4, AT, &L_skip_barrier); ++ __ jmp(handle_wrong_method, relocInfo::runtime_call_type); ++ ++ __ bind(L_skip_barrier); ++ c2i_no_clinit_check_entry = __ pc(); ++ } ++ ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ __ block_comment("c2i_entry_barrier"); ++ bs->c2i_entry_barrier(masm); ++ ++ __ block_comment("gen_c2i_adapter"); ++ gen_c2i_adapter(masm, total_args_passed, comp_args_on_stack, sig_bt, regs, skip_fixup); ++ ++ return AdapterHandlerLibrary::new_entry(fingerprint, i2c_entry, c2i_entry, c2i_unverified_entry, c2i_no_clinit_check_entry); ++} ++ ++int SharedRuntime::vector_calling_convention(VMRegPair *regs, ++ uint num_bits, ++ uint total_args_passed) { ++ Unimplemented(); ++ return 0; ++} ++ ++int SharedRuntime::c_calling_convention(const BasicType *sig_bt, ++ VMRegPair *regs, ++ VMRegPair *regs2, ++ int total_args_passed) { ++ assert(regs2 == nullptr, "not needed on LA"); ++ ++ // We return the amount of VMRegImpl stack slots we need to reserve for all ++ // the arguments NOT counting out_preserve_stack_slots. ++ ++ static const Register INT_ArgReg[Argument::n_int_register_parameters_c] = { ++ c_rarg0, c_rarg1, c_rarg2, c_rarg3, ++ c_rarg4, c_rarg5, c_rarg6, c_rarg7 ++ }; ++ static const FloatRegister FP_ArgReg[Argument::n_float_register_parameters_c] = { ++ c_farg0, c_farg1, c_farg2, c_farg3, ++ c_farg4, c_farg5, c_farg6, c_farg7 ++ }; ++ uint int_args = 0; ++ uint fp_args = 0; ++ uint stk_args = 0; // inc by 2 each time ++ ++// Example: ++// n java.lang.UNIXProcess::forkAndExec ++// private native int forkAndExec(byte[] prog, ++// byte[] argBlock, int argc, ++// byte[] envBlock, int envc, ++// byte[] dir, ++// boolean redirectErrorStream, ++// FileDescriptor stdin_fd, ++// FileDescriptor stdout_fd, ++// FileDescriptor stderr_fd) ++// JNIEXPORT jint JNICALL ++// Java_java_lang_UNIXProcess_forkAndExec(JNIEnv *env, ++// jobject process, ++// jbyteArray prog, ++// jbyteArray argBlock, jint argc, ++// jbyteArray envBlock, jint envc, ++// jbyteArray dir, ++// jboolean redirectErrorStream, ++// jobject stdin_fd, ++// jobject stdout_fd, ++// jobject stderr_fd) ++// ++// ::c_calling_convention ++// 0: // env <-- a0 ++// 1: L // klass/obj <-- t0 => a1 ++// 2: [ // prog[] <-- a0 => a2 ++// 3: [ // argBlock[] <-- a1 => a3 ++// 4: I // argc <-- a2 => a4 ++// 5: [ // envBlock[] <-- a3 => a5 ++// 6: I // envc <-- a4 => a5 ++// 7: [ // dir[] <-- a5 => a7 ++// 8: Z // redirectErrorStream <-- a6 => sp[0] ++// 9: L // stdin <-- a7 => sp[8] ++// 10: L // stdout fp[16] => sp[16] ++// 11: L // stderr fp[24] => sp[24] ++// ++ for (int i = 0; i < total_args_passed; i++) { ++ switch (sig_bt[i]) { ++ case T_VOID: // Halves of longs and doubles ++ assert(i != 0 && (sig_bt[i - 1] == T_LONG || sig_bt[i - 1] == T_DOUBLE), "expecting half"); ++ regs[i].set_bad(); ++ break; ++ case T_BOOLEAN: ++ case T_CHAR: ++ case T_BYTE: ++ case T_SHORT: ++ case T_INT: ++ if (int_args < Argument::n_int_register_parameters_c) { ++ regs[i].set1(INT_ArgReg[int_args++]->as_VMReg()); ++ } else { ++ regs[i].set1(VMRegImpl::stack2reg(stk_args)); ++ stk_args += 2; ++ } ++ break; ++ case T_LONG: ++ assert(sig_bt[i + 1] == T_VOID, "expecting half"); ++ // fall through ++ case T_OBJECT: ++ case T_ARRAY: ++ case T_ADDRESS: ++ case T_METADATA: ++ if (int_args < Argument::n_int_register_parameters_c) { ++ regs[i].set2(INT_ArgReg[int_args++]->as_VMReg()); ++ } else { ++ regs[i].set2(VMRegImpl::stack2reg(stk_args)); ++ stk_args += 2; ++ } ++ break; ++ case T_FLOAT: ++ if (fp_args < Argument::n_float_register_parameters_c) { ++ regs[i].set1(FP_ArgReg[fp_args++]->as_VMReg()); ++ } else if (int_args < Argument::n_int_register_parameters_c) { ++ regs[i].set1(INT_ArgReg[int_args++]->as_VMReg()); ++ } else { ++ regs[i].set1(VMRegImpl::stack2reg(stk_args)); ++ stk_args += 2; ++ } ++ break; ++ case T_DOUBLE: ++ assert(sig_bt[i + 1] == T_VOID, "expecting half"); ++ if (fp_args < Argument::n_float_register_parameters_c) { ++ regs[i].set2(FP_ArgReg[fp_args++]->as_VMReg()); ++ } else if (int_args < Argument::n_int_register_parameters_c) { ++ regs[i].set2(INT_ArgReg[int_args++]->as_VMReg()); ++ } else { ++ regs[i].set2(VMRegImpl::stack2reg(stk_args)); ++ stk_args += 2; ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ break; ++ } ++ } ++ ++ return align_up(stk_args, 2); ++} ++ ++// --------------------------------------------------------------------------- ++void SharedRuntime::save_native_result(MacroAssembler *masm, BasicType ret_type, int frame_slots) { ++ // We always ignore the frame_slots arg and just use the space just below frame pointer ++ // which by this time is free to use ++ switch (ret_type) { ++ case T_FLOAT: ++ __ fst_s(FSF, FP, -3 * wordSize); ++ break; ++ case T_DOUBLE: ++ __ fst_d(FSF, FP, -3 * wordSize); ++ break; ++ case T_VOID: break; ++ default: { ++ __ st_d(V0, FP, -3 * wordSize); ++ } ++ } ++} ++ ++void SharedRuntime::restore_native_result(MacroAssembler *masm, BasicType ret_type, int frame_slots) { ++ // We always ignore the frame_slots arg and just use the space just below frame pointer ++ // which by this time is free to use ++ switch (ret_type) { ++ case T_FLOAT: ++ __ fld_s(FSF, FP, -3 * wordSize); ++ break; ++ case T_DOUBLE: ++ __ fld_d(FSF, FP, -3 * wordSize); ++ break; ++ case T_VOID: break; ++ default: { ++ __ ld_d(V0, FP, -3 * wordSize); ++ } ++ } ++} ++ ++static void save_args(MacroAssembler *masm, int arg_count, int first_arg, VMRegPair *args) { ++ for ( int i = first_arg ; i < arg_count ; i++ ) { ++ if (args[i].first()->is_Register()) { ++ __ push(args[i].first()->as_Register()); ++ } else if (args[i].first()->is_FloatRegister()) { ++ __ push(args[i].first()->as_FloatRegister()); ++ } ++ } ++} ++ ++static void restore_args(MacroAssembler *masm, int arg_count, int first_arg, VMRegPair *args) { ++ for ( int i = arg_count - 1 ; i >= first_arg ; i-- ) { ++ if (args[i].first()->is_Register()) { ++ __ pop(args[i].first()->as_Register()); ++ } else if (args[i].first()->is_FloatRegister()) { ++ __ pop(args[i].first()->as_FloatRegister()); ++ } ++ } ++} ++ ++static void verify_oop_args(MacroAssembler* masm, ++ const methodHandle& method, ++ const BasicType* sig_bt, ++ const VMRegPair* regs) { ++ if (VerifyOops) { ++ // verify too many args may overflow the code buffer ++ int arg_size = MIN2(64, (int)(method->size_of_parameters())); ++ ++ for (int i = 0; i < arg_size; i++) { ++ if (is_reference_type(sig_bt[i])) { ++ VMReg r = regs[i].first(); ++ assert(r->is_valid(), "bad oop arg"); ++ if (r->is_stack()) { ++ __ verify_oop_addr(Address(SP, r->reg2stack() * VMRegImpl::stack_slot_size)); ++ } else { ++ __ verify_oop(r->as_Register()); ++ } ++ } ++ } ++ } ++} ++ ++// on exit, sp points to the ContinuationEntry ++OopMap* continuation_enter_setup(MacroAssembler* masm, int& stack_slots) { ++ assert(ContinuationEntry::size() % VMRegImpl::stack_slot_size == 0, ""); ++ assert(in_bytes(ContinuationEntry::cont_offset()) % VMRegImpl::stack_slot_size == 0, ""); ++ assert(in_bytes(ContinuationEntry::chunk_offset()) % VMRegImpl::stack_slot_size == 0, ""); ++ ++ stack_slots += checked_cast(ContinuationEntry::size()) / wordSize; ++ __ li(AT, checked_cast(ContinuationEntry::size())); ++ __ sub_d(SP, SP, AT); ++ ++ OopMap* map = new OopMap(((int)ContinuationEntry::size() + wordSize) / VMRegImpl::stack_slot_size, 0 /* arg_slots*/); ++ ++ __ ld_d(AT, Address(TREG, JavaThread::cont_entry_offset())); ++ __ st_d(AT, Address(SP, ContinuationEntry::parent_offset())); ++ __ st_d(SP, Address(TREG, JavaThread::cont_entry_offset())); ++ ++ return map; ++} ++ ++// on entry j_rarg0 points to the continuation ++// SP points to ContinuationEntry ++// j_rarg2 -- isVirtualThread ++void fill_continuation_entry(MacroAssembler* masm) { ++#ifdef ASSERT ++ __ li(AT, ContinuationEntry::cookie_value()); ++ __ st_w(AT, Address(SP, ContinuationEntry::cookie_offset())); ++#endif ++ ++ __ st_d(j_rarg0, Address(SP, ContinuationEntry::cont_offset())); ++ __ st_w(j_rarg2, Address(SP, ContinuationEntry::flags_offset())); ++ __ st_d(R0, Address(SP, ContinuationEntry::chunk_offset())); ++ __ st_w(R0, Address(SP, ContinuationEntry::argsize_offset())); ++ __ st_w(R0, Address(SP, ContinuationEntry::pin_count_offset())); ++ ++ __ ld_d(AT, Address(TREG, JavaThread::cont_fastpath_offset())); ++ __ st_d(AT, Address(SP, ContinuationEntry::parent_cont_fastpath_offset())); ++ __ ld_d(AT, Address(TREG, JavaThread::held_monitor_count_offset())); ++ __ st_d(AT, Address(SP, ContinuationEntry::parent_held_monitor_count_offset())); ++ ++ __ st_d(R0, Address(TREG, JavaThread::cont_fastpath_offset())); ++ __ st_d(R0, Address(TREG, JavaThread::held_monitor_count_offset())); ++} ++ ++// on entry, sp points to the ContinuationEntry ++// on exit, fp points to the spilled fp + 2 * wordSize in the entry frame ++void continuation_enter_cleanup(MacroAssembler* masm) { ++#ifndef PRODUCT ++ Label OK; ++ __ ld_d(AT, Address(TREG, JavaThread::cont_entry_offset())); ++ __ beq(SP, AT, OK); ++ __ stop("incorrect sp for cleanup"); ++ __ bind(OK); ++#endif ++ ++ __ ld_d(AT, Address(SP, ContinuationEntry::parent_cont_fastpath_offset())); ++ __ st_d(AT, Address(TREG, JavaThread::cont_fastpath_offset())); ++ __ ld_d(AT, Address(SP, ContinuationEntry::parent_held_monitor_count_offset())); ++ __ st_d(AT, Address(TREG, JavaThread::held_monitor_count_offset())); ++ ++ __ ld_d(AT, Address(SP, ContinuationEntry::parent_offset())); ++ __ st_d(AT, Address(TREG, JavaThread::cont_entry_offset())); ++ ++ // add 2 extra words to match up with leave() ++ __ li(AT, (int)ContinuationEntry::size() + 2 * wordSize); ++ __ add_d(FP, SP, AT); ++} ++ ++// enterSpecial(Continuation c, boolean isContinue, boolean isVirtualThread) ++// On entry: j_rarg0 (T0) -- the continuation object ++// j_rarg1 (A0) -- isContinue ++// j_rarg2 (A1) -- isVirtualThread ++static void gen_continuation_enter(MacroAssembler* masm, ++ const methodHandle& method, ++ const BasicType* sig_bt, ++ const VMRegPair* regs, ++ int& exception_offset, ++ OopMapSet*oop_maps, ++ int& frame_complete, ++ int& stack_slots, ++ int& interpreted_entry_offset, ++ int& compiled_entry_offset) { ++ AddressLiteral resolve(SharedRuntime::get_resolve_static_call_stub(), ++ relocInfo::static_call_type); ++ ++ address start = __ pc(); ++ ++ Label call_thaw, exit; ++ ++ // i2i entry used at interp_only_mode only ++ interpreted_entry_offset = __ pc() - start; ++ { ++#ifdef ASSERT ++ Label is_interp_only; ++ __ ld_w(AT, Address(TREG, JavaThread::interp_only_mode_offset())); ++ __ bnez(AT, is_interp_only); ++ __ stop("enterSpecial interpreter entry called when not in interp_only_mode"); ++ __ bind(is_interp_only); ++#endif ++ ++ // Read interpreter arguments into registers (this is an ad-hoc i2c adapter) ++ __ ld_d(j_rarg0, Address(SP, Interpreter::stackElementSize * 2)); ++ __ ld_d(j_rarg1, Address(SP, Interpreter::stackElementSize * 1)); ++ __ ld_d(j_rarg2, Address(SP, Interpreter::stackElementSize * 0)); ++ __ push_cont_fastpath(TREG); ++ ++ __ enter(); ++ stack_slots = 2; // will be adjusted in setup ++ OopMap* map = continuation_enter_setup(masm, stack_slots); ++ // The frame is complete here, but we only record it for the compiled entry, so the frame would appear unsafe, ++ // but that's okay because at the very worst we'll miss an async sample, but we're in interp_only_mode anyway. ++ ++ fill_continuation_entry(masm); ++ ++ __ bnez(j_rarg1, call_thaw); ++ ++ address mark = __ pc(); ++ __ trampoline_call(resolve); ++ ++ oop_maps->add_gc_map(__ pc() - start, map); ++ __ post_call_nop(); ++ ++ __ b(exit); ++ ++ CodeBuffer* cbuf = masm->code_section()->outer(); ++ CompiledStaticCall::emit_to_interp_stub(*cbuf, mark); ++ } ++ ++ // compiled entry ++ __ align(CodeEntryAlignment); ++ compiled_entry_offset = __ pc() - start; ++ ++ __ enter(); ++ stack_slots = 2; // will be adjusted in setup ++ OopMap* map = continuation_enter_setup(masm, stack_slots); ++ frame_complete = __ pc() - start; ++ ++ fill_continuation_entry(masm); ++ ++ __ bnez(j_rarg1, call_thaw); ++ ++ address mark = __ pc(); ++ __ trampoline_call(resolve); ++ ++ oop_maps->add_gc_map(__ pc() - start, map); ++ __ post_call_nop(); ++ ++ __ b(exit); ++ ++ __ bind(call_thaw); ++ ++ __ call(CAST_FROM_FN_PTR(address, StubRoutines::cont_thaw()), relocInfo::runtime_call_type); ++ oop_maps->add_gc_map(__ pc() - start, map->deep_copy()); ++ ContinuationEntry::_return_pc_offset = __ pc() - start; ++ __ post_call_nop(); ++ ++ __ bind(exit); ++ ++ // We've succeeded, set sp to the ContinuationEntry ++ __ ld_d(SP, Address(TREG, JavaThread::cont_entry_offset())); ++ continuation_enter_cleanup(masm); ++ __ leave(); ++ __ jr(RA); ++ ++ // exception handling ++ exception_offset = __ pc() - start; ++ { ++ __ move(TSR, A0); // save return value contaning the exception oop in callee-saved TSR ++ ++ // We've succeeded, set sp to the ContinuationEntry ++ __ ld_d(SP, Address(TREG, JavaThread::cont_entry_offset())); ++ continuation_enter_cleanup(masm); ++ ++ __ ld_d(c_rarg1, Address(FP, -1 * wordSize)); // return address ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::exception_handler_for_return_address), TREG, c_rarg1); ++ ++ // Continue at exception handler: ++ // A0: exception oop ++ // T4: exception handler ++ // A1: exception pc ++ __ move(T4, A0); ++ __ move(A0, TSR); ++ __ verify_oop(A0); ++ ++ __ leave(); ++ __ move(A1, RA); ++ __ jr(T4); ++ } ++ ++ CodeBuffer* cbuf = masm->code_section()->outer(); ++ CompiledStaticCall::emit_to_interp_stub(*cbuf, mark); ++} ++ ++static void gen_continuation_yield(MacroAssembler* masm, ++ const methodHandle& method, ++ const BasicType* sig_bt, ++ const VMRegPair* regs, ++ int& exception_offset, ++ OopMapSet* oop_maps, ++ int& frame_complete, ++ int& stack_slots, ++ int& interpreted_entry_offset, ++ int& compiled_entry_offset) { ++ enum layout { ++ fp_off, ++ fp_off2, ++ return_off, ++ return_off2, ++ framesize // inclusive of return address ++ }; ++ ++ stack_slots = framesize / VMRegImpl::slots_per_word; ++ assert(stack_slots == 2, "recheck layout"); ++ ++ address start = __ pc(); ++ ++ compiled_entry_offset = __ pc() - start; ++ __ enter(); ++ ++ __ move(c_rarg1, SP); ++ ++ frame_complete = __ pc() - start; ++ address the_pc = __ pc(); ++ ++ Label L; ++ __ bind(L); ++ ++ __ post_call_nop(); // this must be exactly after the pc value that is pushed into the frame info, we use this nop for fast CodeBlob lookup ++ ++ __ move(c_rarg0, TREG); ++ __ set_last_Java_frame(TREG, SP, FP, L); ++ __ call_VM_leaf(Continuation::freeze_entry(), 2); ++ __ reset_last_Java_frame(true); ++ ++ Label pinned; ++ ++ __ bnez(A0, pinned); ++ ++ // We've succeeded, set sp to the ContinuationEntry ++ __ ld_d(SP, Address(TREG, JavaThread::cont_entry_offset())); ++ continuation_enter_cleanup(masm); ++ ++ __ bind(pinned); // pinned -- return to caller ++ ++ // handle pending exception thrown by freeze ++ __ ld_d(AT, Address(TREG, in_bytes(Thread::pending_exception_offset()))); ++ Label ok; ++ __ beqz(AT, ok); ++ __ leave(); ++ __ jmp(StubRoutines::forward_exception_entry(), relocInfo::runtime_call_type); ++ __ bind(ok); ++ ++ __ leave(); ++ __ jr(RA); ++ ++ OopMap* map = new OopMap(framesize, 1); ++ oop_maps->add_gc_map(the_pc - start, map); ++} ++ ++static void gen_special_dispatch(MacroAssembler* masm, ++ const methodHandle& method, ++ const BasicType* sig_bt, ++ const VMRegPair* regs) { ++ verify_oop_args(masm, method, sig_bt, regs); ++ vmIntrinsics::ID iid = method->intrinsic_id(); ++ ++ // Now write the args into the outgoing interpreter space ++ bool has_receiver = false; ++ Register receiver_reg = noreg; ++ int member_arg_pos = -1; ++ Register member_reg = noreg; ++ int ref_kind = MethodHandles::signature_polymorphic_intrinsic_ref_kind(iid); ++ if (ref_kind != 0) { ++ member_arg_pos = method->size_of_parameters() - 1; // trailing MemberName argument ++ member_reg = S3; // known to be free at this point ++ has_receiver = MethodHandles::ref_kind_has_receiver(ref_kind); ++ } else if (iid == vmIntrinsics::_invokeBasic) { ++ has_receiver = true; ++ } else if (iid == vmIntrinsics::_linkToNative) { ++ member_arg_pos = method->size_of_parameters() - 1; // trailing NativeEntryPoint argument ++ member_reg = S3; // known to be free at this point ++ } else { ++ fatal("unexpected intrinsic id %d", vmIntrinsics::as_int(iid)); ++ } ++ ++ if (member_reg != noreg) { ++ // Load the member_arg into register, if necessary. ++ SharedRuntime::check_member_name_argument_is_last_argument(method, sig_bt, regs); ++ VMReg r = regs[member_arg_pos].first(); ++ if (r->is_stack()) { ++ __ ld_d(member_reg, Address(SP, r->reg2stack() * VMRegImpl::stack_slot_size)); ++ } else { ++ // no data motion is needed ++ member_reg = r->as_Register(); ++ } ++ } ++ ++ if (has_receiver) { ++ // Make sure the receiver is loaded into a register. ++ assert(method->size_of_parameters() > 0, "oob"); ++ assert(sig_bt[0] == T_OBJECT, "receiver argument must be an object"); ++ VMReg r = regs[0].first(); ++ assert(r->is_valid(), "bad receiver arg"); ++ if (r->is_stack()) { ++ // Porting note: This assumes that compiled calling conventions always ++ // pass the receiver oop in a register. If this is not true on some ++ // platform, pick a temp and load the receiver from stack. ++ fatal("receiver always in a register"); ++ receiver_reg = T6; // known to be free at this point ++ __ ld_d(receiver_reg, Address(SP, r->reg2stack() * VMRegImpl::stack_slot_size)); ++ } else { ++ // no data motion is needed ++ receiver_reg = r->as_Register(); ++ } ++ } ++ ++ // Figure out which address we are really jumping to: ++ MethodHandles::generate_method_handle_dispatch(masm, iid, ++ receiver_reg, member_reg, /*for_compiler_entry:*/ true); ++} ++ ++// --------------------------------------------------------------------------- ++// Generate a native wrapper for a given method. The method takes arguments ++// in the Java compiled code convention, marshals them to the native ++// convention (handlizes oops, etc), transitions to native, makes the call, ++// returns to java state (possibly blocking), unhandlizes any result and ++// returns. ++nmethod *SharedRuntime::generate_native_wrapper(MacroAssembler* masm, ++ const methodHandle& method, ++ int compile_id, ++ BasicType* in_sig_bt, ++ VMRegPair* in_regs, ++ BasicType ret_type) { ++ if (method->is_continuation_native_intrinsic()) { ++ int vep_offset = 0; ++ int exception_offset = 0; ++ int frame_complete = 0; ++ int stack_slots = 0; ++ OopMapSet* oop_maps = new OopMapSet(); ++ int interpreted_entry_offset = -1; ++ if (method->is_continuation_enter_intrinsic()) { ++ gen_continuation_enter(masm, ++ method, ++ in_sig_bt, ++ in_regs, ++ exception_offset, ++ oop_maps, ++ frame_complete, ++ stack_slots, ++ interpreted_entry_offset, ++ vep_offset); ++ } else if (method->is_continuation_yield_intrinsic()) { ++ gen_continuation_yield(masm, ++ method, ++ in_sig_bt, ++ in_regs, ++ exception_offset, ++ oop_maps, ++ frame_complete, ++ stack_slots, ++ interpreted_entry_offset, ++ vep_offset); ++ } else { ++ guarantee(false, "Unknown Continuation native intrinsic"); ++ } ++ ++ __ flush(); ++ nmethod* nm = nmethod::new_native_nmethod(method, ++ compile_id, ++ masm->code(), ++ vep_offset, ++ frame_complete, ++ stack_slots, ++ in_ByteSize(-1), ++ in_ByteSize(-1), ++ oop_maps, ++ exception_offset); ++ if (method->is_continuation_enter_intrinsic()) { ++ ContinuationEntry::set_enter_code(nm, interpreted_entry_offset); ++ } else if (method->is_continuation_yield_intrinsic()) { ++ _cont_doYield_stub = nm; ++ } else { ++ guarantee(false, "Unknown Continuation native intrinsic"); ++ } ++ return nm; ++ } ++ ++ if (method->is_method_handle_intrinsic()) { ++ vmIntrinsics::ID iid = method->intrinsic_id(); ++ intptr_t start = (intptr_t)__ pc(); ++ int vep_offset = ((intptr_t)__ pc()) - start; ++ gen_special_dispatch(masm, ++ method, ++ in_sig_bt, ++ in_regs); ++ assert(((intptr_t)__ pc() - start - vep_offset) >= 1 * BytesPerInstWord, ++ "valid size for make_non_entrant"); ++ int frame_complete = ((intptr_t)__ pc()) - start; // not complete, period ++ __ flush(); ++ int stack_slots = SharedRuntime::out_preserve_stack_slots(); // no out slots at all, actually ++ return nmethod::new_native_nmethod(method, ++ compile_id, ++ masm->code(), ++ vep_offset, ++ frame_complete, ++ stack_slots / VMRegImpl::slots_per_word, ++ in_ByteSize(-1), ++ in_ByteSize(-1), ++ nullptr); ++ } ++ ++ address native_func = method->native_function(); ++ assert(native_func != nullptr, "must have function"); ++ ++ // Native nmethod wrappers never take possession of the oop arguments. ++ // So the caller will gc the arguments. The only thing we need an ++ // oopMap for is if the call is static ++ // ++ // An OopMap for lock (and class if static), and one for the VM call itself ++ OopMapSet *oop_maps = new OopMapSet(); ++ ++ // We have received a description of where all the java arg are located ++ // on entry to the wrapper. We need to convert these args to where ++ // the jni function will expect them. To figure out where they go ++ // we convert the java signature to a C signature by inserting ++ // the hidden arguments as arg[0] and possibly arg[1] (static method) ++ ++ const int total_in_args = method->size_of_parameters(); ++ int total_c_args = total_in_args + (method->is_static() ? 2 : 1); ++ ++ BasicType* out_sig_bt = NEW_RESOURCE_ARRAY(BasicType, total_c_args); ++ VMRegPair* out_regs = NEW_RESOURCE_ARRAY(VMRegPair, total_c_args); ++ BasicType* in_elem_bt = nullptr; ++ ++ int argc = 0; ++ out_sig_bt[argc++] = T_ADDRESS; ++ if (method->is_static()) { ++ out_sig_bt[argc++] = T_OBJECT; ++ } ++ ++ for (int i = 0; i < total_in_args ; i++ ) { ++ out_sig_bt[argc++] = in_sig_bt[i]; ++ } ++ ++ // Now figure out where the args must be stored and how much stack space ++ // they require (neglecting out_preserve_stack_slots but space for storing ++ // the 1st six register arguments). It's weird see int_stk_helper. ++ // ++ int out_arg_slots; ++ out_arg_slots = c_calling_convention(out_sig_bt, out_regs, nullptr, total_c_args); ++ ++ // Compute framesize for the wrapper. We need to handlize all oops in ++ // registers. We must create space for them here that is disjoint from ++ // the windowed save area because we have no control over when we might ++ // flush the window again and overwrite values that gc has since modified. ++ // (The live window race) ++ // ++ // We always just allocate 6 word for storing down these object. This allow ++ // us to simply record the base and use the Ireg number to decide which ++ // slot to use. (Note that the reg number is the inbound number not the ++ // outbound number). ++ // We must shuffle args to match the native convention, and include var-args space. ++ ++ // Calculate the total number of stack slots we will need. ++ ++ // First count the abi requirement plus all of the outgoing args ++ int stack_slots = SharedRuntime::out_preserve_stack_slots() + out_arg_slots; ++ ++ // Now the space for the inbound oop handle area ++ int total_save_slots = Argument::n_int_register_parameters_j * VMRegImpl::slots_per_word; ++ ++ int oop_handle_offset = stack_slots; ++ stack_slots += total_save_slots; ++ ++ // Now any space we need for handlizing a klass if static method ++ ++ int klass_slot_offset = 0; ++ int klass_offset = -1; ++ int lock_slot_offset = 0; ++ bool is_static = false; ++ ++ if (method->is_static()) { ++ klass_slot_offset = stack_slots; ++ stack_slots += VMRegImpl::slots_per_word; ++ klass_offset = klass_slot_offset * VMRegImpl::stack_slot_size; ++ is_static = true; ++ } ++ ++ // Plus a lock if needed ++ ++ if (method->is_synchronized()) { ++ lock_slot_offset = stack_slots; ++ stack_slots += VMRegImpl::slots_per_word; ++ } ++ ++ // Now a place (+2) to save return value or as a temporary for any gpr -> fpr moves ++ // + 4 for return address (which we own) and saved fp ++ stack_slots += 6; ++ ++ // Ok The space we have allocated will look like: ++ // ++ // ++ // FP-> | | ++ // | 2 slots (ra) | ++ // | 2 slots (fp) | ++ // |---------------------| ++ // | 2 slots for moves | ++ // |---------------------| ++ // | lock box (if sync) | ++ // |---------------------| <- lock_slot_offset ++ // | klass (if static) | ++ // |---------------------| <- klass_slot_offset ++ // | oopHandle area | ++ // |---------------------| <- oop_handle_offset ++ // | outbound memory | ++ // | based arguments | ++ // | | ++ // |---------------------| ++ // | vararg area | ++ // |---------------------| ++ // | | ++ // SP-> | out_preserved_slots | ++ // ++ // ++ ++ ++ // Now compute actual number of stack words we need rounding to make ++ // stack properly aligned. ++ stack_slots = align_up(stack_slots, StackAlignmentInSlots); ++ ++ int stack_size = stack_slots * VMRegImpl::stack_slot_size; ++ ++ intptr_t start = (intptr_t)__ pc(); ++ ++ ++ ++ // First thing make an ic check to see if we should even be here ++ address ic_miss = SharedRuntime::get_ic_miss_stub(); ++ ++ // We are free to use all registers as temps without saving them and ++ // restoring them except fp. fp is the only callee save register ++ // as far as the interpreter and the compiler(s) are concerned. ++ ++ const Register ic_reg = IC_Klass; ++ const Register receiver = T0; ++ ++ Label hit; ++ Label exception_pending; ++ ++ __ verify_oop(receiver); ++ //add for compressedoops ++ __ load_klass(T4, receiver); ++ __ beq(T4, ic_reg, hit); ++ __ jmp(ic_miss, relocInfo::runtime_call_type); ++ __ bind(hit); ++ ++ int vep_offset = ((intptr_t)__ pc()) - start; ++ ++ if (VM_Version::supports_fast_class_init_checks() && method->needs_clinit_barrier()) { ++ Label L_skip_barrier; ++ address handle_wrong_method = SharedRuntime::get_handle_wrong_method_stub(); ++ __ mov_metadata(T4, method->method_holder()); // InstanceKlass* ++ __ clinit_barrier(T4, AT, &L_skip_barrier); ++ __ jmp(handle_wrong_method, relocInfo::runtime_call_type); ++ ++ __ bind(L_skip_barrier); ++ } ++ ++ // Generate stack overflow check ++ __ bang_stack_with_offset((int)StackOverflow::stack_shadow_zone_size()); ++ ++ // The instruction at the verified entry point must be 4 bytes or longer ++ // because it can be patched on the fly by make_non_entrant. ++ if (((intptr_t)__ pc() - start - vep_offset) < 1 * BytesPerInstWord) { ++ __ nop(); ++ } ++ ++ // Generate a new frame for the wrapper. ++ // do LA need this ? ++ __ st_d(SP, Address(TREG, JavaThread::last_Java_sp_offset())); ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); ++ ++ __ enter(); ++ // -2 because return address is already present and so is saved fp ++ __ addi_d(SP, SP, -1 * (stack_size - 2*wordSize)); ++ ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ bs->nmethod_entry_barrier(masm, nullptr /* slow_path */, nullptr /* continuation */, nullptr /* guard */); ++ ++ // Frame is now completed as far a size and linkage. ++ ++ int frame_complete = ((intptr_t)__ pc()) - start; ++ ++ // Calculate the difference between sp and fp. We need to know it ++ // after the native call because on windows Java Natives will pop ++ // the arguments and it is painful to do sp relative addressing ++ // in a platform independent way. So after the call we switch to ++ // fp relative addressing. ++ //FIXME actually , the fp_adjustment may not be the right, because andr(sp, sp, at) may change ++ //the SP ++ int fp_adjustment = stack_size; ++ ++ // Compute the fp offset for any slots used after the jni call ++ ++ int lock_slot_fp_offset = (lock_slot_offset*VMRegImpl::stack_slot_size) - fp_adjustment; ++ ++ // We use S4 as the oop handle for the receiver/klass ++ // It is callee save so it survives the call to native ++ ++ const Register oop_handle_reg = S4; ++ ++ // Move arguments from register/stack to register/stack. ++ // -------------------------------------------------------------------------- ++ // ++ // We immediately shuffle the arguments so that for any vm call we have ++ // to make from here on out (sync slow path, jvmti, etc.) we will have ++ // captured the oops from our caller and have a valid oopMap for them. ++ ++ // ----------------- ++ // The Grand Shuffle ++ // ++ // Natives require 1 or 2 extra arguments over the normal ones: the JNIEnv* ++ // and, if static, the class mirror instead of a receiver. This pretty much ++ // guarantees that register layout will not match (and LA doesn't use reg ++ // parms though amd does). Since the native abi doesn't use register args ++ // and the java conventions does we don't have to worry about collisions. ++ // All of our moved are reg->stack or stack->stack. ++ // We ignore the extra arguments during the shuffle and handle them at the ++ // last moment. The shuffle is described by the two calling convention ++ // vectors we have in our possession. We simply walk the java vector to ++ // get the source locations and the c vector to get the destinations. ++ ++ // Record sp-based slot for receiver on stack for non-static methods ++ int receiver_offset = -1; ++ ++ // This is a trick. We double the stack slots so we can claim ++ // the oops in the caller's frame. Since we are sure to have ++ // more args than the caller doubling is enough to make sure ++ // we can capture all the incoming oop args from the caller. ++ OopMap* map = new OopMap(stack_slots * 2, 0 /* arg_slots*/); ++ ++#ifdef ASSERT ++ bool reg_destroyed[Register::number_of_registers]; ++ bool freg_destroyed[FloatRegister::number_of_registers]; ++ for ( int r = 0 ; r < Register::number_of_registers ; r++ ) { ++ reg_destroyed[r] = false; ++ } ++ for ( int f = 0 ; f < FloatRegister::number_of_registers ; f++ ) { ++ freg_destroyed[f] = false; ++ } ++ ++#endif /* ASSERT */ ++ ++ // We move the arguments backward because the floating point registers ++ // destination will always be to a register with a greater or equal ++ // register number or the stack. ++ // in is the index of the incoming Java arguments ++ // out is the index of the outgoing C arguments ++ ++ for (int in = total_in_args - 1, out = total_c_args - 1; in >= 0; in--, out--) { ++ __ block_comment(err_msg("move %d -> %d", in, out)); ++#ifdef ASSERT ++ if (in_regs[in].first()->is_Register()) { ++ assert(!reg_destroyed[in_regs[in].first()->as_Register()->encoding()], "destroyed reg!"); ++ } else if (in_regs[in].first()->is_FloatRegister()) { ++ assert(!freg_destroyed[in_regs[in].first()->as_FloatRegister()->encoding()], "destroyed reg!"); ++ } ++ if (out_regs[out].first()->is_Register()) { ++ reg_destroyed[out_regs[out].first()->as_Register()->encoding()] = true; ++ } else if (out_regs[out].first()->is_FloatRegister()) { ++ freg_destroyed[out_regs[out].first()->as_FloatRegister()->encoding()] = true; ++ } ++#endif /* ASSERT */ ++ switch (in_sig_bt[in]) { ++ case T_BOOLEAN: ++ case T_CHAR: ++ case T_BYTE: ++ case T_SHORT: ++ case T_INT: ++ __ simple_move32(in_regs[in], out_regs[out]); ++ break; ++ case T_ARRAY: ++ case T_OBJECT: ++ __ object_move(map, oop_handle_offset, stack_slots, ++ in_regs[in], out_regs[out], ++ ((in == 0) && (!is_static)), &receiver_offset); ++ break; ++ case T_VOID: ++ break; ++ case T_FLOAT: ++ __ float_move(in_regs[in], out_regs[out]); ++ break; ++ case T_DOUBLE: ++ assert(in + 1 < total_in_args && ++ in_sig_bt[in + 1] == T_VOID && ++ out_sig_bt[out + 1] == T_VOID, "bad arg list"); ++ __ double_move(in_regs[in], out_regs[out]); ++ break; ++ case T_LONG : ++ __ long_move(in_regs[in], out_regs[out]); ++ break; ++ case T_ADDRESS: ++ fatal("found T_ADDRESS in java args"); ++ break; ++ default: ++ ShouldNotReachHere(); ++ break; ++ } ++ } ++ ++ // point c_arg at the first arg that is already loaded in case we ++ // need to spill before we call out ++ int c_arg = total_c_args - total_in_args; ++ ++ // Pre-load a static method's oop into c_rarg1. ++ // Used both by locking code and the normal JNI call code. ++ if (method->is_static()) { ++ ++ // load oop into a register ++ __ movoop(c_rarg1, ++ JNIHandles::make_local(method->method_holder()->java_mirror())); ++ ++ // Now handlize the static class mirror it's known not-null. ++ __ st_d(c_rarg1, SP, klass_offset); ++ map->set_oop(VMRegImpl::stack2reg(klass_slot_offset)); ++ ++ // Now get the handle ++ __ lea(c_rarg1, Address(SP, klass_offset)); ++ // and protect the arg if we must spill ++ c_arg--; ++ } ++ ++ // Change state to native (we save the return address in the thread, since it might not ++ // be pushed on the stack when we do a a stack traversal). It is enough that the pc() ++ // points into the right code segment. It does not have to be the correct return pc. ++ // We use the same pc/oopMap repeatedly when we call out ++ ++ Label native_return; ++ __ set_last_Java_frame(SP, noreg, native_return); ++ ++ // We have all of the arguments setup at this point. We must not touch any register ++ // argument registers at this point (what if we save/restore them there are no oop? ++ { ++ SkipIfEqual skip_if(masm, &DTraceMethodProbes, 0); ++ save_args(masm, total_c_args, c_arg, out_regs); ++ __ mov_metadata(c_rarg1, method()); ++ __ call_VM_leaf( ++ CAST_FROM_FN_PTR(address, SharedRuntime::dtrace_method_entry), ++ TREG, c_rarg1); ++ restore_args(masm, total_c_args, c_arg, out_regs); ++ } ++ ++ // RedefineClasses() tracing support for obsolete method entry ++ if (log_is_enabled(Trace, redefine, class, obsolete)) { ++ // protect the args we've loaded ++ save_args(masm, total_c_args, c_arg, out_regs); ++ __ mov_metadata(c_rarg1, method()); ++ __ call_VM_leaf( ++ CAST_FROM_FN_PTR(address, SharedRuntime::rc_trace_method_entry), ++ TREG, c_rarg1); ++ restore_args(masm, total_c_args, c_arg, out_regs); ++ } ++ ++ // These are register definitions we need for locking/unlocking ++ const Register swap_reg = T8; // Must use T8 for cmpxchg instruction ++ const Register obj_reg = T4; // Will contain the oop ++ const Register lock_reg = T0; // Address of compiler lock object (BasicLock) ++ ++ Label slow_path_lock; ++ Label lock_done; ++ ++ // Lock a synchronized method ++ if (method->is_synchronized()) { ++ Label count; ++ const int mark_word_offset = BasicLock::displaced_header_offset_in_bytes(); ++ ++ // Get the handle (the 2nd argument) ++ __ move(oop_handle_reg, A1); ++ ++ // Get address of the box ++ __ lea(lock_reg, Address(FP, lock_slot_fp_offset)); ++ ++ // Load the oop from the handle ++ __ ld_d(obj_reg, oop_handle_reg, 0); ++ ++ if (LockingMode == LM_MONITOR) { ++ __ b(slow_path_lock); ++ } else if (LockingMode == LM_LEGACY) { ++ // Load immediate 1 into swap_reg %T8 ++ __ li(swap_reg, 1); ++ ++ __ ld_d(AT, obj_reg, 0); ++ __ orr(swap_reg, swap_reg, AT); ++ ++ __ st_d(swap_reg, lock_reg, mark_word_offset); ++ __ cmpxchg(Address(obj_reg, 0), swap_reg, lock_reg, AT, true, true /* acquire */, count); ++ // Test if the oopMark is an obvious stack pointer, i.e., ++ // 1) (mark & 3) == 0, and ++ // 2) sp <= mark < mark + os::pagesize() ++ // These 3 tests can be done by evaluating the following ++ // expression: ((mark - sp) & (3 - os::vm_page_size())), ++ // assuming both stack pointer and pagesize have their ++ // least significant 2 bits clear. ++ // NOTE: the oopMark is in swap_reg %T8 as the result of cmpxchg ++ ++ __ sub_d(swap_reg, swap_reg, SP); ++ __ li(AT, 3 - (int)os::vm_page_size()); ++ __ andr(swap_reg , swap_reg, AT); ++ // Save the test result, for recursive case, the result is zero ++ __ st_d(swap_reg, lock_reg, mark_word_offset); ++ __ bne(swap_reg, R0, slow_path_lock); ++ } else { ++ assert(LockingMode == LM_LIGHTWEIGHT, "must be"); ++ __ ld_d(swap_reg, Address(obj_reg, oopDesc::mark_offset_in_bytes())); ++ // FIXME ++ Register tmp = T1; ++ __ lightweight_lock(obj_reg, swap_reg, tmp, SCR1, slow_path_lock); ++ } ++ ++ __ bind(count); ++ __ increment(Address(TREG, JavaThread::held_monitor_count_offset()), 1); ++ ++ // Slow path will re-enter here ++ __ bind(lock_done); ++ } ++ ++ ++ // Finally just about ready to make the JNI call ++ ++ ++ // get JNIEnv* which is first argument to native ++ __ addi_d(A0, TREG, in_bytes(JavaThread::jni_environment_offset())); ++ ++ // Now set thread in native ++ __ addi_d(AT, R0, _thread_in_native); ++ if (os::is_MP()) { ++ __ addi_d(T4, TREG, in_bytes(JavaThread::thread_state_offset())); ++ __ amswap_db_w(R0, AT, T4); ++ } else { ++ __ st_w(AT, TREG, in_bytes(JavaThread::thread_state_offset())); ++ } ++ ++ // do the call ++ __ call(native_func, relocInfo::runtime_call_type); ++ __ bind(native_return); ++ ++ oop_maps->add_gc_map(((intptr_t)__ pc()) - start, map); ++ ++ // WARNING - on Windows Java Natives use pascal calling convention and pop the ++ // arguments off of the stack. We could just re-adjust the stack pointer here ++ // and continue to do SP relative addressing but we instead switch to FP ++ // relative addressing. ++ ++ // Unpack native results. ++ if (ret_type != T_OBJECT && ret_type != T_ARRAY) { ++ __ cast_primitive_type(ret_type, V0); ++ } ++ ++ Label after_transition; ++ ++ // Switch thread to "native transition" state before reading the synchronization state. ++ // This additional state is necessary because reading and testing the synchronization ++ // state is not atomic w.r.t. GC, as this scenario demonstrates: ++ // Java thread A, in _thread_in_native state, loads _not_synchronized and is preempted. ++ // VM thread changes sync state to synchronizing and suspends threads for GC. ++ // Thread A is resumed to finish this native method, but doesn't block here since it ++ // didn't see any synchronization is progress, and escapes. ++ __ addi_d(AT, R0, _thread_in_native_trans); ++ ++ // Force this write out before the read below ++ if (os::is_MP() && UseSystemMemoryBarrier) { ++ __ addi_d(T4, TREG, in_bytes(JavaThread::thread_state_offset())); ++ __ amswap_db_w(R0, AT, T4); // AnyAny ++ } else { ++ __ st_w(AT, TREG, in_bytes(JavaThread::thread_state_offset())); ++ } ++ ++ // check for safepoint operation in progress and/or pending suspend requests ++ { ++ Label Continue; ++ Label slow_path; ++ ++ // We need an acquire here to ensure that any subsequent load of the ++ // global SafepointSynchronize::_state flag is ordered after this load ++ // of the thread-local polling word. We don't want this poll to ++ // return false (i.e. not safepointing) and a later poll of the global ++ // SafepointSynchronize::_state spuriously to return true. ++ // ++ // This is to avoid a race when we're in a native->Java transition ++ // racing the code which wakes up from a safepoint. ++ ++ __ safepoint_poll(slow_path, TREG, true /* at_return */, true /* acquire */, false /* in_nmethod */); ++ __ ld_w(AT, TREG, in_bytes(JavaThread::suspend_flags_offset())); ++ __ beq(AT, R0, Continue); ++ __ bind(slow_path); ++ ++ // Don't use call_VM as it will see a possible pending exception and forward it ++ // and never return here preventing us from clearing _last_native_pc down below. ++ // ++ save_native_result(masm, ret_type, stack_slots); ++ __ move(A0, TREG); ++ __ addi_d(SP, SP, -wordSize); ++ __ push(S2); ++ __ move(S2, SP); // use S2 as a sender SP holder ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); // align stack as required by ABI ++ __ call(CAST_FROM_FN_PTR(address, JavaThread::check_special_condition_for_native_trans), relocInfo::runtime_call_type); ++ __ move(SP, S2); // use S2 as a sender SP holder ++ __ pop(S2); ++ __ addi_d(SP, SP, wordSize); ++ // Restore any method result value ++ restore_native_result(masm, ret_type, stack_slots); ++ ++ __ bind(Continue); ++ } ++ ++ // change thread state ++ __ addi_d(AT, R0, _thread_in_Java); ++ if (os::is_MP()) { ++ __ addi_d(T4, TREG, in_bytes(JavaThread::thread_state_offset())); ++ __ amswap_db_w(R0, AT, T4); ++ } else { ++ __ st_w(AT, TREG, in_bytes(JavaThread::thread_state_offset())); ++ } ++ __ bind(after_transition); ++ Label reguard; ++ Label reguard_done; ++ __ ld_w(AT, TREG, in_bytes(JavaThread::stack_guard_state_offset())); ++ __ addi_d(AT, AT, -StackOverflow::stack_guard_yellow_reserved_disabled); ++ __ beq(AT, R0, reguard); ++ // slow path reguard re-enters here ++ __ bind(reguard_done); ++ ++ // Handle possible exception (will unlock if necessary) ++ ++ // native result if any is live ++ ++ // Unlock ++ Label slow_path_unlock; ++ Label unlock_done; ++ if (method->is_synchronized()) { ++ ++ // Get locked oop from the handle we passed to jni ++ __ ld_d( obj_reg, oop_handle_reg, 0); ++ ++ Label done, not_recursive; ++ ++ if (LockingMode == LM_LEGACY) { ++ // Simple recursive lock? ++ __ ld_d(AT, FP, lock_slot_fp_offset); ++ __ bnez(AT, not_recursive); ++ __ decrement(Address(TREG, JavaThread::held_monitor_count_offset()), 1); ++ __ b(done); ++ } ++ ++ __ bind(not_recursive); ++ ++ // Must save FSF if if it is live now because cmpxchg must use it ++ if (ret_type != T_FLOAT && ret_type != T_DOUBLE && ret_type != T_VOID) { ++ save_native_result(masm, ret_type, stack_slots); ++ } ++ ++ if (LockingMode == LM_MONITOR) { ++ __ b(slow_path_unlock); ++ } else if (LockingMode == LM_LEGACY) { ++ // get old displaced header ++ __ ld_d(T8, FP, lock_slot_fp_offset); ++ // get address of the stack lock ++ __ addi_d(lock_reg, FP, lock_slot_fp_offset); ++ // Atomic swap old header if oop still contains the stack lock ++ Label count; ++ __ cmpxchg(Address(obj_reg, 0), lock_reg, T8, AT, false, true /* acquire */, count, &slow_path_unlock); ++ __ bind(count); ++ __ decrement(Address(TREG, JavaThread::held_monitor_count_offset()), 1); ++ } else { ++ assert(LockingMode == LM_LIGHTWEIGHT, ""); ++ __ ld_d(lock_reg, Address(obj_reg, oopDesc::mark_offset_in_bytes())); ++ __ andi(AT, lock_reg, markWord::monitor_value); ++ __ bnez(AT, slow_path_unlock); ++ __ lightweight_unlock(obj_reg, lock_reg, swap_reg, SCR1, slow_path_unlock); ++ __ decrement(Address(TREG, JavaThread::held_monitor_count_offset())); ++ } ++ ++ // slow path re-enters here ++ __ bind(unlock_done); ++ if (ret_type != T_FLOAT && ret_type != T_DOUBLE && ret_type != T_VOID) { ++ restore_native_result(masm, ret_type, stack_slots); ++ } ++ ++ __ bind(done); ++ } ++ { ++ SkipIfEqual skip_if(masm, &DTraceMethodProbes, 0); ++ // Tell dtrace about this method exit ++ save_native_result(masm, ret_type, stack_slots); ++ int metadata_index = __ oop_recorder()->find_index( (method())); ++ RelocationHolder rspec = metadata_Relocation::spec(metadata_index); ++ __ relocate(rspec); ++ __ patchable_li52(AT, (long)(method())); ++ ++ __ call_VM_leaf( ++ CAST_FROM_FN_PTR(address, SharedRuntime::dtrace_method_exit), ++ TREG, AT); ++ restore_native_result(masm, ret_type, stack_slots); ++ } ++ ++ // We can finally stop using that last_Java_frame we setup ages ago ++ ++ __ reset_last_Java_frame(false); ++ ++ // Unpack oop result, e.g. JNIHandles::resolve value. ++ if (is_reference_type(ret_type)) { ++ __ resolve_jobject(V0, SCR2, SCR1); ++ } ++ ++ if (CheckJNICalls) { ++ // clear_pending_jni_exception_check ++ __ st_d(R0, TREG, in_bytes(JavaThread::pending_jni_exception_check_fn_offset())); ++ } ++ ++ // reset handle block ++ __ ld_d(AT, TREG, in_bytes(JavaThread::active_handles_offset())); ++ __ st_w(R0, AT, in_bytes(JNIHandleBlock::top_offset())); ++ ++ __ leave(); ++ ++ // Any exception pending? ++ __ ld_d(AT, TREG, in_bytes(Thread::pending_exception_offset())); ++ __ bne(AT, R0, exception_pending); ++ ++ // We're done ++ __ jr(RA); ++ ++ // Unexpected paths are out of line and go here ++ ++ // forward the exception ++ __ bind(exception_pending); ++ ++ __ jmp(StubRoutines::forward_exception_entry(), relocInfo::runtime_call_type); ++ ++ // Slow path locking & unlocking ++ if (method->is_synchronized()) { ++ ++ // BEGIN Slow path lock ++ __ bind(slow_path_lock); ++ ++ // protect the args we've loaded ++ save_args(masm, total_c_args, c_arg, out_regs); ++ ++ // has last_Java_frame setup. No exceptions so do vanilla call not call_VM ++ // args are (oop obj, BasicLock* lock, JavaThread* thread) ++ ++ __ move(A0, obj_reg); ++ __ move(A1, lock_reg); ++ __ move(A2, TREG); ++ __ addi_d(SP, SP, - 3*wordSize); ++ ++ __ move(S2, SP); // use S2 as a sender SP holder ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); // align stack as required by ABI ++ ++ __ call(CAST_FROM_FN_PTR(address, SharedRuntime::complete_monitor_locking_C), relocInfo::runtime_call_type); ++ __ move(SP, S2); ++ __ addi_d(SP, SP, 3*wordSize); ++ ++ restore_args(masm, total_c_args, c_arg, out_regs); ++ ++#ifdef ASSERT ++ { Label L; ++ __ ld_d(AT, TREG, in_bytes(Thread::pending_exception_offset())); ++ __ beq(AT, R0, L); ++ __ stop("no pending exception allowed on exit from monitorenter"); ++ __ bind(L); ++ } ++#endif ++ __ b(lock_done); ++ // END Slow path lock ++ ++ // BEGIN Slow path unlock ++ __ bind(slow_path_unlock); ++ ++ // Slow path unlock ++ ++ if (ret_type == T_FLOAT || ret_type == T_DOUBLE ) { ++ save_native_result(masm, ret_type, stack_slots); ++ } ++ // Save pending exception around call to VM (which contains an EXCEPTION_MARK) ++ ++ __ ld_d(AT, TREG, in_bytes(Thread::pending_exception_offset())); ++ __ push(AT); ++ __ st_d(R0, TREG, in_bytes(Thread::pending_exception_offset())); ++ ++ __ move(S2, SP); // use S2 as a sender SP holder ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); // align stack as required by ABI ++ ++ // should be a peal ++ // +wordSize because of the push above ++ __ addi_d(A1, FP, lock_slot_fp_offset); ++ ++ __ move(A0, obj_reg); ++ __ move(A2, TREG); ++ __ addi_d(SP, SP, -2*wordSize); ++ __ call(CAST_FROM_FN_PTR(address, SharedRuntime::complete_monitor_unlocking_C), ++ relocInfo::runtime_call_type); ++ __ addi_d(SP, SP, 2*wordSize); ++ __ move(SP, S2); ++#ifdef ASSERT ++ { ++ Label L; ++ __ ld_d( AT, TREG, in_bytes(Thread::pending_exception_offset())); ++ __ beq(AT, R0, L); ++ __ stop("no pending exception allowed on exit complete_monitor_unlocking_C"); ++ __ bind(L); ++ } ++#endif /* ASSERT */ ++ ++ __ pop(AT); ++ __ st_d(AT, TREG, in_bytes(Thread::pending_exception_offset())); ++ if (ret_type == T_FLOAT || ret_type == T_DOUBLE ) { ++ restore_native_result(masm, ret_type, stack_slots); ++ } ++ __ b(unlock_done); ++ // END Slow path unlock ++ ++ } ++ ++ // SLOW PATH Reguard the stack if needed ++ ++ __ bind(reguard); ++ save_native_result(masm, ret_type, stack_slots); ++ __ call(CAST_FROM_FN_PTR(address, SharedRuntime::reguard_yellow_pages), ++ relocInfo::runtime_call_type); ++ restore_native_result(masm, ret_type, stack_slots); ++ __ b(reguard_done); ++ ++ __ flush(); ++ ++ nmethod *nm = nmethod::new_native_nmethod(method, ++ compile_id, ++ masm->code(), ++ vep_offset, ++ frame_complete, ++ stack_slots / VMRegImpl::slots_per_word, ++ (is_static ? in_ByteSize(klass_offset) : in_ByteSize(receiver_offset)), ++ in_ByteSize(lock_slot_offset*VMRegImpl::stack_slot_size), ++ oop_maps); ++ ++ return nm; ++} ++ ++// this function returns the adjust size (in number of words) to a c2i adapter ++// activation for use during deoptimization ++int Deoptimization::last_frame_adjust(int callee_parameters, int callee_locals) { ++ return (callee_locals - callee_parameters) * Interpreter::stackElementWords; ++} ++ ++// Number of stack slots between incoming argument block and the start of ++// a new frame. The PROLOG must add this many slots to the stack. The ++// EPILOG must remove this many slots. LA needs two slots for ++// return address and fp. ++// TODO think this is correct but check ++uint SharedRuntime::in_preserve_stack_slots() { ++ return 4; ++} ++ ++// "Top of Stack" slots that may be unused by the calling convention but must ++// otherwise be preserved. ++// On Intel these are not necessary and the value can be zero. ++// On Sparc this describes the words reserved for storing a register window ++// when an interrupt occurs. ++uint SharedRuntime::out_preserve_stack_slots() { ++ return 0; ++} ++ ++//------------------------------generate_deopt_blob---------------------------- ++// Ought to generate an ideal graph & compile, but here's some SPARC ASM ++// instead. ++void SharedRuntime::generate_deopt_blob() { ++ // allocate space for the code ++ ResourceMark rm; ++ // setup code generation tools ++ int pad = 0; ++#if INCLUDE_JVMCI ++ if (EnableJVMCI) { ++ pad += 512; // Increase the buffer size when compiling for JVMCI ++ } ++#endif ++ CodeBuffer buffer ("deopt_blob", 2048+pad, 1024); ++ MacroAssembler* masm = new MacroAssembler( & buffer); ++ int frame_size_in_words; ++ OopMap* map = nullptr; ++ // Account for the extra args we place on the stack ++ // by the time we call fetch_unroll_info ++ const int additional_words = 2; // deopt kind, thread ++ ++ OopMapSet *oop_maps = new OopMapSet(); ++ RegisterSaver reg_save(COMPILER2_OR_JVMCI != 0); ++ ++ address start = __ pc(); ++ Label cont; ++ // we use S3 for DeOpt reason register ++ Register reason = S3; ++ // use S7 for fetch_unroll_info returned UnrollBlock ++ Register unroll = S7; ++ // Prolog for non exception case! ++ ++ // We have been called from the deopt handler of the deoptee. ++ // ++ // deoptee: ++ // ... ++ // call X ++ // ... ++ // deopt_handler: call_deopt_stub ++ // cur. return pc --> ... ++ // ++ // So currently RA points behind the call in the deopt handler. ++ // We adjust it such that it points to the start of the deopt handler. ++ // The return_pc has been stored in the frame of the deoptee and ++ // will replace the address of the deopt_handler in the call ++ // to Deoptimization::fetch_unroll_info below. ++ ++ // HandlerImpl::size_deopt_handler() ++ __ addi_d(RA, RA, - NativeFarCall::instruction_size); ++ // Save everything in sight. ++ map = reg_save.save_live_registers(masm, additional_words, &frame_size_in_words); ++ // Normal deoptimization ++ __ li(reason, Deoptimization::Unpack_deopt); ++ __ b(cont); ++ ++ int reexecute_offset = __ pc() - start; ++#if INCLUDE_JVMCI && !defined(COMPILER1) ++ if (EnableJVMCI && UseJVMCICompiler) { ++ // JVMCI does not use this kind of deoptimization ++ __ should_not_reach_here(); ++ } ++#endif ++ ++ // Reexecute case ++ // return address is the pc describes what bci to do re-execute at ++ ++ // No need to update map as each call to save_live_registers will produce identical oopmap ++ (void) reg_save.save_live_registers(masm, additional_words, &frame_size_in_words); ++ __ li(reason, Deoptimization::Unpack_reexecute); ++ __ b(cont); ++ ++#if INCLUDE_JVMCI ++ Label after_fetch_unroll_info_call; ++ int implicit_exception_uncommon_trap_offset = 0; ++ int uncommon_trap_offset = 0; ++ ++ if (EnableJVMCI) { ++ implicit_exception_uncommon_trap_offset = __ pc() - start; ++ ++ __ ld_d(RA, Address(TREG, in_bytes(JavaThread::jvmci_implicit_exception_pc_offset()))); ++ __ st_d(R0, Address(TREG, in_bytes(JavaThread::jvmci_implicit_exception_pc_offset()))); ++ ++ uncommon_trap_offset = __ pc() - start; ++ ++ // Save everything in sight. ++ (void) reg_save.save_live_registers(masm, additional_words, &frame_size_in_words); ++ __ addi_d(SP, SP, -additional_words * wordSize); ++ // fetch_unroll_info needs to call last_java_frame() ++ Label retaddr; ++ __ set_last_Java_frame(NOREG, NOREG, retaddr); ++ ++ __ ld_w(c_rarg1, Address(TREG, in_bytes(JavaThread::pending_deoptimization_offset()))); ++ __ li(AT, -1); ++ __ st_w(AT, Address(TREG, in_bytes(JavaThread::pending_deoptimization_offset()))); ++ ++ __ li(reason, (int32_t)Deoptimization::Unpack_reexecute); ++ __ move(c_rarg0, TREG); ++ __ move(c_rarg2, reason); // exec mode ++ __ call((address)Deoptimization::uncommon_trap, relocInfo::runtime_call_type); ++ __ bind(retaddr); ++ oop_maps->add_gc_map( __ pc()-start, map->deep_copy()); ++ __ addi_d(SP, SP, additional_words * wordSize); ++ ++ __ reset_last_Java_frame(false); ++ ++ __ b(after_fetch_unroll_info_call); ++ } // EnableJVMCI ++#endif // INCLUDE_JVMCI ++ ++ int exception_offset = __ pc() - start; ++ // Prolog for exception case ++ ++ // all registers are dead at this entry point, except for V0 and ++ // V1 which contain the exception oop and exception pc ++ // respectively. Set them in TLS and fall thru to the ++ // unpack_with_exception_in_tls entry point. ++ ++ __ st_d(V1, Address(TREG, JavaThread::exception_pc_offset())); ++ __ st_d(V0, Address(TREG, JavaThread::exception_oop_offset())); ++ int exception_in_tls_offset = __ pc() - start; ++ // new implementation because exception oop is now passed in JavaThread ++ ++ // Prolog for exception case ++ // All registers must be preserved because they might be used by LinearScan ++ // Exceptiop oop and throwing PC are passed in JavaThread ++ // tos: stack at point of call to method that threw the exception (i.e. only ++ // args are on the stack, no return address) ++ ++ // Return address will be patched later with the throwing pc. The correct value is not ++ // available now because loading it from memory would destroy registers. ++ // Save everything in sight. ++ // No need to update map as each call to save_live_registers will produce identical oopmap ++ (void) reg_save.save_live_registers(masm, additional_words, &frame_size_in_words); ++ ++ // Now it is safe to overwrite any register ++ // store the correct deoptimization type ++ __ li(reason, Deoptimization::Unpack_exception); ++ // load throwing pc from JavaThread and patch it as the return address ++ // of the current frame. Then clear the field in JavaThread ++ ++ __ ld_d(V1, Address(TREG, JavaThread::exception_pc_offset())); ++ __ st_d(V1, SP, reg_save.ra_offset()); //save ra ++ __ st_d(R0, Address(TREG, JavaThread::exception_pc_offset())); ++ ++#ifdef ASSERT ++ // verify that there is really an exception oop in JavaThread ++ __ ld_d(AT, TREG, in_bytes(JavaThread::exception_oop_offset())); ++ __ verify_oop(AT); ++ // verify that there is no pending exception ++ Label no_pending_exception; ++ __ ld_d(AT, TREG, in_bytes(Thread::pending_exception_offset())); ++ __ beq(AT, R0, no_pending_exception); ++ __ stop("must not have pending exception here"); ++ __ bind(no_pending_exception); ++#endif ++ __ bind(cont); ++ ++ // Call C code. Need thread and this frame, but NOT official VM entry ++ // crud. We cannot block on this call, no GC can happen. ++ ++ __ move(c_rarg0, TREG); ++ __ move(c_rarg1, reason); // exec_mode ++ __ addi_d(SP, SP, -additional_words * wordSize); ++ ++ Label retaddr; ++ __ set_last_Java_frame(NOREG, NOREG, retaddr); ++ ++ // Call fetch_unroll_info(). Need thread and this frame, but NOT official VM entry - cannot block on ++ // this call, no GC can happen. Call should capture return values. ++ ++ // TODO: confirm reloc ++ __ call(CAST_FROM_FN_PTR(address, Deoptimization::fetch_unroll_info), relocInfo::runtime_call_type); ++ __ bind(retaddr); ++ oop_maps->add_gc_map(__ pc() - start, map); ++ __ addi_d(SP, SP, additional_words * wordSize); ++ ++ __ reset_last_Java_frame(false); ++ ++#if INCLUDE_JVMCI ++ if (EnableJVMCI) { ++ __ bind(after_fetch_unroll_info_call); ++ } ++#endif ++ ++ // Load UnrollBlock into S7 ++ __ move(unroll, V0); ++ ++ ++ // Move the unpack kind to a safe place in the UnrollBlock because ++ // we are very short of registers ++ ++ Address unpack_kind(unroll, Deoptimization::UnrollBlock::unpack_kind_offset()); ++ __ st_w(reason, unpack_kind); ++ // save the unpack_kind value ++ // Retrieve the possible live values (return values) ++ // All callee save registers representing jvm state ++ // are now in the vframeArray. ++ ++ Label noException; ++ __ li(AT, Deoptimization::Unpack_exception); ++ __ bne(AT, reason, noException);// Was exception pending? ++ __ ld_d(V0, Address(TREG, JavaThread::exception_oop_offset())); ++ __ ld_d(V1, Address(TREG, JavaThread::exception_pc_offset())); ++ __ st_d(R0, Address(TREG, JavaThread::exception_pc_offset())); ++ __ st_d(R0, Address(TREG, JavaThread::exception_oop_offset())); ++ ++ __ verify_oop(V0); ++ ++ // Overwrite the result registers with the exception results. ++ __ st_d(V0, SP, reg_save.v0_offset()); ++ __ st_d(V1, SP, reg_save.v1_offset()); ++ ++ __ bind(noException); ++ ++ ++ // Stack is back to only having register save data on the stack. ++ // Now restore the result registers. Everything else is either dead or captured ++ // in the vframeArray. ++ ++ reg_save.restore_result_registers(masm); ++ // All of the register save area has been popped of the stack. Only the ++ // return address remains. ++ // Pop all the frames we must move/replace. ++ // Frame picture (youngest to oldest) ++ // 1: self-frame (no frame link) ++ // 2: deopting frame (no frame link) ++ // 3: caller of deopting frame (could be compiled/interpreted). ++ // ++ // Note: by leaving the return address of self-frame on the stack ++ // and using the size of frame 2 to adjust the stack ++ // when we are done the return to frame 3 will still be on the stack. ++ ++ // register for the sender's sp ++ Register sender_sp = Rsender; ++ // register for frame pcs ++ Register pcs = T0; ++ // register for frame sizes ++ Register sizes = T1; ++ // register for frame count ++ Register count = T3; ++ ++ // Pop deoptimized frame ++ __ ld_w(T8, Address(unroll, Deoptimization::UnrollBlock::size_of_deoptimized_frame_offset())); ++ __ add_d(SP, SP, T8); ++ // sp should be pointing at the return address to the caller (3) ++ ++ // Load array of frame pcs into pcs ++ __ ld_d(pcs, Address(unroll, Deoptimization::UnrollBlock::frame_pcs_offset())); ++ __ addi_d(SP, SP, wordSize); // trash the old pc ++ // Load array of frame sizes into T6 ++ __ ld_d(sizes, Address(unroll, Deoptimization::UnrollBlock::frame_sizes_offset())); ++ ++#ifdef ASSERT ++ // Compilers generate code that bang the stack by as much as the ++ // interpreter would need. So this stack banging should never ++ // trigger a fault. Verify that it does not on non product builds. ++ __ ld_w(TSR, Address(unroll, Deoptimization::UnrollBlock::total_frame_sizes_offset())); ++ __ bang_stack_size(TSR, T8); ++#endif ++ ++ // Load count of frams into T3 ++ __ ld_w(count, Address(unroll, Deoptimization::UnrollBlock::number_of_frames_offset())); ++ // Pick up the initial fp we should save ++ __ ld_d(FP, Address(unroll, Deoptimization::UnrollBlock::initial_info_offset())); ++ // Now adjust the caller's stack to make up for the extra locals ++ // but record the original sp so that we can save it in the skeletal interpreter ++ // frame and the stack walking of interpreter_sender will get the unextended sp ++ // value and not the "real" sp value. ++ __ move(sender_sp, SP); ++ __ ld_w(AT, Address(unroll, Deoptimization::UnrollBlock::caller_adjustment_offset())); ++ __ sub_d(SP, SP, AT); ++ ++ Label loop; ++ __ bind(loop); ++ __ ld_d(T2, sizes, 0); // Load frame size ++ __ ld_d(AT, pcs, 0); // save return address ++ __ addi_d(T2, T2, -2 * wordSize); // we'll push pc and fp, by hand ++ __ push2(AT, FP); ++ __ addi_d(FP, SP, 2 * wordSize); ++ __ sub_d(SP, SP, T2); // Prolog! ++ // This value is corrected by layout_activation_impl ++ __ st_d(R0, FP, frame::interpreter_frame_last_sp_offset * wordSize); ++ __ st_d(sender_sp, FP, frame::interpreter_frame_sender_sp_offset * wordSize);// Make it walkable ++ __ move(sender_sp, SP); // pass to next frame ++ __ addi_d(count, count, -1); // decrement counter ++ __ addi_d(sizes, sizes, wordSize); // Bump array pointer (sizes) ++ __ addi_d(pcs, pcs, wordSize); // Bump array pointer (pcs) ++ __ bne(count, R0, loop); ++ ++ // Re-push self-frame ++ __ ld_d(AT, pcs, 0); // frame_pcs[number_of_frames] = Interpreter::deopt_entry(vtos, 0); ++ __ push2(AT, FP); ++ __ addi_d(FP, SP, 2 * wordSize); ++ __ addi_d(SP, SP, -(frame_size_in_words - 2 - additional_words) * wordSize); ++ ++ // Restore frame locals after moving the frame ++ __ st_d(V0, SP, reg_save.v0_offset()); ++ __ st_d(V1, SP, reg_save.v1_offset()); ++ __ fst_d(F0, SP, reg_save.fpr0_offset()); ++ __ fst_d(F1, SP, reg_save.fpr1_offset()); ++ ++ // Call unpack_frames(). Need thread and this frame, but NOT official VM entry - cannot block on ++ // this call, no GC can happen. ++ __ move(A1, reason); // exec_mode ++ __ move(A0, TREG); // thread ++ __ addi_d(SP, SP, (-additional_words) *wordSize); ++ ++ // set last_Java_sp, last_Java_fp ++ Label L; ++ address the_pc = __ pc(); ++ __ bind(L); ++ __ set_last_Java_frame(NOREG, FP, L); ++ ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); // Fix stack alignment as required by ABI ++ ++ __ call(CAST_FROM_FN_PTR(address, Deoptimization::unpack_frames), relocInfo::runtime_call_type); ++ // Revert SP alignment after call since we're going to do some SP relative addressing below ++ __ ld_d(SP, TREG, in_bytes(JavaThread::last_Java_sp_offset())); ++ // Set an oopmap for the call site ++ oop_maps->add_gc_map(the_pc - start, new OopMap(frame_size_in_words, 0)); ++ ++ __ push(V0); ++ ++ __ reset_last_Java_frame(true); ++ ++ // Collect return values ++ __ ld_d(V0, SP, reg_save.v0_offset() + (additional_words + 1) * wordSize); ++ __ ld_d(V1, SP, reg_save.v1_offset() + (additional_words + 1) * wordSize); ++ // Pop float stack and store in local ++ __ fld_d(F0, SP, reg_save.fpr0_offset() + (additional_words + 1) * wordSize); ++ __ fld_d(F1, SP, reg_save.fpr1_offset() + (additional_words + 1) * wordSize); ++ ++ // Push a float or double return value if necessary. ++ __ leave(); ++ ++ // Jump to interpreter ++ __ jr(RA); ++ ++ masm->flush(); ++ _deopt_blob = DeoptimizationBlob::create(&buffer, oop_maps, 0, exception_offset, reexecute_offset, frame_size_in_words); ++ _deopt_blob->set_unpack_with_exception_in_tls_offset(exception_in_tls_offset); ++#if INCLUDE_JVMCI ++ if (EnableJVMCI) { ++ _deopt_blob->set_uncommon_trap_offset(uncommon_trap_offset); ++ _deopt_blob->set_implicit_exception_uncommon_trap_offset(implicit_exception_uncommon_trap_offset); ++ } ++#endif ++} ++ ++#ifdef COMPILER2 ++ ++//------------------------------generate_uncommon_trap_blob-------------------- ++// Ought to generate an ideal graph & compile, but here's some SPARC ASM ++// instead. ++void SharedRuntime::generate_uncommon_trap_blob() { ++ // allocate space for the code ++ ResourceMark rm; ++ // setup code generation tools ++ CodeBuffer buffer ("uncommon_trap_blob", 512*80 , 512*40 ); ++ MacroAssembler* masm = new MacroAssembler(&buffer); ++ ++ enum frame_layout { ++ fp_off, fp_off2, ++ return_off, return_off2, ++ framesize ++ }; ++ assert(framesize % 4 == 0, "sp not 16-byte aligned"); ++ address start = __ pc(); ++ ++ // Push self-frame. ++ __ addi_d(SP, SP, -framesize * BytesPerInt); ++ ++ __ st_d(RA, SP, return_off * BytesPerInt); ++ __ st_d(FP, SP, fp_off * BytesPerInt); ++ ++ __ addi_d(FP, SP, framesize * BytesPerInt); ++ ++ // set last_Java_sp ++ Label retaddr; ++ __ set_last_Java_frame(NOREG, FP, retaddr); ++ // Call C code. Need thread but NOT official VM entry ++ // crud. We cannot block on this call, no GC can happen. Call should ++ // capture callee-saved registers as well as return values. ++ __ move(A0, TREG); ++ // argument already in T0 ++ __ move(A1, T0); ++ __ addi_d(A2, R0, Deoptimization::Unpack_uncommon_trap); ++ __ call((address)Deoptimization::uncommon_trap, relocInfo::runtime_call_type); ++ __ bind(retaddr); ++ ++ // Set an oopmap for the call site ++ OopMapSet *oop_maps = new OopMapSet(); ++ OopMap* map = new OopMap( framesize, 0 ); ++ ++ oop_maps->add_gc_map(__ pc() - start, map); ++ ++ __ reset_last_Java_frame(false); ++ ++ // Load UnrollBlock into S7 ++ Register unroll = S7; ++ __ move(unroll, V0); ++ ++#ifdef ASSERT ++ { Label L; ++ __ ld_d(AT, Address(unroll, Deoptimization::UnrollBlock::unpack_kind_offset())); ++ __ li(T4, Deoptimization::Unpack_uncommon_trap); ++ __ beq(AT, T4, L); ++ __ stop("SharedRuntime::generate_uncommon_trap_blob: expected Unpack_uncommon_trap"); ++ __ bind(L); ++ } ++#endif ++ ++ // Pop all the frames we must move/replace. ++ // ++ // Frame picture (youngest to oldest) ++ // 1: self-frame (no frame link) ++ // 2: deopting frame (no frame link) ++ // 3: possible-i2c-adapter-frame ++ // 4: caller of deopting frame (could be compiled/interpreted. If interpreted we will create an ++ // and c2i here) ++ ++ __ addi_d(SP, SP, framesize * BytesPerInt); ++ ++ // Pop deoptimized frame ++ __ ld_w(T8, Address(unroll, Deoptimization::UnrollBlock::size_of_deoptimized_frame_offset())); ++ __ add_d(SP, SP, T8); ++ ++#ifdef ASSERT ++ // Compilers generate code that bang the stack by as much as the ++ // interpreter would need. So this stack banging should never ++ // trigger a fault. Verify that it does not on non product builds. ++ __ ld_w(TSR, Address(unroll, Deoptimization::UnrollBlock::total_frame_sizes_offset())); ++ __ bang_stack_size(TSR, T8); ++#endif ++ ++ // register for frame pcs ++ Register pcs = T8; ++ // register for frame sizes ++ Register sizes = T4; ++ // register for frame count ++ Register count = T3; ++ // register for the sender's sp ++ Register sender_sp = T1; ++ ++ // sp should be pointing at the return address to the caller (4) ++ // Load array of frame pcs ++ __ ld_d(pcs, Address(unroll, Deoptimization::UnrollBlock::frame_pcs_offset())); ++ ++ // Load array of frame sizes ++ __ ld_d(sizes, Address(unroll, Deoptimization::UnrollBlock::frame_sizes_offset())); ++ __ ld_wu(count, Address(unroll, Deoptimization::UnrollBlock::number_of_frames_offset())); ++ ++ // Pick up the initial fp we should save ++ __ ld_d(FP, Address(unroll, Deoptimization::UnrollBlock::initial_info_offset())); ++ ++ // Now adjust the caller's stack to make up for the extra locals ++ // but record the original sp so that we can save it in the skeletal interpreter ++ // frame and the stack walking of interpreter_sender will get the unextended sp ++ // value and not the "real" sp value. ++ __ move(sender_sp, SP); ++ __ ld_w(AT, Address(unroll, Deoptimization::UnrollBlock::caller_adjustment_offset())); ++ __ sub_d(SP, SP, AT); ++ ++ // Push interpreter frames in a loop ++ Label loop; ++ __ bind(loop); ++ __ ld_d(T2, sizes, 0); // Load frame size ++ __ ld_d(RA, pcs, 0); // save return address ++ __ addi_d(T2, T2, -2*wordSize); // we'll push pc and fp, by hand ++ __ enter(); ++ __ sub_d(SP, SP, T2); // Prolog! ++ // This value is corrected by layout_activation_impl ++ __ st_d(R0, FP, frame::interpreter_frame_last_sp_offset * wordSize); ++ __ st_d(sender_sp, FP, frame::interpreter_frame_sender_sp_offset * wordSize);// Make it walkable ++ __ move(sender_sp, SP); // pass to next frame ++ __ addi_d(count, count, -1); // decrement counter ++ __ addi_d(sizes, sizes, wordSize); // Bump array pointer (sizes) ++ __ addi_d(pcs, pcs, wordSize); // Bump array pointer (pcs) ++ __ bne(count, R0, loop); ++ ++ __ ld_d(RA, pcs, 0); ++ ++ // Re-push self-frame ++ // save old & set new FP ++ // save final return address ++ __ enter(); ++ ++ // Use FP because the frames look interpreted now ++ // Save "the_pc" since it cannot easily be retrieved using the last_java_SP after we aligned SP. ++ // Don't need the precise return PC here, just precise enough to point into this code blob. ++ Label L; ++ address the_pc = __ pc(); ++ __ bind(L); ++ __ set_last_Java_frame(NOREG, FP, L); ++ ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); // Fix stack alignment as required by ABI ++ ++ // Call C code. Need thread but NOT official VM entry ++ // crud. We cannot block on this call, no GC can happen. Call should ++ // restore return values to their stack-slots with the new SP. ++ __ move(A0, TREG); ++ __ li(A1, Deoptimization::Unpack_uncommon_trap); ++ __ call((address)Deoptimization::unpack_frames, relocInfo::runtime_call_type); ++ // Set an oopmap for the call site ++ oop_maps->add_gc_map(the_pc - start, new OopMap(framesize, 0)); ++ ++ __ reset_last_Java_frame(true); ++ ++ // Pop self-frame. ++ __ leave(); // Epilog! ++ ++ // Jump to interpreter ++ __ jr(RA); ++ // ------------- ++ // make sure all code is generated ++ masm->flush(); ++ _uncommon_trap_blob = UncommonTrapBlob::create(&buffer, oop_maps, framesize / 2); ++} ++ ++#endif // COMPILER2 ++ ++//------------------------------generate_handler_blob------------------- ++// ++// Generate a special Compile2Runtime blob that saves all registers, and sets ++// up an OopMap and calls safepoint code to stop the compiled code for ++// a safepoint. ++// ++// This blob is jumped to (via a breakpoint and the signal handler) from a ++// safepoint in compiled code. ++ ++SafepointBlob* SharedRuntime::generate_handler_blob(address call_ptr, int poll_type) { ++ ++ // Account for thread arg in our frame ++ const int additional_words = 0; ++ int frame_size_in_words; ++ ++ assert (StubRoutines::forward_exception_entry() != nullptr, "must be generated before"); ++ ++ ResourceMark rm; ++ OopMapSet *oop_maps = new OopMapSet(); ++ OopMap* map; ++ ++ // allocate space for the code ++ // setup code generation tools ++ CodeBuffer buffer ("handler_blob", 2048, 512); ++ MacroAssembler* masm = new MacroAssembler( &buffer); ++ ++ address start = __ pc(); ++ bool cause_return = (poll_type == POLL_AT_RETURN); ++ RegisterSaver reg_save(poll_type == POLL_AT_VECTOR_LOOP /* save_vectors */); ++ ++ map = reg_save.save_live_registers(masm, additional_words, &frame_size_in_words); ++ ++ // The following is basically a call_VM. However, we need the precise ++ // address of the call in order to generate an oopmap. Hence, we do all the ++ // work outselvs. ++ ++ Label retaddr; ++ __ set_last_Java_frame(NOREG, NOREG, retaddr); ++ ++ if (!cause_return) { ++ // overwrite the return address pushed by save_live_registers ++ // Additionally, TSR is a callee-saved register so we can look at ++ // it later to determine if someone changed the return address for ++ // us! ++ __ ld_d(TSR, Address(TREG, JavaThread::saved_exception_pc_offset())); ++ __ st_d(TSR, SP, reg_save.ra_offset()); ++ } ++ ++ // Do the call ++ __ move(A0, TREG); ++ // TODO: confirm reloc ++ __ call(call_ptr, relocInfo::runtime_call_type); ++ __ bind(retaddr); ++ ++ // Set an oopmap for the call site. This oopmap will map all ++ // oop-registers and debug-info registers as callee-saved. This ++ // will allow deoptimization at this safepoint to find all possible ++ // debug-info recordings, as well as let GC find all oops. ++ oop_maps->add_gc_map(__ pc() - start, map); ++ ++ Label noException; ++ ++ // Clear last_Java_sp again ++ __ reset_last_Java_frame(false); ++ ++ __ ld_d(AT, Address(TREG, Thread::pending_exception_offset())); ++ __ beq(AT, R0, noException); ++ ++ // Exception pending ++ ++ reg_save.restore_live_registers(masm); ++ ++ __ jmp(StubRoutines::forward_exception_entry(), relocInfo::runtime_call_type); ++ ++ // No exception case ++ __ bind(noException); ++ ++ Label no_adjust, bail; ++ if (!cause_return) { ++ // If our stashed return pc was modified by the runtime we avoid touching it ++ __ ld_d(AT, SP, reg_save.ra_offset()); ++ __ bne(AT, TSR, no_adjust); ++ ++#ifdef ASSERT ++ // Verify the correct encoding of the poll we're about to skip. ++ // See NativeInstruction::is_safepoint_poll() ++ __ ld_wu(AT, TSR, 0); ++ __ push(T5); ++ __ li(T5, 0xffc0001f); ++ __ andr(AT, AT, T5); ++ __ li(T5, 0x28800013); ++ __ xorr(AT, AT, T5); ++ __ pop(T5); ++ __ bne(AT, R0, bail); ++#endif ++ // Adjust return pc forward to step over the safepoint poll instruction ++ __ addi_d(RA, TSR, 4); // NativeInstruction::instruction_size=4 ++ __ st_d(RA, SP, reg_save.ra_offset()); ++ } ++ ++ __ bind(no_adjust); ++ // Normal exit, register restoring and exit ++ reg_save.restore_live_registers(masm); ++ __ jr(RA); ++ ++#ifdef ASSERT ++ __ bind(bail); ++ __ stop("Attempting to adjust pc to skip safepoint poll but the return point is not what we expected"); ++#endif ++ ++ // Make sure all code is generated ++ masm->flush(); ++ // Fill-out other meta info ++ return SafepointBlob::create(&buffer, oop_maps, frame_size_in_words); ++} ++ ++// ++// generate_resolve_blob - call resolution (static/virtual/opt-virtual/ic-miss ++// ++// Generate a stub that calls into vm to find out the proper destination ++// of a java call. All the argument registers are live at this point ++// but since this is generic code we don't know what they are and the caller ++// must do any gc of the args. ++// ++RuntimeStub* SharedRuntime::generate_resolve_blob(address destination, const char* name) { ++ assert (StubRoutines::forward_exception_entry() != nullptr, "must be generated before"); ++ ++ // allocate space for the code ++ ResourceMark rm; ++ ++ CodeBuffer buffer(name, 1000, 512); ++ MacroAssembler* masm = new MacroAssembler(&buffer); ++ ++ int frame_size_words; ++ RegisterSaver reg_save(false /* save_vectors */); ++ //we put the thread in A0 ++ ++ OopMapSet *oop_maps = new OopMapSet(); ++ OopMap* map = nullptr; ++ ++ address start = __ pc(); ++ map = reg_save.save_live_registers(masm, 0, &frame_size_words); ++ ++ int frame_complete = __ offset(); ++ ++ __ move(A0, TREG); ++ Label retaddr; ++ __ set_last_Java_frame(noreg, FP, retaddr); ++ // align the stack before invoke native ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); ++ ++ // TODO: confirm reloc ++ __ call(destination, relocInfo::runtime_call_type); ++ __ bind(retaddr); ++ ++ // Set an oopmap for the call site. ++ // We need this not only for callee-saved registers, but also for volatile ++ // registers that the compiler might be keeping live across a safepoint. ++ oop_maps->add_gc_map(__ pc() - start, map); ++ // V0 contains the address we are going to jump to assuming no exception got installed ++ __ ld_d(SP, Address(TREG, JavaThread::last_Java_sp_offset())); ++ // clear last_Java_sp ++ __ reset_last_Java_frame(true); ++ // check for pending exceptions ++ Label pending; ++ __ ld_d(AT, Address(TREG, Thread::pending_exception_offset())); ++ __ bne(AT, R0, pending); ++ // get the returned Method* ++ __ get_vm_result_2(Rmethod, TREG); ++ __ st_d(Rmethod, SP, reg_save.s3_offset()); ++ __ st_d(V0, SP, reg_save.t5_offset()); ++ reg_save.restore_live_registers(masm); ++ ++ // We are back the original state on entry and ready to go the callee method. ++ __ jr(T5); ++ // Pending exception after the safepoint ++ ++ __ bind(pending); ++ ++ reg_save.restore_live_registers(masm); ++ ++ // exception pending => remove activation and forward to exception handler ++ ++ __ st_d(R0, Address(TREG, JavaThread::vm_result_offset())); ++ __ ld_d(V0, Address(TREG, Thread::pending_exception_offset())); ++ __ jmp(StubRoutines::forward_exception_entry(), relocInfo::runtime_call_type); ++ // ++ // make sure all code is generated ++ masm->flush(); ++ RuntimeStub* tmp= RuntimeStub::new_runtime_stub(name, &buffer, frame_complete, frame_size_words, oop_maps, true); ++ return tmp; ++} ++ ++#ifdef COMPILER2 ++//-------------- generate_exception_blob ----------- ++// creates exception blob at the end ++// Using exception blob, this code is jumped from a compiled method. ++// (see emit_exception_handler in loongarch.ad file) ++// ++// Given an exception pc at a call we call into the runtime for the ++// handler in this method. This handler might merely restore state ++// (i.e. callee save registers) unwind the frame and jump to the ++// exception handler for the nmethod if there is no Java level handler ++// for the nmethod. ++// ++// This code is entered with a jump, and left with a jump. ++// ++// Arguments: ++// A0: exception oop ++// A1: exception pc ++// ++// Results: ++// A0: exception oop ++// A1: exception pc in caller ++// destination: exception handler of caller ++// ++// Note: the exception pc MUST be at a call (precise debug information) ++// ++// [stubGenerator_loongarch_64.cpp] generate_forward_exception() ++// |- A0, A1 are created ++// |- T4 <= SharedRuntime::exception_handler_for_return_address ++// `- jr T4 ++// `- the caller's exception_handler ++// `- jr OptoRuntime::exception_blob ++// `- here ++// ++void OptoRuntime::generate_exception_blob() { ++ enum frame_layout { ++ fp_off, fp_off2, ++ return_off, return_off2, ++ framesize ++ }; ++ assert(framesize % 4 == 0, "sp not 16-byte aligned"); ++ ++ // Allocate space for the code ++ ResourceMark rm; ++ // Setup code generation tools ++ CodeBuffer buffer("exception_blob", 2048, 1024); ++ MacroAssembler* masm = new MacroAssembler(&buffer); ++ ++ address start = __ pc(); ++ ++ // Exception pc is 'return address' for stack walker ++ __ push2(A1 /* return address */, FP); ++ ++ // there are no callee save registers and we don't expect an ++ // arg reg save area ++#ifndef PRODUCT ++ assert(frame::arg_reg_save_area_bytes == 0, "not expecting frame reg save area"); ++#endif ++ // Store exception in Thread object. We cannot pass any arguments to the ++ // handle_exception call, since we do not want to make any assumption ++ // about the size of the frame where the exception happened in. ++ __ st_d(A0, Address(TREG, JavaThread::exception_oop_offset())); ++ __ st_d(A1, Address(TREG, JavaThread::exception_pc_offset())); ++ ++ // This call does all the hard work. It checks if an exception handler ++ // exists in the method. ++ // If so, it returns the handler address. ++ // If not, it prepares for stack-unwinding, restoring the callee-save ++ // registers of the frame being removed. ++ // ++ // address OptoRuntime::handle_exception_C(JavaThread* thread) ++ // ++ Label L; ++ address the_pc = __ pc(); ++ __ bind(L); ++ __ set_last_Java_frame(TREG, SP, NOREG, L); ++ ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); // Fix stack alignment as required by ABI ++ ++ __ move(A0, TREG); ++ __ call(CAST_FROM_FN_PTR(address, OptoRuntime::handle_exception_C), ++ relocInfo::runtime_call_type); ++ ++ // handle_exception_C is a special VM call which does not require an explicit ++ // instruction sync afterwards. ++ ++ // Set an oopmap for the call site. This oopmap will only be used if we ++ // are unwinding the stack. Hence, all locations will be dead. ++ // Callee-saved registers will be the same as the frame above (i.e., ++ // handle_exception_stub), since they were restored when we got the ++ // exception. ++ ++ OopMapSet *oop_maps = new OopMapSet(); ++ ++ oop_maps->add_gc_map(the_pc - start, new OopMap(framesize, 0)); ++ ++ __ reset_last_Java_frame(TREG, false); ++ ++ // Restore callee-saved registers ++ ++ // FP is an implicitly saved callee saved register (i.e. the calling ++ // convention will save restore it in prolog/epilog) Other than that ++ // there are no callee save registers now that adapter frames are gone. ++ // and we dont' expect an arg reg save area ++ __ pop2(RA, FP); ++ ++ const Register exception_handler = T4; ++ ++ // We have a handler in A0, (could be deopt blob) ++ __ move(exception_handler, A0); ++ ++ // Get the exception ++ __ ld_d(A0, Address(TREG, JavaThread::exception_oop_offset())); ++ // Get the exception pc in case we are deoptimized ++ __ ld_d(A1, Address(TREG, JavaThread::exception_pc_offset())); ++#ifdef ASSERT ++ __ st_d(R0, Address(TREG, JavaThread::exception_handler_pc_offset())); ++ __ st_d(R0, Address(TREG, JavaThread::exception_pc_offset())); ++#endif ++ // Clear the exception oop so GC no longer processes it as a root. ++ __ st_d(R0, Address(TREG, JavaThread::exception_oop_offset())); ++ ++ // A0: exception oop ++ // A1: exception pc ++ __ jr(exception_handler); ++ ++ // make sure all code is generated ++ masm->flush(); ++ ++ // Set exception blob ++ _exception_blob = ExceptionBlob::create(&buffer, oop_maps, framesize >> 1); ++} ++#endif // COMPILER2 ++ ++extern "C" int SpinPause() {return 0;} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/smallRegisterMap_loongarch.inline.hpp b/src/hotspot/cpu/loongarch/smallRegisterMap_loongarch.inline.hpp +--- a/src/hotspot/cpu/loongarch/smallRegisterMap_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/smallRegisterMap_loongarch.inline.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,92 @@ ++/* ++ * Copyright (c) 2019, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_SMALLREGISTERMAP_LOONGARCH_INLINE_HPP ++#define CPU_LOONGARCH_SMALLREGISTERMAP_LOONGARCH_INLINE_HPP ++ ++#include "runtime/frame.inline.hpp" ++#include "runtime/registerMap.hpp" ++ ++// Java frames don't have callee saved registers (except for FP), so we can use a smaller RegisterMap ++class SmallRegisterMap { ++public: ++ static constexpr SmallRegisterMap* instance = nullptr; ++private: ++ static void assert_is_fp(VMReg r) NOT_DEBUG_RETURN ++ DEBUG_ONLY({ assert (r == FP->as_VMReg() || r == FP->as_VMReg()->next(), "Reg: %s", r->name()); }) ++public: ++ // as_RegisterMap is used when we didn't want to templatize and abstract over RegisterMap type to support SmallRegisterMap ++ // Consider enhancing SmallRegisterMap to support those cases ++ const RegisterMap* as_RegisterMap() const { return nullptr; } ++ RegisterMap* as_RegisterMap() { return nullptr; } ++ ++ RegisterMap* copy_to_RegisterMap(RegisterMap* map, intptr_t* sp) const { ++ map->clear(); ++ map->set_include_argument_oops(this->include_argument_oops()); ++ frame::update_map_with_saved_link(map, (intptr_t**)sp - 2); ++ return map; ++ } ++ ++ SmallRegisterMap() {} ++ ++ SmallRegisterMap(const RegisterMap* map) { ++ #ifdef ASSERT ++ for(int i = 0; i < RegisterMap::reg_count; i++) { ++ VMReg r = VMRegImpl::as_VMReg(i); ++ if (map->location(r, (intptr_t*)nullptr) != nullptr) assert_is_fp(r); ++ } ++ #endif ++ } ++ ++ inline address location(VMReg reg, intptr_t* sp) const { ++ assert_is_fp(reg); ++ return (address)(sp - 2); ++ } ++ ++ inline void set_location(VMReg reg, address loc) { assert_is_fp(reg); } ++ ++ JavaThread* thread() const { ++ #ifndef ASSERT ++ guarantee (false, ""); ++ #endif ++ return nullptr; ++ } ++ ++ bool update_map() const { return false; } ++ bool walk_cont() const { return false; } ++ bool include_argument_oops() const { return false; } ++ void set_include_argument_oops(bool f) {} ++ bool in_cont() const { return false; } ++ stackChunkHandle stack_chunk() const { return stackChunkHandle(); } ++ ++#ifdef ASSERT ++ bool should_skip_missing() const { return false; } ++ VMReg find_register_spilled_here(void* p, intptr_t* sp) { return FP->as_VMReg(); } ++ void print() const { print_on(tty); } ++ void print_on(outputStream* st) const { st->print_cr("Small register map"); } ++#endif ++}; ++ ++#endif // CPU_LOONGARCH_SMALLREGISTERMAP_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/stackChunkFrameStream_loongarch.inline.hpp b/src/hotspot/cpu/loongarch/stackChunkFrameStream_loongarch.inline.hpp +--- a/src/hotspot/cpu/loongarch/stackChunkFrameStream_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/stackChunkFrameStream_loongarch.inline.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,142 @@ ++/* ++ * Copyright (c) 2019, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_STACKCHUNKFRAMESTREAM_LOONGARCH_INLINE_HPP ++#define CPU_LOONGARCH_STACKCHUNKFRAMESTREAM_LOONGARCH_INLINE_HPP ++ ++#include "interpreter/oopMapCache.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/registerMap.hpp" ++ ++#ifdef ASSERT ++template ++inline bool StackChunkFrameStream::is_in_frame(void* p0) const { ++ assert(!is_done(), ""); ++ intptr_t* p = (intptr_t*)p0; ++ int argsize = is_compiled() ? (_cb->as_compiled_method()->method()->num_stack_arg_slots() * VMRegImpl::stack_slot_size) >> LogBytesPerWord : 0; ++ int frame_size = _cb->frame_size() + argsize; ++ return p == sp() - 2 || ((p - unextended_sp()) >= 0 && (p - unextended_sp()) < frame_size); ++} ++#endif ++ ++template ++inline frame StackChunkFrameStream::to_frame() const { ++ if (is_done()) { ++ return frame(_sp, _sp, nullptr, nullptr, nullptr, nullptr, true); ++ } else { ++ return frame(sp(), unextended_sp(), fp(), pc(), cb(), _oopmap, true); ++ } ++} ++ ++template ++inline address StackChunkFrameStream::get_pc() const { ++ assert(!is_done(), ""); ++ return *(address*)(_sp - 1); ++} ++ ++template ++inline intptr_t* StackChunkFrameStream::fp() const { ++ intptr_t* fp_addr = _sp - 2; ++ return (frame_kind == ChunkFrames::Mixed && is_interpreted()) ++ ? fp_addr + *fp_addr // derelativize ++ : *(intptr_t**)fp_addr; ++} ++ ++template ++inline intptr_t* StackChunkFrameStream::derelativize(int offset) const { ++ intptr_t* fp = this->fp(); ++ assert(fp != nullptr, ""); ++ return fp + fp[offset]; ++} ++ ++template ++inline intptr_t* StackChunkFrameStream::unextended_sp_for_interpreter_frame() const { ++ assert_is_interpreted_and_frame_type_mixed(); ++ return derelativize(frame::interpreter_frame_last_sp_offset); ++} ++ ++template ++inline void StackChunkFrameStream::next_for_interpreter_frame() { ++ assert_is_interpreted_and_frame_type_mixed(); ++ if (derelativize(frame::interpreter_frame_locals_offset) + 1 >= _end) { ++ _unextended_sp = _end; ++ _sp = _end; ++ } else { ++ intptr_t* fp = this->fp(); ++ _unextended_sp = fp + fp[frame::interpreter_frame_sender_sp_offset]; ++ _sp = fp + frame::sender_sp_offset; ++ } ++} ++ ++template ++inline int StackChunkFrameStream::interpreter_frame_size() const { ++ assert_is_interpreted_and_frame_type_mixed(); ++ ++ intptr_t* top = unextended_sp(); // later subtract argsize if callee is interpreted ++ intptr_t* bottom = derelativize(frame::interpreter_frame_locals_offset) + 1; // the sender's unextended sp: derelativize(frame::interpreter_frame_sender_sp_offset); ++ return (int)(bottom - top); ++} ++ ++template ++inline int StackChunkFrameStream::interpreter_frame_stack_argsize() const { ++ assert_is_interpreted_and_frame_type_mixed(); ++ int diff = (int)(derelativize(frame::interpreter_frame_locals_offset) - derelativize(frame::interpreter_frame_sender_sp_offset) + 1); ++ return diff; ++} ++ ++template ++inline int StackChunkFrameStream::interpreter_frame_num_oops() const { ++ assert_is_interpreted_and_frame_type_mixed(); ++ ResourceMark rm; ++ InterpreterOopMap mask; ++ frame f = to_frame(); ++ f.interpreted_frame_oop_map(&mask); ++ return mask.num_oops() ++ + 1 // for the mirror oop ++ + ((intptr_t*)f.interpreter_frame_monitor_begin() ++ - (intptr_t*)f.interpreter_frame_monitor_end()) / BasicObjectLock::size(); ++} ++ ++template<> ++template<> ++inline void StackChunkFrameStream::update_reg_map_pd(RegisterMap* map) { ++ if (map->update_map()) { ++ frame::update_map_with_saved_link(map, map->in_cont() ? (intptr_t**)2 : (intptr_t**)(_sp - 2)); ++ } ++} ++ ++template<> ++template<> ++inline void StackChunkFrameStream::update_reg_map_pd(RegisterMap* map) { ++ if (map->update_map()) { ++ frame::update_map_with_saved_link(map, map->in_cont() ? (intptr_t**)2 : (intptr_t**)(_sp - 2)); ++ } ++} ++ ++template ++template ++inline void StackChunkFrameStream::update_reg_map_pd(RegisterMapT* map) {} ++ ++#endif // CPU_LOONGARCH_STACKCHUNKFRAMESTREAM_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/stackChunkOop_loongarch.inline.hpp b/src/hotspot/cpu/loongarch/stackChunkOop_loongarch.inline.hpp +--- a/src/hotspot/cpu/loongarch/stackChunkOop_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/stackChunkOop_loongarch.inline.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,43 @@ ++/* ++ * Copyright (c) 2019, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_STACKCHUNKOOP_LOONGARCH_INLINE_HPP ++#define CPU_LOONGARCH_STACKCHUNKOOP_LOONGARCH_INLINE_HPP ++ ++#include "runtime/frame.inline.hpp" ++ ++inline void stackChunkOopDesc::relativize_frame_pd(frame& fr) const { ++ if (fr.is_interpreted_frame()) { ++ fr.set_offset_fp(relativize_address(fr.fp())); ++ } ++} ++ ++inline void stackChunkOopDesc::derelativize_frame_pd(frame& fr) const { ++ if (fr.is_interpreted_frame()) { ++ fr.set_fp(derelativize_address(fr.offset_fp())); ++ } ++} ++ ++#endif // CPU_LOONGARCH_STACKCHUNKOOP_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/stubGenerator_loongarch_64.cpp b/src/hotspot/cpu/loongarch/stubGenerator_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/stubGenerator_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/stubGenerator_loongarch_64.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,5721 @@ ++/* ++ * Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "compiler/oopMap.hpp" ++#include "gc/shared/barrierSet.hpp" ++#include "gc/shared/barrierSetAssembler.hpp" ++#include "interpreter/interpreter.hpp" ++#include "nativeInst_loongarch.hpp" ++#include "oops/instanceOop.hpp" ++#include "oops/method.hpp" ++#include "oops/objArrayKlass.hpp" ++#include "oops/oop.inline.hpp" ++#include "prims/methodHandles.hpp" ++#include "runtime/continuation.hpp" ++#include "runtime/continuationEntry.inline.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/handles.inline.hpp" ++#include "runtime/javaThread.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/stubCodeGenerator.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "utilities/copy.hpp" ++ ++#ifdef COMPILER2 ++#include "opto/runtime.hpp" ++#endif ++#if INCLUDE_JFR ++#include "jfr/support/jfrIntrinsics.hpp" ++#endif ++#if INCLUDE_ZGC ++#include "gc/z/zBarrierSetAssembler.hpp" ++#endif ++ ++#if INCLUDE_ZGC ++#include "gc/z/zThreadLocalData.hpp" ++#endif ++ ++// Declaration and definition of StubGenerator (no .hpp file). ++// For a more detailed description of the stub routine structure ++// see the comment in stubRoutines.hpp ++ ++#define __ _masm-> ++ ++#ifdef PRODUCT ++#define BLOCK_COMMENT(str) /* nothing */ ++#else ++#define BLOCK_COMMENT(str) __ block_comment(str) ++#endif ++ ++#define BIND(label) bind(label); BLOCK_COMMENT(#label ":") ++ ++// Stub Code definitions ++ ++class StubGenerator: public StubCodeGenerator { ++ private: ++ ++ // Call stubs are used to call Java from C ++ // ++ // Arguments: ++ // c_rarg0: call wrapper address address ++ // c_rarg1: result address ++ // c_rarg2: result type BasicType ++ // c_rarg3: method Method* ++ // c_rarg4: (interpreter) entry point address ++ // c_rarg5: parameters intptr_t* ++ // c_rarg6: parameter size (in words) int ++ // c_rarg7: thread Thread* ++ // ++ // we don't need to save all arguments, since both C and Java treat ++ // them as volatile registers. ++ // ++ // we only need to keep call wrapper address (c_rarg0) for Java frame, ++ // and restore the stub result via c_rarg1 and c_rarg2. ++ // ++ // we save RA as the return PC at the base of the frame and link FP ++ // below it as the frame pointer. ++ // ++ // we save S0-S8 and F24-F31 which are expected to be callee-saved. ++ // ++ // so the stub frame looks like this when we enter Java code ++ // ++ // [ return_from_Java ] <--- sp ++ // [ argument word n ] ++ // ... ++ // -23 [ argument word 1 ] ++ // -22 [ F31 ] <--- sp_after_call ++ // ... ++ // -15 [ F24 ] ++ // -14 [ S8 ] ++ // ... ++ // -6 [ S0 ] ++ // -5 [ result_type ] <--- c_rarg2 ++ // -4 [ result ] <--- c_rarg1 ++ // -3 [ call wrapper ] <--- c_rarg0 ++ // -2 [ saved FP ] ++ // -1 [ saved RA ] ++ // 0 [ ] <--- fp ++ ++ // Call stub stack layout word offsets from fp ++ enum call_stub_layout { ++ sp_after_call_off = -22, ++ ++ F31_off = -22, ++ F30_off = -21, ++ F29_off = -20, ++ F28_off = -19, ++ F27_off = -18, ++ F26_off = -17, ++ F25_off = -16, ++ F24_off = -15, ++ ++ S8_off = -14, ++ S7_off = -13, ++ S6_off = -12, ++ S5_off = -11, ++ S4_off = -10, ++ S3_off = -9, ++ S2_off = -8, ++ S1_off = -7, ++ S0_off = -6, ++ ++ result_type_off = -5, ++ result_off = -4, ++ call_wrapper_off = -3, ++ FP_off = -2, ++ RA_off = -1, ++ }; ++ ++ address generate_call_stub(address& return_address) { ++ assert((int)frame::entry_frame_after_call_words == -(int)sp_after_call_off + 1 && ++ (int)frame::entry_frame_call_wrapper_offset == (int)call_wrapper_off, ++ "adjust this code"); ++ ++ StubCodeMark mark(this, "StubRoutines", "call_stub"); ++ ++ // stub code ++ ++ address start = __ pc(); ++ ++ // set up frame and move sp to end of save area ++ __ enter(); ++ __ addi_d(SP, FP, sp_after_call_off * wordSize); ++ ++ // save register parameters and Java temporary/global registers ++ __ st_d(A0, FP, call_wrapper_off * wordSize); ++ __ st_d(A1, FP, result_off * wordSize); ++ __ st_d(A2, FP, result_type_off * wordSize); ++ ++ __ st_d(S0, FP, S0_off * wordSize); ++ __ st_d(S1, FP, S1_off * wordSize); ++ __ st_d(S2, FP, S2_off * wordSize); ++ __ st_d(S3, FP, S3_off * wordSize); ++ __ st_d(S4, FP, S4_off * wordSize); ++ __ st_d(S5, FP, S5_off * wordSize); ++ __ st_d(S6, FP, S6_off * wordSize); ++ __ st_d(S7, FP, S7_off * wordSize); ++ __ st_d(S8, FP, S8_off * wordSize); ++ ++ __ fst_d(F24, FP, F24_off * wordSize); ++ __ fst_d(F25, FP, F25_off * wordSize); ++ __ fst_d(F26, FP, F26_off * wordSize); ++ __ fst_d(F27, FP, F27_off * wordSize); ++ __ fst_d(F28, FP, F28_off * wordSize); ++ __ fst_d(F29, FP, F29_off * wordSize); ++ __ fst_d(F30, FP, F30_off * wordSize); ++ __ fst_d(F31, FP, F31_off * wordSize); ++ ++ // install Java thread in global register now we have saved ++ // whatever value it held ++ __ move(TREG, A7); ++ ++ // init Method* ++ __ move(Rmethod, A3); ++ ++ // set up the heapbase register ++ __ reinit_heapbase(); ++ ++#ifdef ASSERT ++ // make sure we have no pending exceptions ++ { ++ Label L; ++ __ ld_d(AT, TREG, in_bytes(Thread::pending_exception_offset())); ++ __ beqz(AT, L); ++ __ stop("StubRoutines::call_stub: entered with pending exception"); ++ __ bind(L); ++ } ++#endif ++ ++ // pass parameters if any ++ // c_rarg5: parameter_pointer ++ // c_rarg6: parameter_size ++ Label parameters_done; ++ __ beqz(c_rarg6, parameters_done); ++ ++ __ slli_d(c_rarg6, c_rarg6, LogBytesPerWord); ++ __ sub_d(SP, SP, c_rarg6); ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); ++ ++ address loop = __ pc(); ++ __ ld_d(AT, c_rarg5, 0); ++ __ addi_d(c_rarg5, c_rarg5, wordSize); ++ __ addi_d(c_rarg6, c_rarg6, -wordSize); ++ __ stx_d(AT, SP, c_rarg6); ++ __ blt(R0, c_rarg6, loop); ++ ++ __ bind(parameters_done); ++ ++ // call Java entry -- passing methdoOop, and current sp ++ // Rmethod: Method* ++ // Rsender: sender sp ++ BLOCK_COMMENT("call Java function"); ++ __ move(Rsender, SP); ++ __ jalr(c_rarg4); ++ ++ // save current address for use by exception handling code ++ ++ return_address = __ pc(); ++ ++ // store result depending on type (everything that is not ++ // T_OBJECT, T_LONG, T_FLOAT or T_DOUBLE is treated as T_INT) ++ // n.b. this assumes Java returns an integral result in A0 ++ // and a floating result in FA0 ++ __ ld_d(c_rarg1, FP, result_off * wordSize); ++ __ ld_d(c_rarg2, FP, result_type_off * wordSize); ++ ++ Label is_long, is_float, is_double, exit; ++ ++ __ addi_d(AT, c_rarg2, (-1) * T_OBJECT); ++ __ beqz(AT, is_long); ++ __ addi_d(AT, c_rarg2, (-1) * T_LONG); ++ __ beqz(AT, is_long); ++ __ addi_d(AT, c_rarg2, (-1) * T_FLOAT); ++ __ beqz(AT, is_float); ++ __ addi_d(AT, c_rarg2, (-1) * T_DOUBLE); ++ __ beqz(AT, is_double); ++ ++ // handle T_INT case ++ __ st_w(A0, c_rarg1, 0); ++ ++ __ bind(exit); ++ ++ __ pop_cont_fastpath(TREG); ++ ++ // restore callee-save registers ++ ++ __ ld_d(S0, FP, S0_off * wordSize); ++ __ ld_d(S1, FP, S1_off * wordSize); ++ __ ld_d(S2, FP, S2_off * wordSize); ++ __ ld_d(S3, FP, S3_off * wordSize); ++ __ ld_d(S4, FP, S4_off * wordSize); ++ __ ld_d(S5, FP, S5_off * wordSize); ++ __ ld_d(S6, FP, S6_off * wordSize); ++ __ ld_d(S7, FP, S7_off * wordSize); ++ __ ld_d(S8, FP, S8_off * wordSize); ++ ++ __ fld_d(F24, FP, F24_off * wordSize); ++ __ fld_d(F25, FP, F25_off * wordSize); ++ __ fld_d(F26, FP, F26_off * wordSize); ++ __ fld_d(F27, FP, F27_off * wordSize); ++ __ fld_d(F28, FP, F28_off * wordSize); ++ __ fld_d(F29, FP, F29_off * wordSize); ++ __ fld_d(F30, FP, F30_off * wordSize); ++ __ fld_d(F31, FP, F31_off * wordSize); ++ ++ // leave frame and return to caller ++ __ leave(); ++ __ jr(RA); ++ ++ // handle return types different from T_INT ++ __ bind(is_long); ++ __ st_d(A0, c_rarg1, 0); ++ __ b(exit); ++ ++ __ bind(is_float); ++ __ fst_s(FA0, c_rarg1, 0); ++ __ b(exit); ++ ++ __ bind(is_double); ++ __ fst_d(FA0, c_rarg1, 0); ++ __ b(exit); ++ ++ return start; ++ } ++ ++ // Return point for a Java call if there's an exception thrown in ++ // Java code. The exception is caught and transformed into a ++ // pending exception stored in JavaThread that can be tested from ++ // within the VM. ++ // ++ // Note: Usually the parameters are removed by the callee. In case ++ // of an exception crossing an activation frame boundary, that is ++ // not the case if the callee is compiled code => need to setup the ++ // sp. ++ // ++ // V0: exception oop ++ ++ address generate_catch_exception() { ++ StubCodeMark mark(this, "StubRoutines", "catch_exception"); ++ address start = __ pc(); ++ ++#ifdef ASSERT ++ // verify that threads correspond ++ { Label L; ++ __ get_thread(T8); ++ __ beq(T8, TREG, L); ++ __ stop("StubRoutines::catch_exception: threads must correspond"); ++ __ bind(L); ++ } ++#endif ++ // set pending exception ++ __ verify_oop(V0); ++ __ st_d(V0, TREG, in_bytes(Thread::pending_exception_offset())); ++ __ li(AT, (long)__FILE__); ++ __ st_d(AT, TREG, in_bytes(Thread::exception_file_offset ())); ++ __ li(AT, (long)__LINE__); ++ __ st_d(AT, TREG, in_bytes(Thread::exception_line_offset ())); ++ ++ // complete return to VM ++ assert(StubRoutines::_call_stub_return_address != nullptr, "_call_stub_return_address must have been generated before"); ++ __ jmp(StubRoutines::_call_stub_return_address, relocInfo::none); ++ return start; ++ } ++ ++ // Continuation point for runtime calls returning with a pending ++ // exception. The pending exception check happened in the runtime ++ // or native call stub. The pending exception in Thread is ++ // converted into a Java-level exception. ++ // ++ // Contract with Java-level exception handlers: ++ // A0: exception ++ // A1: throwing pc ++ // ++ // NOTE: At entry of this stub, exception-pc must be in RA !! ++ ++ address generate_forward_exception() { ++ StubCodeMark mark(this, "StubRoutines", "forward exception"); ++ address start = __ pc(); ++ ++ // Upon entry, RA points to the return address returning into ++ // Java (interpreted or compiled) code; i.e., the return address ++ // becomes the throwing pc. ++ // ++ // Arguments pushed before the runtime call are still on the stack ++ // but the exception handler will reset the stack pointer -> ++ // ignore them. A potential result in registers can be ignored as ++ // well. ++ ++#ifdef ASSERT ++ // make sure this code is only executed if there is a pending exception ++ { ++ Label L; ++ __ ld_d(AT, Address(TREG, Thread::pending_exception_offset())); ++ __ bnez(AT, L); ++ __ stop("StubRoutines::forward exception: no pending exception (1)"); ++ __ bind(L); ++ } ++#endif ++ ++ const Register exception_handler = T4; ++ ++ __ move(TSR, RA); // keep return address in callee-saved register ++ __ call_VM_leaf( ++ CAST_FROM_FN_PTR(address, SharedRuntime::exception_handler_for_return_address), ++ TREG, RA); ++ __ move(RA, TSR); // restore ++ ++ __ move(exception_handler, A0); ++ __ move(A1, RA); // save throwing pc ++ ++ __ ld_d(A0, TREG, in_bytes(Thread::pending_exception_offset())); ++ __ st_d(R0, TREG, in_bytes(Thread::pending_exception_offset())); ++ ++#ifdef ASSERT ++ // make sure exception is set ++ { ++ Label L; ++ __ bnez(A0, L); ++ __ stop("StubRoutines::forward exception: no pending exception (2)"); ++ __ bind(L); ++ } ++#endif ++ ++ // continue at exception handler (return address removed) ++ // A0: exception ++ // A1: throwing pc ++ __ verify_oop(A0); ++ __ jr(exception_handler); ++ ++ return start; ++ } ++ ++ // Non-destructive plausibility checks for oops ++ // ++ // Arguments: ++ // c_rarg0: error message ++ // c_rarg1: oop to verify ++ // ++ // Stack after saving c_rarg3: ++ // [tos + 0]: saved c_rarg3 ++ // [tos + 1]: saved c_rarg2 ++ // [tos + 2]: saved c_rarg1 ++ // [tos + 3]: saved c_rarg0 ++ // [tos + 4]: saved AT ++ // [tos + 5]: saved RA ++ address generate_verify_oop() { ++ ++ StubCodeMark mark(this, "StubRoutines", "verify_oop"); ++ address start = __ pc(); ++ ++ Label exit, error; ++ ++ const Register msg = c_rarg0; ++ const Register oop = c_rarg1; ++ ++ __ push(RegSet::of(c_rarg2, c_rarg3)); ++ ++ __ li(c_rarg2, (address) StubRoutines::verify_oop_count_addr()); ++ __ ld_d(c_rarg3, Address(c_rarg2)); ++ __ addi_d(c_rarg3, c_rarg3, 1); ++ __ st_d(c_rarg3, Address(c_rarg2)); ++ ++ // make sure object is 'reasonable' ++ __ beqz(oop, exit); // if obj is null it is OK ++ ++ BarrierSetAssembler* bs_asm = BarrierSet::barrier_set()->barrier_set_assembler(); ++ bs_asm->check_oop(_masm, oop, c_rarg2, c_rarg3, error); ++ ++ // return if everything seems ok ++ __ bind(exit); ++ __ pop(RegSet::of(c_rarg2, c_rarg3)); ++ __ jr(RA); ++ ++ // handle errors ++ __ bind(error); ++ __ pop(RegSet::of(c_rarg2, c_rarg3)); ++ // error message already in c_rarg0, pass it to debug ++ __ call(CAST_FROM_FN_PTR(address, MacroAssembler::debug), relocInfo::runtime_call_type); ++ __ brk(5); ++ ++ return start; ++ } ++ ++ // Generate indices for iota vector. ++ address generate_iota_indices(const char *stub_name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", stub_name); ++ address start = __ pc(); ++ // B ++ __ emit_data64(0x0706050403020100, relocInfo::none); ++ __ emit_data64(0x0F0E0D0C0B0A0908, relocInfo::none); ++ __ emit_data64(0x1716151413121110, relocInfo::none); ++ __ emit_data64(0x1F1E1D1C1B1A1918, relocInfo::none); ++ // H ++ __ emit_data64(0x0003000200010000, relocInfo::none); ++ __ emit_data64(0x0007000600050004, relocInfo::none); ++ __ emit_data64(0x000B000A00090008, relocInfo::none); ++ __ emit_data64(0x000F000E000D000C, relocInfo::none); ++ // W ++ __ emit_data64(0x0000000100000000, relocInfo::none); ++ __ emit_data64(0x0000000300000002, relocInfo::none); ++ __ emit_data64(0x0000000500000004, relocInfo::none); ++ __ emit_data64(0x0000000700000006, relocInfo::none); ++ // D ++ __ emit_data64(0x0000000000000000, relocInfo::none); ++ __ emit_data64(0x0000000000000001, relocInfo::none); ++ __ emit_data64(0x0000000000000002, relocInfo::none); ++ __ emit_data64(0x0000000000000003, relocInfo::none); ++ // S - FP ++ __ emit_data64(0x3F80000000000000, relocInfo::none); // 0.0f, 1.0f ++ __ emit_data64(0x4040000040000000, relocInfo::none); // 2.0f, 3.0f ++ __ emit_data64(0x40A0000040800000, relocInfo::none); // 4.0f, 5.0f ++ __ emit_data64(0x40E0000040C00000, relocInfo::none); // 6.0f, 7.0f ++ // D - FP ++ __ emit_data64(0x0000000000000000, relocInfo::none); // 0.0d ++ __ emit_data64(0x3FF0000000000000, relocInfo::none); // 1.0d ++ __ emit_data64(0x4000000000000000, relocInfo::none); // 2.0d ++ __ emit_data64(0x4008000000000000, relocInfo::none); // 3.0d ++ return start; ++ } ++ ++ // ++ // Generate stub for array fill. If "aligned" is true, the ++ // "to" address is assumed to be heapword aligned. ++ // ++ // Arguments for generated stub: ++ // to: A0 ++ // value: A1 ++ // count: A2 treated as signed ++ // ++ address generate_fill(BasicType t, bool aligned, const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ address start = __ pc(); ++ ++ const Register to = A0; // source array address ++ const Register value = A1; // value ++ const Register count = A2; // elements count ++ ++ if (UseLASX) { ++ __ array_fill_lasx(t, to, value, count); ++ } else if (UseLSX) { ++ __ array_fill_lsx(t, to, value, count); ++ } else { ++ __ array_fill(t, to, value, count, aligned); ++ } ++ ++ return start; ++ } ++ ++ // ++ // Generate overlap test for array copy stubs ++ // ++ // Input: ++ // A0 - source array address ++ // A1 - destination array address ++ // A2 - element count ++ // ++ // Temp: ++ // AT - destination array address - source array address ++ // T4 - element count * element size ++ // ++ void array_overlap_test(address no_overlap_target, int log2_elem_size) { ++ __ slli_d(T4, A2, log2_elem_size); ++ __ sub_d(AT, A1, A0); ++ __ bgeu(AT, T4, no_overlap_target); ++ } ++ ++ // disjoint large copy ++ void generate_disjoint_large_copy(DecoratorSet decorators, BasicType type, Label &entry, const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ ++ Register gct1 = S0; ++ Register gct2 = S1; ++ Register gct3 = S2; ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ ++ { ++ UnsafeCopyMemoryMark ucmm(this, true, true); ++ Label loop, le32, le16, le8, lt8; ++ ++ __ bind(entry); ++#if INCLUDE_ZGC ++ if (UseZGC && ZGenerational && is_reference_type(type)) { ++ __ push(RegSet::of(gct1, gct2, gct3)); ++ } ++#endif ++ __ add_d(A3, A1, A2); ++ __ add_d(A2, A0, A2); ++ bs->copy_load_at(_masm, decorators, type, 8, A6, Address(A0, 0), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, A7, Address(A2, -8), gct1); ++ ++ __ andi(T1, A1, 7); ++ __ sub_d(T0, R0, T1); ++ __ addi_d(T0, T0, 8); ++ ++ __ add_d(A0, A0, T0); ++ __ add_d(A5, A1, T0); ++ ++ __ addi_d(A4, A2, -64); ++ __ bgeu(A0, A4, le32); ++ ++ __ bind(loop); ++ bs->copy_load_at(_masm, decorators, type, 8, T0, Address(A0, 0), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T1, Address(A0, 8), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T2, Address(A0, 16), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T3, Address(A0, 24), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T4, Address(A0, 32), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T5, Address(A0, 40), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T6, Address(A0, 48), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T8, Address(A0, 56), gct1); ++ __ addi_d(A0, A0, 64); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 0), T0, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 8), T1, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 16), T2, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 24), T3, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 32), T4, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 40), T5, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 48), T6, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 56), T8, gct1, gct2, gct3); ++ __ addi_d(A5, A5, 64); ++ __ bltu(A0, A4, loop); ++ ++ __ bind(le32); ++ __ addi_d(A4, A2, -32); ++ __ bgeu(A0, A4, le16); ++ bs->copy_load_at(_masm, decorators, type, 8, T0, Address(A0, 0), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T1, Address(A0, 8), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T2, Address(A0, 16), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T3, Address(A0, 24), gct1); ++ __ addi_d(A0, A0, 32); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 0), T0, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 8), T1, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 16), T2, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 24), T3, gct1, gct2, gct3); ++ __ addi_d(A5, A5, 32); ++ ++ __ bind(le16); ++ __ addi_d(A4, A2, -16); ++ __ bgeu(A0, A4, le8); ++ bs->copy_load_at(_masm, decorators, type, 8, T0, Address(A0, 0), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T1, Address(A0, 8), gct1); ++ __ addi_d(A0, A0, 16); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 0), T0, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 8), T1, gct1, gct2, gct3); ++ __ addi_d(A5, A5, 16); ++ ++ __ bind(le8); ++ __ addi_d(A4, A2, -8); ++ __ bgeu(A0, A4, lt8); ++ bs->copy_load_at(_masm, decorators, type, 8, T0, Address(A0, 0), gct1); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, 0), T0, gct1, gct2, gct3); ++ ++ __ bind(lt8); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 0), A6, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A3, -8), A7, gct1, gct2, gct3); ++#if INCLUDE_ZGC ++ if (UseZGC && ZGenerational && is_reference_type(type)) { ++ __ pop(RegSet::of(gct1, gct2, gct3)); ++ } ++#endif ++ } ++ ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ // disjoint large copy lsx ++ void generate_disjoint_large_copy_lsx(DecoratorSet decorators, BasicType type, Label &entry, const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ ++ Register gct1 = T2; ++ Register gct2 = T3; ++ Register gct3 = T4; ++ Register gct4 = T5; ++ FloatRegister gcvt1 = FT8; ++ FloatRegister gcvt2 = FT9; ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ ++ { ++ UnsafeCopyMemoryMark ucmm(this, true, true); ++ Label loop, le64, le32, le16, lt16; ++ ++ __ bind(entry); ++ __ add_d(A3, A1, A2); ++ __ add_d(A2, A0, A2); ++ bs->copy_load_at(_masm, decorators, type, 16, F0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, F1, Address(A2, -16), gct1, gct2, gcvt1); ++ ++ __ andi(T1, A1, 15); ++ __ sub_d(T0, R0, T1); ++ __ addi_d(T0, T0, 16); ++ ++ __ add_d(A0, A0, T0); ++ __ add_d(A5, A1, T0); ++ ++ __ addi_d(A4, A2, -128); ++ __ bgeu(A0, A4, le64); ++ ++ __ bind(loop); ++ bs->copy_load_at(_masm, decorators, type, 16, FT0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT1, Address(A0, 16), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT2, Address(A0, 32), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT3, Address(A0, 48), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT4, Address(A0, 64), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT5, Address(A0, 80), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT6, Address(A0, 96), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT7, Address(A0, 112), gct1, gct2, gcvt1); ++ __ addi_d(A0, A0, 128); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 0), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 16), FT1, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 32), FT2, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 48), FT3, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 64), FT4, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 80), FT5, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 96), FT6, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 112), FT7, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ addi_d(A5, A5, 128); ++ __ bltu(A0, A4, loop); ++ ++ __ bind(le64); ++ __ addi_d(A4, A2, -64); ++ __ bgeu(A0, A4, le32); ++ bs->copy_load_at(_masm, decorators, type, 16, FT0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT1, Address(A0, 16), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT2, Address(A0, 32), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT3, Address(A0, 48), gct1, gct2, gcvt1); ++ __ addi_d(A0, A0, 64); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 0), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 16), FT1, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 32), FT2, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 48), FT3, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ addi_d(A5, A5, 64); ++ ++ __ bind(le32); ++ __ addi_d(A4, A2, -32); ++ __ bgeu(A0, A4, le16); ++ bs->copy_load_at(_masm, decorators, type, 16, FT0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT1, Address(A0, 16), gct1, gct2, gcvt1); ++ __ addi_d(A0, A0, 32); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 0), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 16), FT1, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ addi_d(A5, A5, 32); ++ ++ __ bind(le16); ++ __ addi_d(A4, A2, -16); ++ __ bgeu(A0, A4, lt16); ++ bs->copy_load_at(_masm, decorators, type, 16, FT0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, 0), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ ++ __ bind(lt16); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A1, 0), F0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A3, -16), F1, gct1, gct2, gct3, gct4, gcvt1, gcvt2, false /* need_save_restore */); ++ } ++ ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ // disjoint large copy lasx ++ void generate_disjoint_large_copy_lasx(DecoratorSet decorators, BasicType type, Label &entry, const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ ++ Register gct1 = T2; ++ Register gct2 = T3; ++ Register gct3 = T4; ++ Register gct4 = T5; ++ FloatRegister gcvt1 = FT8; ++ FloatRegister gcvt2 = FT9; ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ ++ { ++ UnsafeCopyMemoryMark ucmm(this, true, true); ++ Label loop, le128, le64, le32, lt32; ++ ++ __ bind(entry); ++ __ add_d(A3, A1, A2); ++ __ add_d(A2, A0, A2); ++ bs->copy_load_at(_masm, decorators, type, 32, F0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, F1, Address(A2, -32), gct1, gct2, gcvt1); ++ ++ __ andi(T1, A1, 31); ++ __ sub_d(T0, R0, T1); ++ __ addi_d(T0, T0, 32); ++ ++ __ add_d(A0, A0, T0); ++ __ add_d(A5, A1, T0); ++ ++ __ addi_d(A4, A2, -256); ++ __ bgeu(A0, A4, le128); ++ ++ __ bind(loop); ++ bs->copy_load_at(_masm, decorators, type, 32, FT0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT1, Address(A0, 32), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT2, Address(A0, 64), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT3, Address(A0, 96), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT4, Address(A0, 128), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT5, Address(A0, 160), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT6, Address(A0, 192), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT7, Address(A0, 224), gct1, gct2, gcvt1); ++ __ addi_d(A0, A0, 256); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 0), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 32), FT1, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 64), FT2, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 96), FT3, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 128), FT4, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 160), FT5, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 192), FT6, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 224), FT7, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ addi_d(A5, A5, 256); ++ __ bltu(A0, A4, loop); ++ ++ __ bind(le128); ++ __ addi_d(A4, A2, -128); ++ __ bgeu(A0, A4, le64); ++ bs->copy_load_at(_masm, decorators, type, 32, FT0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT1, Address(A0, 32), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT2, Address(A0, 64), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT3, Address(A0, 96), gct1, gct2, gcvt1); ++ __ addi_d(A0, A0, 128); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 0), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 32), FT1, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 64), FT2, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 96), FT3, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ addi_d(A5, A5, 128); ++ ++ __ bind(le64); ++ __ addi_d(A4, A2, -64); ++ __ bgeu(A0, A4, le32); ++ bs->copy_load_at(_masm, decorators, type, 32, FT0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT1, Address(A0, 32), gct1, gct2, gcvt1); ++ __ addi_d(A0, A0, 64); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 0), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 32), FT1, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ addi_d(A5, A5, 64); ++ ++ __ bind(le32); ++ __ addi_d(A4, A2, -32); ++ __ bgeu(A0, A4, lt32); ++ bs->copy_load_at(_masm, decorators, type, 32, FT0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, 0), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ ++ __ bind(lt32); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A1, 0), F0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A3, -32), F1, gct1, gct2, gct3, gct4, gcvt1, gcvt2, false /* need_save_restore */); ++ } ++ ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ // conjoint large copy ++ void generate_conjoint_large_copy(DecoratorSet decorators, BasicType type, Label &entry, const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ ++ Register gct1 = S0; ++ Register gct2 = S1; ++ Register gct3 = S2; ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ ++ { ++ UnsafeCopyMemoryMark ucmm(this, true, true); ++ Label loop, le32, le16, le8, lt8; ++ ++ __ bind(entry); ++#if INCLUDE_ZGC ++ if (UseZGC && ZGenerational && is_reference_type(type)) { ++ __ push(RegSet::of(gct1, gct2, gct3)); ++ } ++#endif ++ __ add_d(A3, A1, A2); ++ __ add_d(A2, A0, A2); ++ bs->copy_load_at(_masm, decorators, type, 8, A6, Address(A0, 0), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, A7, Address(A2, -8), gct1); ++ ++ __ andi(T1, A3, 7); ++ __ sub_d(A2, A2, T1); ++ __ sub_d(A5, A3, T1); ++ ++ __ addi_d(A4, A0, 64); ++ __ bgeu(A4, A2, le32); ++ ++ __ bind(loop); ++ bs->copy_load_at(_masm, decorators, type, 8, T0, Address(A2, -8), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T1, Address(A2, -16), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T2, Address(A2, -24), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T3, Address(A2, -32), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T4, Address(A2, -40), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T5, Address(A2, -48), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T6, Address(A2, -56), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T8, Address(A2, -64), gct1); ++ __ addi_d(A2, A2, -64); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -8), T0, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -16), T1, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -24), T2, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -32), T3, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -40), T4, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -48), T5, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -56), T6, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -64), T8, gct1, gct2, gct3); ++ __ addi_d(A5, A5, -64); ++ __ bltu(A4, A2, loop); ++ ++ __ bind(le32); ++ __ addi_d(A4, A0, 32); ++ __ bgeu(A4, A2, le16); ++ bs->copy_load_at(_masm, decorators, type, 8, T0, Address(A2, -8), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T1, Address(A2, -16), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T2, Address(A2, -24), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T3, Address(A2, -32), gct1); ++ __ addi_d(A2, A2, -32); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -8), T0, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -16), T1, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -24), T2, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -32), T3, gct1, gct2, gct3); ++ __ addi_d(A5, A5, -32); ++ ++ __ bind(le16); ++ __ addi_d(A4, A0, 16); ++ __ bgeu(A4, A2, le8); ++ bs->copy_load_at(_masm, decorators, type, 8, T0, Address(A2, -8), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, T1, Address(A2, -16), gct1); ++ __ addi_d(A2, A2, -16); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -8), T0, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -16), T1, gct1, gct2, gct3); ++ __ addi_d(A5, A5, -16); ++ ++ __ bind(le8); ++ __ addi_d(A4, A0, 8); ++ __ bgeu(A4, A2, lt8); ++ bs->copy_load_at(_masm, decorators, type, 8, T0, Address(A2, -8), gct1); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A5, -8), T0, gct1, gct2, gct3); ++ ++ __ bind(lt8); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 0), A6, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A3, -8), A7, gct1, gct2, gct3); ++#if INCLUDE_ZGC ++ if (UseZGC && ZGenerational && is_reference_type(type)) { ++ __ pop(RegSet::of(gct1, gct2, gct3)); ++ } ++#endif ++ } ++ ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ // conjoint large copy lsx ++ void generate_conjoint_large_copy_lsx(DecoratorSet decorators, BasicType type, Label &entry, const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ ++ Register gct1 = T2; ++ Register gct2 = T3; ++ Register gct3 = T4; ++ Register gct4 = T5; ++ FloatRegister gcvt1 = FT8; ++ FloatRegister gcvt2 = FT9; ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ ++ { ++ UnsafeCopyMemoryMark ucmm(this, true, true); ++ Label loop, le64, le32, le16, lt16; ++ ++ __ bind(entry); ++ __ add_d(A3, A1, A2); ++ __ add_d(A2, A0, A2); ++ bs->copy_load_at(_masm, decorators, type, 16, F0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, F1, Address(A2, -16), gct1, gct2, gcvt1); ++ ++ __ andi(T1, A3, 15); ++ __ sub_d(A2, A2, T1); ++ __ sub_d(A5, A3, T1); ++ ++ __ addi_d(A4, A0, 128); ++ __ bgeu(A4, A2, le64); ++ ++ __ bind(loop); ++ bs->copy_load_at(_masm, decorators, type, 16, FT0, Address(A2, -16), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT1, Address(A2, -32), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT2, Address(A2, -48), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT3, Address(A2, -64), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT4, Address(A2, -80), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT5, Address(A2, -96), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT6, Address(A2, -112), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT7, Address(A2, -128), gct1, gct2, gcvt1); ++ __ addi_d(A2, A2, -128); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -16), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -32), FT1, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -48), FT2, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -64), FT3, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -80), FT4, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -96), FT5, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -112), FT6, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -128), FT7, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ addi_d(A5, A5, -128); ++ __ bltu(A4, A2, loop); ++ ++ __ bind(le64); ++ __ addi_d(A4, A0, 64); ++ __ bgeu(A4, A2, le32); ++ bs->copy_load_at(_masm, decorators, type, 16, FT0, Address(A2, -16), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT1, Address(A2, -32), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT2, Address(A2, -48), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT3, Address(A2, -64), gct1, gct2, gcvt1); ++ __ addi_d(A2, A2, -64); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -16), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -32), FT1, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -48), FT2, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -64), FT3, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ addi_d(A5, A5, -64); ++ ++ __ bind(le32); ++ __ addi_d(A4, A0, 32); ++ __ bgeu(A4, A2, le16); ++ bs->copy_load_at(_masm, decorators, type, 16, FT0, Address(A2, -16), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 16, FT1, Address(A2, -32), gct1, gct2, gcvt1); ++ __ addi_d(A2, A2, -32); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -16), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -32), FT1, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ addi_d(A5, A5, -32); ++ ++ __ bind(le16); ++ __ addi_d(A4, A0, 16); ++ __ bgeu(A4, A2, lt16); ++ bs->copy_load_at(_masm, decorators, type, 16, FT0, Address(A2, -16), gct1, gct2, gcvt1); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A5, -16), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ ++ __ bind(lt16); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A1, 0), F0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A3, -16), F1, gct1, gct2, gct3, gct4, gcvt1, gcvt2, false /* need_save_restore */); ++ } ++ ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ // conjoint large copy lasx ++ void generate_conjoint_large_copy_lasx(DecoratorSet decorators, BasicType type, Label &entry, const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ ++ Register gct1 = T2; ++ Register gct2 = T3; ++ Register gct3 = T4; ++ Register gct4 = T5; ++ FloatRegister gcvt1 = FT8; ++ FloatRegister gcvt2 = FT9; ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ ++ { ++ UnsafeCopyMemoryMark ucmm(this, true, true); ++ Label loop, le128, le64, le32, lt32; ++ ++ __ bind(entry); ++ __ add_d(A3, A1, A2); ++ __ add_d(A2, A0, A2); ++ bs->copy_load_at(_masm, decorators, type, 32, F0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, F1, Address(A2, -32), gct1, gct2, gcvt1); ++ ++ __ andi(T1, A3, 31); ++ __ sub_d(A2, A2, T1); ++ __ sub_d(A5, A3, T1); ++ ++ __ addi_d(A4, A0, 256); ++ __ bgeu(A4, A2, le128); ++ ++ __ bind(loop); ++ bs->copy_load_at(_masm, decorators, type, 32, FT0, Address(A2, -32), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT1, Address(A2, -64), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT2, Address(A2, -96), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT3, Address(A2, -128), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT4, Address(A2, -160), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT5, Address(A2, -192), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT6, Address(A2, -224), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT7, Address(A2, -256), gct1, gct2, gcvt1); ++ __ addi_d(A2, A2, -256); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -32), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -64), FT1, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -96), FT2, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -128), FT3, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -160), FT4, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -192), FT5, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -224), FT6, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -256), FT7, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ addi_d(A5, A5, -256); ++ __ bltu(A4, A2, loop); ++ ++ __ bind(le128); ++ __ addi_d(A4, A0, 128); ++ __ bgeu(A4, A2, le64); ++ bs->copy_load_at(_masm, decorators, type, 32, FT0, Address(A2, -32), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT1, Address(A2, -64), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT2, Address(A2, -96), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT3, Address(A2, -128), gct1, gct2, gcvt1); ++ __ addi_d(A2, A2, -128); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -32), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -64), FT1, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -96), FT2, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -128), FT3, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ addi_d(A5, A5, -128); ++ ++ __ bind(le64); ++ __ addi_d(A4, A0, 64); ++ __ bgeu(A4, A2, le32); ++ bs->copy_load_at(_masm, decorators, type, 32, FT0, Address(A2, -32), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 32, FT1, Address(A2, -64), gct1, gct2, gcvt1); ++ __ addi_d(A2, A2, -64); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -32), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -64), FT1, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ addi_d(A5, A5, -64); ++ ++ __ bind(le32); ++ __ addi_d(A4, A0, 32); ++ __ bgeu(A4, A2, lt32); ++ bs->copy_load_at(_masm, decorators, type, 32, FT0, Address(A2, -32), gct1, gct2, gcvt1); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A5, -32), FT0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ ++ __ bind(lt32); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A1, 0), F0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A3, -32), F1, gct1, gct2, gct3, gct4, gcvt1, gcvt2, false /* need_save_restore */); ++ } ++ ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ // Byte small copy: less than { int:9, lsx:17, lasx:33 } elements. ++ void generate_byte_small_copy(Label &entry, const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ ++ Label L; ++ __ bind(entry); ++ __ lipc(AT, L); ++ __ slli_d(A2, A2, 5); ++ __ add_d(AT, AT, A2); ++ __ jr(AT); ++ ++ __ bind(L); ++ // 0: ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ // 1: ++ __ ld_b(AT, A0, 0); ++ __ st_b(AT, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ // 2: ++ __ ld_h(AT, A0, 0); ++ __ st_h(AT, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ // 3: ++ __ ld_h(AT, A0, 0); ++ __ ld_b(A2, A0, 2); ++ __ st_h(AT, A1, 0); ++ __ st_b(A2, A1, 2); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 4: ++ __ ld_w(AT, A0, 0); ++ __ st_w(AT, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ // 5: ++ __ ld_w(AT, A0, 0); ++ __ ld_b(A2, A0, 4); ++ __ st_w(AT, A1, 0); ++ __ st_b(A2, A1, 4); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 6: ++ __ ld_w(AT, A0, 0); ++ __ ld_h(A2, A0, 4); ++ __ st_w(AT, A1, 0); ++ __ st_h(A2, A1, 4); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 7: ++ __ ld_w(AT, A0, 0); ++ __ ld_w(A2, A0, 3); ++ __ st_w(AT, A1, 0); ++ __ st_w(A2, A1, 3); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 8: ++ __ ld_d(AT, A0, 0); ++ __ st_d(AT, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ ++ if (!UseLSX) ++ return; ++ ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ // 9: ++ __ ld_d(AT, A0, 0); ++ __ ld_b(A2, A0, 8); ++ __ st_d(AT, A1, 0); ++ __ st_b(A2, A1, 8); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 10: ++ __ ld_d(AT, A0, 0); ++ __ ld_h(A2, A0, 8); ++ __ st_d(AT, A1, 0); ++ __ st_h(A2, A1, 8); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 11: ++ __ ld_d(AT, A0, 0); ++ __ ld_w(A2, A0, 7); ++ __ st_d(AT, A1, 0); ++ __ st_w(A2, A1, 7); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 12: ++ __ ld_d(AT, A0, 0); ++ __ ld_w(A2, A0, 8); ++ __ st_d(AT, A1, 0); ++ __ st_w(A2, A1, 8); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 13: ++ __ ld_d(AT, A0, 0); ++ __ ld_d(A2, A0, 5); ++ __ st_d(AT, A1, 0); ++ __ st_d(A2, A1, 5); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 14: ++ __ ld_d(AT, A0, 0); ++ __ ld_d(A2, A0, 6); ++ __ st_d(AT, A1, 0); ++ __ st_d(A2, A1, 6); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 15: ++ __ ld_d(AT, A0, 0); ++ __ ld_d(A2, A0, 7); ++ __ st_d(AT, A1, 0); ++ __ st_d(A2, A1, 7); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 16: ++ __ vld(F0, A0, 0); ++ __ vst(F0, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ if (!UseLASX) ++ return; ++ ++ // 17: ++ __ vld(F0, A0, 0); ++ __ ld_b(AT, A0, 16); ++ __ vst(F0, A1, 0); ++ __ st_b(AT, A1, 16); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 18: ++ __ vld(F0, A0, 0); ++ __ ld_h(AT, A0, 16); ++ __ vst(F0, A1, 0); ++ __ st_h(AT, A1, 16); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 19: ++ __ vld(F0, A0, 0); ++ __ ld_w(AT, A0, 15); ++ __ vst(F0, A1, 0); ++ __ st_w(AT, A1, 15); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 20: ++ __ vld(F0, A0, 0); ++ __ ld_w(AT, A0, 16); ++ __ vst(F0, A1, 0); ++ __ st_w(AT, A1, 16); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 21: ++ __ vld(F0, A0, 0); ++ __ ld_d(AT, A0, 13); ++ __ vst(F0, A1, 0); ++ __ st_d(AT, A1, 13); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 22: ++ __ vld(F0, A0, 0); ++ __ ld_d(AT, A0, 14); ++ __ vst(F0, A1, 0); ++ __ st_d(AT, A1, 14); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 23: ++ __ vld(F0, A0, 0); ++ __ ld_d(AT, A0, 15); ++ __ vst(F0, A1, 0); ++ __ st_d(AT, A1, 15); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 24: ++ __ vld(F0, A0, 0); ++ __ ld_d(AT, A0, 16); ++ __ vst(F0, A1, 0); ++ __ st_d(AT, A1, 16); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 25: ++ __ vld(F0, A0, 0); ++ __ vld(F1, A0, 9); ++ __ vst(F0, A1, 0); ++ __ vst(F1, A1, 9); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 26: ++ __ vld(F0, A0, 0); ++ __ vld(F1, A0, 10); ++ __ vst(F0, A1, 0); ++ __ vst(F1, A1, 10); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 27: ++ __ vld(F0, A0, 0); ++ __ vld(F1, A0, 11); ++ __ vst(F0, A1, 0); ++ __ vst(F1, A1, 11); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 28: ++ __ vld(F0, A0, 0); ++ __ vld(F1, A0, 12); ++ __ vst(F0, A1, 0); ++ __ vst(F1, A1, 12); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 29: ++ __ vld(F0, A0, 0); ++ __ vld(F1, A0, 13); ++ __ vst(F0, A1, 0); ++ __ vst(F1, A1, 13); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 30: ++ __ vld(F0, A0, 0); ++ __ vld(F1, A0, 14); ++ __ vst(F0, A1, 0); ++ __ vst(F1, A1, 14); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 31: ++ __ vld(F0, A0, 0); ++ __ vld(F1, A0, 15); ++ __ vst(F0, A1, 0); ++ __ vst(F1, A1, 15); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 32: ++ __ xvld(F0, A0, 0); ++ __ xvst(F0, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ // Arguments: ++ // aligned - true => Input and output aligned on a HeapWord == 8-byte boundary ++ // ignored ++ // name - stub name string ++ // ++ // Inputs: ++ // A0 - source array address ++ // A1 - destination array address ++ // A2 - element count, treated as ssize_t, can be zero ++ // ++ // If 'from' and/or 'to' are aligned on 4-, 2-, or 1-byte boundaries, ++ // we let the hardware handle it. The one to eight bytes within words, ++ // dwords or qwords that span cache line boundaries will still be loaded ++ // and stored atomically. ++ // ++ // Side Effects: ++ // disjoint_byte_copy_entry is set to the no-overlap entry point ++ // used by generate_conjoint_byte_copy(). ++ // ++ address generate_disjoint_byte_copy(bool aligned, Label &small, Label &large, ++ const char * name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ address start = __ pc(); ++ ++ if (UseLASX) ++ __ sltui(T0, A2, 33); ++ else if (UseLSX) ++ __ sltui(T0, A2, 17); ++ else ++ __ sltui(T0, A2, 9); ++ __ bnez(T0, small); ++ ++ __ b(large); ++ ++ return start; ++ } ++ ++ // Arguments: ++ // aligned - true => Input and output aligned on a HeapWord == 8-byte boundary ++ // ignored ++ // name - stub name string ++ // ++ // Inputs: ++ // A0 - source array address ++ // A1 - destination array address ++ // A2 - element count, treated as ssize_t, can be zero ++ // ++ // If 'from' and/or 'to' are aligned on 4-, 2-, or 1-byte boundaries, ++ // we let the hardware handle it. The one to eight bytes within words, ++ // dwords or qwords that span cache line boundaries will still be loaded ++ // and stored atomically. ++ // ++ address generate_conjoint_byte_copy(bool aligned, Label &small, Label &large, ++ const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ address start = __ pc(); ++ ++ array_overlap_test(StubRoutines::jbyte_disjoint_arraycopy(), 0); ++ ++ if (UseLASX) ++ __ sltui(T0, A2, 33); ++ else if (UseLSX) ++ __ sltui(T0, A2, 17); ++ else ++ __ sltui(T0, A2, 9); ++ __ bnez(T0, small); ++ ++ __ b(large); ++ ++ return start; ++ } ++ ++ // Short small copy: less than { int:9, lsx:9, lasx:17 } elements. ++ void generate_short_small_copy(Label &entry, const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ ++ Label L; ++ __ bind(entry); ++ __ lipc(AT, L); ++ __ slli_d(A2, A2, 5); ++ __ add_d(AT, AT, A2); ++ __ jr(AT); ++ ++ __ bind(L); ++ // 0: ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ // 1: ++ __ ld_h(AT, A0, 0); ++ __ st_h(AT, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ // 2: ++ __ ld_w(AT, A0, 0); ++ __ st_w(AT, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ // 3: ++ __ ld_w(AT, A0, 0); ++ __ ld_h(A2, A0, 4); ++ __ st_w(AT, A1, 0); ++ __ st_h(A2, A1, 4); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 4: ++ __ ld_d(AT, A0, 0); ++ __ st_d(AT, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ // 5: ++ __ ld_d(AT, A0, 0); ++ __ ld_h(A2, A0, 8); ++ __ st_d(AT, A1, 0); ++ __ st_h(A2, A1, 8); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 6: ++ __ ld_d(AT, A0, 0); ++ __ ld_w(A2, A0, 8); ++ __ st_d(AT, A1, 0); ++ __ st_w(A2, A1, 8); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 7: ++ __ ld_d(AT, A0, 0); ++ __ ld_d(A2, A0, 6); ++ __ st_d(AT, A1, 0); ++ __ st_d(A2, A1, 6); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 8: ++ if (UseLSX) { ++ __ vld(F0, A0, 0); ++ __ vst(F0, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ } else { ++ __ ld_d(AT, A0, 0); ++ __ ld_d(A2, A0, 8); ++ __ st_d(AT, A1, 0); ++ __ st_d(A2, A1, 8); ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ if (!UseLASX) ++ return; ++ ++ __ nop(); ++ __ nop(); ++ ++ // 9: ++ __ vld(F0, A0, 0); ++ __ ld_h(AT, A0, 16); ++ __ vst(F0, A1, 0); ++ __ st_h(AT, A1, 16); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 10: ++ __ vld(F0, A0, 0); ++ __ ld_w(AT, A0, 16); ++ __ vst(F0, A1, 0); ++ __ st_w(AT, A1, 16); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 11: ++ __ vld(F0, A0, 0); ++ __ ld_d(AT, A0, 14); ++ __ vst(F0, A1, 0); ++ __ st_d(AT, A1, 14); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 12: ++ __ vld(F0, A0, 0); ++ __ ld_d(AT, A0, 16); ++ __ vst(F0, A1, 0); ++ __ st_d(AT, A1, 16); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 13: ++ __ vld(F0, A0, 0); ++ __ vld(F1, A0, 10); ++ __ vst(F0, A1, 0); ++ __ vst(F1, A1, 10); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 14: ++ __ vld(F0, A0, 0); ++ __ vld(F1, A0, 12); ++ __ vst(F0, A1, 0); ++ __ vst(F1, A1, 12); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 15: ++ __ vld(F0, A0, 0); ++ __ vld(F1, A0, 14); ++ __ vst(F0, A1, 0); ++ __ vst(F1, A1, 14); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 16: ++ __ xvld(F0, A0, 0); ++ __ xvst(F0, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ // Arguments: ++ // aligned - true => Input and output aligned on a HeapWord == 8-byte boundary ++ // ignored ++ // name - stub name string ++ // ++ // Inputs: ++ // A0 - source array address ++ // A1 - destination array address ++ // A2 - element count, treated as ssize_t, can be zero ++ // ++ // If 'from' and/or 'to' are aligned on 4-, 2-, or 1-byte boundaries, ++ // we let the hardware handle it. The one to eight bytes within words, ++ // dwords or qwords that span cache line boundaries will still be loaded ++ // and stored atomically. ++ // ++ // Side Effects: ++ // disjoint_short_copy_entry is set to the no-overlap entry point ++ // used by generate_conjoint_short_copy(). ++ // ++ address generate_disjoint_short_copy(bool aligned, Label &small, Label &large, ++ const char * name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ address start = __ pc(); ++ ++ if (UseLASX) ++ __ sltui(T0, A2, 17); ++ else ++ __ sltui(T0, A2, 9); ++ __ bnez(T0, small); ++ ++ __ slli_d(A2, A2, 1); ++ ++ __ b(large); ++ ++ return start; ++ } ++ ++ // Arguments: ++ // aligned - true => Input and output aligned on a HeapWord == 8-byte boundary ++ // ignored ++ // name - stub name string ++ // ++ // Inputs: ++ // A0 - source array address ++ // A1 - destination array address ++ // A2 - element count, treated as ssize_t, can be zero ++ // ++ // If 'from' and/or 'to' are aligned on 4- or 2-byte boundaries, we ++ // let the hardware handle it. The two or four words within dwords ++ // or qwords that span cache line boundaries will still be loaded ++ // and stored atomically. ++ // ++ address generate_conjoint_short_copy(bool aligned, Label &small, Label &large, ++ const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ address start = __ pc(); ++ ++ array_overlap_test(StubRoutines::jshort_disjoint_arraycopy(), 1); ++ ++ if (UseLASX) ++ __ sltui(T0, A2, 17); ++ else ++ __ sltui(T0, A2, 9); ++ __ bnez(T0, small); ++ ++ __ slli_d(A2, A2, 1); ++ ++ __ b(large); ++ ++ return start; ++ } ++ ++ // Int small copy: less than { int:7, lsx:7, lasx:9 } elements. ++ void generate_int_small_copy(Label &entry, const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ ++ Label L; ++ __ bind(entry); ++ __ lipc(AT, L); ++ __ slli_d(A2, A2, 5); ++ __ add_d(AT, AT, A2); ++ __ jr(AT); ++ ++ __ bind(L); ++ // 0: ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ // 1: ++ __ ld_w(AT, A0, 0); ++ __ st_w(AT, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ // 2: ++ __ ld_d(AT, A0, 0); ++ __ st_d(AT, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ __ nop(); ++ ++ // 3: ++ __ ld_d(AT, A0, 0); ++ __ ld_w(A2, A0, 8); ++ __ st_d(AT, A1, 0); ++ __ st_w(A2, A1, 8); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 4: ++ if (UseLSX) { ++ __ vld(F0, A0, 0); ++ __ vst(F0, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ } else { ++ __ ld_d(AT, A0, 0); ++ __ ld_d(A2, A0, 8); ++ __ st_d(AT, A1, 0); ++ __ st_d(A2, A1, 8); ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ __ nop(); ++ __ nop(); ++ ++ // 5: ++ if (UseLSX) { ++ __ vld(F0, A0, 0); ++ __ ld_w(AT, A0, 16); ++ __ vst(F0, A1, 0); ++ __ st_w(AT, A1, 16); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ } else { ++ __ ld_d(AT, A0, 0); ++ __ ld_d(A2, A0, 8); ++ __ ld_w(A3, A0, 16); ++ __ st_d(AT, A1, 0); ++ __ st_d(A2, A1, 8); ++ __ st_w(A3, A1, 16); ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ // 6: ++ if (UseLSX) { ++ __ vld(F0, A0, 0); ++ __ ld_d(AT, A0, 16); ++ __ vst(F0, A1, 0); ++ __ st_d(AT, A1, 16); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ } else { ++ __ ld_d(AT, A0, 0); ++ __ ld_d(A2, A0, 8); ++ __ ld_d(A3, A0, 16); ++ __ st_d(AT, A1, 0); ++ __ st_d(A2, A1, 8); ++ __ st_d(A3, A1, 16); ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ if (!UseLASX) ++ return; ++ ++ // 7: ++ __ vld(F0, A0, 0); ++ __ vld(F1, A0, 12); ++ __ vst(F0, A1, 0); ++ __ vst(F1, A1, 12); ++ __ move(A0, R0); ++ __ jr(RA); ++ __ nop(); ++ __ nop(); ++ ++ // 8: ++ __ xvld(F0, A0, 0); ++ __ xvst(F0, A1, 0); ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ // Generate maybe oop copy ++ void gen_maybe_oop_copy(bool is_oop, bool disjoint, bool aligned, Label &small, ++ Label &large, const char *name, int small_limit, ++ int log2_elem_size, bool dest_uninitialized = false) { ++ Label post, _large; ++ DecoratorSet decorators = DECORATORS_NONE; ++ BarrierSetAssembler *bs = nullptr; ++ ++ if (is_oop) { ++ decorators = IN_HEAP | IS_ARRAY; ++ ++ if (disjoint) { ++ decorators |= ARRAYCOPY_DISJOINT; ++ } ++ ++ if (aligned) { ++ decorators |= ARRAYCOPY_ALIGNED; ++ } ++ ++ if (dest_uninitialized) { ++ decorators |= IS_DEST_UNINITIALIZED; ++ } ++ ++ __ push(RegSet::of(RA)); ++ ++ bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ bs->arraycopy_prologue(_masm, decorators, is_oop, A0, A1, A2, RegSet::of(A0, A1, A2)); ++ ++ __ push(RegSet::of(A1, A2)); ++ } ++ ++ __ sltui(T0, A2, small_limit); ++ if (is_oop) { ++ __ beqz(T0, _large); ++ __ bl(small); ++ __ b(post); ++ } else { ++ __ bnez(T0, small); ++ } ++ ++ __ bind(_large); ++ __ slli_d(A2, A2, log2_elem_size); ++ ++ if (is_oop) { ++ __ bl(large); ++ } else { ++ __ b(large); ++ } ++ ++ if (is_oop) { ++ __ bind(post); ++ __ pop(RegSet::of(A1, A2)); ++ ++ bs->arraycopy_epilogue(_masm, decorators, is_oop, A1, A2, T1, RegSet()); ++ ++ __ pop(RegSet::of(RA)); ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ } ++ ++ // Arguments: ++ // aligned - true => Input and output aligned on a HeapWord == 8-byte boundary ++ // ignored ++ // is_oop - true => oop array, so generate store check code ++ // name - stub name string ++ // ++ // Inputs: ++ // A0 - source array address ++ // A1 - destination array address ++ // A2 - element count, treated as ssize_t, can be zero ++ // ++ // If 'from' and/or 'to' are aligned on 4-byte boundaries, we let ++ // the hardware handle it. The two dwords within qwords that span ++ // cache line boundaries will still be loaded and stored atomically. ++ // ++ // Side Effects: ++ // disjoint_int_copy_entry is set to the no-overlap entry point ++ // used by generate_conjoint_int_oop_copy(). ++ // ++ address generate_disjoint_int_oop_copy(bool aligned, bool is_oop, Label &small, ++ Label &large, const char *name, int small_limit, ++ bool dest_uninitialized = false) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ address start = __ pc(); ++ ++ gen_maybe_oop_copy(is_oop, true, aligned, small, large, name, ++ small_limit, 2, dest_uninitialized); ++ ++ return start; ++ } ++ ++ // Arguments: ++ // aligned - true => Input and output aligned on a HeapWord == 8-byte boundary ++ // ignored ++ // is_oop - true => oop array, so generate store check code ++ // name - stub name string ++ // ++ // Inputs: ++ // A0 - source array address ++ // A1 - destination array address ++ // A2 - element count, treated as ssize_t, can be zero ++ // ++ // If 'from' and/or 'to' are aligned on 4-byte boundaries, we let ++ // the hardware handle it. The two dwords within qwords that span ++ // cache line boundaries will still be loaded and stored atomically. ++ // ++ address generate_conjoint_int_oop_copy(bool aligned, bool is_oop, Label &small, ++ Label &large, const char *name, int small_limit, ++ bool dest_uninitialized = false) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ address start = __ pc(); ++ ++ if (is_oop) { ++ array_overlap_test(StubRoutines::oop_disjoint_arraycopy(), 2); ++ } else { ++ array_overlap_test(StubRoutines::jint_disjoint_arraycopy(), 2); ++ } ++ ++ gen_maybe_oop_copy(is_oop, false, aligned, small, large, name, ++ small_limit, 2, dest_uninitialized); ++ ++ return start; ++ } ++ ++ // Long small copy: less than { int:4, lsx:4, lasx:5 } elements. ++ void generate_long_small_copy(DecoratorSet decorators, BasicType type, Label &entry, const char *name) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ ++ Register gct1 = T2; ++ Register gct2 = T3; ++ Register gct3 = T4; ++ Register gct4 = T5; ++ FloatRegister gcvt1 = FT8; ++ FloatRegister gcvt2 = FT9; ++ BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ ++ Label L, L1, L2, L3, L4; ++ __ bind(entry); ++ __ beqz(A2, L); ++ __ li(SCR1, 1); ++ __ beq(A2, SCR1, L1); ++ __ li(SCR1, 2); ++ __ beq(A2, SCR1, L2); ++ __ li(SCR1, 3); ++ __ beq(A2, SCR1, L3); ++ __ li(SCR1, 4); ++ __ beq(A2, SCR1, L4); ++ ++ __ bind(L); ++ // 0: ++ __ move(A0, R0); ++ __ jr(RA); ++ ++ { ++ UnsafeCopyMemoryMark ucmm(this, true, true); ++ // 1: ++ __ bind(L1); ++ bs->copy_load_at(_masm, decorators, type, 8, T8, Address(A0, 0), gct1); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 0), T8, gct1, gct2, gct3); ++ __ move(A0, R0); ++ __ jr(RA); ++ ++ // 2: ++ __ bind(L2); ++ if (UseLSX && !ZGenerational) { ++ bs->copy_load_at(_masm, decorators, type, 16, F0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A1, 0), F0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ __ move(A0, R0); ++ __ jr(RA); ++ } else { ++ bs->copy_load_at(_masm, decorators, type, 8, T8, Address(A0, 0), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, A2, Address(A0, 8), gct1); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 0), T8, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 8), A2, gct1, gct2, gct3); ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ // 3: ++ __ bind(L3); ++ if (UseLSX && !ZGenerational) { ++ bs->copy_load_at(_masm, decorators, type, 16, F0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_load_at(_masm, decorators, type, 8, T8, Address(A0, 16), gct1); ++ bs->copy_store_at(_masm, decorators, type, 16, Address(A1, 0), F0, gct1, gct2, gct3, gct4, gcvt1, gcvt2); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 16), T8, gct1, gct2, gct3); ++ __ move(A0, R0); ++ __ jr(RA); ++ } else { ++ bs->copy_load_at(_masm, decorators, type, 8, T8, Address(A0, 0), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, A2, Address(A0, 8), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, A3, Address(A0, 16), gct1); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 0), T8, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 8), A2, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 16), A3, gct1, gct2, gct3); ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ __ bind(L4); ++ // 4: ++ if (UseLASX) { ++ bs->copy_load_at(_masm, decorators, type, 32, F0, Address(A0, 0), gct1, gct2, gcvt1); ++ bs->copy_store_at(_masm, decorators, type, 32, Address(A1, 0), F0, gct1, gct2, gct3, gct4, gcvt1, gcvt2, false /* need_save_restore */); ++ } else { ++ bs->copy_load_at(_masm, decorators, type, 8, T8, Address(A0, 0), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, A2, Address(A0, 8), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, A3, Address(A0, 16), gct1); ++ bs->copy_load_at(_masm, decorators, type, 8, A4, Address(A0, 32), gct1); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 0), T8, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 8), A2, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 16), A3, gct1, gct2, gct3); ++ bs->copy_store_at(_masm, decorators, type, 8, Address(A1, 32), A4, gct1, gct2, gct3); ++ } ++ } ++ ++ __ move(A0, R0); ++ __ jr(RA); ++ } ++ ++ // Arguments: ++ // aligned - true => Input and output aligned on a HeapWord == 8-byte boundary ++ // ignored ++ // is_oop - true => oop array, so generate store check code ++ // name - stub name string ++ // ++ // Inputs: ++ // A0 - source array address ++ // A1 - destination array address ++ // A2 - element count, treated as ssize_t, can be zero ++ // ++ // If 'from' and/or 'to' are aligned on 4-byte boundaries, we let ++ // the hardware handle it. The two dwords within qwords that span ++ // cache line boundaries will still be loaded and stored atomically. ++ // ++ // Side Effects: ++ // disjoint_int_copy_entry is set to the no-overlap entry point ++ // used by generate_conjoint_int_oop_copy(). ++ // ++ address generate_disjoint_long_oop_copy(bool aligned, bool is_oop, Label &small, ++ Label &large, const char *name, int small_limit, ++ bool dest_uninitialized = false) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ address start = __ pc(); ++ ++ gen_maybe_oop_copy(is_oop, true, aligned, small, large, name, ++ small_limit, 3, dest_uninitialized); ++ ++ return start; ++ } ++ ++ // Arguments: ++ // aligned - true => Input and output aligned on a HeapWord == 8-byte boundary ++ // ignored ++ // is_oop - true => oop array, so generate store check code ++ // name - stub name string ++ // ++ // Inputs: ++ // A0 - source array address ++ // A1 - destination array address ++ // A2 - element count, treated as ssize_t, can be zero ++ // ++ // If 'from' and/or 'to' are aligned on 4-byte boundaries, we let ++ // the hardware handle it. The two dwords within qwords that span ++ // cache line boundaries will still be loaded and stored atomically. ++ // ++ address generate_conjoint_long_oop_copy(bool aligned, bool is_oop, Label &small, ++ Label &large, const char *name, int small_limit, ++ bool dest_uninitialized = false) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ address start = __ pc(); ++ ++ if (is_oop) { ++ array_overlap_test(StubRoutines::oop_disjoint_arraycopy(dest_uninitialized /* ZGC */), 3); ++ } else { ++ array_overlap_test(StubRoutines::jlong_disjoint_arraycopy(), 3); ++ } ++ ++ gen_maybe_oop_copy(is_oop, false, aligned, small, large, name, ++ small_limit, 3, dest_uninitialized); ++ ++ return start; ++ } ++ ++ // Helper for generating a dynamic type check. ++ // Smashes scratch1, scratch2. ++ void generate_type_check(Register sub_klass, ++ Register super_check_offset, ++ Register super_klass, ++ Register tmp1, ++ Register tmp2, ++ Label& L_success) { ++ assert_different_registers(sub_klass, super_check_offset, super_klass); ++ ++ __ block_comment("type_check:"); ++ ++ Label L_miss; ++ ++ __ check_klass_subtype_fast_path(sub_klass, super_klass, tmp1, &L_success, &L_miss, nullptr, ++ super_check_offset); ++ __ check_klass_subtype_slow_path(sub_klass, super_klass, tmp1, tmp2, &L_success, nullptr); ++ ++ // Fall through on failure! ++ __ bind(L_miss); ++ } ++ ++ // ++ // Generate checkcasting array copy stub ++ // ++ // Input: ++ // A0 - source array address ++ // A1 - destination array address ++ // A2 - element count, treated as ssize_t, can be zero ++ // A3 - size_t ckoff (super_check_offset) ++ // A4 - oop ckval (super_klass) ++ // ++ // Output: ++ // V0 == 0 - success ++ // V0 == -1^K - failure, where K is partial transfer count ++ // ++ address generate_checkcast_copy(const char *name, bool dest_uninitialized = false) { ++ Label L_load_element, L_store_element, L_do_card_marks, L_done, L_done_pop; ++ ++ // Input registers (after setup_arg_regs) ++ const Register from = A0; // source array address ++ const Register to = A1; // destination array address ++ const Register count = A2; // elementscount ++ const Register ckoff = A3; // super_check_offset ++ const Register ckval = A4; // super_klass ++ ++ RegSet wb_pre_saved_regs = RegSet::range(A0, A4); ++ RegSet wb_post_saved_regs = RegSet::of(count); ++ ++ // Registers used as temps (S0, S1, S2, S3 are save-on-entry) ++ const Register copied_oop = S0; // actual oop copied ++ const Register count_save = S1; // orig elementscount ++ const Register start_to = S2; // destination array start address ++ const Register oop_klass = S3; // oop._klass ++ const Register tmp1 = A5; ++ const Register tmp2 = A6; ++ const Register tmp3 = A7; ++ ++ //--------------------------------------------------------------- ++ // Assembler stub will be used for this call to arraycopy ++ // if the two arrays are subtypes of Object[] but the ++ // destination array type is not equal to or a supertype ++ // of the source type. Each element must be separately ++ // checked. ++ ++ assert_different_registers(from, to, count, ckoff, ckval, start_to, ++ copied_oop, oop_klass, count_save); ++ ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ address start = __ pc(); ++ ++ // caller guarantees that the arrays really are different ++ // otherwise, we would have to make conjoint checks ++ ++ // Caller of this entry point must set up the argument registers. ++ __ block_comment("Entry:"); ++ ++ // Empty array: Nothing to do. ++ __ beqz(count, L_done); ++ ++ __ push(RegSet::of(S0, S1, S2, S3) + RA); ++ ++#ifdef ASSERT ++ __ block_comment("assert consistent ckoff/ckval"); ++ // The ckoff and ckval must be mutually consistent, ++ // even though caller generates both. ++ { Label L; ++ int sco_offset = in_bytes(Klass::super_check_offset_offset()); ++ __ ld_w(start_to, Address(ckval, sco_offset)); ++ __ beq(ckoff, start_to, L); ++ __ stop("super_check_offset inconsistent"); ++ __ bind(L); ++ } ++#endif //ASSERT ++ ++ DecoratorSet decorators = IN_HEAP | IS_ARRAY | ARRAYCOPY_CHECKCAST | ARRAYCOPY_DISJOINT; ++ bool is_oop = true; ++ if (dest_uninitialized) { ++ decorators |= IS_DEST_UNINITIALIZED; ++ } ++ ++ BarrierSetAssembler *bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ bs->arraycopy_prologue(_masm, decorators, is_oop, from, to, count, wb_pre_saved_regs); ++ ++ // save the original count ++ __ move(count_save, count); ++ ++ // Copy from low to high addresses ++ __ move(start_to, to); // Save destination array start address ++ __ b(L_load_element); ++ ++ // ======== begin loop ======== ++ // (Loop is rotated; its entry is L_load_element.) ++ // Loop control: ++ // for (; count != 0; count--) { ++ // copied_oop = load_heap_oop(from++); ++ // ... generate_type_check ...; ++ // store_heap_oop(to++, copied_oop); ++ // } ++ __ align(OptoLoopAlignment); ++ ++ __ bind(L_store_element); ++ bs->copy_store_at(_masm, decorators, T_OBJECT, UseCompressedOops ? 4 : 8, ++ Address(to, 0), copied_oop, ++ tmp1, tmp2, tmp3); ++ __ addi_d(to, to, UseCompressedOops ? 4 : 8); ++ __ addi_d(count, count, -1); ++ __ beqz(count, L_do_card_marks); ++ ++ // ======== loop entry is here ======== ++ __ bind(L_load_element); ++ bs->copy_load_at(_masm, decorators, T_OBJECT, UseCompressedOops ? 4 : 8, ++ copied_oop, Address(from, 0), ++ tmp1); ++ __ addi_d(from, from, UseCompressedOops ? 4 : 8); ++ __ beqz(copied_oop, L_store_element); ++ ++ __ load_klass(oop_klass, copied_oop); // query the object klass ++ generate_type_check(oop_klass, ckoff, ckval, tmp1, tmp2, L_store_element); ++ // ======== end loop ======== ++ ++ // Register count = remaining oops, count_orig = total oops. ++ // Emit GC store barriers for the oops we have copied and report ++ // their number to the caller. ++ ++ __ sub_d(tmp1, count_save, count); // K = partially copied oop count ++ __ nor(count, tmp1, R0); // report (-1^K) to caller ++ __ beqz(tmp1, L_done_pop); ++ ++ __ bind(L_do_card_marks); ++ ++ bs->arraycopy_epilogue(_masm, decorators, is_oop, start_to, count_save, tmp2, wb_post_saved_regs); ++ ++ __ bind(L_done_pop); ++ __ pop(RegSet::of(S0, S1, S2, S3) + RA); ++ ++#ifndef PRODUCT ++ __ li(SCR2, (address)&SharedRuntime::_checkcast_array_copy_ctr); ++ __ increment(Address(SCR2, 0), 1); ++#endif ++ ++ __ bind(L_done); ++ __ move(A0, count); ++ __ jr(RA); ++ ++ return start; ++ } ++ ++ // ++ // Generate 'unsafe' array copy stub ++ // Though just as safe as the other stubs, it takes an unscaled ++ // size_t argument instead of an element count. ++ // ++ // Input: ++ // A0 - source array address ++ // A1 - destination array address ++ // A2 - byte count, treated as ssize_t, can be zero ++ // ++ // Examines the alignment of the operands and dispatches ++ // to a long, int, short, or byte copy loop. ++ // ++ address generate_unsafe_copy(const char *name) { ++ Label L_long_aligned, L_int_aligned, L_short_aligned; ++ Register s = A0, d = A1, count = A2; ++ ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ address start = __ pc(); ++ ++ __ orr(AT, s, d); ++ __ orr(AT, AT, count); ++ ++ __ andi(AT, AT, BytesPerLong-1); ++ __ beqz(AT, L_long_aligned); ++ __ andi(AT, AT, BytesPerInt-1); ++ __ beqz(AT, L_int_aligned); ++ __ andi(AT, AT, BytesPerShort-1); ++ __ beqz(AT, L_short_aligned); ++ __ b(StubRoutines::_jbyte_arraycopy); ++ ++ __ bind(L_short_aligned); ++ __ srli_d(count, count, LogBytesPerShort); // size => short_count ++ __ b(StubRoutines::_jshort_arraycopy); ++ __ bind(L_int_aligned); ++ __ srli_d(count, count, LogBytesPerInt); // size => int_count ++ __ b(StubRoutines::_jint_arraycopy); ++ __ bind(L_long_aligned); ++ __ srli_d(count, count, LogBytesPerLong); // size => long_count ++ __ b(StubRoutines::_jlong_arraycopy); ++ ++ return start; ++ } ++ ++ // Perform range checks on the proposed arraycopy. ++ // Kills temp, but nothing else. ++ // Also, clean the sign bits of src_pos and dst_pos. ++ void arraycopy_range_checks(Register src, // source array oop (A0) ++ Register src_pos, // source position (A1) ++ Register dst, // destination array oo (A2) ++ Register dst_pos, // destination position (A3) ++ Register length, ++ Register temp, ++ Label& L_failed) { ++ __ block_comment("arraycopy_range_checks:"); ++ ++ assert_different_registers(SCR1, temp); ++ ++ // if (src_pos + length > arrayOop(src)->length()) FAIL; ++ __ ld_w(SCR1, Address(src, arrayOopDesc::length_offset_in_bytes())); ++ __ add_w(temp, length, src_pos); ++ __ bltu(SCR1, temp, L_failed); ++ ++ // if (dst_pos + length > arrayOop(dst)->length()) FAIL; ++ __ ld_w(SCR1, Address(dst, arrayOopDesc::length_offset_in_bytes())); ++ __ add_w(temp, length, dst_pos); ++ __ bltu(SCR1, temp, L_failed); ++ ++ // Have to clean up high 32 bits of 'src_pos' and 'dst_pos'. ++ __ move(src_pos, src_pos); ++ __ move(dst_pos, dst_pos); ++ ++ __ block_comment("arraycopy_range_checks done"); ++ } ++ ++ // ++ // Generate generic array copy stubs ++ // ++ // Input: ++ // A0 - src oop ++ // A1 - src_pos (32-bits) ++ // A2 - dst oop ++ // A3 - dst_pos (32-bits) ++ // A4 - element count (32-bits) ++ // ++ // Output: ++ // V0 == 0 - success ++ // V0 == -1^K - failure, where K is partial transfer count ++ // ++ address generate_generic_copy(const char *name) { ++ Label L_failed, L_objArray; ++ Label L_copy_bytes, L_copy_shorts, L_copy_ints, L_copy_longs; ++ ++ // Input registers ++ const Register src = A0; // source array oop ++ const Register src_pos = A1; // source position ++ const Register dst = A2; // destination array oop ++ const Register dst_pos = A3; // destination position ++ const Register length = A4; ++ ++ // Registers used as temps ++ const Register dst_klass = A5; ++ ++ __ align(CodeEntryAlignment); ++ ++ StubCodeMark mark(this, "StubRoutines", name); ++ ++ address start = __ pc(); ++ ++#ifndef PRODUCT ++ // bump this on entry, not on exit: ++ __ li(SCR2, (address)&SharedRuntime::_generic_array_copy_ctr); ++ __ increment(Address(SCR2, 0), 1); ++#endif ++ ++ //----------------------------------------------------------------------- ++ // Assembler stub will be used for this call to arraycopy ++ // if the following conditions are met: ++ // ++ // (1) src and dst must not be null. ++ // (2) src_pos must not be negative. ++ // (3) dst_pos must not be negative. ++ // (4) length must not be negative. ++ // (5) src klass and dst klass should be the same and not null. ++ // (6) src and dst should be arrays. ++ // (7) src_pos + length must not exceed length of src. ++ // (8) dst_pos + length must not exceed length of dst. ++ // ++ ++ // if (src == nullptr) return -1; ++ __ beqz(src, L_failed); ++ ++ // if (src_pos < 0) return -1; ++ __ blt(src_pos, R0, L_failed); ++ ++ // if (dst == nullptr) return -1; ++ __ beqz(dst, L_failed); ++ ++ // if (dst_pos < 0) return -1; ++ __ blt(dst_pos, R0, L_failed); ++ ++ // registers used as temp ++ const Register scratch_length = T0; // elements count to copy ++ const Register scratch_src_klass = T1; // array klass ++ const Register lh = T2; // layout helper ++ const Register tmp1 = T3; ++ const Register tmp2 = T4; ++ ++ // if (length < 0) return -1; ++ __ move(scratch_length, length); // length (elements count, 32-bits value) ++ __ blt(scratch_length, R0, L_failed); ++ ++ __ load_klass(scratch_src_klass, src); ++#ifdef ASSERT ++ // assert(src->klass() != nullptr); ++ { ++ __ block_comment("assert klasses not null {"); ++ Label L1, L2; ++ __ bnez(scratch_src_klass, L2); // it is broken if klass is null ++ __ bind(L1); ++ __ stop("broken null klass"); ++ __ bind(L2); ++ __ load_klass(SCR2, dst); ++ __ beqz(SCR2, L1); // this would be broken also ++ __ block_comment("} assert klasses not null done"); ++ } ++#endif ++ ++ // Load layout helper (32-bits) ++ // ++ // |array_tag| | header_size | element_type | |log2_element_size| ++ // 32 30 24 16 8 2 0 ++ // ++ // array_tag: typeArray = 0x3, objArray = 0x2, non-array = 0x0 ++ // ++ ++ const int lh_offset = in_bytes(Klass::layout_helper_offset()); ++ ++ // Handle objArrays completely differently... ++ const jint objArray_lh = Klass::array_layout_helper(T_OBJECT); ++ __ ld_w(lh, Address(scratch_src_klass, lh_offset)); ++ __ li(SCR1, objArray_lh); ++ __ xorr(SCR2, lh, SCR1); ++ __ beqz(SCR2, L_objArray); ++ ++ // if (src->klass() != dst->klass()) return -1; ++ __ load_klass(SCR2, dst); ++ __ xorr(SCR2, SCR2, scratch_src_klass); ++ __ bnez(SCR2, L_failed); ++ ++ // if (!src->is_Array()) return -1; ++ __ bge(lh, R0, L_failed); // i.e. (lh >= 0) ++ ++ // At this point, it is known to be a typeArray (array_tag 0x3). ++#ifdef ASSERT ++ { ++ __ block_comment("assert primitive array {"); ++ Label L; ++ __ li(SCR2, (int)(Klass::_lh_array_tag_type_value << Klass::_lh_array_tag_shift)); ++ __ bge(lh, SCR2, L); ++ __ stop("must be a primitive array"); ++ __ bind(L); ++ __ block_comment("} assert primitive array done"); ++ } ++#endif ++ ++ arraycopy_range_checks(src, src_pos, dst, dst_pos, scratch_length, SCR2, L_failed); ++ ++ // TypeArrayKlass ++ // ++ // src_addr = (src + array_header_in_bytes()) + (src_pos << log2elemsize); ++ // dst_addr = (dst + array_header_in_bytes()) + (dst_pos << log2elemsize); ++ // ++ ++ const Register scr1_offset = SCR1; // array offset ++ const Register elsize = lh; // element size ++ ++ __ bstrpick_d(scr1_offset, lh, Klass::_lh_header_size_shift + ++ exact_log2(Klass::_lh_header_size_mask+1) - 1, ++ Klass::_lh_header_size_shift); // array_offset ++ __ add_d(src, src, scr1_offset); // src array offset ++ __ add_d(dst, dst, scr1_offset); // dst array offset ++ __ block_comment("choose copy loop based on element size"); ++ ++ // next registers should be set before the jump to corresponding stub ++ const Register from = A0; // source array address ++ const Register to = A1; // destination array address ++ const Register count = A2; // elements count ++ ++ // 'from', 'to', 'count' registers should be set in such order ++ // since they are the same as 'src', 'src_pos', 'dst'. ++ ++ assert(Klass::_lh_log2_element_size_shift == 0, "fix this code"); ++ ++ // The possible values of elsize are 0-3, i.e. exact_log2(element ++ // size in bytes). We do a simple bitwise binary search. ++ __ bind(L_copy_bytes); ++ __ andi(tmp1, elsize, 2); ++ __ bnez(tmp1, L_copy_ints); ++ __ andi(tmp1, elsize, 1); ++ __ bnez(tmp1, L_copy_shorts); ++ __ lea(from, Address(src, src_pos, Address::no_scale)); // src_addr ++ __ lea(to, Address(dst, dst_pos, Address::no_scale)); // dst_addr ++ __ move(count, scratch_length); // length ++ __ b(StubRoutines::_jbyte_arraycopy); ++ ++ __ bind(L_copy_shorts); ++ __ lea(from, Address(src, src_pos, Address::times_2)); // src_addr ++ __ lea(to, Address(dst, dst_pos, Address::times_2)); // dst_addr ++ __ move(count, scratch_length); // length ++ __ b(StubRoutines::_jshort_arraycopy); ++ ++ __ bind(L_copy_ints); ++ __ andi(tmp1, elsize, 1); ++ __ bnez(tmp1, L_copy_longs); ++ __ lea(from, Address(src, src_pos, Address::times_4)); // src_addr ++ __ lea(to, Address(dst, dst_pos, Address::times_4)); // dst_addr ++ __ move(count, scratch_length); // length ++ __ b(StubRoutines::_jint_arraycopy); ++ ++ __ bind(L_copy_longs); ++#ifdef ASSERT ++ { ++ __ block_comment("assert long copy {"); ++ Label L; ++ __ andi(lh, lh, Klass::_lh_log2_element_size_mask); // lh -> elsize ++ __ li(tmp1, LogBytesPerLong); ++ __ beq(elsize, tmp1, L); ++ __ stop("must be long copy, but elsize is wrong"); ++ __ bind(L); ++ __ block_comment("} assert long copy done"); ++ } ++#endif ++ __ lea(from, Address(src, src_pos, Address::times_8)); // src_addr ++ __ lea(to, Address(dst, dst_pos, Address::times_8)); // dst_addr ++ __ move(count, scratch_length); // length ++ __ b(StubRoutines::_jlong_arraycopy); ++ ++ // ObjArrayKlass ++ __ bind(L_objArray); ++ // live at this point: scratch_src_klass, scratch_length, src[_pos], dst[_pos] ++ ++ Label L_plain_copy, L_checkcast_copy; ++ // test array classes for subtyping ++ __ load_klass(tmp1, dst); ++ __ bne(scratch_src_klass, tmp1, L_checkcast_copy); // usual case is exact equality ++ ++ // Identically typed arrays can be copied without element-wise checks. ++ arraycopy_range_checks(src, src_pos, dst, dst_pos, scratch_length, SCR2, L_failed); ++ ++ __ lea(from, Address(src, src_pos, Address::ScaleFactor(LogBytesPerHeapOop))); ++ __ addi_d(from, from, arrayOopDesc::base_offset_in_bytes(T_OBJECT)); ++ __ lea(to, Address(dst, dst_pos, Address::ScaleFactor(LogBytesPerHeapOop))); ++ __ addi_d(to, to, arrayOopDesc::base_offset_in_bytes(T_OBJECT)); ++ __ move(count, scratch_length); // length ++ __ bind(L_plain_copy); ++ __ b(StubRoutines::_oop_arraycopy); ++ ++ __ bind(L_checkcast_copy); ++ // live at this point: scratch_src_klass, scratch_length, tmp1 (dst_klass) ++ { ++ // Before looking at dst.length, make sure dst is also an objArray. ++ __ ld_w(SCR1, Address(tmp1, lh_offset)); ++ __ li(SCR2, objArray_lh); ++ __ xorr(SCR1, SCR1, SCR2); ++ __ bnez(SCR1, L_failed); ++ ++ // It is safe to examine both src.length and dst.length. ++ arraycopy_range_checks(src, src_pos, dst, dst_pos, scratch_length, tmp1, L_failed); ++ ++ __ load_klass(dst_klass, dst); // reload ++ ++ // Marshal the base address arguments now, freeing registers. ++ __ lea(from, Address(src, src_pos, Address::ScaleFactor(LogBytesPerHeapOop))); ++ __ addi_d(from, from, arrayOopDesc::base_offset_in_bytes(T_OBJECT)); ++ __ lea(to, Address(dst, dst_pos, Address::ScaleFactor(LogBytesPerHeapOop))); ++ __ addi_d(to, to, arrayOopDesc::base_offset_in_bytes(T_OBJECT)); ++ __ move(count, length); // length (reloaded) ++ Register sco_temp = A3; // this register is free now ++ assert_different_registers(from, to, count, sco_temp, dst_klass, scratch_src_klass); ++ // assert_clean_int(count, sco_temp); ++ ++ // Generate the type check. ++ const int sco_offset = in_bytes(Klass::super_check_offset_offset()); ++ __ ld_w(sco_temp, Address(dst_klass, sco_offset)); ++ ++ // Smashes SCR1, SCR2 ++ generate_type_check(scratch_src_klass, sco_temp, dst_klass, tmp1, tmp2, L_plain_copy); ++ ++ // Fetch destination element klass from the ObjArrayKlass header. ++ int ek_offset = in_bytes(ObjArrayKlass::element_klass_offset()); ++ __ ld_d(dst_klass, Address(dst_klass, ek_offset)); ++ __ ld_w(sco_temp, Address(dst_klass, sco_offset)); ++ ++ // the checkcast_copy loop needs two extra arguments: ++ assert(A3 == sco_temp, "#3 already in place"); ++ // Set up arguments for checkcast_arraycopy. ++ __ move(A4, dst_klass); // dst.klass.element_klass ++ __ b(StubRoutines::_checkcast_arraycopy); ++ } ++ ++ __ bind(L_failed); ++ __ li(V0, -1); ++ __ jr(RA); ++ ++ return start; ++ } ++ ++ void generate_arraycopy_stubs() { ++ Label disjoint_large_copy, conjoint_large_copy; ++#if INCLUDE_ZGC ++ Label disjoint_large_copy_oop, conjoint_large_copy_oop; ++ Label disjoint_large_copy_oop_uninit, conjoint_large_copy_oop_uninit; ++#endif ++ Label byte_small_copy, short_small_copy, int_small_copy, long_small_copy; ++#if INCLUDE_ZGC ++ Label long_small_copy_oop, long_small_copy_oop_uninit; ++#endif ++ int int_oop_small_limit, long_oop_small_limit; ++ ++ if (UseLASX) { ++ int_oop_small_limit = 9; ++ long_oop_small_limit = 5; ++ generate_disjoint_large_copy_lasx(DECORATORS_NONE, T_LONG, disjoint_large_copy, "disjoint_large_copy_lasx"); ++ generate_conjoint_large_copy_lasx(DECORATORS_NONE, T_LONG, conjoint_large_copy, "conjoint_large_copy_lasx"); ++#if INCLUDE_ZGC ++ if (UseZGC && ZGenerational) { ++ generate_disjoint_large_copy_lasx(IN_HEAP | IS_ARRAY | ARRAYCOPY_DISJOINT, T_OBJECT, disjoint_large_copy_oop, "disjoint_large_copy_oop_lasx"); ++ generate_conjoint_large_copy_lasx(IN_HEAP | IS_ARRAY, T_OBJECT, conjoint_large_copy_oop, "conjoint_large_copy_oop_lasx"); ++ generate_disjoint_large_copy_lasx(IN_HEAP | IS_ARRAY | ARRAYCOPY_DISJOINT | IS_DEST_UNINITIALIZED, T_OBJECT, disjoint_large_copy_oop_uninit, "disjoint_large_copy_oop_uninit_lasx"); ++ generate_conjoint_large_copy_lasx(IN_HEAP | IS_ARRAY | IS_DEST_UNINITIALIZED, T_OBJECT, conjoint_large_copy_oop_uninit, "conjoint_large_copy_oop_uninit_lasx"); ++ } ++#endif ++ } else if (UseLSX) { ++ int_oop_small_limit = 7; ++ long_oop_small_limit = 4; ++ generate_disjoint_large_copy_lsx(DECORATORS_NONE, T_LONG, disjoint_large_copy, "disjoint_large_copy_lsx"); ++ generate_conjoint_large_copy_lsx(DECORATORS_NONE, T_LONG, conjoint_large_copy, "conjoint_large_copy_lsx"); ++#if INCLUDE_ZGC ++ if (UseZGC && ZGenerational) { ++ generate_disjoint_large_copy_lsx(IN_HEAP | IS_ARRAY | ARRAYCOPY_DISJOINT, T_OBJECT, disjoint_large_copy_oop, "disjoint_large_copy_oop_lsx"); ++ generate_conjoint_large_copy_lsx(IN_HEAP | IS_ARRAY, T_OBJECT, conjoint_large_copy_oop, "conjoint_large_copy_oop_lsx"); ++ generate_disjoint_large_copy_lsx(IN_HEAP | IS_ARRAY | ARRAYCOPY_DISJOINT | IS_DEST_UNINITIALIZED, T_OBJECT, disjoint_large_copy_oop_uninit, "disjoint_large_copy_oop_uninit_lsx"); ++ generate_conjoint_large_copy_lsx(IN_HEAP | IS_ARRAY | IS_DEST_UNINITIALIZED, T_OBJECT, conjoint_large_copy_oop_uninit, "conjoint_large_copy_oop_uninit_lsx"); ++ } ++#endif ++ } else { ++ int_oop_small_limit = 7; ++ long_oop_small_limit = 4; ++ generate_disjoint_large_copy(DECORATORS_NONE, T_LONG, disjoint_large_copy, "disjoint_large_copy_int"); ++ generate_conjoint_large_copy(DECORATORS_NONE, T_LONG, conjoint_large_copy, "conjoint_large_copy_int"); ++#if INCLUDE_ZGC ++ if (UseZGC && ZGenerational) { ++ generate_disjoint_large_copy(IN_HEAP | IS_ARRAY | ARRAYCOPY_DISJOINT, T_OBJECT, disjoint_large_copy_oop, "disjoint_large_copy_oop"); ++ generate_conjoint_large_copy(IN_HEAP | IS_ARRAY, T_OBJECT, conjoint_large_copy_oop, "conjoint_large_copy_oop"); ++ generate_disjoint_large_copy(IN_HEAP | IS_ARRAY | ARRAYCOPY_DISJOINT | IS_DEST_UNINITIALIZED, T_OBJECT, disjoint_large_copy_oop_uninit, "disjoint_large_copy_oop_uninit"); ++ generate_conjoint_large_copy(IN_HEAP | IS_ARRAY | IS_DEST_UNINITIALIZED, T_OBJECT, conjoint_large_copy_oop_uninit, "conjoint_large_copy_oop_uninit"); ++ } ++#endif ++ } ++ generate_byte_small_copy(byte_small_copy, "jbyte_small_copy"); ++ generate_short_small_copy(short_small_copy, "jshort_small_copy"); ++ generate_int_small_copy(int_small_copy, "jint_small_copy"); ++ generate_long_small_copy(DECORATORS_NONE, T_LONG, long_small_copy, "jlong_small_copy"); ++#if INCLUDE_ZGC ++ if (UseZGC && ZGenerational) { ++ generate_long_small_copy(IN_HEAP | IS_ARRAY | ARRAYCOPY_DISJOINT, T_OBJECT, long_small_copy_oop, "jlong_small_copy_oop"); ++ generate_long_small_copy(IN_HEAP | IS_ARRAY | ARRAYCOPY_DISJOINT | IS_DEST_UNINITIALIZED, T_OBJECT, long_small_copy_oop_uninit, "jlong_small_copy_oop_uninit"); ++ } ++#endif ++ ++ if (UseCompressedOops) { ++ StubRoutines::_oop_disjoint_arraycopy = generate_disjoint_int_oop_copy(false, true, int_small_copy, disjoint_large_copy, ++ "oop_disjoint_arraycopy", int_oop_small_limit); ++ StubRoutines::_oop_disjoint_arraycopy_uninit = generate_disjoint_int_oop_copy(false, true, int_small_copy, disjoint_large_copy, ++ "oop_disjoint_arraycopy_uninit", int_oop_small_limit, true); ++ StubRoutines::_oop_arraycopy = generate_conjoint_int_oop_copy(false, true, int_small_copy, conjoint_large_copy, ++ "oop_arraycopy", int_oop_small_limit); ++ StubRoutines::_oop_arraycopy_uninit = generate_conjoint_int_oop_copy(false, true, int_small_copy, conjoint_large_copy, ++ "oop_arraycopy_uninit", int_oop_small_limit, true); ++ } else { ++#if INCLUDE_ZGC ++ if (UseZGC && ZGenerational) { ++ StubRoutines::_oop_disjoint_arraycopy = generate_disjoint_long_oop_copy(false, true, long_small_copy_oop, disjoint_large_copy_oop, ++ "oop_disjoint_arraycopy", long_oop_small_limit); ++ StubRoutines::_oop_disjoint_arraycopy_uninit = generate_disjoint_long_oop_copy(false, true, long_small_copy_oop_uninit, disjoint_large_copy_oop_uninit, ++ "oop_disjoint_arraycopy_uninit", long_oop_small_limit, true); ++ StubRoutines::_oop_arraycopy = generate_conjoint_long_oop_copy(false, true, long_small_copy_oop, conjoint_large_copy_oop, ++ "oop_arraycopy", long_oop_small_limit); ++ StubRoutines::_oop_arraycopy_uninit = generate_conjoint_long_oop_copy(false, true, long_small_copy_oop_uninit, conjoint_large_copy_oop_uninit, ++ "oop_arraycopy_uninit", long_oop_small_limit, true); ++ } else { ++#endif ++ StubRoutines::_oop_disjoint_arraycopy = generate_disjoint_long_oop_copy(false, true, long_small_copy, disjoint_large_copy, ++ "oop_disjoint_arraycopy", long_oop_small_limit); ++ StubRoutines::_oop_disjoint_arraycopy_uninit = generate_disjoint_long_oop_copy(false, true, long_small_copy, disjoint_large_copy, ++ "oop_disjoint_arraycopy_uninit", long_oop_small_limit, true); ++ StubRoutines::_oop_arraycopy = generate_conjoint_long_oop_copy(false, true, long_small_copy, conjoint_large_copy, ++ "oop_arraycopy", long_oop_small_limit); ++ StubRoutines::_oop_arraycopy_uninit = generate_conjoint_long_oop_copy(false, true, long_small_copy, conjoint_large_copy, ++ "oop_arraycopy_uninit", long_oop_small_limit, true); ++#if INCLUDE_ZGC ++ } ++#endif ++ } ++ ++ StubRoutines::_jbyte_disjoint_arraycopy = generate_disjoint_byte_copy(false, byte_small_copy, disjoint_large_copy, "jbyte_disjoint_arraycopy"); ++ StubRoutines::_jshort_disjoint_arraycopy = generate_disjoint_short_copy(false, short_small_copy, disjoint_large_copy, "jshort_disjoint_arraycopy"); ++ StubRoutines::_jint_disjoint_arraycopy = generate_disjoint_int_oop_copy(false, false, int_small_copy, disjoint_large_copy, ++ "jint_disjoint_arraycopy", int_oop_small_limit); ++ ++ StubRoutines::_jbyte_arraycopy = generate_conjoint_byte_copy(false, byte_small_copy, conjoint_large_copy, "jbyte_arraycopy"); ++ StubRoutines::_jshort_arraycopy = generate_conjoint_short_copy(false, short_small_copy, conjoint_large_copy, "jshort_arraycopy"); ++ StubRoutines::_jint_arraycopy = generate_conjoint_int_oop_copy(false, false, int_small_copy, conjoint_large_copy, ++ "jint_arraycopy", int_oop_small_limit); ++ ++ StubRoutines::_jlong_disjoint_arraycopy = generate_disjoint_long_oop_copy(false, false, long_small_copy, disjoint_large_copy, ++ "jlong_disjoint_arraycopy", long_oop_small_limit); ++ StubRoutines::_jlong_arraycopy = generate_conjoint_long_oop_copy(false, false, long_small_copy, conjoint_large_copy, ++ "jlong_arraycopy", long_oop_small_limit); ++ ++ // We don't generate specialized code for HeapWord-aligned source ++ // arrays, so just use the code we've already generated ++ StubRoutines::_arrayof_jbyte_disjoint_arraycopy = StubRoutines::_jbyte_disjoint_arraycopy; ++ StubRoutines::_arrayof_jbyte_arraycopy = StubRoutines::_jbyte_arraycopy; ++ ++ StubRoutines::_arrayof_jshort_disjoint_arraycopy = StubRoutines::_jshort_disjoint_arraycopy; ++ StubRoutines::_arrayof_jshort_arraycopy = StubRoutines::_jshort_arraycopy; ++ ++ StubRoutines::_arrayof_jint_disjoint_arraycopy = StubRoutines::_jint_disjoint_arraycopy; ++ StubRoutines::_arrayof_jint_arraycopy = StubRoutines::_jint_arraycopy; ++ ++ StubRoutines::_arrayof_jlong_disjoint_arraycopy = StubRoutines::_jlong_disjoint_arraycopy; ++ StubRoutines::_arrayof_jlong_arraycopy = StubRoutines::_jlong_arraycopy; ++ ++ StubRoutines::_arrayof_oop_disjoint_arraycopy = StubRoutines::_oop_disjoint_arraycopy; ++ StubRoutines::_arrayof_oop_arraycopy = StubRoutines::_oop_arraycopy; ++ ++ StubRoutines::_arrayof_oop_disjoint_arraycopy_uninit = StubRoutines::_oop_disjoint_arraycopy_uninit; ++ StubRoutines::_arrayof_oop_arraycopy_uninit = StubRoutines::_oop_arraycopy_uninit; ++ ++ StubRoutines::_checkcast_arraycopy = generate_checkcast_copy("checkcast_arraycopy"); ++ StubRoutines::_checkcast_arraycopy_uninit = generate_checkcast_copy("checkcast_arraycopy_uninit", true); ++ ++ StubRoutines::_unsafe_arraycopy = generate_unsafe_copy("unsafe_arraycopy"); ++ ++ StubRoutines::_generic_arraycopy = generate_generic_copy("generic_arraycopy"); ++ ++ StubRoutines::_jbyte_fill = generate_fill(T_BYTE, false, "jbyte_fill"); ++ StubRoutines::_jshort_fill = generate_fill(T_SHORT, false, "jshort_fill"); ++ StubRoutines::_jint_fill = generate_fill(T_INT, false, "jint_fill"); ++ StubRoutines::_arrayof_jbyte_fill = generate_fill(T_BYTE, true, "arrayof_jbyte_fill"); ++ StubRoutines::_arrayof_jshort_fill = generate_fill(T_SHORT, true, "arrayof_jshort_fill"); ++ StubRoutines::_arrayof_jint_fill = generate_fill(T_INT, true, "arrayof_jint_fill"); ++ ++ StubRoutines::la::_jlong_fill = generate_fill(T_LONG, false, "jlong_fill"); ++ StubRoutines::la::_arrayof_jlong_fill = generate_fill(T_LONG, true, "arrayof_jlong_fill"); ++ ++#if INCLUDE_ZGC ++ if (!(UseZGC && ZGenerational)) { ++#endif ++ Copy::_conjoint_words = reinterpret_cast(StubRoutines::jlong_arraycopy()); ++ Copy::_disjoint_words = reinterpret_cast(StubRoutines::jlong_disjoint_arraycopy()); ++ Copy::_disjoint_words_atomic = reinterpret_cast(StubRoutines::jlong_disjoint_arraycopy()); ++ Copy::_aligned_conjoint_words = reinterpret_cast(StubRoutines::jlong_arraycopy()); ++ Copy::_aligned_disjoint_words = reinterpret_cast(StubRoutines::jlong_disjoint_arraycopy()); ++ Copy::_conjoint_bytes = reinterpret_cast(StubRoutines::jbyte_arraycopy()); ++ Copy::_conjoint_bytes_atomic = reinterpret_cast(StubRoutines::jbyte_arraycopy()); ++ Copy::_conjoint_jshorts_atomic = reinterpret_cast(StubRoutines::jshort_arraycopy()); ++ Copy::_conjoint_jints_atomic = reinterpret_cast(StubRoutines::jint_arraycopy()); ++ Copy::_conjoint_jlongs_atomic = reinterpret_cast(StubRoutines::jlong_arraycopy()); ++ Copy::_conjoint_oops_atomic = reinterpret_cast(StubRoutines::jlong_arraycopy()); ++ Copy::_arrayof_conjoint_bytes = reinterpret_cast(StubRoutines::arrayof_jbyte_arraycopy()); ++ Copy::_arrayof_conjoint_jshorts = reinterpret_cast(StubRoutines::arrayof_jshort_arraycopy()); ++ Copy::_arrayof_conjoint_jints = reinterpret_cast(StubRoutines::arrayof_jint_arraycopy()); ++ Copy::_arrayof_conjoint_jlongs = reinterpret_cast(StubRoutines::arrayof_jlong_arraycopy()); ++ Copy::_arrayof_conjoint_oops = reinterpret_cast(StubRoutines::arrayof_jlong_arraycopy()); ++ Copy::_fill_to_bytes = reinterpret_cast(StubRoutines::jbyte_fill()); ++ Copy::_fill_to_words = reinterpret_cast(StubRoutines::la::jlong_fill()); ++ Copy::_fill_to_aligned_words = reinterpret_cast(StubRoutines::la::arrayof_jlong_fill());; ++#if INCLUDE_ZGC ++ } ++#endif ++ } ++ ++ address generate_method_entry_barrier() { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", "nmethod_entry_barrier"); ++ ++ Label deoptimize_label; ++ Register rscratch2 = T8; ++ ++ address start = __ pc(); ++ ++ BarrierSetAssembler* bs_asm = BarrierSet::barrier_set()->barrier_set_assembler(); ++ ++ if (bs_asm->nmethod_patching_type() == NMethodPatchingType::conc_instruction_and_data_patch) { ++ BarrierSetNMethod* bs_nm = BarrierSet::barrier_set()->barrier_set_nmethod(); ++ Address thread_epoch_addr(TREG, in_bytes(bs_nm->thread_disarmed_guard_value_offset()) + 4); ++ __ lea_long(SCR1, ExternalAddress(bs_asm->patching_epoch_addr())); ++ __ ld_wu(SCR1, SCR1, 0); ++ __ st_w(SCR1, thread_epoch_addr); ++ __ ibar(0); ++ __ membar(__ LoadLoad); ++ } ++ ++ __ set_last_Java_frame(SP, FP, RA); ++ ++ __ enter(); ++ __ addi_d(T4, SP, wordSize); // T4 points to the saved RA ++ ++ __ addi_d(SP, SP, -4 * wordSize); // four words for the returned {SP, FP, RA, PC} ++ ++ __ push(V0); ++ __ push_call_clobbered_registers_except(RegSet::of(V0)); ++ ++ __ move(A0, T4); ++ __ call_VM_leaf ++ (CAST_FROM_FN_PTR ++ (address, BarrierSetNMethod::nmethod_stub_entry_barrier), 1); ++ ++ __ reset_last_Java_frame(true); ++ ++ __ pop_call_clobbered_registers_except(RegSet::of(V0)); ++ ++ __ bnez(V0, deoptimize_label); ++ ++ __ pop(V0); ++ __ leave(); ++ __ jr(RA); ++ ++ __ bind(deoptimize_label); ++ ++ __ pop(V0); ++ __ ld_d(rscratch2, SP, 0); ++ __ ld_d(FP, SP, 1 * wordSize); ++ __ ld_d(RA, SP, 2 * wordSize); ++ __ ld_d(T4, SP, 3 * wordSize); ++ ++ __ move(SP, rscratch2); ++ __ jr(T4); ++ ++ return start; ++ } ++ ++ // T8 result ++ // A4 src ++ // A5 src count ++ // A6 pattern ++ // A7 pattern count ++ address generate_string_indexof_linear(bool needle_isL, bool haystack_isL) ++ { ++ const char* stubName = needle_isL ++ ? (haystack_isL ? "indexof_linear_ll" : "indexof_linear_ul") ++ : "indexof_linear_uu"; ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", stubName); ++ address entry = __ pc(); ++ ++ int needle_chr_size = needle_isL ? 1 : 2; ++ int haystack_chr_size = haystack_isL ? 1 : 2; ++ int needle_chr_shift = needle_isL ? 0 : 1; ++ int haystack_chr_shift = haystack_isL ? 0 : 1; ++ bool isL = needle_isL && haystack_isL; ++ ++ // parameters ++ Register result = T8, haystack = A4, haystack_len = A5, needle = A6, needle_len = A7; ++ ++ // temporary registers ++ Register match_mask = T0, mask1 = T1, mask2 = T2; ++ Register first = T3, trailing_zeros = T4; ++ Register ch1 = T5, ch2 = T6; ++ ++ RegSet spilled_regs = RegSet::range(T0, T6); ++ ++ __ push(spilled_regs); ++ ++ Label L_LOOP, L_LOOP_PROCEED, L_SMALL, L_HAS_ZERO, L_SMALL_HAS_ZERO, ++ L_HAS_ZERO_LOOP, L_CMP_LOOP, L_CMP_LOOP_NOMATCH, ++ L_SMALL_HAS_ZERO_LOOP, L_SMALL_CMP_LOOP_NOMATCH, L_SMALL_CMP_LOOP, ++ L_POST_LOOP, L_CMP_LOOP_LAST_CMP, L_HAS_ZERO_LOOP_NOMATCH, ++ L_SMALL_CMP_LOOP_LAST_CMP, L_SMALL_CMP_LOOP_LAST_CMP2, ++ L_CMP_LOOP_LAST_CMP2, DONE, NOMATCH; ++ ++ __ ld_d(ch1, Address(needle)); ++ ++ // src.length - pattern.length ++ __ sub_d(haystack_len, haystack_len, needle_len); ++ ++ // first is needle[0] ++ __ bstrpick_d(first, ch1, needle_isL ? 7 : 15, 0); ++ ++ uint64_t mask0101 = UCONST64(0x0101010101010101); ++ uint64_t mask0001 = UCONST64(0x0001000100010001); ++ __ li(mask1, haystack_isL ? mask0101 : mask0001); ++ ++ uint64_t mask7f7f = UCONST64(0x7f7f7f7f7f7f7f7f); ++ uint64_t mask7fff = UCONST64(0x7fff7fff7fff7fff); ++ __ li(mask2, haystack_isL ? mask7f7f : mask7fff); ++ ++ // first -> needle[0]needle[0]needle[0]needle[0] ++ if (haystack_isL) __ bstrins_d(first, first, 15, 8); ++ __ bstrins_d(first, first, 31, 16); ++ __ bstrins_d(first, first, 63, 32); ++ ++ if (needle_isL != haystack_isL) { ++ // convert Latin1 to UTF. eg: 0x0000abcd -> 0x0a0b0c0d ++ __ move(AT, ch1); ++ __ bstrpick_d(ch1, AT, 7, 0); ++ __ srli_d(AT, AT, 8); ++ __ bstrins_d(ch1, AT, 23, 16); ++ __ srli_d(AT, AT, 8); ++ __ bstrins_d(ch1, AT, 39, 32); ++ __ srli_d(AT, AT, 8); ++ __ bstrins_d(ch1, AT, 55, 48); ++ } ++ ++ __ addi_d(haystack_len, haystack_len, -1 * (wordSize / haystack_chr_size - 1)); ++ __ bge(R0, haystack_len, L_SMALL); ++ ++ // compare and set match_mask[i] with 0x80/0x8000 (Latin1/UTF16) if ch2[i] == first[i] ++ // eg: ++ // first: aa aa aa aa aa aa aa aa ++ // ch2: aa aa li nx jd ka aa aa ++ // match_mask: 80 80 00 00 00 00 80 80 ++ ++ __ bind(L_LOOP); ++ __ ld_d(ch2, Address(haystack)); ++ // compute match_mask ++ __ xorr(ch2, first, ch2); ++ __ sub_d(match_mask, ch2, mask1); ++ __ orr(ch2, ch2, mask2); ++ __ andn(match_mask, match_mask, ch2); ++ // search first char of needle, goto L_HAS_ZERO if success. ++ __ bnez(match_mask, L_HAS_ZERO); ++ ++ __ bind(L_LOOP_PROCEED); ++ __ addi_d(haystack_len, haystack_len, -1 * (wordSize / haystack_chr_size)); ++ __ addi_d(haystack, haystack, wordSize); ++ __ addi_d(result, result, wordSize / haystack_chr_size); ++ __ bge(haystack_len, R0, L_LOOP); ++ ++ __ bind(L_POST_LOOP); ++ __ li(ch2, -1 * (wordSize / haystack_chr_size)); ++ __ bge(ch2, haystack_len, NOMATCH); // no extra characters to check ++ ++ __ bind(L_SMALL); ++ __ ld_d(ch2, Address(haystack)); ++ __ slli_d(haystack_len, haystack_len, LogBitsPerByte + haystack_chr_shift); ++ __ sub_d(haystack_len, R0, haystack_len); ++ // compute match_mask ++ __ xorr(ch2, first, ch2); ++ __ sub_d(match_mask, ch2, mask1); ++ __ orr(ch2, ch2, mask2); ++ __ andn(match_mask, match_mask, ch2); ++ // clear useless match_mask bits and check ++ __ nor(trailing_zeros, R0, R0); // all bits set ++ __ srl_d(trailing_zeros, trailing_zeros, haystack_len); // zeroes on useless bits. ++ __ andr(match_mask, match_mask, trailing_zeros); // refine match_mask ++ __ beqz(match_mask, NOMATCH); ++ ++ __ bind(L_SMALL_HAS_ZERO); ++ __ ctz_d(trailing_zeros, match_mask); ++ __ li(AT, wordSize / haystack_chr_size); ++ __ bge(AT, needle_len, L_SMALL_CMP_LOOP_LAST_CMP2); ++ ++ __ bind(L_SMALL_HAS_ZERO_LOOP); ++ // compute index ++ __ srl_d(match_mask, match_mask, trailing_zeros); ++ __ srli_d(match_mask, match_mask, 1); ++ __ srli_d(AT, trailing_zeros, LogBitsPerByte); ++ if (!haystack_isL) __ andi(AT, AT, 0xE); ++ __ add_d(haystack, haystack, AT); ++ __ ld_d(ch2, Address(haystack)); ++ if (!haystack_isL) __ srli_d(AT, AT, haystack_chr_shift); ++ __ add_d(result, result, AT); ++ ++ __ li(trailing_zeros, wordSize / haystack_chr_size); ++ __ bne(ch1, ch2, L_SMALL_CMP_LOOP_NOMATCH); ++ ++ __ bind(L_SMALL_CMP_LOOP); ++ needle_isL ? __ ld_bu(first, Address(needle, trailing_zeros, Address::no_scale, 0)) ++ : __ ld_hu(first, Address(needle, trailing_zeros, Address::times_2, 0)); ++ haystack_isL ? __ ld_bu(ch2, Address(haystack, trailing_zeros, Address::no_scale, 0)) ++ : __ ld_hu(ch2, Address(haystack, trailing_zeros, Address::times_2, 0)); ++ __ addi_d(trailing_zeros, trailing_zeros, 1); ++ __ bge(trailing_zeros, needle_len, L_SMALL_CMP_LOOP_LAST_CMP); ++ __ beq(first, ch2, L_SMALL_CMP_LOOP); ++ ++ __ bind(L_SMALL_CMP_LOOP_NOMATCH); ++ __ beqz(match_mask, NOMATCH); ++ __ ctz_d(trailing_zeros, match_mask); ++ __ addi_d(result, result, 1); ++ __ addi_d(haystack, haystack, haystack_chr_size); ++ __ b(L_SMALL_HAS_ZERO_LOOP); ++ ++ __ bind(L_SMALL_CMP_LOOP_LAST_CMP); ++ __ bne(first, ch2, L_SMALL_CMP_LOOP_NOMATCH); ++ __ b(DONE); ++ ++ __ bind(L_SMALL_CMP_LOOP_LAST_CMP2); ++ // compute index ++ __ srl_d(match_mask, match_mask, trailing_zeros); ++ __ srli_d(match_mask, match_mask, 1); ++ __ srli_d(AT, trailing_zeros, LogBitsPerByte); ++ if (!haystack_isL) __ andi(AT, AT, 0xE); ++ __ add_d(haystack, haystack, AT); ++ __ ld_d(ch2, Address(haystack)); ++ if (!haystack_isL) __ srli_d(AT, AT, haystack_chr_shift); ++ __ add_d(result, result, AT); ++ ++ __ bne(ch1, ch2, L_SMALL_CMP_LOOP_NOMATCH); ++ __ b(DONE); ++ ++ __ bind(L_HAS_ZERO); ++ __ ctz_d(trailing_zeros, match_mask); ++ __ li(AT, wordSize / haystack_chr_size); ++ __ bge(AT, needle_len, L_CMP_LOOP_LAST_CMP2); ++ __ addi_d(result, result, -1); // array index from 0, so result -= 1 ++ ++ __ bind(L_HAS_ZERO_LOOP); ++ // compute index ++ __ srl_d(match_mask, match_mask, trailing_zeros); ++ __ srli_d(match_mask, match_mask, 1); ++ __ srli_d(AT, trailing_zeros, LogBitsPerByte); ++ if (!haystack_isL) __ andi(AT, AT, 0xE); ++ __ add_d(haystack, haystack, AT); ++ __ ld_d(ch2, Address(haystack)); ++ if (!haystack_isL) __ srli_d(AT, AT, haystack_chr_shift); ++ __ add_d(result, result, AT); ++ ++ __ addi_d(result, result, 1); ++ __ li(trailing_zeros, wordSize / haystack_chr_size); ++ __ bne(ch1, ch2, L_CMP_LOOP_NOMATCH); ++ ++ // compare one char ++ __ bind(L_CMP_LOOP); ++ haystack_isL ? __ ld_bu(ch2, Address(haystack, trailing_zeros, Address::no_scale, 0)) ++ : __ ld_hu(ch2, Address(haystack, trailing_zeros, Address::times_2, 0)); ++ needle_isL ? __ ld_bu(AT, Address(needle, trailing_zeros, Address::no_scale, 0)) ++ : __ ld_hu(AT, Address(needle, trailing_zeros, Address::times_2, 0)); ++ __ addi_d(trailing_zeros, trailing_zeros, 1); // next char index ++ __ bge(trailing_zeros, needle_len, L_CMP_LOOP_LAST_CMP); ++ __ beq(AT, ch2, L_CMP_LOOP); ++ ++ __ bind(L_CMP_LOOP_NOMATCH); ++ __ beqz(match_mask, L_HAS_ZERO_LOOP_NOMATCH); ++ __ ctz_d(trailing_zeros, match_mask); ++ __ addi_d(haystack, haystack, haystack_chr_size); ++ __ b(L_HAS_ZERO_LOOP); ++ ++ __ bind(L_CMP_LOOP_LAST_CMP); ++ __ bne(AT, ch2, L_CMP_LOOP_NOMATCH); ++ __ b(DONE); ++ ++ __ bind(L_CMP_LOOP_LAST_CMP2); ++ // compute index ++ __ srl_d(match_mask, match_mask, trailing_zeros); ++ __ srli_d(match_mask, match_mask, 1); ++ __ srli_d(AT, trailing_zeros, LogBitsPerByte); ++ if (!haystack_isL) __ andi(AT, AT, 0xE); ++ __ add_d(haystack, haystack, AT); ++ __ ld_d(ch2, Address(haystack)); ++ if (!haystack_isL) __ srli_d(AT, AT, haystack_chr_shift); ++ __ add_d(result, result, AT); ++ ++ __ addi_d(result, result, 1); ++ __ bne(ch1, ch2, L_CMP_LOOP_NOMATCH); ++ __ b(DONE); ++ ++ __ bind(L_HAS_ZERO_LOOP_NOMATCH); ++ // 1) Restore "result" index. Index was wordSize/str2_chr_size * N until ++ // L_HAS_ZERO block. Byte octet was analyzed in L_HAS_ZERO_LOOP, ++ // so, result was increased at max by wordSize/str2_chr_size - 1, so, ++ // respective high bit wasn't changed. L_LOOP_PROCEED will increase ++ // result by analyzed characters value, so, we can just reset lower bits ++ // in result here. Clear 2 lower bits for UU/UL and 3 bits for LL ++ // 2) advance haystack value to represent next haystack octet. result & 7/3 is ++ // index of last analyzed substring inside current octet. So, haystack in at ++ // respective start address. We need to advance it to next octet ++ __ andi(match_mask, result, wordSize / haystack_chr_size - 1); ++ __ sub_d(result, result, match_mask); ++ if (!haystack_isL) __ slli_d(match_mask, match_mask, haystack_chr_shift); ++ __ sub_d(haystack, haystack, match_mask); ++ __ b(L_LOOP_PROCEED); ++ ++ __ bind(NOMATCH); ++ __ nor(result, R0, R0); // result = -1 ++ ++ __ bind(DONE); ++ __ pop(spilled_regs); ++ __ jr(RA); ++ return entry; ++ } ++ ++ void generate_string_indexof_stubs() ++ { ++ StubRoutines::la::_string_indexof_linear_ll = generate_string_indexof_linear(true, true); ++ StubRoutines::la::_string_indexof_linear_uu = generate_string_indexof_linear(false, false); ++ StubRoutines::la::_string_indexof_linear_ul = generate_string_indexof_linear(true, false); ++ } ++ ++ address generate_mulAdd() { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", "mulAdd"); ++ ++ address entry = __ pc(); ++ ++ const Register out = c_rarg0; ++ const Register in = c_rarg1; ++ const Register offset = c_rarg2; ++ const Register len = c_rarg3; ++ const Register k = c_rarg4; ++ ++ __ block_comment("Entry:"); ++ __ mul_add(out, in, offset, len, k); ++ __ jr(RA); ++ ++ return entry; ++ } ++ ++ // Arguments: ++ // ++ // Inputs: ++ // A0 - source byte array address ++ // A1 - destination byte array address ++ // A2 - K (key) in little endian int array ++ // A3 - r vector byte array address ++ // A4 - input length ++ // ++ // Output: ++ // A0 - input length ++ // ++ address generate_aescrypt_encryptBlock(bool cbc) { ++ static const uint32_t ft_consts[256] = { ++ 0xc66363a5, 0xf87c7c84, 0xee777799, 0xf67b7b8d, ++ 0xfff2f20d, 0xd66b6bbd, 0xde6f6fb1, 0x91c5c554, ++ 0x60303050, 0x02010103, 0xce6767a9, 0x562b2b7d, ++ 0xe7fefe19, 0xb5d7d762, 0x4dababe6, 0xec76769a, ++ 0x8fcaca45, 0x1f82829d, 0x89c9c940, 0xfa7d7d87, ++ 0xeffafa15, 0xb25959eb, 0x8e4747c9, 0xfbf0f00b, ++ 0x41adadec, 0xb3d4d467, 0x5fa2a2fd, 0x45afafea, ++ 0x239c9cbf, 0x53a4a4f7, 0xe4727296, 0x9bc0c05b, ++ 0x75b7b7c2, 0xe1fdfd1c, 0x3d9393ae, 0x4c26266a, ++ 0x6c36365a, 0x7e3f3f41, 0xf5f7f702, 0x83cccc4f, ++ 0x6834345c, 0x51a5a5f4, 0xd1e5e534, 0xf9f1f108, ++ 0xe2717193, 0xabd8d873, 0x62313153, 0x2a15153f, ++ 0x0804040c, 0x95c7c752, 0x46232365, 0x9dc3c35e, ++ 0x30181828, 0x379696a1, 0x0a05050f, 0x2f9a9ab5, ++ 0x0e070709, 0x24121236, 0x1b80809b, 0xdfe2e23d, ++ 0xcdebeb26, 0x4e272769, 0x7fb2b2cd, 0xea75759f, ++ 0x1209091b, 0x1d83839e, 0x582c2c74, 0x341a1a2e, ++ 0x361b1b2d, 0xdc6e6eb2, 0xb45a5aee, 0x5ba0a0fb, ++ 0xa45252f6, 0x763b3b4d, 0xb7d6d661, 0x7db3b3ce, ++ 0x5229297b, 0xdde3e33e, 0x5e2f2f71, 0x13848497, ++ 0xa65353f5, 0xb9d1d168, 0x00000000, 0xc1eded2c, ++ 0x40202060, 0xe3fcfc1f, 0x79b1b1c8, 0xb65b5bed, ++ 0xd46a6abe, 0x8dcbcb46, 0x67bebed9, 0x7239394b, ++ 0x944a4ade, 0x984c4cd4, 0xb05858e8, 0x85cfcf4a, ++ 0xbbd0d06b, 0xc5efef2a, 0x4faaaae5, 0xedfbfb16, ++ 0x864343c5, 0x9a4d4dd7, 0x66333355, 0x11858594, ++ 0x8a4545cf, 0xe9f9f910, 0x04020206, 0xfe7f7f81, ++ 0xa05050f0, 0x783c3c44, 0x259f9fba, 0x4ba8a8e3, ++ 0xa25151f3, 0x5da3a3fe, 0x804040c0, 0x058f8f8a, ++ 0x3f9292ad, 0x219d9dbc, 0x70383848, 0xf1f5f504, ++ 0x63bcbcdf, 0x77b6b6c1, 0xafdada75, 0x42212163, ++ 0x20101030, 0xe5ffff1a, 0xfdf3f30e, 0xbfd2d26d, ++ 0x81cdcd4c, 0x180c0c14, 0x26131335, 0xc3ecec2f, ++ 0xbe5f5fe1, 0x359797a2, 0x884444cc, 0x2e171739, ++ 0x93c4c457, 0x55a7a7f2, 0xfc7e7e82, 0x7a3d3d47, ++ 0xc86464ac, 0xba5d5de7, 0x3219192b, 0xe6737395, ++ 0xc06060a0, 0x19818198, 0x9e4f4fd1, 0xa3dcdc7f, ++ 0x44222266, 0x542a2a7e, 0x3b9090ab, 0x0b888883, ++ 0x8c4646ca, 0xc7eeee29, 0x6bb8b8d3, 0x2814143c, ++ 0xa7dede79, 0xbc5e5ee2, 0x160b0b1d, 0xaddbdb76, ++ 0xdbe0e03b, 0x64323256, 0x743a3a4e, 0x140a0a1e, ++ 0x924949db, 0x0c06060a, 0x4824246c, 0xb85c5ce4, ++ 0x9fc2c25d, 0xbdd3d36e, 0x43acacef, 0xc46262a6, ++ 0x399191a8, 0x319595a4, 0xd3e4e437, 0xf279798b, ++ 0xd5e7e732, 0x8bc8c843, 0x6e373759, 0xda6d6db7, ++ 0x018d8d8c, 0xb1d5d564, 0x9c4e4ed2, 0x49a9a9e0, ++ 0xd86c6cb4, 0xac5656fa, 0xf3f4f407, 0xcfeaea25, ++ 0xca6565af, 0xf47a7a8e, 0x47aeaee9, 0x10080818, ++ 0x6fbabad5, 0xf0787888, 0x4a25256f, 0x5c2e2e72, ++ 0x381c1c24, 0x57a6a6f1, 0x73b4b4c7, 0x97c6c651, ++ 0xcbe8e823, 0xa1dddd7c, 0xe874749c, 0x3e1f1f21, ++ 0x964b4bdd, 0x61bdbddc, 0x0d8b8b86, 0x0f8a8a85, ++ 0xe0707090, 0x7c3e3e42, 0x71b5b5c4, 0xcc6666aa, ++ 0x904848d8, 0x06030305, 0xf7f6f601, 0x1c0e0e12, ++ 0xc26161a3, 0x6a35355f, 0xae5757f9, 0x69b9b9d0, ++ 0x17868691, 0x99c1c158, 0x3a1d1d27, 0x279e9eb9, ++ 0xd9e1e138, 0xebf8f813, 0x2b9898b3, 0x22111133, ++ 0xd26969bb, 0xa9d9d970, 0x078e8e89, 0x339494a7, ++ 0x2d9b9bb6, 0x3c1e1e22, 0x15878792, 0xc9e9e920, ++ 0x87cece49, 0xaa5555ff, 0x50282878, 0xa5dfdf7a, ++ 0x038c8c8f, 0x59a1a1f8, 0x09898980, 0x1a0d0d17, ++ 0x65bfbfda, 0xd7e6e631, 0x844242c6, 0xd06868b8, ++ 0x824141c3, 0x299999b0, 0x5a2d2d77, 0x1e0f0f11, ++ 0x7bb0b0cb, 0xa85454fc, 0x6dbbbbd6, 0x2c16163a ++ }; ++ static const uint8_t fsb_consts[256] = { ++ 0x63, 0x7c, 0x77, 0x7b, 0xf2, 0x6b, 0x6f, 0xc5, ++ 0x30, 0x01, 0x67, 0x2b, 0xfe, 0xd7, 0xab, 0x76, ++ 0xca, 0x82, 0xc9, 0x7d, 0xfa, 0x59, 0x47, 0xf0, ++ 0xad, 0xd4, 0xa2, 0xaf, 0x9c, 0xa4, 0x72, 0xc0, ++ 0xb7, 0xfd, 0x93, 0x26, 0x36, 0x3f, 0xf7, 0xcc, ++ 0x34, 0xa5, 0xe5, 0xf1, 0x71, 0xd8, 0x31, 0x15, ++ 0x04, 0xc7, 0x23, 0xc3, 0x18, 0x96, 0x05, 0x9a, ++ 0x07, 0x12, 0x80, 0xe2, 0xeb, 0x27, 0xb2, 0x75, ++ 0x09, 0x83, 0x2c, 0x1a, 0x1b, 0x6e, 0x5a, 0xa0, ++ 0x52, 0x3b, 0xd6, 0xb3, 0x29, 0xe3, 0x2f, 0x84, ++ 0x53, 0xd1, 0x00, 0xed, 0x20, 0xfc, 0xb1, 0x5b, ++ 0x6a, 0xcb, 0xbe, 0x39, 0x4a, 0x4c, 0x58, 0xcf, ++ 0xd0, 0xef, 0xaa, 0xfb, 0x43, 0x4d, 0x33, 0x85, ++ 0x45, 0xf9, 0x02, 0x7f, 0x50, 0x3c, 0x9f, 0xa8, ++ 0x51, 0xa3, 0x40, 0x8f, 0x92, 0x9d, 0x38, 0xf5, ++ 0xbc, 0xb6, 0xda, 0x21, 0x10, 0xff, 0xf3, 0xd2, ++ 0xcd, 0x0c, 0x13, 0xec, 0x5f, 0x97, 0x44, 0x17, ++ 0xc4, 0xa7, 0x7e, 0x3d, 0x64, 0x5d, 0x19, 0x73, ++ 0x60, 0x81, 0x4f, 0xdc, 0x22, 0x2a, 0x90, 0x88, ++ 0x46, 0xee, 0xb8, 0x14, 0xde, 0x5e, 0x0b, 0xdb, ++ 0xe0, 0x32, 0x3a, 0x0a, 0x49, 0x06, 0x24, 0x5c, ++ 0xc2, 0xd3, 0xac, 0x62, 0x91, 0x95, 0xe4, 0x79, ++ 0xe7, 0xc8, 0x37, 0x6d, 0x8d, 0xd5, 0x4e, 0xa9, ++ 0x6c, 0x56, 0xf4, 0xea, 0x65, 0x7a, 0xae, 0x08, ++ 0xba, 0x78, 0x25, 0x2e, 0x1c, 0xa6, 0xb4, 0xc6, ++ 0xe8, 0xdd, 0x74, 0x1f, 0x4b, 0xbd, 0x8b, 0x8a, ++ 0x70, 0x3e, 0xb5, 0x66, 0x48, 0x03, 0xf6, 0x0e, ++ 0x61, 0x35, 0x57, 0xb9, 0x86, 0xc1, 0x1d, 0x9e, ++ 0xe1, 0xf8, 0x98, 0x11, 0x69, 0xd9, 0x8e, 0x94, ++ 0x9b, 0x1e, 0x87, 0xe9, 0xce, 0x55, 0x28, 0xdf, ++ 0x8c, 0xa1, 0x89, 0x0d, 0xbf, 0xe6, 0x42, 0x68, ++ 0x41, 0x99, 0x2d, 0x0f, 0xb0, 0x54, 0xbb, 0x16 ++ }; ++ ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", "aescrypt_encryptBlock"); ++ ++ // Allocate registers ++ Register src = A0; ++ Register dst = A1; ++ Register key = A2; ++ Register rve = A3; ++ Register srclen = A4; ++ Register keylen = T8; ++ Register srcend = A5; ++ Register keyold = A6; ++ Register t0 = A7; ++ Register t1, t2, t3, ftp; ++ Register xa[4] = { T0, T1, T2, T3 }; ++ Register ya[4] = { T4, T5, T6, T7 }; ++ ++ Label loop, tail, done; ++ address start = __ pc(); ++ ++ if (cbc) { ++ t1 = S0; ++ t2 = S1; ++ t3 = S2; ++ ftp = S3; ++ ++ __ beqz(srclen, done); ++ ++ __ addi_d(SP, SP, -4 * wordSize); ++ __ st_d(S3, SP, 3 * wordSize); ++ __ st_d(S2, SP, 2 * wordSize); ++ __ st_d(S1, SP, 1 * wordSize); ++ __ st_d(S0, SP, 0 * wordSize); ++ ++ __ add_d(srcend, src, srclen); ++ __ move(keyold, key); ++ } else { ++ t1 = A3; ++ t2 = A4; ++ t3 = A5; ++ ftp = A6; ++ } ++ ++ __ ld_w(keylen, key, arrayOopDesc::length_offset_in_bytes() - arrayOopDesc::base_offset_in_bytes(T_INT)); ++ ++ // Round 1 ++ if (cbc) { ++ for (int i = 0; i < 4; i++) { ++ __ ld_w(xa[i], rve, 4 * i); ++ } ++ ++ __ bind(loop); ++ ++ for (int i = 0; i < 4; i++) { ++ __ ld_w(ya[i], src, 4 * i); ++ } ++ for (int i = 0; i < 4; i++) { ++ __ XOR(xa[i], xa[i], ya[i]); ++ } ++ } else { ++ for (int i = 0; i < 4; i++) { ++ __ ld_w(xa[i], src, 4 * i); ++ } ++ } ++ for (int i = 0; i < 4; i++) { ++ __ ld_w(ya[i], key, 4 * i); ++ } ++ for (int i = 0; i < 4; i++) { ++ __ revb_2h(xa[i], xa[i]); ++ } ++ for (int i = 0; i < 4; i++) { ++ __ rotri_w(xa[i], xa[i], 16); ++ } ++ for (int i = 0; i < 4; i++) { ++ __ XOR(xa[i], xa[i], ya[i]); ++ } ++ ++ __ li(ftp, (intptr_t)ft_consts); ++ ++ // Round 2 - (N-1) ++ for (int r = 0; r < 14; r++) { ++ Register *xp; ++ Register *yp; ++ ++ if (r & 1) { ++ xp = xa; ++ yp = ya; ++ } else { ++ xp = ya; ++ yp = xa; ++ } ++ ++ for (int i = 0; i < 4; i++) { ++ __ ld_w(xp[i], key, 4 * (4 * (r + 1) + i)); ++ } ++ ++ for (int i = 0; i < 4; i++) { ++ __ bstrpick_d(t0, yp[(i + 3) & 3], 7, 0); ++ __ bstrpick_d(t1, yp[(i + 2) & 3], 15, 8); ++ __ bstrpick_d(t2, yp[(i + 1) & 3], 23, 16); ++ __ bstrpick_d(t3, yp[(i + 0) & 3], 31, 24); ++ __ slli_w(t0, t0, 2); ++ __ slli_w(t1, t1, 2); ++ __ slli_w(t2, t2, 2); ++ __ slli_w(t3, t3, 2); ++ __ ldx_w(t0, ftp, t0); ++ __ ldx_w(t1, ftp, t1); ++ __ ldx_w(t2, ftp, t2); ++ __ ldx_w(t3, ftp, t3); ++ __ rotri_w(t0, t0, 24); ++ __ rotri_w(t1, t1, 16); ++ __ rotri_w(t2, t2, 8); ++ __ XOR(xp[i], xp[i], t0); ++ __ XOR(t0, t1, t2); ++ __ XOR(xp[i], xp[i], t3); ++ __ XOR(xp[i], xp[i], t0); ++ } ++ ++ if (r == 8) { ++ // AES 128 ++ __ li(t0, 44); ++ __ beq(t0, keylen, tail); ++ } else if (r == 10) { ++ // AES 192 ++ __ li(t0, 52); ++ __ beq(t0, keylen, tail); ++ } ++ } ++ ++ __ bind(tail); ++ __ li(ftp, (intptr_t)fsb_consts); ++ __ alsl_d(key, keylen, key, 2 - 1); ++ ++ // Round N ++ for (int i = 0; i < 4; i++) { ++ __ bstrpick_d(t0, ya[(i + 3) & 3], 7, 0); ++ __ bstrpick_d(t1, ya[(i + 2) & 3], 15, 8); ++ __ bstrpick_d(t2, ya[(i + 1) & 3], 23, 16); ++ __ bstrpick_d(t3, ya[(i + 0) & 3], 31, 24); ++ __ ldx_bu(t0, ftp, t0); ++ __ ldx_bu(t1, ftp, t1); ++ __ ldx_bu(t2, ftp, t2); ++ __ ldx_bu(t3, ftp, t3); ++ __ ld_w(xa[i], key, 4 * i - 16); ++ __ slli_w(t1, t1, 8); ++ __ slli_w(t2, t2, 16); ++ __ slli_w(t3, t3, 24); ++ __ XOR(xa[i], xa[i], t0); ++ __ XOR(t0, t1, t2); ++ __ XOR(xa[i], xa[i], t3); ++ __ XOR(xa[i], xa[i], t0); ++ } ++ ++ for (int i = 0; i < 4; i++) { ++ __ revb_2h(xa[i], xa[i]); ++ } ++ for (int i = 0; i < 4; i++) { ++ __ rotri_w(xa[i], xa[i], 16); ++ } ++ for (int i = 0; i < 4; i++) { ++ __ st_w(xa[i], dst, 4 * i); ++ } ++ ++ if (cbc) { ++ __ move(key, keyold); ++ __ addi_d(src, src, 16); ++ __ addi_d(dst, dst, 16); ++ __ blt(src, srcend, loop); ++ ++ for (int i = 0; i < 4; i++) { ++ __ st_w(xa[i], rve, 4 * i); ++ } ++ ++ __ ld_d(S3, SP, 3 * wordSize); ++ __ ld_d(S2, SP, 2 * wordSize); ++ __ ld_d(S1, SP, 1 * wordSize); ++ __ ld_d(S0, SP, 0 * wordSize); ++ __ addi_d(SP, SP, 4 * wordSize); ++ ++ __ bind(done); ++ __ move(A0, srclen); ++ } ++ ++ __ jr(RA); ++ ++ return start; ++ } ++ ++ // Arguments: ++ // ++ // Inputs: ++ // A0 - source byte array address ++ // A1 - destination byte array address ++ // A2 - K (key) in little endian int array ++ // A3 - r vector byte array address ++ // A4 - input length ++ // ++ // Output: ++ // A0 - input length ++ // ++ address generate_aescrypt_decryptBlock(bool cbc) { ++ static const uint32_t rt_consts[256] = { ++ 0x51f4a750, 0x7e416553, 0x1a17a4c3, 0x3a275e96, ++ 0x3bab6bcb, 0x1f9d45f1, 0xacfa58ab, 0x4be30393, ++ 0x2030fa55, 0xad766df6, 0x88cc7691, 0xf5024c25, ++ 0x4fe5d7fc, 0xc52acbd7, 0x26354480, 0xb562a38f, ++ 0xdeb15a49, 0x25ba1b67, 0x45ea0e98, 0x5dfec0e1, ++ 0xc32f7502, 0x814cf012, 0x8d4697a3, 0x6bd3f9c6, ++ 0x038f5fe7, 0x15929c95, 0xbf6d7aeb, 0x955259da, ++ 0xd4be832d, 0x587421d3, 0x49e06929, 0x8ec9c844, ++ 0x75c2896a, 0xf48e7978, 0x99583e6b, 0x27b971dd, ++ 0xbee14fb6, 0xf088ad17, 0xc920ac66, 0x7dce3ab4, ++ 0x63df4a18, 0xe51a3182, 0x97513360, 0x62537f45, ++ 0xb16477e0, 0xbb6bae84, 0xfe81a01c, 0xf9082b94, ++ 0x70486858, 0x8f45fd19, 0x94de6c87, 0x527bf8b7, ++ 0xab73d323, 0x724b02e2, 0xe31f8f57, 0x6655ab2a, ++ 0xb2eb2807, 0x2fb5c203, 0x86c57b9a, 0xd33708a5, ++ 0x302887f2, 0x23bfa5b2, 0x02036aba, 0xed16825c, ++ 0x8acf1c2b, 0xa779b492, 0xf307f2f0, 0x4e69e2a1, ++ 0x65daf4cd, 0x0605bed5, 0xd134621f, 0xc4a6fe8a, ++ 0x342e539d, 0xa2f355a0, 0x058ae132, 0xa4f6eb75, ++ 0x0b83ec39, 0x4060efaa, 0x5e719f06, 0xbd6e1051, ++ 0x3e218af9, 0x96dd063d, 0xdd3e05ae, 0x4de6bd46, ++ 0x91548db5, 0x71c45d05, 0x0406d46f, 0x605015ff, ++ 0x1998fb24, 0xd6bde997, 0x894043cc, 0x67d99e77, ++ 0xb0e842bd, 0x07898b88, 0xe7195b38, 0x79c8eedb, ++ 0xa17c0a47, 0x7c420fe9, 0xf8841ec9, 0x00000000, ++ 0x09808683, 0x322bed48, 0x1e1170ac, 0x6c5a724e, ++ 0xfd0efffb, 0x0f853856, 0x3daed51e, 0x362d3927, ++ 0x0a0fd964, 0x685ca621, 0x9b5b54d1, 0x24362e3a, ++ 0x0c0a67b1, 0x9357e70f, 0xb4ee96d2, 0x1b9b919e, ++ 0x80c0c54f, 0x61dc20a2, 0x5a774b69, 0x1c121a16, ++ 0xe293ba0a, 0xc0a02ae5, 0x3c22e043, 0x121b171d, ++ 0x0e090d0b, 0xf28bc7ad, 0x2db6a8b9, 0x141ea9c8, ++ 0x57f11985, 0xaf75074c, 0xee99ddbb, 0xa37f60fd, ++ 0xf701269f, 0x5c72f5bc, 0x44663bc5, 0x5bfb7e34, ++ 0x8b432976, 0xcb23c6dc, 0xb6edfc68, 0xb8e4f163, ++ 0xd731dcca, 0x42638510, 0x13972240, 0x84c61120, ++ 0x854a247d, 0xd2bb3df8, 0xaef93211, 0xc729a16d, ++ 0x1d9e2f4b, 0xdcb230f3, 0x0d8652ec, 0x77c1e3d0, ++ 0x2bb3166c, 0xa970b999, 0x119448fa, 0x47e96422, ++ 0xa8fc8cc4, 0xa0f03f1a, 0x567d2cd8, 0x223390ef, ++ 0x87494ec7, 0xd938d1c1, 0x8ccaa2fe, 0x98d40b36, ++ 0xa6f581cf, 0xa57ade28, 0xdab78e26, 0x3fadbfa4, ++ 0x2c3a9de4, 0x5078920d, 0x6a5fcc9b, 0x547e4662, ++ 0xf68d13c2, 0x90d8b8e8, 0x2e39f75e, 0x82c3aff5, ++ 0x9f5d80be, 0x69d0937c, 0x6fd52da9, 0xcf2512b3, ++ 0xc8ac993b, 0x10187da7, 0xe89c636e, 0xdb3bbb7b, ++ 0xcd267809, 0x6e5918f4, 0xec9ab701, 0x834f9aa8, ++ 0xe6956e65, 0xaaffe67e, 0x21bccf08, 0xef15e8e6, ++ 0xbae79bd9, 0x4a6f36ce, 0xea9f09d4, 0x29b07cd6, ++ 0x31a4b2af, 0x2a3f2331, 0xc6a59430, 0x35a266c0, ++ 0x744ebc37, 0xfc82caa6, 0xe090d0b0, 0x33a7d815, ++ 0xf104984a, 0x41ecdaf7, 0x7fcd500e, 0x1791f62f, ++ 0x764dd68d, 0x43efb04d, 0xccaa4d54, 0xe49604df, ++ 0x9ed1b5e3, 0x4c6a881b, 0xc12c1fb8, 0x4665517f, ++ 0x9d5eea04, 0x018c355d, 0xfa877473, 0xfb0b412e, ++ 0xb3671d5a, 0x92dbd252, 0xe9105633, 0x6dd64713, ++ 0x9ad7618c, 0x37a10c7a, 0x59f8148e, 0xeb133c89, ++ 0xcea927ee, 0xb761c935, 0xe11ce5ed, 0x7a47b13c, ++ 0x9cd2df59, 0x55f2733f, 0x1814ce79, 0x73c737bf, ++ 0x53f7cdea, 0x5ffdaa5b, 0xdf3d6f14, 0x7844db86, ++ 0xcaaff381, 0xb968c43e, 0x3824342c, 0xc2a3405f, ++ 0x161dc372, 0xbce2250c, 0x283c498b, 0xff0d9541, ++ 0x39a80171, 0x080cb3de, 0xd8b4e49c, 0x6456c190, ++ 0x7bcb8461, 0xd532b670, 0x486c5c74, 0xd0b85742 ++ }; ++ static const uint8_t rsb_consts[256] = { ++ 0x52, 0x09, 0x6a, 0xd5, 0x30, 0x36, 0xa5, 0x38, ++ 0xbf, 0x40, 0xa3, 0x9e, 0x81, 0xf3, 0xd7, 0xfb, ++ 0x7c, 0xe3, 0x39, 0x82, 0x9b, 0x2f, 0xff, 0x87, ++ 0x34, 0x8e, 0x43, 0x44, 0xc4, 0xde, 0xe9, 0xcb, ++ 0x54, 0x7b, 0x94, 0x32, 0xa6, 0xc2, 0x23, 0x3d, ++ 0xee, 0x4c, 0x95, 0x0b, 0x42, 0xfa, 0xc3, 0x4e, ++ 0x08, 0x2e, 0xa1, 0x66, 0x28, 0xd9, 0x24, 0xb2, ++ 0x76, 0x5b, 0xa2, 0x49, 0x6d, 0x8b, 0xd1, 0x25, ++ 0x72, 0xf8, 0xf6, 0x64, 0x86, 0x68, 0x98, 0x16, ++ 0xd4, 0xa4, 0x5c, 0xcc, 0x5d, 0x65, 0xb6, 0x92, ++ 0x6c, 0x70, 0x48, 0x50, 0xfd, 0xed, 0xb9, 0xda, ++ 0x5e, 0x15, 0x46, 0x57, 0xa7, 0x8d, 0x9d, 0x84, ++ 0x90, 0xd8, 0xab, 0x00, 0x8c, 0xbc, 0xd3, 0x0a, ++ 0xf7, 0xe4, 0x58, 0x05, 0xb8, 0xb3, 0x45, 0x06, ++ 0xd0, 0x2c, 0x1e, 0x8f, 0xca, 0x3f, 0x0f, 0x02, ++ 0xc1, 0xaf, 0xbd, 0x03, 0x01, 0x13, 0x8a, 0x6b, ++ 0x3a, 0x91, 0x11, 0x41, 0x4f, 0x67, 0xdc, 0xea, ++ 0x97, 0xf2, 0xcf, 0xce, 0xf0, 0xb4, 0xe6, 0x73, ++ 0x96, 0xac, 0x74, 0x22, 0xe7, 0xad, 0x35, 0x85, ++ 0xe2, 0xf9, 0x37, 0xe8, 0x1c, 0x75, 0xdf, 0x6e, ++ 0x47, 0xf1, 0x1a, 0x71, 0x1d, 0x29, 0xc5, 0x89, ++ 0x6f, 0xb7, 0x62, 0x0e, 0xaa, 0x18, 0xbe, 0x1b, ++ 0xfc, 0x56, 0x3e, 0x4b, 0xc6, 0xd2, 0x79, 0x20, ++ 0x9a, 0xdb, 0xc0, 0xfe, 0x78, 0xcd, 0x5a, 0xf4, ++ 0x1f, 0xdd, 0xa8, 0x33, 0x88, 0x07, 0xc7, 0x31, ++ 0xb1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xec, 0x5f, ++ 0x60, 0x51, 0x7f, 0xa9, 0x19, 0xb5, 0x4a, 0x0d, ++ 0x2d, 0xe5, 0x7a, 0x9f, 0x93, 0xc9, 0x9c, 0xef, ++ 0xa0, 0xe0, 0x3b, 0x4d, 0xae, 0x2a, 0xf5, 0xb0, ++ 0xc8, 0xeb, 0xbb, 0x3c, 0x83, 0x53, 0x99, 0x61, ++ 0x17, 0x2b, 0x04, 0x7e, 0xba, 0x77, 0xd6, 0x26, ++ 0xe1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0c, 0x7d ++ }; ++ ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", "aescrypt_decryptBlock"); ++ ++ // Allocate registers ++ Register src = A0; ++ Register dst = A1; ++ Register key = A2; ++ Register rve = A3; ++ Register srclen = A4; ++ Register keylen = T8; ++ Register srcend = A5; ++ Register t0 = A6; ++ Register t1 = A7; ++ Register t2, t3, rtp, rvp; ++ Register xa[4] = { T0, T1, T2, T3 }; ++ Register ya[4] = { T4, T5, T6, T7 }; ++ ++ Label loop, tail, done; ++ address start = __ pc(); ++ ++ if (cbc) { ++ t2 = S0; ++ t3 = S1; ++ rtp = S2; ++ rvp = S3; ++ ++ __ beqz(srclen, done); ++ ++ __ addi_d(SP, SP, -4 * wordSize); ++ __ st_d(S3, SP, 3 * wordSize); ++ __ st_d(S2, SP, 2 * wordSize); ++ __ st_d(S1, SP, 1 * wordSize); ++ __ st_d(S0, SP, 0 * wordSize); ++ ++ __ add_d(srcend, src, srclen); ++ __ move(rvp, rve); ++ } else { ++ t2 = A3; ++ t3 = A4; ++ rtp = A5; ++ } ++ ++ __ ld_w(keylen, key, arrayOopDesc::length_offset_in_bytes() - arrayOopDesc::base_offset_in_bytes(T_INT)); ++ ++ __ bind(loop); ++ ++ // Round 1 ++ for (int i = 0; i < 4; i++) { ++ __ ld_w(xa[i], src, 4 * i); ++ } ++ for (int i = 0; i < 4; i++) { ++ __ ld_w(ya[i], key, 4 * (4 + i)); ++ } ++ for (int i = 0; i < 4; i++) { ++ __ revb_2h(xa[i], xa[i]); ++ } ++ for (int i = 0; i < 4; i++) { ++ __ rotri_w(xa[i], xa[i], 16); ++ } ++ for (int i = 0; i < 4; i++) { ++ __ XOR(xa[i], xa[i], ya[i]); ++ } ++ ++ __ li(rtp, (intptr_t)rt_consts); ++ ++ // Round 2 - (N-1) ++ for (int r = 0; r < 14; r++) { ++ Register *xp; ++ Register *yp; ++ ++ if (r & 1) { ++ xp = xa; ++ yp = ya; ++ } else { ++ xp = ya; ++ yp = xa; ++ } ++ ++ for (int i = 0; i < 4; i++) { ++ __ ld_w(xp[i], key, 4 * (4 * (r + 1) + 4 + i)); ++ } ++ ++ for (int i = 0; i < 4; i++) { ++ __ bstrpick_d(t0, yp[(i + 1) & 3], 7, 0); ++ __ bstrpick_d(t1, yp[(i + 2) & 3], 15, 8); ++ __ bstrpick_d(t2, yp[(i + 3) & 3], 23, 16); ++ __ bstrpick_d(t3, yp[(i + 0) & 3], 31, 24); ++ __ slli_w(t0, t0, 2); ++ __ slli_w(t1, t1, 2); ++ __ slli_w(t2, t2, 2); ++ __ slli_w(t3, t3, 2); ++ __ ldx_w(t0, rtp, t0); ++ __ ldx_w(t1, rtp, t1); ++ __ ldx_w(t2, rtp, t2); ++ __ ldx_w(t3, rtp, t3); ++ __ rotri_w(t0, t0, 24); ++ __ rotri_w(t1, t1, 16); ++ __ rotri_w(t2, t2, 8); ++ __ XOR(xp[i], xp[i], t0); ++ __ XOR(t0, t1, t2); ++ __ XOR(xp[i], xp[i], t3); ++ __ XOR(xp[i], xp[i], t0); ++ } ++ ++ if (r == 8) { ++ // AES 128 ++ __ li(t0, 44); ++ __ beq(t0, keylen, tail); ++ } else if (r == 10) { ++ // AES 192 ++ __ li(t0, 52); ++ __ beq(t0, keylen, tail); ++ } ++ } ++ ++ __ bind(tail); ++ __ li(rtp, (intptr_t)rsb_consts); ++ ++ // Round N ++ for (int i = 0; i < 4; i++) { ++ __ bstrpick_d(t0, ya[(i + 1) & 3], 7, 0); ++ __ bstrpick_d(t1, ya[(i + 2) & 3], 15, 8); ++ __ bstrpick_d(t2, ya[(i + 3) & 3], 23, 16); ++ __ bstrpick_d(t3, ya[(i + 0) & 3], 31, 24); ++ __ ldx_bu(t0, rtp, t0); ++ __ ldx_bu(t1, rtp, t1); ++ __ ldx_bu(t2, rtp, t2); ++ __ ldx_bu(t3, rtp, t3); ++ __ ld_w(xa[i], key, 4 * i); ++ __ slli_w(t1, t1, 8); ++ __ slli_w(t2, t2, 16); ++ __ slli_w(t3, t3, 24); ++ __ XOR(xa[i], xa[i], t0); ++ __ XOR(t0, t1, t2); ++ __ XOR(xa[i], xa[i], t3); ++ __ XOR(xa[i], xa[i], t0); ++ } ++ ++ if (cbc) { ++ for (int i = 0; i < 4; i++) { ++ __ ld_w(ya[i], rvp, 4 * i); ++ } ++ } ++ for (int i = 0; i < 4; i++) { ++ __ revb_2h(xa[i], xa[i]); ++ } ++ for (int i = 0; i < 4; i++) { ++ __ rotri_w(xa[i], xa[i], 16); ++ } ++ if (cbc) { ++ for (int i = 0; i < 4; i++) { ++ __ XOR(xa[i], xa[i], ya[i]); ++ } ++ } ++ for (int i = 0; i < 4; i++) { ++ __ st_w(xa[i], dst, 4 * i); ++ } ++ ++ if (cbc) { ++ __ move(rvp, src); ++ __ addi_d(src, src, 16); ++ __ addi_d(dst, dst, 16); ++ __ blt(src, srcend, loop); ++ ++ __ ld_d(t0, src, -16); ++ __ ld_d(t1, src, -8); ++ __ st_d(t0, rve, 0); ++ __ st_d(t1, rve, 8); ++ ++ __ ld_d(S3, SP, 3 * wordSize); ++ __ ld_d(S2, SP, 2 * wordSize); ++ __ ld_d(S1, SP, 1 * wordSize); ++ __ ld_d(S0, SP, 0 * wordSize); ++ __ addi_d(SP, SP, 4 * wordSize); ++ ++ __ bind(done); ++ __ move(A0, srclen); ++ } ++ ++ __ jr(RA); ++ ++ return start; ++ } ++ ++ // Arguments: ++ // ++ // Inputs: ++ // A0 - byte[] source+offset ++ // A1 - int[] SHA.state ++ // A2 - int offset ++ // A3 - int limit ++ // ++ void generate_md5_implCompress(const char *name, address &entry, address &entry_mb) { ++ static const uint32_t round_consts[64] = { ++ 0xd76aa478, 0xe8c7b756, 0x242070db, 0xc1bdceee, ++ 0xf57c0faf, 0x4787c62a, 0xa8304613, 0xfd469501, ++ 0x698098d8, 0x8b44f7af, 0xffff5bb1, 0x895cd7be, ++ 0x6b901122, 0xfd987193, 0xa679438e, 0x49b40821, ++ 0xf61e2562, 0xc040b340, 0x265e5a51, 0xe9b6c7aa, ++ 0xd62f105d, 0x02441453, 0xd8a1e681, 0xe7d3fbc8, ++ 0x21e1cde6, 0xc33707d6, 0xf4d50d87, 0x455a14ed, ++ 0xa9e3e905, 0xfcefa3f8, 0x676f02d9, 0x8d2a4c8a, ++ 0xfffa3942, 0x8771f681, 0x6d9d6122, 0xfde5380c, ++ 0xa4beea44, 0x4bdecfa9, 0xf6bb4b60, 0xbebfbc70, ++ 0x289b7ec6, 0xeaa127fa, 0xd4ef3085, 0x04881d05, ++ 0xd9d4d039, 0xe6db99e5, 0x1fa27cf8, 0xc4ac5665, ++ 0xf4292244, 0x432aff97, 0xab9423a7, 0xfc93a039, ++ 0x655b59c3, 0x8f0ccc92, 0xffeff47d, 0x85845dd1, ++ 0x6fa87e4f, 0xfe2ce6e0, 0xa3014314, 0x4e0811a1, ++ 0xf7537e82, 0xbd3af235, 0x2ad7d2bb, 0xeb86d391, ++ }; ++ static const uint8_t round_offs[64] = { ++ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, ++ 1, 6, 11, 0, 5, 10, 15, 4, 9, 14, 3, 8, 13, 2, 7, 12, ++ 5, 8, 11, 14, 1, 4, 7, 10, 13, 0, 3, 6, 9, 12, 15, 2, ++ 0, 7, 14, 5, 12, 3, 10, 1, 8, 15, 6, 13, 4, 11, 2, 9, ++ }; ++ static const uint8_t round_shfs[64] = { ++ 25, 20, 15, 10, 25, 20, 15, 10, 25, 20, 15, 10, 25, 20, 15, 10, ++ 27, 23, 18, 12, 27, 23, 18, 12, 27, 23, 18, 12, 27, 23, 18, 12, ++ 28, 21, 16, 9, 28, 21, 16, 9, 28, 21, 16, 9, 28, 21, 16, 9, ++ 26, 22, 17, 11, 26, 22, 17, 11, 26, 22, 17, 11, 26, 22, 17, 11, ++ }; ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ Label loop; ++ ++ // Allocate registers ++ Register t0 = T4; ++ Register t1 = T5; ++ Register t2 = T6; ++ Register t3 = T7; ++ Register buf = A0; ++ Register state = A1; ++ Register ofs = A2; ++ Register limit = A3; ++ Register kptr = T8; ++ Register sa[4] = { T0, T1, T2, T3 }; ++ Register sb[4] = { A4, A5, A6, A7 }; ++ ++ // Entry ++ entry = __ pc(); ++ __ move(ofs, R0); ++ __ move(limit, R0); ++ ++ // Entry MB ++ entry_mb = __ pc(); ++ ++ // Load keys base address ++ __ li(kptr, (intptr_t)round_consts); ++ ++ // Load states ++ __ ld_w(sa[0], state, 0); ++ __ ld_w(sa[1], state, 4); ++ __ ld_w(sa[2], state, 8); ++ __ ld_w(sa[3], state, 12); ++ ++ __ bind(loop); ++ __ move(sb[0], sa[0]); ++ __ move(sb[1], sa[1]); ++ __ move(sb[2], sa[2]); ++ __ move(sb[3], sa[3]); ++ ++ // 64 rounds of hashing ++ for (int i = 0; i < 64; i++) { ++ Register a = sa[(0 - i) & 3]; ++ Register b = sa[(1 - i) & 3]; ++ Register c = sa[(2 - i) & 3]; ++ Register d = sa[(3 - i) & 3]; ++ ++ __ ld_w(t2, kptr, i * 4); ++ __ ld_w(t3, buf, round_offs[i] * 4); ++ ++ if (i < 16) { ++ __ XOR(t0, c, d); ++ __ AND(t0, t0, b); ++ __ XOR(t0, t0, d); ++ } else if (i < 32) { ++ __ andn(t0, c, d); ++ __ AND(t1, d, b); ++ __ OR(t0, t0, t1); ++ } else if (i < 48) { ++ __ XOR(t0, c, d); ++ __ XOR(t0, t0, b); ++ } else { ++ __ orn(t0, b, d); ++ __ XOR(t0, t0, c); ++ } ++ ++ __ add_w(a, a, t2); ++ __ add_w(a, a, t3); ++ __ add_w(a, a, t0); ++ __ rotri_w(a, a, round_shfs[i]); ++ __ add_w(a, a, b); ++ } ++ ++ __ add_w(sa[0], sa[0], sb[0]); ++ __ add_w(sa[1], sa[1], sb[1]); ++ __ add_w(sa[2], sa[2], sb[2]); ++ __ add_w(sa[3], sa[3], sb[3]); ++ ++ __ addi_w(ofs, ofs, 64); ++ __ addi_d(buf, buf, 64); ++ __ bge(limit, ofs, loop); ++ __ move(V0, ofs); // return ofs ++ ++ // Save updated state ++ __ st_w(sa[0], state, 0); ++ __ st_w(sa[1], state, 4); ++ __ st_w(sa[2], state, 8); ++ __ st_w(sa[3], state, 12); ++ ++ __ jr(RA); ++ } ++ ++ // Arguments: ++ // ++ // Inputs: ++ // A0 - byte[] source+offset ++ // A1 - int[] SHA.state ++ // A2 - int offset ++ // A3 - int limit ++ // ++ void generate_sha1_implCompress(const char *name, address &entry, address &entry_mb) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ Label keys, loop; ++ ++ // Keys ++ __ bind(keys); ++ __ emit_int32(0x5a827999); ++ __ emit_int32(0x6ed9eba1); ++ __ emit_int32(0x8f1bbcdc); ++ __ emit_int32(0xca62c1d6); ++ ++ // Allocate registers ++ Register t0 = T5; ++ Register t1 = T6; ++ Register t2 = T7; ++ Register t3 = T8; ++ Register buf = A0; ++ Register state = A1; ++ Register ofs = A2; ++ Register limit = A3; ++ Register ka[4] = { A4, A5, A6, A7 }; ++ Register sa[5] = { T0, T1, T2, T3, T4 }; ++ ++ // Entry ++ entry = __ pc(); ++ __ move(ofs, R0); ++ __ move(limit, R0); ++ ++ // Entry MB ++ entry_mb = __ pc(); ++ ++ // Allocate scratch space ++ __ addi_d(SP, SP, -64); ++ ++ // Load keys ++ __ lipc(t0, keys); ++ __ ld_w(ka[0], t0, 0); ++ __ ld_w(ka[1], t0, 4); ++ __ ld_w(ka[2], t0, 8); ++ __ ld_w(ka[3], t0, 12); ++ ++ __ bind(loop); ++ // Load arguments ++ __ ld_w(sa[0], state, 0); ++ __ ld_w(sa[1], state, 4); ++ __ ld_w(sa[2], state, 8); ++ __ ld_w(sa[3], state, 12); ++ __ ld_w(sa[4], state, 16); ++ ++ // 80 rounds of hashing ++ for (int i = 0; i < 80; i++) { ++ Register a = sa[(5 - (i % 5)) % 5]; ++ Register b = sa[(6 - (i % 5)) % 5]; ++ Register c = sa[(7 - (i % 5)) % 5]; ++ Register d = sa[(8 - (i % 5)) % 5]; ++ Register e = sa[(9 - (i % 5)) % 5]; ++ ++ if (i < 16) { ++ __ ld_w(t0, buf, i * 4); ++ __ revb_2h(t0, t0); ++ __ rotri_w(t0, t0, 16); ++ __ add_w(e, e, t0); ++ __ st_w(t0, SP, i * 4); ++ __ XOR(t0, c, d); ++ __ AND(t0, t0, b); ++ __ XOR(t0, t0, d); ++ } else { ++ __ ld_w(t0, SP, ((i - 3) & 0xF) * 4); ++ __ ld_w(t1, SP, ((i - 8) & 0xF) * 4); ++ __ ld_w(t2, SP, ((i - 14) & 0xF) * 4); ++ __ ld_w(t3, SP, ((i - 16) & 0xF) * 4); ++ __ XOR(t0, t0, t1); ++ __ XOR(t0, t0, t2); ++ __ XOR(t0, t0, t3); ++ __ rotri_w(t0, t0, 31); ++ __ add_w(e, e, t0); ++ __ st_w(t0, SP, (i & 0xF) * 4); ++ ++ if (i < 20) { ++ __ XOR(t0, c, d); ++ __ AND(t0, t0, b); ++ __ XOR(t0, t0, d); ++ } else if (i < 40 || i >= 60) { ++ __ XOR(t0, b, c); ++ __ XOR(t0, t0, d); ++ } else if (i < 60) { ++ __ OR(t0, c, d); ++ __ AND(t0, t0, b); ++ __ AND(t2, c, d); ++ __ OR(t0, t0, t2); ++ } ++ } ++ ++ __ rotri_w(b, b, 2); ++ __ add_w(e, e, t0); ++ __ add_w(e, e, ka[i / 20]); ++ __ rotri_w(t0, a, 27); ++ __ add_w(e, e, t0); ++ } ++ ++ // Save updated state ++ __ ld_w(t0, state, 0); ++ __ ld_w(t1, state, 4); ++ __ ld_w(t2, state, 8); ++ __ ld_w(t3, state, 12); ++ __ add_w(sa[0], sa[0], t0); ++ __ ld_w(t0, state, 16); ++ __ add_w(sa[1], sa[1], t1); ++ __ add_w(sa[2], sa[2], t2); ++ __ add_w(sa[3], sa[3], t3); ++ __ add_w(sa[4], sa[4], t0); ++ __ st_w(sa[0], state, 0); ++ __ st_w(sa[1], state, 4); ++ __ st_w(sa[2], state, 8); ++ __ st_w(sa[3], state, 12); ++ __ st_w(sa[4], state, 16); ++ ++ __ addi_w(ofs, ofs, 64); ++ __ addi_d(buf, buf, 64); ++ __ bge(limit, ofs, loop); ++ __ move(V0, ofs); // return ofs ++ ++ __ addi_d(SP, SP, 64); ++ __ jr(RA); ++ } ++ ++ // Arguments: ++ // ++ // Inputs: ++ // A0 - byte[] source+offset ++ // A1 - int[] SHA.state ++ // A2 - int offset ++ // A3 - int limit ++ // ++ void generate_sha256_implCompress(const char *name, address &entry, address &entry_mb) { ++ static const uint32_t round_consts[64] = { ++ 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5, ++ 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5, ++ 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3, ++ 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174, ++ 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc, ++ 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da, ++ 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7, ++ 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967, ++ 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13, ++ 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85, ++ 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3, ++ 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070, ++ 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5, ++ 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3, ++ 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, ++ 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2, ++ }; ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", name); ++ Label loop; ++ ++ // Allocate registers ++ Register t0 = A4; ++ Register t1 = A5; ++ Register t2 = A6; ++ Register t3 = A7; ++ Register buf = A0; ++ Register state = A1; ++ Register ofs = A2; ++ Register limit = A3; ++ Register kptr = T8; ++ Register sa[8] = { T0, T1, T2, T3, T4, T5, T6, T7 }; ++ ++ // Entry ++ entry = __ pc(); ++ __ move(ofs, R0); ++ __ move(limit, R0); ++ ++ // Entry MB ++ entry_mb = __ pc(); ++ ++ // Allocate scratch space ++ __ addi_d(SP, SP, -64); ++ ++ // Load keys base address ++ __ li(kptr, (intptr_t)round_consts); ++ ++ __ bind(loop); ++ // Load state ++ __ ld_w(sa[0], state, 0); ++ __ ld_w(sa[1], state, 4); ++ __ ld_w(sa[2], state, 8); ++ __ ld_w(sa[3], state, 12); ++ __ ld_w(sa[4], state, 16); ++ __ ld_w(sa[5], state, 20); ++ __ ld_w(sa[6], state, 24); ++ __ ld_w(sa[7], state, 28); ++ ++ // Do 64 rounds of hashing ++ for (int i = 0; i < 64; i++) { ++ Register a = sa[(0 - i) & 7]; ++ Register b = sa[(1 - i) & 7]; ++ Register c = sa[(2 - i) & 7]; ++ Register d = sa[(3 - i) & 7]; ++ Register e = sa[(4 - i) & 7]; ++ Register f = sa[(5 - i) & 7]; ++ Register g = sa[(6 - i) & 7]; ++ Register h = sa[(7 - i) & 7]; ++ ++ if (i < 16) { ++ __ ld_w(t1, buf, i * 4); ++ __ revb_2h(t1, t1); ++ __ rotri_w(t1, t1, 16); ++ } else { ++ __ ld_w(t0, SP, ((i - 15) & 0xF) * 4); ++ __ ld_w(t1, SP, ((i - 16) & 0xF) * 4); ++ __ ld_w(t2, SP, ((i - 7) & 0xF) * 4); ++ __ add_w(t1, t1, t2); ++ __ rotri_w(t2, t0, 18); ++ __ srli_w(t3, t0, 3); ++ __ rotri_w(t0, t0, 7); ++ __ XOR(t2, t2, t3); ++ __ XOR(t0, t0, t2); ++ __ add_w(t1, t1, t0); ++ __ ld_w(t0, SP, ((i - 2) & 0xF) * 4); ++ __ rotri_w(t2, t0, 19); ++ __ srli_w(t3, t0, 10); ++ __ rotri_w(t0, t0, 17); ++ __ XOR(t2, t2, t3); ++ __ XOR(t0, t0, t2); ++ __ add_w(t1, t1, t0); ++ } ++ ++ __ rotri_w(t2, e, 11); ++ __ rotri_w(t3, e, 25); ++ __ rotri_w(t0, e, 6); ++ __ XOR(t2, t2, t3); ++ __ XOR(t0, t0, t2); ++ __ XOR(t2, g, f); ++ __ ld_w(t3, kptr, i * 4); ++ __ AND(t2, t2, e); ++ __ XOR(t2, t2, g); ++ __ add_w(t0, t0, t2); ++ __ add_w(t0, t0, t3); ++ __ add_w(h, h, t1); ++ __ add_w(h, h, t0); ++ __ add_w(d, d, h); ++ __ rotri_w(t2, a, 13); ++ __ rotri_w(t3, a, 22); ++ __ rotri_w(t0, a, 2); ++ __ XOR(t2, t2, t3); ++ __ XOR(t0, t0, t2); ++ __ add_w(h, h, t0); ++ __ OR(t0, c, b); ++ __ AND(t2, c, b); ++ __ AND(t0, t0, a); ++ __ OR(t0, t0, t2); ++ __ add_w(h, h, t0); ++ __ st_w(t1, SP, (i & 0xF) * 4); ++ } ++ ++ // Add to state ++ __ ld_w(t0, state, 0); ++ __ ld_w(t1, state, 4); ++ __ ld_w(t2, state, 8); ++ __ ld_w(t3, state, 12); ++ __ add_w(sa[0], sa[0], t0); ++ __ add_w(sa[1], sa[1], t1); ++ __ add_w(sa[2], sa[2], t2); ++ __ add_w(sa[3], sa[3], t3); ++ __ ld_w(t0, state, 16); ++ __ ld_w(t1, state, 20); ++ __ ld_w(t2, state, 24); ++ __ ld_w(t3, state, 28); ++ __ add_w(sa[4], sa[4], t0); ++ __ add_w(sa[5], sa[5], t1); ++ __ add_w(sa[6], sa[6], t2); ++ __ add_w(sa[7], sa[7], t3); ++ __ st_w(sa[0], state, 0); ++ __ st_w(sa[1], state, 4); ++ __ st_w(sa[2], state, 8); ++ __ st_w(sa[3], state, 12); ++ __ st_w(sa[4], state, 16); ++ __ st_w(sa[5], state, 20); ++ __ st_w(sa[6], state, 24); ++ __ st_w(sa[7], state, 28); ++ ++ __ addi_w(ofs, ofs, 64); ++ __ addi_d(buf, buf, 64); ++ __ bge(limit, ofs, loop); ++ __ move(V0, ofs); // return ofs ++ ++ __ addi_d(SP, SP, 64); ++ __ jr(RA); ++ } ++ ++ // Do NOT delete this node which stands for stub routine placeholder ++ address generate_updateBytesCRC32() { ++ assert(UseCRC32Intrinsics, "need CRC32 instructions support"); ++ ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", "updateBytesCRC32"); ++ ++ address start = __ pc(); ++ ++ const Register crc = A0; // crc ++ const Register buf = A1; // source java byte array address ++ const Register len = A2; // length ++ const Register tmp = A3; ++ ++ __ enter(); // required for proper stackwalking of RuntimeStub frame ++ ++ __ kernel_crc32(crc, buf, len, tmp); ++ ++ __ leave(); // required for proper stackwalking of RuntimeStub frame ++ __ jr(RA); ++ ++ return start; ++ } ++ ++ // Do NOT delete this node which stands for stub routine placeholder ++ address generate_updateBytesCRC32C() { ++ assert(UseCRC32CIntrinsics, "need CRC32C instructions support"); ++ ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", "updateBytesCRC32C"); ++ ++ address start = __ pc(); ++ ++ const Register crc = A0; // crc ++ const Register buf = A1; // source java byte array address ++ const Register len = A2; // length ++ const Register tmp = A3; ++ ++ __ enter(); // required for proper stackwalking of RuntimeStub frame ++ ++ __ kernel_crc32c(crc, buf, len, tmp); ++ ++ __ leave(); // required for proper stackwalking of RuntimeStub frame ++ __ jr(RA); ++ ++ return start; ++ } ++ ++ // ChaCha20 block function. This version parallelizes by loading ++ // individual 32-bit state elements into vectors for four blocks ++ // ++ // state (int[16]) = c_rarg0 ++ // keystream (byte[1024]) = c_rarg1 ++ // return - number of bytes of keystream (always 256) ++ address generate_chacha20Block_blockpar() { ++ Label L_twoRounds, L_cc20_const; ++ // Add masks for 4-block ChaCha20 Block calculations, ++ // creates a +0/+1/+2/+3 add overlay. ++ __ bind(L_cc20_const); ++ __ emit_int64(0x0000000000000001UL); ++ __ emit_int64(0x0000000000000000UL); ++ ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", "chacha20Block"); ++ address start = __ pc(); ++ __ enter(); ++ ++ int i; ++ const Register state = c_rarg0; ++ const Register keystream = c_rarg1; ++ const Register loopCtr = SCR1; ++ const Register tmpAddr = SCR2; ++ ++ const FloatRegister aState = F0; ++ const FloatRegister bState = F1; ++ const FloatRegister cState = F2; ++ const FloatRegister dState = F3; ++ const FloatRegister origCtrState = F20; ++ const FloatRegister dState1 = F21; ++ const FloatRegister dState2 = F22; ++ const FloatRegister dState3 = F23; ++ ++ // Organize SIMD registers in four arrays that facilitates ++ // putting repetitive opcodes into loop structures. ++ const FloatRegister aVec[4] = { ++ F4, F5, F6, F7 ++ }; ++ const FloatRegister bVec[4] = { ++ F8, F9, F10, F11 ++ }; ++ const FloatRegister cVec[4] = { ++ F12, F13, F14, F15 ++ }; ++ const FloatRegister dVec[4] = { ++ F16, F17, F18, F19 ++ }; ++ ++ // Load the initial state in columnar orientation and then copy ++ // that starting state to the working register set. ++ // Also load the address of the add mask for later use in handling ++ // multi-block counter increments. ++ __ vld(aState, state, 0); ++ __ vld(bState, state, 16); ++ __ vld(cState, state, 32); ++ __ vld(dState, state, 48); ++ __ lipc(tmpAddr, L_cc20_const); ++ __ vld(origCtrState, tmpAddr, 0); ++ __ vadd_w(dState1, dState, origCtrState); ++ __ vadd_w(dState2, dState1, origCtrState); ++ __ vadd_w(dState3, dState2, origCtrState); ++ for (i = 0; i < 4; i++) { ++ __ vori_b(aVec[i], aState, 0); ++ __ vori_b(bVec[i], bState, 0); ++ __ vori_b(cVec[i], cState, 0); ++ } ++ __ vori_b(dVec[0], dState, 0); ++ __ vori_b(dVec[1], dState1, 0); ++ __ vori_b(dVec[2], dState2, 0); ++ __ vori_b(dVec[3], dState3, 0); ++ ++ // Set up the 10 iteration loop and perform all 8 quarter round ops ++ __ li(loopCtr, 10); ++ __ bind(L_twoRounds); ++ ++ // The first quarter round macro call covers the first 4 QR operations: ++ // Qround(state, 0, 4, 8,12) ++ // Qround(state, 1, 5, 9,13) ++ // Qround(state, 2, 6,10,14) ++ // Qround(state, 3, 7,11,15) ++ __ cc20_quarter_round(aVec[0], bVec[0], cVec[0], dVec[0]); ++ __ cc20_quarter_round(aVec[1], bVec[1], cVec[1], dVec[1]); ++ __ cc20_quarter_round(aVec[2], bVec[2], cVec[2], dVec[2]); ++ __ cc20_quarter_round(aVec[3], bVec[3], cVec[3], dVec[3]); ++ ++ // Shuffle the bVec/cVec/dVec to reorganize the state vectors ++ // to diagonals. The aVec does not need to change orientation. ++ __ cc20_shift_lane_org(bVec[0], cVec[0], dVec[0], true); ++ __ cc20_shift_lane_org(bVec[1], cVec[1], dVec[1], true); ++ __ cc20_shift_lane_org(bVec[2], cVec[2], dVec[2], true); ++ __ cc20_shift_lane_org(bVec[3], cVec[3], dVec[3], true); ++ ++ // The second set of operations on the vectors covers the second 4 quarter ++ // round operations, now acting on the diagonals: ++ // Qround(state, 0, 5,10,15) ++ // Qround(state, 1, 6,11,12) ++ // Qround(state, 2, 7, 8,13) ++ // Qround(state, 3, 4, 9,14) ++ __ cc20_quarter_round(aVec[0], bVec[0], cVec[0], dVec[0]); ++ __ cc20_quarter_round(aVec[1], bVec[1], cVec[1], dVec[1]); ++ __ cc20_quarter_round(aVec[2], bVec[2], cVec[2], dVec[2]); ++ __ cc20_quarter_round(aVec[3], bVec[3], cVec[3], dVec[3]); ++ ++ // Before we start the next iteration, we need to perform shuffles ++ // on the b/c/d vectors to move them back to columnar organizations ++ // from their current diagonal orientation. ++ __ cc20_shift_lane_org(bVec[0], cVec[0], dVec[0], false); ++ __ cc20_shift_lane_org(bVec[1], cVec[1], dVec[1], false); ++ __ cc20_shift_lane_org(bVec[2], cVec[2], dVec[2], false); ++ __ cc20_shift_lane_org(bVec[3], cVec[3], dVec[3], false); ++ ++ // Decrement and iterate ++ __ addi_d(loopCtr, loopCtr, -1); ++ __ bnez(loopCtr, L_twoRounds); ++ ++ // Add the original start state back into the current state. ++ for (i = 0; i < 4; i++) { ++ __ vadd_w(aVec[i], aVec[i], aState); ++ __ vadd_w(bVec[i], bVec[i], bState); ++ __ vadd_w(cVec[i], cVec[i], cState); ++ } ++ __ vadd_w(dVec[0], dVec[0], dState); ++ __ vadd_w(dVec[1], dVec[1], dState1); ++ __ vadd_w(dVec[2], dVec[2], dState2); ++ __ vadd_w(dVec[3], dVec[3], dState3); ++ ++ // Write the data to the keystream array ++ for (i = 0; i < 4; i++) { ++ __ vst(aVec[i], keystream, 0); ++ __ vst(bVec[i], keystream, 16); ++ __ vst(cVec[i], keystream, 32); ++ __ vst(dVec[i], keystream, 48); ++ __ addi_d(keystream, keystream, 64); ++ } ++ ++ __ li(A0, 256); // Return length of output keystream ++ __ leave(); ++ __ jr(RA); ++ ++ return start; ++ } ++ ++ // Arguments: ++ // ++ // Input: ++ // c_rarg0 - newArr address ++ // c_rarg1 - oldArr address ++ // c_rarg2 - newIdx ++ // c_rarg3 - shiftCount ++ // c_rarg4 - numIter ++ // ++ address generate_bigIntegerLeftShift() { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", "bigIntegerLeftShiftWorker"); ++ address entry = __ pc(); ++ ++ Label loop_eight, loop_four, once, exit; ++ ++ Register newArr = c_rarg0; ++ Register oldArr = c_rarg1; ++ Register newIdx = c_rarg2; ++ Register shiftCount = c_rarg3; ++ Register numIter = c_rarg4; ++ ++ Register shiftRevCount = c_rarg5; ++ ++ FloatRegister vShiftCount = c_farg0; ++ FloatRegister vShiftRevCount = c_farg1; ++ ++ __ beqz(numIter, exit); ++ ++ __ alsl_d(newArr, newIdx, newArr, 1); ++ __ li(shiftRevCount, 32); ++ __ sub_w(shiftRevCount, shiftRevCount, shiftCount); ++ ++ __ li(SCR2, 4); ++ __ blt(numIter, SCR2, once); ++ ++ __ xvreplgr2vr_w(vShiftCount, shiftCount); ++ __ xvreplgr2vr_w(vShiftRevCount, shiftRevCount); ++ ++ __ li(SCR1, 8); ++ __ blt(numIter, SCR1, loop_four); ++ ++ __ bind(loop_eight); ++ __ xvld(FT0, oldArr, 0); ++ __ xvld(FT1, oldArr, 4); ++ __ xvsll_w(FT0, FT0, vShiftCount); ++ __ xvsrl_w(FT1, FT1, vShiftRevCount); ++ __ xvor_v(FT0, FT0, FT1); ++ __ xvst(FT0, newArr, 0); ++ __ addi_d(numIter, numIter, -8); ++ __ addi_d(oldArr, oldArr, 32); ++ __ addi_d(newArr, newArr, 32); ++ __ bge(numIter, SCR1, loop_eight); ++ ++ __ bind(loop_four); ++ __ blt(numIter, SCR2, once); ++ __ vld(FT0, oldArr, 0); ++ __ vld(FT1, oldArr, 4); ++ __ vsll_w(FT0, FT0, vShiftCount); ++ __ vsrl_w(FT1, FT1, vShiftRevCount); ++ __ vor_v(FT0, FT0, FT1); ++ __ vst(FT0, newArr, 0); ++ __ addi_d(numIter, numIter, -4); ++ __ addi_d(oldArr, oldArr, 16); ++ __ addi_d(newArr, newArr, 16); ++ __ b(loop_four); ++ ++ __ bind(once); ++ __ beqz(numIter, exit); ++ __ ld_w(SCR1, oldArr, 0); ++ __ ld_w(SCR2, oldArr, 4); ++ __ sll_w(SCR1, SCR1, shiftCount); ++ __ srl_w(SCR2, SCR2, shiftRevCount); ++ __ orr(SCR1, SCR1, SCR2); ++ __ st_w(SCR1, newArr, 0); ++ __ addi_d(numIter, numIter, -1); ++ __ addi_d(oldArr, oldArr, 4); ++ __ addi_d(newArr, newArr, 4); ++ __ b(once); ++ ++ __ bind(exit); ++ __ jr(RA); ++ ++ return entry; ++ } ++ ++ address generate_bigIntegerRightShift() { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", "bigIntegerRightShiftWorker"); ++ address entry = __ pc(); ++ ++ Label loop_eight, loop_four, once, exit; ++ ++ Register newArr = c_rarg0; ++ Register oldArr = c_rarg1; ++ Register newIdx = c_rarg2; ++ Register shiftCount = c_rarg3; ++ Register numIter = c_rarg4; ++ Register nidx = numIter; ++ ++ Register shiftRevCount = c_rarg5; ++ ++ FloatRegister vShiftCount = c_farg0; ++ FloatRegister vShiftRevCount = c_farg1; ++ ++ __ beqz(nidx, exit); ++ ++ __ alsl_d(newArr, newIdx, newArr, 1); ++ __ alsl_d(newArr, nidx, newArr, 1); ++ __ alsl_d(oldArr, numIter, oldArr, 1); ++ ++ __ li(shiftRevCount, 32); ++ __ sub_w(shiftRevCount, shiftRevCount, shiftCount); ++ ++ __ li(SCR2, 4); ++ __ blt(nidx, SCR2, once); ++ ++ __ xvreplgr2vr_w(vShiftCount, shiftCount); ++ __ xvreplgr2vr_w(vShiftRevCount, shiftRevCount); ++ ++ __ li(SCR1, 8); ++ __ blt(nidx, SCR1, loop_four); ++ ++ __ bind(loop_eight); ++ ++ __ addi_d(nidx, nidx, -8); ++ __ addi_d(oldArr, oldArr, -32); ++ __ addi_d(newArr, newArr, -32); ++ ++ __ xvld(FT0, oldArr, 4); ++ __ xvld(FT1, oldArr, 0); ++ __ xvsrl_w(FT0, FT0, vShiftCount); ++ __ xvsll_w(FT1, FT1, vShiftRevCount); ++ __ xvor_v(FT0, FT0, FT1); ++ __ xvst(FT0, newArr, 0); ++ __ bge(nidx, SCR1, loop_eight); ++ ++ __ bind(loop_four); ++ __ blt(nidx, SCR2, once); ++ __ addi_d(nidx, nidx, -4); ++ __ addi_d(oldArr, oldArr, -16); ++ __ addi_d(newArr, newArr, -16); ++ __ xvld(FT0, oldArr, 4); ++ __ xvld(FT1, oldArr, 0); ++ __ xvsrl_w(FT0, FT0, vShiftCount); ++ __ xvsll_w(FT1, FT1, vShiftRevCount); ++ __ vor_v(FT0, FT0, FT1); ++ __ vst(FT0, newArr, 0); ++ ++ __ b(loop_four); ++ ++ __ bind(once); ++ __ beqz(nidx, exit); ++ __ addi_d(nidx, nidx, -1); ++ __ addi_d(oldArr, oldArr, -4); ++ __ addi_d(newArr, newArr, -4); ++ __ ld_w(SCR1, oldArr, 4); ++ __ ld_w(SCR2, oldArr, 0); ++ __ srl_w(SCR1, SCR1, shiftCount); ++ __ sll_w(SCR2, SCR2, shiftRevCount); ++ __ orr(SCR1, SCR1, SCR2); ++ __ st_w(SCR1, newArr, 0); ++ ++ __ b(once); ++ ++ __ bind(exit); ++ __ jr(RA); ++ ++ return entry; ++ } ++ ++ address generate_dsin_dcos(bool isCos) { ++ __ align(CodeEntryAlignment); ++ StubCodeMark mark(this, "StubRoutines", isCos ? "libmDcos" : "libmDsin"); ++ address start = __ pc(); ++ __ generate_dsin_dcos(isCos, (address)StubRoutines::la::_npio2_hw, ++ (address)StubRoutines::la::_two_over_pi, ++ (address)StubRoutines::la::_pio2, ++ (address)StubRoutines::la::_dsin_coef, ++ (address)StubRoutines::la::_dcos_coef); ++ return start; ++ } ++ ++ address generate_cont_thaw(Continuation::thaw_kind kind) { ++ bool return_barrier = Continuation::is_thaw_return_barrier(kind); ++ bool return_barrier_exception = Continuation::is_thaw_return_barrier_exception(kind); ++ ++ address start = __ pc(); ++ ++ if (return_barrier) { ++ __ ld_d(SP, Address(TREG, JavaThread::cont_entry_offset())); ++ } ++ ++#ifndef PRODUCT ++ { ++ Label OK; ++ __ ld_d(AT, Address(TREG, JavaThread::cont_entry_offset())); ++ __ beq(SP, AT, OK); ++ __ stop("incorrect sp before prepare_thaw"); ++ __ bind(OK); ++ } ++#endif ++ ++ if (return_barrier) { ++ // preserve possible return value from a method returning to the return barrier ++ __ addi_d(SP, SP, - 2 * wordSize); ++ __ fst_d(FA0, Address(SP, 0 * wordSize)); ++ __ st_d(A0, Address(SP, 1 * wordSize)); ++ } ++ ++ __ addi_w(c_rarg1, R0, (return_barrier ? 1 : 0)); ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, Continuation::prepare_thaw), TREG, c_rarg1); ++ __ move(T4, A0); // A0 contains the size of the frames to thaw, 0 if overflow or no more frames ++ ++ if (return_barrier) { ++ // restore return value (no safepoint in the call to thaw, so even an oop return value should be OK) ++ __ ld_d(A0, Address(SP, 1 * wordSize)); ++ __ fld_d(FA0, Address(SP, 0 * wordSize)); ++ __ addi_d(SP, SP, 2 * wordSize); ++ } ++ ++#ifndef PRODUCT ++ { ++ Label OK; ++ __ ld_d(AT, Address(TREG, JavaThread::cont_entry_offset())); ++ __ beq(SP, AT, OK); ++ __ stop("incorrect sp after prepare_thaw"); ++ __ bind(OK); ++ } ++#endif ++ ++ Label thaw_success; ++ // T4 contains the size of the frames to thaw, 0 if overflow or no more frames ++ __ bnez(T4, thaw_success); ++ __ jmp(StubRoutines::throw_StackOverflowError_entry()); ++ __ bind(thaw_success); ++ ++ // make room for the thawed frames ++ __ sub_d(SP, SP, T4); ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); ++ ++ if (return_barrier) { ++ // save original return value -- again ++ __ addi_d(SP, SP, - 2 * wordSize); ++ __ fst_d(FA0, Address(SP, 0 * wordSize)); ++ __ st_d(A0, Address(SP, 1 * wordSize)); ++ } ++ ++ // If we want, we can templatize thaw by kind, and have three different entries ++ __ li(c_rarg1, (uint32_t)kind); ++ ++ __ call_VM_leaf(Continuation::thaw_entry(), TREG, c_rarg1); ++ __ move(T4, A0); // A0 is the sp of the yielding frame ++ ++ if (return_barrier) { ++ // restore return value (no safepoint in the call to thaw, so even an oop return value should be OK) ++ __ ld_d(A0, Address(SP, 1 * wordSize)); ++ __ fld_d(FA0, Address(SP, 0 * wordSize)); ++ __ addi_d(SP, SP, 2 * wordSize); ++ } else { ++ __ move(A0, R0); // return 0 (success) from doYield ++ } ++ ++ // we're now on the yield frame (which is in an address above us b/c sp has been pushed down) ++ __ move(FP, T4); ++ __ addi_d(SP, T4, - 2 * wordSize); // now pointing to fp spill ++ ++ if (return_barrier_exception) { ++ __ ld_d(c_rarg1, Address(FP, -1 * wordSize)); // return address ++ __ verify_oop(A0); ++ __ move(TSR, A0); // save return value contaning the exception oop in callee-saved TSR ++ ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::exception_handler_for_return_address), TREG, c_rarg1); ++ ++ // Continue at exception handler: ++ // A0: exception oop ++ // T4: exception handler ++ // A1: exception pc ++ __ move(T4, A0); ++ __ move(A0, TSR); ++ __ verify_oop(A0); ++ ++ __ leave(); ++ __ move(A1, RA); ++ __ jr(T4); ++ } else { ++ // We're "returning" into the topmost thawed frame; see Thaw::push_return_frame ++ __ leave(); ++ __ jr(RA); ++ } ++ ++ return start; ++ } ++ ++ address generate_cont_thaw() { ++ if (!Continuations::enabled()) return nullptr; ++ ++ StubCodeMark mark(this, "StubRoutines", "Cont thaw"); ++ address start = __ pc(); ++ generate_cont_thaw(Continuation::thaw_top); ++ return start; ++ } ++ ++ address generate_cont_returnBarrier() { ++ if (!Continuations::enabled()) return nullptr; ++ ++ // TODO: will probably need multiple return barriers depending on return type ++ StubCodeMark mark(this, "StubRoutines", "cont return barrier"); ++ address start = __ pc(); ++ ++ generate_cont_thaw(Continuation::thaw_return_barrier); ++ ++ return start; ++ } ++ ++ address generate_cont_returnBarrier_exception() { ++ if (!Continuations::enabled()) return nullptr; ++ ++ StubCodeMark mark(this, "StubRoutines", "cont return barrier exception handler"); ++ address start = __ pc(); ++ ++ generate_cont_thaw(Continuation::thaw_return_barrier_exception); ++ ++ return start; ++ } ++ ++#if INCLUDE_JFR ++ ++ // For c2: c_rarg0 is junk, call to runtime to write a checkpoint. ++ // It returns a jobject handle to the event writer. ++ // The handle is dereferenced and the return value is the event writer oop. ++ RuntimeStub* generate_jfr_write_checkpoint() { ++ enum layout { ++ fp_off, ++ fp_off2, ++ return_off, ++ return_off2, ++ framesize // inclusive of return address ++ }; ++ ++ CodeBuffer code("jfr_write_checkpoint", 1024, 64); ++ MacroAssembler* _masm = new MacroAssembler(&code); ++ ++ address start = __ pc(); ++ __ enter(); ++ int frame_complete = __ pc() - start; ++ ++ Label L; ++ address the_pc = __ pc(); ++ __ bind(L); ++ __ set_last_Java_frame(TREG, SP, FP, L); ++ __ move(c_rarg0, TREG); ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, JfrIntrinsicSupport::write_checkpoint), 1); ++ __ reset_last_Java_frame(true); ++ // A0 is jobject handle result, unpack and process it through a barrier. ++ // For zBarrierSet, tmp1 shall not be SCR1 or same as dst ++ __ resolve_global_jobject(A0, SCR2, SCR1); ++ ++ __ leave(); ++ __ jr(RA); ++ ++ OopMapSet* oop_maps = new OopMapSet(); ++ OopMap* map = new OopMap(framesize, 1); ++ oop_maps->add_gc_map(frame_complete, map); ++ ++ RuntimeStub* stub = ++ RuntimeStub::new_runtime_stub(code.name(), ++ &code, ++ frame_complete, ++ (framesize >> (LogBytesPerWord - LogBytesPerInt)), ++ oop_maps, ++ false); ++ return stub; ++ } ++ ++ // For c2: call to return a leased buffer. ++ static RuntimeStub* generate_jfr_return_lease() { ++ enum layout { ++ fp_off, ++ fp_off2, ++ return_off, ++ return_off2, ++ framesize // inclusive of return address ++ }; ++ ++ int insts_size = 1024; ++ int locs_size = 64; ++ CodeBuffer code("jfr_return_lease", insts_size, locs_size); ++ OopMapSet* oop_maps = new OopMapSet(); ++ MacroAssembler* masm = new MacroAssembler(&code); ++ MacroAssembler* _masm = masm; ++ ++ Label L; ++ address start = __ pc(); ++ __ enter(); ++ int frame_complete = __ pc() - start; ++ address the_pc = __ pc(); ++ __ bind(L); ++ __ set_last_Java_frame(TREG, SP, FP, L); ++ __ move(c_rarg0, TREG); ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, JfrIntrinsicSupport::return_lease), 1); ++ ++ __ reset_last_Java_frame(true); ++ __ leave(); ++ __ jr(RA); ++ ++ OopMap* map = new OopMap(framesize, 1); ++ oop_maps->add_gc_map(the_pc - start, map); ++ ++ RuntimeStub* stub = // codeBlob framesize is in words (not VMRegImpl::slot_size) ++ RuntimeStub::new_runtime_stub("jfr_return_lease", &code, frame_complete, ++ (framesize >> (LogBytesPerWord - LogBytesPerInt)), ++ oop_maps, false); ++ return stub; ++ } ++ ++#endif // INCLUDE_JFR ++ ++#undef __ ++#define __ masm-> ++ ++ // Continuation point for throwing of implicit exceptions that are ++ // not handled in the current activation. Fabricates an exception ++ // oop and initiates normal exception dispatching in this ++ // frame. Since we need to preserve callee-saved values (currently ++ // only for C2, but done for C1 as well) we need a callee-saved oop ++ // map and therefore have to make these stubs into RuntimeStubs ++ // rather than BufferBlobs. If the compiler needs all registers to ++ // be preserved between the fault point and the exception handler ++ // then it must assume responsibility for that in ++ // AbstractCompiler::continuation_for_implicit_null_exception or ++ // continuation_for_implicit_division_by_zero_exception. All other ++ // implicit exceptions (e.g., NullPointerException or ++ // AbstractMethodError on entry) are either at call sites or ++ // otherwise assume that stack unwinding will be initiated, so ++ // caller saved registers were assumed volatile in the compiler. ++ address generate_throw_exception(const char* name, ++ address runtime_entry) { ++ // Information about frame layout at time of blocking runtime call. ++ // Note that we only have to preserve callee-saved registers since ++ // the compilers are responsible for supplying a continuation point ++ // if they expect all registers to be preserved. ++ assert(frame::arg_reg_save_area_bytes == 0, "not expecting frame reg save area"); ++ ++ enum layout { ++ fp_off = 0, ++ fp_off2, ++ return_off, ++ return_off2, ++ framesize // inclusive of return address ++ }; ++ ++ const int insts_size = 1024; ++ const int locs_size = 64; ++ ++ CodeBuffer code(name, insts_size, locs_size); ++ OopMapSet* oop_maps = new OopMapSet(); ++ MacroAssembler* masm = new MacroAssembler(&code); ++ ++ address start = __ pc(); ++ ++ // This is an inlined and slightly modified version of call_VM ++ // which has the ability to fetch the return PC out of ++ // thread-local storage and also sets up last_Java_sp slightly ++ // differently than the real call_VM ++ ++ __ enter(); // Save FP and RA before call ++ ++ // RA and FP are already in place ++ __ addi_d(SP, FP, 0 - ((unsigned)framesize << LogBytesPerInt)); // prolog ++ ++ int frame_complete = __ pc() - start; ++ ++ // Set up last_Java_sp and last_Java_fp ++ Label before_call; ++ address the_pc = __ pc(); ++ __ bind(before_call); ++ __ set_last_Java_frame(SP, FP, before_call); ++ ++ // TODO: the stack is unaligned before calling this stub ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); ++ ++ __ move(c_rarg0, TREG); ++ __ call(runtime_entry, relocInfo::runtime_call_type); ++ ++ // Generate oop map ++ OopMap* map = new OopMap(framesize, 0); ++ oop_maps->add_gc_map(the_pc - start, map); ++ ++ __ reset_last_Java_frame(true); ++ ++ __ leave(); ++ ++ // check for pending exceptions ++#ifdef ASSERT ++ Label L; ++ __ ld_d(AT, Address(TREG, Thread::pending_exception_offset())); ++ __ bnez(AT, L); ++ __ should_not_reach_here(); ++ __ bind(L); ++#endif //ASSERT ++ __ jmp(StubRoutines::forward_exception_entry(), relocInfo::runtime_call_type); ++ ++ // codeBlob framesize is in words (not VMRegImpl::slot_size) ++ RuntimeStub* stub = ++ RuntimeStub::new_runtime_stub(name, ++ &code, ++ frame_complete, ++ (framesize >> (LogBytesPerWord - LogBytesPerInt)), ++ oop_maps, false); ++ ++ return stub->entry_point(); ++ } ++ ++ class MontgomeryMultiplyGenerator : public MacroAssembler { ++ ++ Register Pa_base, Pb_base, Pn_base, Pm_base, inv, Rlen, Rlen2, Ra, Rb, Rm, ++ Rn, Iam, Ibn, Rhi_ab, Rlo_ab, Rhi_mn, Rlo_mn, t0, t1, t2, Ri, Rj; ++ ++ bool _squaring; ++ ++ public: ++ MontgomeryMultiplyGenerator (Assembler *as, bool squaring) ++ : MacroAssembler(as->code()), _squaring(squaring) { ++ ++ // Register allocation ++ ++ RegSetIterator regs = (RegSet::range(A0, T8) \ ++ + RegSet::range(S0, S3)).begin(); ++ ++ Pa_base = *regs; // Argument registers: ++ if (squaring) ++ Pb_base = Pa_base; ++ else ++ Pb_base = *++regs; ++ Pn_base = *++regs; ++ Rlen = *++regs; ++ inv = *++regs; ++ Rlen2 = inv; // Reuse inv ++ Pm_base = *++regs; ++ ++ // Working registers: ++ Ra = *++regs; // The current digit of a, b, n, and m. ++ Rb = *++regs; ++ Rm = *++regs; ++ Rn = *++regs; ++ ++ Iam = *++regs; // Index to the current/next digit of a, b, n, and m. ++ Ibn = *++regs; ++ ++ t0 = *++regs; // Three registers which form a ++ t1 = *++regs; // triple-precision accumuator. ++ t2 = *++regs; ++ ++ Ri = *++regs; // Inner and outer loop indexes. ++ Rj = *++regs; ++ ++ Rhi_ab = *++regs; // Product registers: low and high parts ++ Rlo_ab = *++regs; // of a*b and m*n. ++ Rhi_mn = *++regs; ++ Rlo_mn = *++regs; ++ } ++ ++ private: ++ void enter() { ++ addi_d(SP, SP, -6 * wordSize); ++ st_d(FP, SP, 0 * wordSize); ++ move(FP, SP); ++ } ++ ++ void leave() { ++ addi_d(T0, FP, 6 * wordSize); ++ ld_d(FP, FP, 0 * wordSize); ++ move(SP, T0); ++ } ++ ++ void save_regs() { ++ if (!_squaring) ++ st_d(Rhi_ab, FP, 5 * wordSize); ++ st_d(Rlo_ab, FP, 4 * wordSize); ++ st_d(Rhi_mn, FP, 3 * wordSize); ++ st_d(Rlo_mn, FP, 2 * wordSize); ++ st_d(Pm_base, FP, 1 * wordSize); ++ } ++ ++ void restore_regs() { ++ if (!_squaring) ++ ld_d(Rhi_ab, FP, 5 * wordSize); ++ ld_d(Rlo_ab, FP, 4 * wordSize); ++ ld_d(Rhi_mn, FP, 3 * wordSize); ++ ld_d(Rlo_mn, FP, 2 * wordSize); ++ ld_d(Pm_base, FP, 1 * wordSize); ++ } ++ ++ template ++ void unroll_2(Register count, T block, Register tmp) { ++ Label loop, end, odd; ++ andi(tmp, count, 1); ++ bnez(tmp, odd); ++ beqz(count, end); ++ align(16); ++ bind(loop); ++ (this->*block)(); ++ bind(odd); ++ (this->*block)(); ++ addi_w(count, count, -2); ++ blt(R0, count, loop); ++ bind(end); ++ } ++ ++ template ++ void unroll_2(Register count, T block, Register d, Register s, Register tmp) { ++ Label loop, end, odd; ++ andi(tmp, count, 1); ++ bnez(tmp, odd); ++ beqz(count, end); ++ align(16); ++ bind(loop); ++ (this->*block)(d, s, tmp); ++ bind(odd); ++ (this->*block)(d, s, tmp); ++ addi_w(count, count, -2); ++ blt(R0, count, loop); ++ bind(end); ++ } ++ ++ void acc(Register Rhi, Register Rlo, ++ Register t0, Register t1, Register t2, Register t, Register c) { ++ add_d(t0, t0, Rlo); ++ OR(t, t1, Rhi); ++ sltu(c, t0, Rlo); ++ add_d(t1, t1, Rhi); ++ add_d(t1, t1, c); ++ sltu(c, t1, t); ++ add_d(t2, t2, c); ++ } ++ ++ void pre1(Register i) { ++ block_comment("pre1"); ++ // Iam = 0; ++ // Ibn = i; ++ ++ slli_w(Ibn, i, LogBytesPerWord); ++ ++ // Ra = Pa_base[Iam]; ++ // Rb = Pb_base[Ibn]; ++ // Rm = Pm_base[Iam]; ++ // Rn = Pn_base[Ibn]; ++ ++ ld_d(Ra, Pa_base, 0); ++ ldx_d(Rb, Pb_base, Ibn); ++ ld_d(Rm, Pm_base, 0); ++ ldx_d(Rn, Pn_base, Ibn); ++ ++ move(Iam, R0); ++ ++ // Zero the m*n result. ++ move(Rhi_mn, R0); ++ move(Rlo_mn, R0); ++ } ++ ++ // The core multiply-accumulate step of a Montgomery ++ // multiplication. The idea is to schedule operations as a ++ // pipeline so that instructions with long latencies (loads and ++ // multiplies) have time to complete before their results are ++ // used. This most benefits in-order implementations of the ++ // architecture but out-of-order ones also benefit. ++ void step() { ++ block_comment("step"); ++ // MACC(Ra, Rb, t0, t1, t2); ++ // Ra = Pa_base[++Iam]; ++ // Rb = Pb_base[--Ibn]; ++ addi_d(Iam, Iam, wordSize); ++ addi_d(Ibn, Ibn, -wordSize); ++ mul_d(Rlo_ab, Ra, Rb); ++ mulh_du(Rhi_ab, Ra, Rb); ++ acc(Rhi_mn, Rlo_mn, t0, t1, t2, Ra, Rb); // The pending m*n from the ++ // previous iteration. ++ ldx_d(Ra, Pa_base, Iam); ++ ldx_d(Rb, Pb_base, Ibn); ++ ++ // MACC(Rm, Rn, t0, t1, t2); ++ // Rm = Pm_base[Iam]; ++ // Rn = Pn_base[Ibn]; ++ mul_d(Rlo_mn, Rm, Rn); ++ mulh_du(Rhi_mn, Rm, Rn); ++ acc(Rhi_ab, Rlo_ab, t0, t1, t2, Rm, Rn); ++ ldx_d(Rm, Pm_base, Iam); ++ ldx_d(Rn, Pn_base, Ibn); ++ } ++ ++ void post1() { ++ block_comment("post1"); ++ ++ // MACC(Ra, Rb, t0, t1, t2); ++ mul_d(Rlo_ab, Ra, Rb); ++ mulh_du(Rhi_ab, Ra, Rb); ++ acc(Rhi_mn, Rlo_mn, t0, t1, t2, Ra, Rb); // The pending m*n ++ acc(Rhi_ab, Rlo_ab, t0, t1, t2, Ra, Rb); ++ ++ // Pm_base[Iam] = Rm = t0 * inv; ++ mul_d(Rm, t0, inv); ++ stx_d(Rm, Pm_base, Iam); ++ ++ // MACC(Rm, Rn, t0, t1, t2); ++ // t0 = t1; t1 = t2; t2 = 0; ++ mulh_du(Rhi_mn, Rm, Rn); ++ ++#ifndef PRODUCT ++ // assert(m[i] * n[0] + t0 == 0, "broken Montgomery multiply"); ++ { ++ mul_d(Rlo_mn, Rm, Rn); ++ add_d(Rlo_mn, t0, Rlo_mn); ++ Label ok; ++ beqz(Rlo_mn, ok); { ++ stop("broken Montgomery multiply"); ++ } bind(ok); ++ } ++#endif ++ ++ // We have very carefully set things up so that ++ // m[i]*n[0] + t0 == 0 (mod b), so we don't have to calculate ++ // the lower half of Rm * Rn because we know the result already: ++ // it must be -t0. t0 + (-t0) must generate a carry iff ++ // t0 != 0. So, rather than do a mul and an adds we just set ++ // the carry flag iff t0 is nonzero. ++ // ++ // mul_d(Rlo_mn, Rm, Rn); ++ // add_d(t0, t0, Rlo_mn); ++ OR(Ra, t1, Rhi_mn); ++ sltu(Rb, R0, t0); ++ add_d(t0, t1, Rhi_mn); ++ add_d(t0, t0, Rb); ++ sltu(Rb, t0, Ra); ++ add_d(t1, t2, Rb); ++ move(t2, R0); ++ } ++ ++ void pre2(Register i, Register len) { ++ block_comment("pre2"); ++ ++ // Rj == i-len ++ sub_w(Rj, i, len); ++ ++ // Iam = i - len; ++ // Ibn = len; ++ slli_w(Iam, Rj, LogBytesPerWord); ++ slli_w(Ibn, len, LogBytesPerWord); ++ ++ // Ra = Pa_base[++Iam]; ++ // Rb = Pb_base[--Ibn]; ++ // Rm = Pm_base[++Iam]; ++ // Rn = Pn_base[--Ibn]; ++ addi_d(Iam, Iam, wordSize); ++ addi_d(Ibn, Ibn, -wordSize); ++ ++ ldx_d(Ra, Pa_base, Iam); ++ ldx_d(Rb, Pb_base, Ibn); ++ ldx_d(Rm, Pm_base, Iam); ++ ldx_d(Rn, Pn_base, Ibn); ++ ++ move(Rhi_mn, R0); ++ move(Rlo_mn, R0); ++ } ++ ++ void post2(Register i, Register len) { ++ block_comment("post2"); ++ ++ sub_w(Rj, i, len); ++ alsl_d(Iam, Rj, Pm_base, LogBytesPerWord - 1); ++ ++ add_d(t0, t0, Rlo_mn); // The pending m*n, low part ++ ++ // As soon as we know the least significant digit of our result, ++ // store it. ++ // Pm_base[i-len] = t0; ++ st_d(t0, Iam, 0); ++ ++ // t0 = t1; t1 = t2; t2 = 0; ++ OR(Ra, t1, Rhi_mn); ++ sltu(Rb, t0, Rlo_mn); ++ add_d(t0, t1, Rhi_mn); // The pending m*n, high part ++ add_d(t0, t0, Rb); ++ sltu(Rb, t0, Ra); ++ add_d(t1, t2, Rb); ++ move(t2, R0); ++ } ++ ++ // A carry in t0 after Montgomery multiplication means that we ++ // should subtract multiples of n from our result in m. We'll ++ // keep doing that until there is no carry. ++ void normalize(Register len) { ++ block_comment("normalize"); ++ // while (t0) ++ // t0 = sub(Pm_base, Pn_base, t0, len); ++ Label loop, post, again; ++ Register cnt = t1, i = t2, b = Ra, t = Rb; // Re-use registers; we're done with them now ++ beqz(t0, post); { ++ bind(again); { ++ move(i, R0); ++ move(b, R0); ++ slli_w(cnt, len, LogBytesPerWord); ++ align(16); ++ bind(loop); { ++ ldx_d(Rm, Pm_base, i); ++ ldx_d(Rn, Pn_base, i); ++ sltu(t, Rm, b); ++ sub_d(Rm, Rm, b); ++ sltu(b, Rm, Rn); ++ sub_d(Rm, Rm, Rn); ++ OR(b, b, t); ++ stx_d(Rm, Pm_base, i); ++ addi_w(i, i, BytesPerWord); ++ } blt(i, cnt, loop); ++ sub_d(t0, t0, b); ++ } bnez(t0, again); ++ } bind(post); ++ } ++ ++ // Move memory at s to d, reversing words. ++ // Increments d to end of copied memory ++ // Destroys tmp1, tmp2, tmp3 ++ // Preserves len ++ // Leaves s pointing to the address which was in d at start ++ void reverse(Register d, Register s, Register len, Register tmp1, Register tmp2) { ++ assert(tmp1->encoding() < S0->encoding(), "register corruption"); ++ assert(tmp2->encoding() < S0->encoding(), "register corruption"); ++ ++ alsl_d(s, len, s, LogBytesPerWord - 1); ++ move(tmp1, len); ++ unroll_2(tmp1, &MontgomeryMultiplyGenerator::reverse1, d, s, tmp2); ++ slli_w(s, len, LogBytesPerWord); ++ sub_d(s, d, s); ++ } ++ ++ // where ++ void reverse1(Register d, Register s, Register tmp) { ++ ld_d(tmp, s, -wordSize); ++ addi_d(s, s, -wordSize); ++ addi_d(d, d, wordSize); ++ rotri_d(tmp, tmp, 32); ++ st_d(tmp, d, -wordSize); ++ } ++ ++ public: ++ /** ++ * Fast Montgomery multiplication. The derivation of the ++ * algorithm is in A Cryptographic Library for the Motorola ++ * DSP56000, Dusse and Kaliski, Proc. EUROCRYPT 90, pp. 230-237. ++ * ++ * Arguments: ++ * ++ * Inputs for multiplication: ++ * A0 - int array elements a ++ * A1 - int array elements b ++ * A2 - int array elements n (the modulus) ++ * A3 - int length ++ * A4 - int inv ++ * A5 - int array elements m (the result) ++ * ++ * Inputs for squaring: ++ * A0 - int array elements a ++ * A1 - int array elements n (the modulus) ++ * A2 - int length ++ * A3 - int inv ++ * A4 - int array elements m (the result) ++ * ++ */ ++ address generate_multiply() { ++ Label argh, nothing; ++ bind(argh); ++ stop("MontgomeryMultiply total_allocation must be <= 8192"); ++ ++ align(CodeEntryAlignment); ++ address entry = pc(); ++ ++ beqz(Rlen, nothing); ++ ++ enter(); ++ ++ // Make room. ++ sltui(Ra, Rlen, 513); ++ beqz(Ra, argh); ++ slli_w(Ra, Rlen, exact_log2(4 * sizeof (jint))); ++ sub_d(Ra, SP, Ra); ++ ++ srli_w(Rlen, Rlen, 1); // length in longwords = len/2 ++ ++ { ++ // Copy input args, reversing as we go. We use Ra as a ++ // temporary variable. ++ reverse(Ra, Pa_base, Rlen, t0, t1); ++ if (!_squaring) ++ reverse(Ra, Pb_base, Rlen, t0, t1); ++ reverse(Ra, Pn_base, Rlen, t0, t1); ++ } ++ ++ // Push all call-saved registers and also Pm_base which we'll need ++ // at the end. ++ save_regs(); ++ ++#ifndef PRODUCT ++ // assert(inv * n[0] == -1UL, "broken inverse in Montgomery multiply"); ++ { ++ ld_d(Rn, Pn_base, 0); ++ li(t0, -1); ++ mul_d(Rlo_mn, Rn, inv); ++ Label ok; ++ beq(Rlo_mn, t0, ok); { ++ stop("broken inverse in Montgomery multiply"); ++ } bind(ok); ++ } ++#endif ++ ++ move(Pm_base, Ra); ++ ++ move(t0, R0); ++ move(t1, R0); ++ move(t2, R0); ++ ++ block_comment("for (int i = 0; i < len; i++) {"); ++ move(Ri, R0); { ++ Label loop, end; ++ bge(Ri, Rlen, end); ++ ++ bind(loop); ++ pre1(Ri); ++ ++ block_comment(" for (j = i; j; j--) {"); { ++ move(Rj, Ri); ++ unroll_2(Rj, &MontgomeryMultiplyGenerator::step, Rlo_ab); ++ } block_comment(" } // j"); ++ ++ post1(); ++ addi_w(Ri, Ri, 1); ++ blt(Ri, Rlen, loop); ++ bind(end); ++ block_comment("} // i"); ++ } ++ ++ block_comment("for (int i = len; i < 2*len; i++) {"); ++ move(Ri, Rlen); ++ slli_w(Rlen2, Rlen, 1); { ++ Label loop, end; ++ bge(Ri, Rlen2, end); ++ ++ bind(loop); ++ pre2(Ri, Rlen); ++ ++ block_comment(" for (j = len*2-i-1; j; j--) {"); { ++ sub_w(Rj, Rlen2, Ri); ++ addi_w(Rj, Rj, -1); ++ unroll_2(Rj, &MontgomeryMultiplyGenerator::step, Rlo_ab); ++ } block_comment(" } // j"); ++ ++ post2(Ri, Rlen); ++ addi_w(Ri, Ri, 1); ++ blt(Ri, Rlen2, loop); ++ bind(end); ++ } ++ block_comment("} // i"); ++ ++ normalize(Rlen); ++ ++ move(Ra, Pm_base); // Save Pm_base in Ra ++ restore_regs(); // Restore caller's Pm_base ++ ++ // Copy our result into caller's Pm_base ++ reverse(Pm_base, Ra, Rlen, t0, t1); ++ ++ leave(); ++ bind(nothing); ++ jr(RA); ++ ++ return entry; ++ } ++ // In C, approximately: ++ ++ // void ++ // montgomery_multiply(unsigned long Pa_base[], unsigned long Pb_base[], ++ // unsigned long Pn_base[], unsigned long Pm_base[], ++ // unsigned long inv, int len) { ++ // unsigned long t0 = 0, t1 = 0, t2 = 0; // Triple-precision accumulator ++ // unsigned long Ra, Rb, Rn, Rm; ++ // int i, Iam, Ibn; ++ ++ // assert(inv * Pn_base[0] == -1UL, "broken inverse in Montgomery multiply"); ++ ++ // for (i = 0; i < len; i++) { ++ // int j; ++ ++ // Iam = 0; ++ // Ibn = i; ++ ++ // Ra = Pa_base[Iam]; ++ // Rb = Pb_base[Iam]; ++ // Rm = Pm_base[Ibn]; ++ // Rn = Pn_base[Ibn]; ++ ++ // int iters = i; ++ // for (j = 0; iters--; j++) { ++ // assert(Ra == Pa_base[j] && Rb == Pb_base[i-j], "must be"); ++ // MACC(Ra, Rb, t0, t1, t2); ++ // Ra = Pa_base[++Iam]; ++ // Rb = pb_base[--Ibn]; ++ // assert(Rm == Pm_base[j] && Rn == Pn_base[i-j], "must be"); ++ // MACC(Rm, Rn, t0, t1, t2); ++ // Rm = Pm_base[++Iam]; ++ // Rn = Pn_base[--Ibn]; ++ // } ++ ++ // assert(Ra == Pa_base[i] && Rb == Pb_base[0], "must be"); ++ // MACC(Ra, Rb, t0, t1, t2); ++ // Pm_base[Iam] = Rm = t0 * inv; ++ // assert(Rm == Pm_base[i] && Rn == Pn_base[0], "must be"); ++ // MACC(Rm, Rn, t0, t1, t2); ++ ++ // assert(t0 == 0, "broken Montgomery multiply"); ++ ++ // t0 = t1; t1 = t2; t2 = 0; ++ // } ++ ++ // for (i = len; i < 2*len; i++) { ++ // int j; ++ ++ // Iam = i - len; ++ // Ibn = len; ++ ++ // Ra = Pa_base[++Iam]; ++ // Rb = Pb_base[--Ibn]; ++ // Rm = Pm_base[++Iam]; ++ // Rn = Pn_base[--Ibn]; ++ ++ // int iters = len*2-i-1; ++ // for (j = i-len+1; iters--; j++) { ++ // assert(Ra == Pa_base[j] && Rb == Pb_base[i-j], "must be"); ++ // MACC(Ra, Rb, t0, t1, t2); ++ // Ra = Pa_base[++Iam]; ++ // Rb = Pb_base[--Ibn]; ++ // assert(Rm == Pm_base[j] && Rn == Pn_base[i-j], "must be"); ++ // MACC(Rm, Rn, t0, t1, t2); ++ // Rm = Pm_base[++Iam]; ++ // Rn = Pn_base[--Ibn]; ++ // } ++ ++ // Pm_base[i-len] = t0; ++ // t0 = t1; t1 = t2; t2 = 0; ++ // } ++ ++ // while (t0) ++ // t0 = sub(Pm_base, Pn_base, t0, len); ++ // } ++ }; ++ ++ // Initialization ++ void generate_initial_stubs() { ++ // Generates all stubs and initializes the entry points ++ ++ //------------------------------------------------------------- ++ //----------------------------------------------------------- ++ // entry points that exist in all platforms ++ // Note: This is code that could be shared among different platforms - however the benefit seems to be smaller ++ // than the disadvantage of having a much more complicated generator structure. ++ // See also comment in stubRoutines.hpp. ++ StubRoutines::_forward_exception_entry = generate_forward_exception(); ++ StubRoutines::_call_stub_entry = generate_call_stub(StubRoutines::_call_stub_return_address); ++ // is referenced by megamorphic call ++ StubRoutines::_catch_exception_entry = generate_catch_exception(); ++ ++ StubRoutines::_throw_StackOverflowError_entry = ++ generate_throw_exception("StackOverflowError throw_exception", ++ CAST_FROM_FN_PTR(address, ++ SharedRuntime::throw_StackOverflowError)); ++ ++ StubRoutines::_throw_delayed_StackOverflowError_entry = ++ generate_throw_exception("delayed StackOverflowError throw_exception", ++ CAST_FROM_FN_PTR(address, ++ SharedRuntime::throw_delayed_StackOverflowError)); ++ ++ // Initialize table for copy memory (arraycopy) check. ++ if (UnsafeCopyMemory::_table == nullptr) { ++ UnsafeCopyMemory::create_table(8 ZGC_ONLY(+ (UseZGC && ZGenerational ? 14 : 0))); ++ } ++ ++ if (UseCRC32Intrinsics) { ++ // set table address before stub generation which use it ++ StubRoutines::_crc_table_adr = (address)StubRoutines::la::_crc_table; ++ StubRoutines::_updateBytesCRC32 = generate_updateBytesCRC32(); ++ } ++ ++ if (UseCRC32CIntrinsics) { ++ StubRoutines::_updateBytesCRC32C = generate_updateBytesCRC32C(); ++ } ++ } ++ ++ void generate_continuation_stubs() { ++ // Continuation stubs: ++ StubRoutines::_cont_thaw = generate_cont_thaw(); ++ StubRoutines::_cont_returnBarrier = generate_cont_returnBarrier(); ++ StubRoutines::_cont_returnBarrierExc = generate_cont_returnBarrier_exception(); ++ ++ JFR_ONLY(generate_jfr_stubs();) ++ } ++ ++#if INCLUDE_JFR ++ void generate_jfr_stubs() { ++ StubRoutines::_jfr_write_checkpoint_stub = generate_jfr_write_checkpoint(); ++ StubRoutines::_jfr_write_checkpoint = StubRoutines::_jfr_write_checkpoint_stub->entry_point(); ++ StubRoutines::_jfr_return_lease_stub = generate_jfr_return_lease(); ++ StubRoutines::_jfr_return_lease = StubRoutines::_jfr_return_lease_stub->entry_point(); ++ } ++#endif // INCLUDE_JFR ++ ++ void generate_final_stubs() { ++ // Generates all stubs and initializes the entry points ++ ++ // These entry points require SharedInfo::stack0 to be set up in ++ // non-core builds and need to be relocatable, so they each ++ // fabricate a RuntimeStub internally. ++ StubRoutines::_throw_AbstractMethodError_entry = ++ generate_throw_exception("AbstractMethodError throw_exception", ++ CAST_FROM_FN_PTR(address, ++ SharedRuntime::throw_AbstractMethodError)); ++ ++ StubRoutines::_throw_IncompatibleClassChangeError_entry = ++ generate_throw_exception("IncompatibleClassChangeError throw_exception", ++ CAST_FROM_FN_PTR(address, ++ SharedRuntime::throw_IncompatibleClassChangeError)); ++ ++ StubRoutines::_throw_NullPointerException_at_call_entry = ++ generate_throw_exception("NullPointerException at call throw_exception", ++ CAST_FROM_FN_PTR(address, ++ SharedRuntime::throw_NullPointerException_at_call)); ++ ++ // support for verify_oop (must happen after universe_init) ++ if (VerifyOops) { ++ StubRoutines::_verify_oop_subroutine_entry = generate_verify_oop(); ++ } ++ ++ // arraycopy stubs used by compilers ++ generate_arraycopy_stubs(); ++ ++ if (UseLSX && vmIntrinsics::is_intrinsic_available(vmIntrinsics::_dsin)) { ++ StubRoutines::_dsin = generate_dsin_dcos(/* isCos = */ false); ++ } ++ ++ if (UseLSX && vmIntrinsics::is_intrinsic_available(vmIntrinsics::_dcos)) { ++ StubRoutines::_dcos = generate_dsin_dcos(/* isCos = */ true); ++ } ++ ++ BarrierSetNMethod* bs_nm = BarrierSet::barrier_set()->barrier_set_nmethod(); ++ if (bs_nm != nullptr) { ++ StubRoutines::la::_method_entry_barrier = generate_method_entry_barrier(); ++ } ++ } ++ ++ void generate_compiler_stubs() { ++#if COMPILER2_OR_JVMCI ++#ifdef COMPILER2 ++ if (UseMulAddIntrinsic) { ++ StubRoutines::_mulAdd = generate_mulAdd(); ++ } ++ ++ if (UseMontgomeryMultiplyIntrinsic) { ++ StubCodeMark mark(this, "StubRoutines", "montgomeryMultiply"); ++ MontgomeryMultiplyGenerator g(_masm, false /* squaring */); ++ StubRoutines::_montgomeryMultiply = g.generate_multiply(); ++ } ++ ++ if (UseMontgomerySquareIntrinsic) { ++ StubCodeMark mark(this, "StubRoutines", "montgomerySquare"); ++ MontgomeryMultiplyGenerator g(_masm, true /* squaring */); ++ // We use generate_multiply() rather than generate_square() ++ // because it's faster for the sizes of modulus we care about. ++ StubRoutines::_montgomerySquare = g.generate_multiply(); ++ } ++ ++ if (UseBigIntegerShiftIntrinsic) { ++ StubRoutines::_bigIntegerLeftShiftWorker = generate_bigIntegerLeftShift(); ++ StubRoutines::_bigIntegerRightShiftWorker = generate_bigIntegerRightShift(); ++ } ++#endif ++ ++ StubRoutines::la::_vector_iota_indices = generate_iota_indices("iota_indices"); ++ ++ if (UseAESIntrinsics) { ++ StubRoutines::_aescrypt_encryptBlock = generate_aescrypt_encryptBlock(false); ++ StubRoutines::_aescrypt_decryptBlock = generate_aescrypt_decryptBlock(false); ++ StubRoutines::_cipherBlockChaining_encryptAESCrypt = generate_aescrypt_encryptBlock(true); ++ StubRoutines::_cipherBlockChaining_decryptAESCrypt = generate_aescrypt_decryptBlock(true); ++ } ++ ++ if (UseMD5Intrinsics) { ++ generate_md5_implCompress("md5_implCompress", StubRoutines::_md5_implCompress, StubRoutines::_md5_implCompressMB); ++ } ++ ++ if (UseSHA1Intrinsics) { ++ generate_sha1_implCompress("sha1_implCompress", StubRoutines::_sha1_implCompress, StubRoutines::_sha1_implCompressMB); ++ } ++ ++ if (UseSHA256Intrinsics) { ++ generate_sha256_implCompress("sha256_implCompress", StubRoutines::_sha256_implCompress, StubRoutines::_sha256_implCompressMB); ++ } ++ ++ if (UseChaCha20Intrinsics) { ++ StubRoutines::_chacha20Block = generate_chacha20Block_blockpar(); ++ } ++ ++ generate_string_indexof_stubs(); ++ ++#endif // COMPILER2_OR_JVMCI ++ } ++ ++ public: ++ StubGenerator(CodeBuffer* code, StubsKind kind) : StubCodeGenerator(code) { ++ switch(kind) { ++ case Initial_stubs: ++ generate_initial_stubs(); ++ break; ++ case Continuation_stubs: ++ generate_continuation_stubs(); ++ break; ++ case Compiler_stubs: ++ generate_compiler_stubs(); ++ break; ++ case Final_stubs: ++ generate_final_stubs(); ++ break; ++ default: ++ fatal("unexpected stubs kind: %d", kind); ++ break; ++ }; ++ } ++}; // end class declaration ++ ++void StubGenerator_generate(CodeBuffer* code, StubCodeGenerator::StubsKind kind) { ++ StubGenerator g(code, kind); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/stubRoutines_loongarch_64.cpp b/src/hotspot/cpu/loongarch/stubRoutines_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/stubRoutines_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/stubRoutines_loongarch_64.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,196 @@ ++/* ++ * Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "runtime/deoptimization.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/javaThread.hpp" ++#include "runtime/stubRoutines.hpp" ++ ++// a description of how to extend it, see the stubRoutines.hpp file. ++ ++//find the last fp value ++address StubRoutines::la::_method_entry_barrier = nullptr; ++address StubRoutines::la::_vector_iota_indices = nullptr; ++address StubRoutines::la::_string_indexof_linear_ll = nullptr; ++address StubRoutines::la::_string_indexof_linear_uu = nullptr; ++address StubRoutines::la::_string_indexof_linear_ul = nullptr; ++address StubRoutines::la::_jlong_fill = nullptr; ++address StubRoutines::la::_arrayof_jlong_fill = nullptr; ++ ++/** ++ * crc_table[] from jdk/src/share/native/java/util/zip/zlib-1.2.5/crc32.h ++ */ ++juint StubRoutines::la::_crc_table[] = ++{ ++ 0x00000000UL, 0x77073096UL, 0xee0e612cUL, 0x990951baUL, 0x076dc419UL, ++ 0x706af48fUL, 0xe963a535UL, 0x9e6495a3UL, 0x0edb8832UL, 0x79dcb8a4UL, ++ 0xe0d5e91eUL, 0x97d2d988UL, 0x09b64c2bUL, 0x7eb17cbdUL, 0xe7b82d07UL, ++ 0x90bf1d91UL, 0x1db71064UL, 0x6ab020f2UL, 0xf3b97148UL, 0x84be41deUL, ++ 0x1adad47dUL, 0x6ddde4ebUL, 0xf4d4b551UL, 0x83d385c7UL, 0x136c9856UL, ++ 0x646ba8c0UL, 0xfd62f97aUL, 0x8a65c9ecUL, 0x14015c4fUL, 0x63066cd9UL, ++ 0xfa0f3d63UL, 0x8d080df5UL, 0x3b6e20c8UL, 0x4c69105eUL, 0xd56041e4UL, ++ 0xa2677172UL, 0x3c03e4d1UL, 0x4b04d447UL, 0xd20d85fdUL, 0xa50ab56bUL, ++ 0x35b5a8faUL, 0x42b2986cUL, 0xdbbbc9d6UL, 0xacbcf940UL, 0x32d86ce3UL, ++ 0x45df5c75UL, 0xdcd60dcfUL, 0xabd13d59UL, 0x26d930acUL, 0x51de003aUL, ++ 0xc8d75180UL, 0xbfd06116UL, 0x21b4f4b5UL, 0x56b3c423UL, 0xcfba9599UL, ++ 0xb8bda50fUL, 0x2802b89eUL, 0x5f058808UL, 0xc60cd9b2UL, 0xb10be924UL, ++ 0x2f6f7c87UL, 0x58684c11UL, 0xc1611dabUL, 0xb6662d3dUL, 0x76dc4190UL, ++ 0x01db7106UL, 0x98d220bcUL, 0xefd5102aUL, 0x71b18589UL, 0x06b6b51fUL, ++ 0x9fbfe4a5UL, 0xe8b8d433UL, 0x7807c9a2UL, 0x0f00f934UL, 0x9609a88eUL, ++ 0xe10e9818UL, 0x7f6a0dbbUL, 0x086d3d2dUL, 0x91646c97UL, 0xe6635c01UL, ++ 0x6b6b51f4UL, 0x1c6c6162UL, 0x856530d8UL, 0xf262004eUL, 0x6c0695edUL, ++ 0x1b01a57bUL, 0x8208f4c1UL, 0xf50fc457UL, 0x65b0d9c6UL, 0x12b7e950UL, ++ 0x8bbeb8eaUL, 0xfcb9887cUL, 0x62dd1ddfUL, 0x15da2d49UL, 0x8cd37cf3UL, ++ 0xfbd44c65UL, 0x4db26158UL, 0x3ab551ceUL, 0xa3bc0074UL, 0xd4bb30e2UL, ++ 0x4adfa541UL, 0x3dd895d7UL, 0xa4d1c46dUL, 0xd3d6f4fbUL, 0x4369e96aUL, ++ 0x346ed9fcUL, 0xad678846UL, 0xda60b8d0UL, 0x44042d73UL, 0x33031de5UL, ++ 0xaa0a4c5fUL, 0xdd0d7cc9UL, 0x5005713cUL, 0x270241aaUL, 0xbe0b1010UL, ++ 0xc90c2086UL, 0x5768b525UL, 0x206f85b3UL, 0xb966d409UL, 0xce61e49fUL, ++ 0x5edef90eUL, 0x29d9c998UL, 0xb0d09822UL, 0xc7d7a8b4UL, 0x59b33d17UL, ++ 0x2eb40d81UL, 0xb7bd5c3bUL, 0xc0ba6cadUL, 0xedb88320UL, 0x9abfb3b6UL, ++ 0x03b6e20cUL, 0x74b1d29aUL, 0xead54739UL, 0x9dd277afUL, 0x04db2615UL, ++ 0x73dc1683UL, 0xe3630b12UL, 0x94643b84UL, 0x0d6d6a3eUL, 0x7a6a5aa8UL, ++ 0xe40ecf0bUL, 0x9309ff9dUL, 0x0a00ae27UL, 0x7d079eb1UL, 0xf00f9344UL, ++ 0x8708a3d2UL, 0x1e01f268UL, 0x6906c2feUL, 0xf762575dUL, 0x806567cbUL, ++ 0x196c3671UL, 0x6e6b06e7UL, 0xfed41b76UL, 0x89d32be0UL, 0x10da7a5aUL, ++ 0x67dd4accUL, 0xf9b9df6fUL, 0x8ebeeff9UL, 0x17b7be43UL, 0x60b08ed5UL, ++ 0xd6d6a3e8UL, 0xa1d1937eUL, 0x38d8c2c4UL, 0x4fdff252UL, 0xd1bb67f1UL, ++ 0xa6bc5767UL, 0x3fb506ddUL, 0x48b2364bUL, 0xd80d2bdaUL, 0xaf0a1b4cUL, ++ 0x36034af6UL, 0x41047a60UL, 0xdf60efc3UL, 0xa867df55UL, 0x316e8eefUL, ++ 0x4669be79UL, 0xcb61b38cUL, 0xbc66831aUL, 0x256fd2a0UL, 0x5268e236UL, ++ 0xcc0c7795UL, 0xbb0b4703UL, 0x220216b9UL, 0x5505262fUL, 0xc5ba3bbeUL, ++ 0xb2bd0b28UL, 0x2bb45a92UL, 0x5cb36a04UL, 0xc2d7ffa7UL, 0xb5d0cf31UL, ++ 0x2cd99e8bUL, 0x5bdeae1dUL, 0x9b64c2b0UL, 0xec63f226UL, 0x756aa39cUL, ++ 0x026d930aUL, 0x9c0906a9UL, 0xeb0e363fUL, 0x72076785UL, 0x05005713UL, ++ 0x95bf4a82UL, 0xe2b87a14UL, 0x7bb12baeUL, 0x0cb61b38UL, 0x92d28e9bUL, ++ 0xe5d5be0dUL, 0x7cdcefb7UL, 0x0bdbdf21UL, 0x86d3d2d4UL, 0xf1d4e242UL, ++ 0x68ddb3f8UL, 0x1fda836eUL, 0x81be16cdUL, 0xf6b9265bUL, 0x6fb077e1UL, ++ 0x18b74777UL, 0x88085ae6UL, 0xff0f6a70UL, 0x66063bcaUL, 0x11010b5cUL, ++ 0x8f659effUL, 0xf862ae69UL, 0x616bffd3UL, 0x166ccf45UL, 0xa00ae278UL, ++ 0xd70dd2eeUL, 0x4e048354UL, 0x3903b3c2UL, 0xa7672661UL, 0xd06016f7UL, ++ 0x4969474dUL, 0x3e6e77dbUL, 0xaed16a4aUL, 0xd9d65adcUL, 0x40df0b66UL, ++ 0x37d83bf0UL, 0xa9bcae53UL, 0xdebb9ec5UL, 0x47b2cf7fUL, 0x30b5ffe9UL, ++ 0xbdbdf21cUL, 0xcabac28aUL, 0x53b39330UL, 0x24b4a3a6UL, 0xbad03605UL, ++ 0xcdd70693UL, 0x54de5729UL, 0x23d967bfUL, 0xb3667a2eUL, 0xc4614ab8UL, ++ 0x5d681b02UL, 0x2a6f2b94UL, 0xb40bbe37UL, 0xc30c8ea1UL, 0x5a05df1bUL, ++ 0x2d02ef8dUL ++}; ++ ++ATTRIBUTE_ALIGNED(64) juint StubRoutines::la::_npio2_hw[] = { ++ // first, various coefficient values: 0.5, invpio2, pio2_1, pio2_1t, pio2_2, ++ // pio2_2t, pio2_3, pio2_3t ++ // This is a small optimization which keeping double[8] values in int[] table ++ // to have less address calculation instructions ++ // ++ // invpio2: 53 bits of 2/pi (enough for cases when trigonometric argument is small) ++ // pio2_1: first 33 bit of pi/2 ++ // pio2_1t: pi/2 - pio2_1 ++ // pio2_2: second 33 bit of pi/2 ++ // pio2_2t: pi/2 - (pio2_1+pio2_2) ++ // pio2_3: third 33 bit of pi/2 ++ // pio2_3t: pi/2 - (pio2_1+pio2_2+pio2_3) ++ 0x00000000, 0x3fe00000, // 0.5 ++ 0x6DC9C883, 0x3FE45F30, // invpio2 = 6.36619772367581382433e-01 ++ 0x54400000, 0x3FF921FB, // pio2_1 = 1.57079632673412561417e+00 ++ 0x1A626331, 0x3DD0B461, // pio2_1t = 6.07710050650619224932e-11 ++ 0x1A600000, 0x3DD0B461, // pio2_2 = 6.07710050630396597660e-11 ++ 0x2E037073, 0x3BA3198A, // pio2_2t = 2.02226624879595063154e-21 ++ 0x2E000000, 0x3BA3198A, // pio2_3 = 2.02226624871116645580e-21 ++ 0x252049C1, 0x397B839A, // pio2_3t = 8.47842766036889956997e-32 ++ // now, npio2_hw itself ++ 0x3FF921FB, 0x400921FB, 0x4012D97C, 0x401921FB, 0x401F6A7A, 0x4022D97C, ++ 0x4025FDBB, 0x402921FB, 0x402C463A, 0x402F6A7A, 0x4031475C, 0x4032D97C, ++ 0x40346B9C, 0x4035FDBB, 0x40378FDB, 0x403921FB, 0x403AB41B, 0x403C463A, ++ 0x403DD85A, 0x403F6A7A, 0x40407E4C, 0x4041475C, 0x4042106C, 0x4042D97C, ++ 0x4043A28C, 0x40446B9C, 0x404534AC, 0x4045FDBB, 0x4046C6CB, 0x40478FDB, ++ 0x404858EB, 0x404921FB ++}; ++ ++// Coefficients for sin(x) polynomial approximation: S1..S6. ++// See kernel_sin comments in macroAssembler_loongarch64_trig.cpp for details ++ATTRIBUTE_ALIGNED(64) jdouble StubRoutines::la::_dsin_coef[] = { ++ -1.66666666666666324348e-01, // 0xBFC5555555555549 ++ 8.33333333332248946124e-03, // 0x3F8111111110F8A6 ++ -1.98412698298579493134e-04, // 0xBF2A01A019C161D5 ++ 2.75573137070700676789e-06, // 0x3EC71DE357B1FE7D ++ -2.50507602534068634195e-08, // 0xBE5AE5E68A2B9CEB ++ 1.58969099521155010221e-10 // 0x3DE5D93A5ACFD57C ++}; ++ ++// Coefficients for cos(x) polynomial approximation: C1..C6. ++// See kernel_cos comments in macroAssembler_loongarch64_trig.cpp for details ++ATTRIBUTE_ALIGNED(64) jdouble StubRoutines::la::_dcos_coef[] = { ++ 4.16666666666666019037e-02, // c0x3FA555555555554C ++ -1.38888888888741095749e-03, // 0xBF56C16C16C15177 ++ 2.48015872894767294178e-05, // 0x3EFA01A019CB1590 ++ -2.75573143513906633035e-07, // 0xBE927E4F809C52AD ++ 2.08757232129817482790e-09, // 0x3E21EE9EBDB4B1C4 ++ -1.13596475577881948265e-11 // 0xBDA8FAE9BE8838D4 ++}; ++ ++ATTRIBUTE_ALIGNED(128) julong StubRoutines::la::_string_compress_index[] = { ++ 0x0e0c0a0806040200UL, 0x1e1c1a1816141210UL // 128-bit shuffle index ++}; ++ ++// Table of constants for 2/pi, 396 Hex digits (476 decimal) of 2/pi. ++// Used in cases of very large argument. 396 hex digits is enough to support ++// required precision. ++// Converted to double to avoid unnecessary conversion in code ++// NOTE: table looks like original int table: {0xA2F983, 0x6E4E44,...} with ++// only (double) conversion added ++ATTRIBUTE_ALIGNED(64) jdouble StubRoutines::la::_two_over_pi[] = { ++ (double)0xA2F983, (double)0x6E4E44, (double)0x1529FC, (double)0x2757D1, (double)0xF534DD, (double)0xC0DB62, ++ (double)0x95993C, (double)0x439041, (double)0xFE5163, (double)0xABDEBB, (double)0xC561B7, (double)0x246E3A, ++ (double)0x424DD2, (double)0xE00649, (double)0x2EEA09, (double)0xD1921C, (double)0xFE1DEB, (double)0x1CB129, ++ (double)0xA73EE8, (double)0x8235F5, (double)0x2EBB44, (double)0x84E99C, (double)0x7026B4, (double)0x5F7E41, ++ (double)0x3991D6, (double)0x398353, (double)0x39F49C, (double)0x845F8B, (double)0xBDF928, (double)0x3B1FF8, ++ (double)0x97FFDE, (double)0x05980F, (double)0xEF2F11, (double)0x8B5A0A, (double)0x6D1F6D, (double)0x367ECF, ++ (double)0x27CB09, (double)0xB74F46, (double)0x3F669E, (double)0x5FEA2D, (double)0x7527BA, (double)0xC7EBE5, ++ (double)0xF17B3D, (double)0x0739F7, (double)0x8A5292, (double)0xEA6BFB, (double)0x5FB11F, (double)0x8D5D08, ++ (double)0x560330, (double)0x46FC7B, (double)0x6BABF0, (double)0xCFBC20, (double)0x9AF436, (double)0x1DA9E3, ++ (double)0x91615E, (double)0xE61B08, (double)0x659985, (double)0x5F14A0, (double)0x68408D, (double)0xFFD880, ++ (double)0x4D7327, (double)0x310606, (double)0x1556CA, (double)0x73A8C9, (double)0x60E27B, (double)0xC08C6B, ++}; ++ ++// Pi over 2 value ++ATTRIBUTE_ALIGNED(64) jdouble StubRoutines::la::_pio2[] = { ++ 1.57079625129699707031e+00, // 0x3FF921FB40000000 ++ 7.54978941586159635335e-08, // 0x3E74442D00000000 ++ 5.39030252995776476554e-15, // 0x3CF8469880000000 ++ 3.28200341580791294123e-22, // 0x3B78CC5160000000 ++ 1.27065575308067607349e-29, // 0x39F01B8380000000 ++ 1.22933308981111328932e-36, // 0x387A252040000000 ++ 2.73370053816464559624e-44, // 0x36E3822280000000 ++ 2.16741683877804819444e-51, // 0x3569F31D00000000 ++}; ++ ++ATTRIBUTE_ALIGNED(64) jfloat StubRoutines::la::_round_float_imm[] = { ++ -0.5f, 0.49999997f // magic number for ties ++}; ++ ++ATTRIBUTE_ALIGNED(64) jdouble StubRoutines::la::_round_double_imm[] = { ++ -0.5d, 0.49999999999999994d // magic number for ties ++}; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/stubRoutines_loongarch.hpp b/src/hotspot/cpu/loongarch/stubRoutines_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/stubRoutines_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/stubRoutines_loongarch.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,118 @@ ++/* ++ * Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_STUBROUTINES_LOONGARCH_64_HPP ++#define CPU_LOONGARCH_STUBROUTINES_LOONGARCH_64_HPP ++ ++// This file holds the platform specific parts of the StubRoutines ++// definition. See stubRoutines.hpp for a description on how to ++// extend it. ++ ++static bool returns_to_call_stub(address return_pc){ ++ return return_pc == _call_stub_return_address; ++} ++ ++enum platform_dependent_constants { ++ // simply increase sizes if too small (assembler will crash if too small) ++ _initial_stubs_code_size = 20000, ++ _continuation_stubs_code_size = 2000, ++ _compiler_stubs_code_size = 60000, ++ _final_stubs_code_size = 60000 ZGC_ONLY(+477000) ++}; ++ ++class la { ++ friend class StubGenerator; ++ friend class VMStructs; ++ private: ++ // If we call compiled code directly from the call stub we will ++ // need to adjust the return back to the call stub to a specialized ++ // piece of code that can handle compiled results and cleaning the fpu ++ // stack. The variable holds that location. ++ static address _vector_iota_indices; ++ static juint _crc_table[]; ++ static address _method_entry_barrier; ++ ++ static address _string_indexof_linear_ll; ++ static address _string_indexof_linear_uu; ++ static address _string_indexof_linear_ul; ++ ++ static address _jlong_fill; ++ static address _arrayof_jlong_fill; ++ ++ static julong _string_compress_index[]; ++ ++ static jfloat _round_float_imm[]; ++ static jdouble _round_double_imm[]; ++ ++ // begin trigonometric tables block. See comments in .cpp file ++ static juint _npio2_hw[]; ++ static jdouble _two_over_pi[]; ++ static jdouble _pio2[]; ++ static jdouble _dsin_coef[]; ++ static jdouble _dcos_coef[]; ++ // end trigonometric tables block ++ ++public: ++ // Call back points for traps in compiled code ++ static address vector_iota_indices() { return _vector_iota_indices; } ++ ++ static address method_entry_barrier() { ++ return _method_entry_barrier; ++ } ++ ++ static address string_indexof_linear_ul() { ++ return _string_indexof_linear_ul; ++ } ++ ++ static address string_indexof_linear_ll() { ++ return _string_indexof_linear_ll; ++ } ++ ++ static address string_indexof_linear_uu() { ++ return _string_indexof_linear_uu; ++ } ++ ++ static address jlong_fill() { ++ return _jlong_fill; ++ } ++ ++ static address arrayof_jlong_fill() { ++ return _arrayof_jlong_fill; ++ } ++ ++ static address string_compress_index() { ++ return (address) _string_compress_index; ++ } ++ ++ static address round_float_imm() { ++ return (address) _round_float_imm; ++ } ++ ++ static address round_double_imm() { ++ return (address) _round_double_imm; ++ } ++}; ++ ++#endif // CPU_LOONGARCH_STUBROUTINES_LOONGARCH_64_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/templateInterpreterGenerator_loongarch.cpp b/src/hotspot/cpu/loongarch/templateInterpreterGenerator_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/templateInterpreterGenerator_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/templateInterpreterGenerator_loongarch.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,2106 @@ ++/* ++ * Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "classfile/javaClasses.hpp" ++#include "gc/shared/barrierSetAssembler.hpp" ++#include "interpreter/bytecodeHistogram.hpp" ++#include "interpreter/interp_masm.hpp" ++#include "interpreter/interpreter.hpp" ++#include "interpreter/interpreterRuntime.hpp" ++#include "interpreter/templateInterpreterGenerator.hpp" ++#include "interpreter/templateTable.hpp" ++#include "oops/arrayOop.hpp" ++#include "oops/methodData.hpp" ++#include "oops/method.hpp" ++#include "oops/oop.inline.hpp" ++#include "prims/jvmtiExport.hpp" ++#include "prims/jvmtiThreadState.hpp" ++#include "runtime/arguments.hpp" ++#include "runtime/deoptimization.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/globals.hpp" ++#include "runtime/jniHandles.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "runtime/synchronizer.hpp" ++#include "runtime/timer.hpp" ++#include "runtime/vframeArray.hpp" ++#include "utilities/debug.hpp" ++ ++#define __ _masm-> ++ ++int TemplateInterpreter::InterpreterCodeSize = 500 * K; ++ ++#ifdef PRODUCT ++#define BLOCK_COMMENT(str) /* nothing */ ++#else ++#define BLOCK_COMMENT(str) __ block_comment(str) ++#endif ++ ++address TemplateInterpreterGenerator::generate_slow_signature_handler() { ++ address entry = __ pc(); ++ // Rmethod: method ++ // LVP: pointer to locals ++ // A3: first stack arg ++ __ move(A3, SP); ++ __ addi_d(SP, SP, -18 * wordSize); ++ __ st_d(RA, SP, 0); ++ __ call_VM(noreg, ++ CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::slow_signature_handler), ++ Rmethod, LVP, A3); ++ ++ // V0: result handler ++ ++ // Stack layout: ++ // ... ++ // 18 stack arg0 <--- old sp ++ // 17 floatReg arg7 ++ // ... ++ // 10 floatReg arg0 ++ // 9 float/double identifiers ++ // 8 IntReg arg7 ++ // ... ++ // 2 IntReg arg1 ++ // 1 aligned slot ++ // SP: 0 return address ++ ++ // Do FPU first so we can use A3 as temp ++ __ ld_d(A3, Address(SP, 9 * wordSize)); // float/double identifiers ++ ++ for (int i= 0; i < Argument::n_float_register_parameters_c; i++) { ++ FloatRegister floatreg = as_FloatRegister(i + FA0->encoding()); ++ Label isdouble, done; ++ ++ __ andi(AT, A3, 1 << i); ++ __ bnez(AT, isdouble); ++ __ fld_s(floatreg, SP, (10 + i) * wordSize); ++ __ b(done); ++ __ bind(isdouble); ++ __ fld_d(floatreg, SP, (10 + i) * wordSize); ++ __ bind(done); ++ } ++ ++ // A0 is for env. ++ // If the mothed is not static, A1 will be corrected in generate_native_entry. ++ for (int i= 1; i < Argument::n_int_register_parameters_c; i++) { ++ Register reg = as_Register(i + A0->encoding()); ++ __ ld_d(reg, SP, (1 + i) * wordSize); ++ } ++ ++ // A0/V0 contains the result from the call of ++ // InterpreterRuntime::slow_signature_handler so we don't touch it ++ // here. It will be loaded with the JNIEnv* later. ++ __ ld_d(RA, SP, 0); ++ __ addi_d(SP, SP, 18 * wordSize); ++ __ jr(RA); ++ return entry; ++} ++ ++/** ++ * Method entry for static native methods: ++ * int java.util.zip.CRC32.update(int crc, int b) ++ */ ++address TemplateInterpreterGenerator::generate_CRC32_update_entry() { ++ assert(UseCRC32Intrinsics, "this intrinsic is not supported"); ++ address entry = __ pc(); ++ ++ // rmethod: Method* ++ // Rsender: senderSP must preserved for slow path ++ // SP: args ++ ++ Label slow_path; ++ // If we need a safepoint check, generate full interpreter entry. ++ __ safepoint_poll(slow_path, TREG, false /* at_return */, false /* acquire */, false /* in_nmethod */); ++ ++ // We don't generate local frame and don't align stack because ++ // we call stub code and there is no safepoint on this path. ++ ++ const Register crc = A0; // crc ++ const Register val = A1; // source java byte value ++ const Register tbl = A2; // scratch ++ ++ // Arguments are reversed on java expression stack ++ __ ld_w(val, SP, 0); // byte value ++ __ ld_w(crc, SP, wordSize); // Initial CRC ++ ++ __ li(tbl, (long)StubRoutines::crc_table_addr()); ++ ++ __ nor(crc, crc, R0); // ~crc ++ __ update_byte_crc32(crc, val, tbl); ++ __ nor(crc, crc, R0); // ~crc ++ ++ // restore caller SP ++ __ move(SP, Rsender); ++ __ jr(RA); ++ ++ // generate a vanilla native entry as the slow path ++ __ bind(slow_path); ++ __ jump_to_entry(Interpreter::entry_for_kind(Interpreter::native)); ++ return entry; ++} ++ ++/** ++ * Method entry for static native methods: ++ * int java.util.zip.CRC32.updateBytes(int crc, byte[] b, int off, int len) ++ * int java.util.zip.CRC32.updateByteBuffer(int crc, long buf, int off, int len) ++ */ ++address TemplateInterpreterGenerator::generate_CRC32_updateBytes_entry(AbstractInterpreter::MethodKind kind) { ++ assert(UseCRC32Intrinsics, "this intrinsic is not supported"); ++ address entry = __ pc(); ++ ++ // rmethod: Method* ++ // Rsender: senderSP must preserved for slow path ++ // SP: args ++ ++ Label slow_path; ++ // If we need a safepoint check, generate full interpreter entry. ++ __ safepoint_poll(slow_path, TREG, false /* at_return */, false /* acquire */, false /* in_nmethod */); ++ ++ // We don't generate local frame and don't align stack because ++ // we call stub code and there is no safepoint on this path. ++ ++ const Register crc = A0; // crc ++ const Register buf = A1; // source java byte array address ++ const Register len = A2; // length ++ const Register tmp = A3; ++ ++ const Register off = len; // offset (never overlaps with 'len') ++ ++ // Arguments are reversed on java expression stack ++ // Calculate address of start element ++ __ ld_w(off, SP, wordSize); // int offset ++ __ ld_d(buf, SP, 2 * wordSize); // byte[] buf | long buf ++ __ add_d(buf, buf, off); // + offset ++ if (kind == Interpreter::java_util_zip_CRC32_updateByteBuffer) { ++ __ ld_w(crc, SP, 4 * wordSize); // long crc ++ } else { ++ __ addi_d(buf, buf, arrayOopDesc::base_offset_in_bytes(T_BYTE)); // + header size ++ __ ld_w(crc, SP, 3 * wordSize); // long crc ++ } ++ ++ // Can now load 'len' since we're finished with 'off' ++ __ ld_w(len, SP, 0); // length ++ ++ __ kernel_crc32(crc, buf, len, tmp); ++ ++ // restore caller SP ++ __ move(SP, Rsender); ++ __ jr(RA); ++ ++ // generate a vanilla native entry as the slow path ++ __ bind(slow_path); ++ __ jump_to_entry(Interpreter::entry_for_kind(Interpreter::native)); ++ return entry; ++} ++ ++/** ++ * Method entry for intrinsic-candidate (non-native) methods: ++ * int java.util.zip.CRC32C.updateBytes(int crc, byte[] b, int off, int end) ++ * int java.util.zip.CRC32C.updateDirectByteBuffer(int crc, long buf, int off, int end) ++ * Unlike CRC32, CRC32C does not have any methods marked as native ++ * CRC32C also uses an "end" variable instead of the length variable CRC32 uses ++ */ ++address TemplateInterpreterGenerator::generate_CRC32C_updateBytes_entry(AbstractInterpreter::MethodKind kind) { ++ assert(UseCRC32CIntrinsics, "this intrinsic is not supported"); ++ address entry = __ pc(); ++ ++ const Register crc = A0; // initial crc ++ const Register buf = A1; // source java byte array address ++ const Register len = A2; // len argument to the kernel ++ const Register tmp = A3; ++ ++ const Register end = len; // index of last element to process ++ const Register off = crc; // offset ++ ++ __ ld_w(end, SP, 0); // int end ++ __ ld_w(off, SP, wordSize); // int offset ++ __ sub_w(len, end, off); // calculate length ++ __ ld_d(buf, SP, 2 * wordSize); // byte[] buf | long buf ++ __ add_d(buf, buf, off); // + offset ++ if (kind == Interpreter::java_util_zip_CRC32C_updateDirectByteBuffer) { ++ __ ld_w(crc, SP, 4 * wordSize); // int crc ++ } else { ++ __ addi_d(buf, buf, arrayOopDesc::base_offset_in_bytes(T_BYTE)); // + header size ++ __ ld_w(crc, SP, 3 * wordSize); // int crc ++ } ++ ++ __ kernel_crc32c(crc, buf, len, tmp); ++ ++ // restore caller SP ++ __ move(SP, Rsender); ++ __ jr(RA); ++ ++ return entry; ++} ++ ++// ++// Various method entries ++// ++ ++address TemplateInterpreterGenerator::generate_math_entry(AbstractInterpreter::MethodKind kind) { ++ // These don't need a safepoint check because they aren't virtually ++ // callable. We won't enter these intrinsics from compiled code. ++ // If in the future we added an intrinsic which was virtually callable ++ // we'd have to worry about how to safepoint so that this code is used. ++ ++ // mathematical functions inlined by compiler ++ // (interpreter must provide identical implementation ++ // in order to avoid monotonicity bugs when switching ++ // from interpreter to compiler in the middle of some ++ // computation) ++ // ++ // stack: ++ // [ arg ] <-- sp ++ // [ arg ] ++ // retaddr in ra ++ ++ address entry_point = nullptr; ++ switch (kind) { ++ case Interpreter::java_lang_math_abs: ++ entry_point = __ pc(); ++ __ fld_d(FA0, SP, 0); ++ __ fabs_d(F0, FA0); ++ __ move(SP, Rsender); ++ break; ++ case Interpreter::java_lang_math_sqrt: ++ entry_point = __ pc(); ++ __ fld_d(FA0, SP, 0); ++ __ fsqrt_d(F0, FA0); ++ __ move(SP, Rsender); ++ break; ++ case Interpreter::java_lang_math_sin : ++ case Interpreter::java_lang_math_cos : ++ case Interpreter::java_lang_math_tan : ++ case Interpreter::java_lang_math_log : ++ case Interpreter::java_lang_math_log10 : ++ case Interpreter::java_lang_math_exp : ++ entry_point = __ pc(); ++ __ fld_d(FA0, SP, 0); ++ __ move(SP, Rsender); ++ __ movgr2fr_d(FS0, RA); ++ __ movgr2fr_d(FS1, SP); ++ __ bstrins_d(SP, R0, exact_log2(StackAlignmentInBytes) - 1, 0); ++ generate_transcendental_entry(kind, 1); ++ __ movfr2gr_d(SP, FS1); ++ __ movfr2gr_d(RA, FS0); ++ break; ++ case Interpreter::java_lang_math_pow : ++ entry_point = __ pc(); ++ __ fld_d(FA0, SP, 2 * Interpreter::stackElementSize); ++ __ fld_d(FA1, SP, 0); ++ __ move(SP, Rsender); ++ __ movgr2fr_d(FS0, RA); ++ __ movgr2fr_d(FS1, SP); ++ __ bstrins_d(SP, R0, exact_log2(StackAlignmentInBytes) - 1, 0); ++ generate_transcendental_entry(kind, 2); ++ __ movfr2gr_d(SP, FS1); ++ __ movfr2gr_d(RA, FS0); ++ break; ++ case Interpreter::java_lang_math_fmaD : ++ if (UseFMA) { ++ entry_point = __ pc(); ++ __ fld_d(FA0, SP, 4 * Interpreter::stackElementSize); ++ __ fld_d(FA1, SP, 2 * Interpreter::stackElementSize); ++ __ fld_d(FA2, SP, 0); ++ __ fmadd_d(F0, FA0, FA1, FA2); ++ __ move(SP, Rsender); ++ } ++ break; ++ case Interpreter::java_lang_math_fmaF : ++ if (UseFMA) { ++ entry_point = __ pc(); ++ __ fld_s(FA0, SP, 2 * Interpreter::stackElementSize); ++ __ fld_s(FA1, SP, Interpreter::stackElementSize); ++ __ fld_s(FA2, SP, 0); ++ __ fmadd_s(F0, FA0, FA1, FA2); ++ __ move(SP, Rsender); ++ } ++ break; ++ default: ++ ; ++ } ++ if (entry_point) { ++ __ jr(RA); ++ } ++ ++ return entry_point; ++} ++ ++/** ++ * Method entry for static method: ++ * java.lang.Float.float16ToFloat(short floatBinary16) ++ */ ++address TemplateInterpreterGenerator::generate_Float_float16ToFloat_entry() { ++ assert(VM_Version::supports_float16(), "this intrinsic is not supported"); ++ address entry_point = __ pc(); ++ __ ld_w(A0, SP, 0); ++ __ flt16_to_flt(F0, A0, F1); ++ __ move(SP, Rsender); // Restore caller's SP ++ __ jr(RA); ++ return entry_point; ++} ++ ++/** ++ * Method entry for static method: ++ * java.lang.Float.floatToFloat16(float value) ++ */ ++address TemplateInterpreterGenerator::generate_Float_floatToFloat16_entry() { ++ assert(VM_Version::supports_float16(), "this intrinsic is not supported"); ++ address entry_point = __ pc(); ++ __ fld_s(F0, SP, 0); ++ __ flt_to_flt16(A0, F0, F1); ++ __ move(SP, Rsender); // Restore caller's SP ++ __ jr(RA); ++ return entry_point; ++} ++ ++// Not supported ++address TemplateInterpreterGenerator::generate_Float_intBitsToFloat_entry() { return nullptr; } ++address TemplateInterpreterGenerator::generate_Float_floatToRawIntBits_entry() { return nullptr; } ++address TemplateInterpreterGenerator::generate_Double_longBitsToDouble_entry() { return nullptr; } ++address TemplateInterpreterGenerator::generate_Double_doubleToRawLongBits_entry() { return nullptr; } ++ ++// Method entry for java.lang.Thread.currentThread ++address TemplateInterpreterGenerator::generate_currentThread() { ++ ++ address entry_point = __ pc(); ++ ++ __ ld_d(A0, Address(TREG, JavaThread::vthread_offset())); ++ __ resolve_oop_handle(A0, SCR2, SCR1); ++ __ jr(RA); ++ ++ return entry_point; ++} ++ ++ // double trigonometrics and transcendentals ++ // static jdouble dsin(jdouble x); ++ // static jdouble dcos(jdouble x); ++ // static jdouble dtan(jdouble x); ++ // static jdouble dlog(jdouble x); ++ // static jdouble dlog10(jdouble x); ++ // static jdouble dexp(jdouble x); ++ // static jdouble dpow(jdouble x, jdouble y); ++ ++void TemplateInterpreterGenerator::generate_transcendental_entry(AbstractInterpreter::MethodKind kind, int fpargs) { ++ address fn; ++ switch (kind) { ++ case Interpreter::java_lang_math_sin : ++ if (StubRoutines::dsin() == nullptr) { ++ fn = CAST_FROM_FN_PTR(address, SharedRuntime::dsin); ++ } else { ++ fn = CAST_FROM_FN_PTR(address, StubRoutines::dsin()); ++ } ++ break; ++ case Interpreter::java_lang_math_cos : ++ if (StubRoutines::dcos() == nullptr) { ++ fn = CAST_FROM_FN_PTR(address, SharedRuntime::dcos); ++ } else { ++ fn = CAST_FROM_FN_PTR(address, StubRoutines::dcos()); ++ } ++ break; ++ case Interpreter::java_lang_math_tan : ++ if (StubRoutines::dtan() == nullptr) { ++ fn = CAST_FROM_FN_PTR(address, SharedRuntime::dtan); ++ } else { ++ fn = CAST_FROM_FN_PTR(address, StubRoutines::dtan()); ++ } ++ break; ++ case Interpreter::java_lang_math_log : ++ if (StubRoutines::dlog() == nullptr) { ++ fn = CAST_FROM_FN_PTR(address, SharedRuntime::dlog); ++ } else { ++ fn = CAST_FROM_FN_PTR(address, StubRoutines::dlog()); ++ } ++ break; ++ case Interpreter::java_lang_math_log10 : ++ if (StubRoutines::dlog10() == nullptr) { ++ fn = CAST_FROM_FN_PTR(address, SharedRuntime::dlog10); ++ } else { ++ fn = CAST_FROM_FN_PTR(address, StubRoutines::dlog10()); ++ } ++ break; ++ case Interpreter::java_lang_math_exp : ++ if (StubRoutines::dexp() == nullptr) { ++ fn = CAST_FROM_FN_PTR(address, SharedRuntime::dexp); ++ } else { ++ fn = CAST_FROM_FN_PTR(address, StubRoutines::dexp()); ++ } ++ break; ++ case Interpreter::java_lang_math_pow : ++ if (StubRoutines::dpow() == nullptr) { ++ fn = CAST_FROM_FN_PTR(address, SharedRuntime::dpow); ++ } else { ++ fn = CAST_FROM_FN_PTR(address, StubRoutines::dpow()); ++ } ++ break; ++ default: ++ ShouldNotReachHere(); ++ fn = nullptr; // unreachable ++ } ++ __ li(T4, fn); ++ __ jalr(T4); ++} ++ ++// Abstract method entry ++// Attempt to execute abstract method. Throw exception ++address TemplateInterpreterGenerator::generate_abstract_entry(void) { ++ ++ // Rmethod: Method* ++ // V0: receiver (unused) ++ // Rsender : sender 's sp ++ address entry_point = __ pc(); ++ ++ // abstract method entry ++ // throw exception ++ // adjust stack to what a normal return would do ++ __ empty_expression_stack(); ++ __ restore_bcp(); ++ __ restore_locals(); ++ __ call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::throw_AbstractMethodErrorWithMethod), Rmethod); ++ // the call_VM checks for exception, so we should never return here. ++ __ should_not_reach_here(); ++ ++ return entry_point; ++} ++ ++ ++const int method_offset = frame::interpreter_frame_method_offset * wordSize; ++const int bci_offset = frame::interpreter_frame_bcp_offset * wordSize; ++const int locals_offset = frame::interpreter_frame_locals_offset * wordSize; ++ ++//----------------------------------------------------------------------------- ++ ++address TemplateInterpreterGenerator::generate_StackOverflowError_handler() { ++ address entry = __ pc(); ++ ++#ifdef ASSERT ++ { ++ Label L; ++ __ addi_d(T1, FP, frame::interpreter_frame_monitor_block_top_offset * wordSize); ++ __ sub_d(T1, T1, SP); // T1 = maximal sp for current fp ++ __ bge(T1, R0, L); // check if frame is complete ++ __ stop("interpreter frame not set up"); ++ __ bind(L); ++ } ++#endif // ASSERT ++ // Restore bcp under the assumption that the current frame is still ++ // interpreted ++ __ restore_bcp(); ++ ++ // expression stack must be empty before entering the VM if an ++ // exception happened ++ __ empty_expression_stack(); ++ // throw exception ++ __ call_VM(NOREG, CAST_FROM_FN_PTR(address, InterpreterRuntime::throw_StackOverflowError)); ++ return entry; ++} ++ ++address TemplateInterpreterGenerator::generate_ArrayIndexOutOfBounds_handler() { ++ address entry = __ pc(); ++ // expression stack must be empty before entering the VM if an ++ // exception happened ++ __ empty_expression_stack(); ++ // ??? convention: expect array in register A1 ++ __ call_VM(noreg, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::throw_ArrayIndexOutOfBoundsException), A1, A2); ++ return entry; ++} ++ ++address TemplateInterpreterGenerator::generate_ClassCastException_handler() { ++ address entry = __ pc(); ++ // expression stack must be empty before entering the VM if an ++ // exception happened ++ __ empty_expression_stack(); ++ __ call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::throw_ClassCastException), FSR); ++ return entry; ++} ++ ++address TemplateInterpreterGenerator::generate_exception_handler_common( ++ const char* name, const char* message, bool pass_oop) { ++ assert(!pass_oop || message == nullptr, "either oop or message but not both"); ++ address entry = __ pc(); ++ ++ // expression stack must be empty before entering the VM if an exception happened ++ __ empty_expression_stack(); ++ // setup parameters ++ __ li(A1, (long)name); ++ if (pass_oop) { ++ __ call_VM(V0, ++ CAST_FROM_FN_PTR(address, InterpreterRuntime::create_klass_exception), A1, FSR); ++ } else { ++ __ li(A2, (long)message); ++ __ call_VM(V0, ++ CAST_FROM_FN_PTR(address, InterpreterRuntime::create_exception), A1, A2); ++ } ++ // throw exception ++ __ jmp(Interpreter::throw_exception_entry(), relocInfo::none); ++ return entry; ++} ++ ++address TemplateInterpreterGenerator::generate_return_entry_for(TosState state, int step, size_t index_size) { ++ ++ address entry = __ pc(); ++ ++ // Restore stack bottom in case i2c adjusted stack ++ __ ld_d(SP, Address(FP, frame::interpreter_frame_last_sp_offset * wordSize)); ++ // and null it as marker that sp is now tos until next java call ++ __ st_d(R0, FP, frame::interpreter_frame_last_sp_offset * wordSize); ++ ++ __ restore_bcp(); ++ __ restore_locals(); ++ ++ // mdp: T8 ++ // ret: FSR ++ // tmp: T4 ++ if (state == atos) { ++ Register mdp = T8; ++ Register tmp = T4; ++ __ profile_return_type(mdp, FSR, tmp); ++ } ++ ++ ++ const Register cache = T4; ++ const Register index = T3; ++ if (index_size == sizeof(u4)) { ++ __ load_resolved_indy_entry(cache, index); ++ __ ld_hu(cache, Address(cache, in_bytes(ResolvedIndyEntry::num_parameters_offset()))); ++ __ alsl_d(SP, cache, SP, Interpreter::logStackElementSize - 1); ++ } else { ++ __ get_cache_and_index_at_bcp(cache, index, 1, index_size); ++ __ alsl_d(AT, index, cache, Address::times_ptr - 1); ++ __ ld_d(cache, AT, in_bytes(ConstantPoolCache::base_offset() + ConstantPoolCacheEntry::flags_offset())); ++ __ andi(cache, cache, ConstantPoolCacheEntry::parameter_size_mask); ++ __ alsl_d(SP, cache, SP, Interpreter::logStackElementSize - 1); ++ } ++ ++ __ check_and_handle_popframe(TREG); ++ __ check_and_handle_earlyret(TREG); ++ ++ __ get_dispatch(); ++ __ dispatch_next(state, step); ++ ++ return entry; ++} ++ ++ ++address TemplateInterpreterGenerator::generate_deopt_entry_for(TosState state, ++ int step, ++ address continuation) { ++ address entry = __ pc(); ++ __ restore_bcp(); ++ __ restore_locals(); ++ __ get_dispatch(); ++ ++ // null last_sp until next java call ++ __ st_d(R0, FP, frame::interpreter_frame_last_sp_offset * wordSize); ++ ++#if INCLUDE_JVMCI ++ // Check if we need to take lock at entry of synchronized method. This can ++ // only occur on method entry so emit it only for vtos with step 0. ++ if (EnableJVMCI && state == vtos && step == 0) { ++ Label L; ++ __ ld_b(AT, Address(TREG, JavaThread::pending_monitorenter_offset())); ++ __ beqz(AT, L); ++ // Clear flag. ++ __ st_b(R0, Address(TREG, JavaThread::pending_monitorenter_offset())); ++ // Take lock. ++ lock_method(); ++ __ bind(L); ++ } else { ++#ifdef ASSERT ++ if (EnableJVMCI) { ++ Label L; ++ __ ld_b(AT, Address(TREG, JavaThread::pending_monitorenter_offset())); ++ __ beqz(AT, L); ++ __ stop("unexpected pending monitor in deopt entry"); ++ __ bind(L); ++ } ++#endif ++ } ++#endif ++ ++ // handle exceptions ++ { ++ Label L; ++ __ ld_d(AT, TREG, in_bytes(Thread::pending_exception_offset())); ++ __ beq(AT, R0, L); ++ __ call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::throw_pending_exception)); ++ __ should_not_reach_here(); ++ __ bind(L); ++ } ++ if (continuation == nullptr) { ++ __ dispatch_next(state, step); ++ } else { ++ __ jump_to_entry(continuation); ++ } ++ return entry; ++} ++ ++address TemplateInterpreterGenerator::generate_result_handler_for(BasicType type) { ++ address entry = __ pc(); ++ if (type == T_OBJECT) { ++ // retrieve result from frame ++ __ ld_d(V0, FP, frame::interpreter_frame_oop_temp_offset * wordSize); ++ // and verify it ++ __ verify_oop(V0); ++ } else { ++ __ cast_primitive_type(type, V0); ++ } ++ ++ __ jr(RA); // return from result handler ++ return entry; ++} ++ ++address TemplateInterpreterGenerator::generate_safept_entry_for( ++ TosState state, ++ address runtime_entry) { ++ address entry = __ pc(); ++ __ push(state); ++ __ push_cont_fastpath(TREG); ++ __ call_VM(noreg, runtime_entry); ++ __ pop_cont_fastpath(TREG); ++ __ dispatch_via(vtos, Interpreter::_normal_table.table_for(vtos)); ++ return entry; ++} ++ ++ ++ ++// Helpers for commoning out cases in the various type of method entries. ++// ++ ++ ++// increment invocation count & check for overflow ++// ++// Note: checking for negative value instead of overflow ++// so we have a 'sticky' overflow test ++// ++// Rmethod: method ++void TemplateInterpreterGenerator::generate_counter_incr(Label* overflow) { ++ Label done; ++ int increment = InvocationCounter::count_increment; ++ Label no_mdo; ++ if (ProfileInterpreter) { ++ // Are we profiling? ++ __ ld_d(T0, Address(Rmethod, Method::method_data_offset())); ++ __ beq(T0, R0, no_mdo); ++ // Increment counter in the MDO ++ const Address mdo_invocation_counter(T0, in_bytes(MethodData::invocation_counter_offset()) + ++ in_bytes(InvocationCounter::counter_offset())); ++ const Address mask(T0, in_bytes(MethodData::invoke_mask_offset())); ++ __ increment_mask_and_jump(mdo_invocation_counter, increment, mask, T1, false, Assembler::zero, overflow); ++ __ b(done); ++ } ++ __ bind(no_mdo); ++ // Increment counter in MethodCounters ++ const Address invocation_counter(T0, ++ MethodCounters::invocation_counter_offset() + ++ InvocationCounter::counter_offset()); ++ __ get_method_counters(Rmethod, T0, done); ++ const Address mask(T0, in_bytes(MethodCounters::invoke_mask_offset())); ++ __ increment_mask_and_jump(invocation_counter, increment, mask, T1, false, Assembler::zero, overflow); ++ __ bind(done); ++} ++ ++void TemplateInterpreterGenerator::generate_counter_overflow(Label& do_continue) { ++ ++ // Asm interpreter on entry ++ // S7 - locals ++ // S0 - bcp ++ // Rmethod - method ++ // FP - interpreter frame ++ ++ // On return (i.e. jump to entry_point) ++ // Rmethod - method ++ // RA - return address of interpreter caller ++ // tos - the last parameter to Java method ++ // SP - sender_sp ++ ++ // the bcp is valid if and only if it's not null ++ __ call_VM(NOREG, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::frequency_counter_overflow), R0); ++ __ ld_d(Rmethod, FP, method_offset); ++ // Preserve invariant that S0/S7 contain bcp/locals of sender frame ++ __ b_far(do_continue); ++} ++ ++// See if we've got enough room on the stack for locals plus overhead. ++// The expression stack grows down incrementally, so the normal guard ++// page mechanism will work for that. ++// ++// NOTE: Since the additional locals are also always pushed (wasn't ++// obvious in generate_method_entry) so the guard should work for them ++// too. ++// ++// Args: ++// T2: number of additional locals this frame needs (what we must check) ++// T0: Method* ++// ++void TemplateInterpreterGenerator::generate_stack_overflow_check(void) { ++ // see if we've got enough room on the stack for locals plus overhead. ++ // the expression stack grows down incrementally, so the normal guard ++ // page mechanism will work for that. ++ // ++ // Registers live on entry: ++ // ++ // T0: Method* ++ // T2: number of additional locals this frame needs (what we must check) ++ ++ // NOTE: since the additional locals are also always pushed (wasn't obvious in ++ // generate_method_entry) so the guard should work for them too. ++ // ++ ++ const int entry_size = frame::interpreter_frame_monitor_size_in_bytes(); ++ ++ // total overhead size: entry_size + (saved fp thru expr stack bottom). ++ // be sure to change this if you add/subtract anything to/from the overhead area ++ const int overhead_size = -(frame::interpreter_frame_initial_sp_offset*wordSize) ++ + entry_size; ++ ++ const size_t page_size = os::vm_page_size(); ++ Label after_frame_check; ++ ++ // see if the frame is greater than one page in size. If so, ++ // then we need to verify there is enough stack space remaining ++ // for the additional locals. ++ __ li(AT, (page_size - overhead_size) / Interpreter::stackElementSize); ++ __ bge(AT, T2, after_frame_check); ++ ++ // compute sp as if this were going to be the last frame on ++ // the stack before the red zone ++ ++ // locals + overhead, in bytes ++ __ slli_d(T3, T2, Interpreter::logStackElementSize); ++ __ addi_d(T3, T3, overhead_size); // locals * 4 + overhead_size --> T3 ++ ++#ifdef ASSERT ++ Label stack_base_okay, stack_size_okay; ++ // verify that thread stack base is non-zero ++ __ ld_d(AT, TREG, in_bytes(Thread::stack_base_offset())); ++ __ bne(AT, R0, stack_base_okay); ++ __ stop("stack base is zero"); ++ __ bind(stack_base_okay); ++ // verify that thread stack size is non-zero ++ __ ld_d(AT, TREG, in_bytes(Thread::stack_size_offset())); ++ __ bne(AT, R0, stack_size_okay); ++ __ stop("stack size is zero"); ++ __ bind(stack_size_okay); ++#endif ++ ++ // Add stack base to locals and subtract stack size ++ __ ld_d(AT, TREG, in_bytes(Thread::stack_base_offset())); // stack_base --> AT ++ __ add_d(T3, T3, AT); // locals * 4 + overhead_size + stack_base--> T3 ++ __ ld_d(AT, TREG, in_bytes(Thread::stack_size_offset())); // stack_size --> AT ++ __ sub_d(T3, T3, AT); // locals * 4 + overhead_size + stack_base - stack_size --> T3 ++ ++ // Use the bigger size for banging. ++ const int max_bang_size = (int)MAX2(StackOverflow::stack_shadow_zone_size(), StackOverflow::stack_guard_zone_size()); ++ ++ // add in the redzone and yellow size ++ __ li(AT, max_bang_size); ++ __ add_d(T3, T3, AT); ++ ++ // check against the current stack bottom ++ __ blt(T3, SP, after_frame_check); ++ ++ // Note: the restored frame is not necessarily interpreted. ++ // Use the shared runtime version of the StackOverflowError. ++ __ move(SP, Rsender); ++ assert(StubRoutines::throw_StackOverflowError_entry() != nullptr, "stub not yet generated"); ++ __ jmp(StubRoutines::throw_StackOverflowError_entry(), relocInfo::runtime_call_type); ++ ++ // all done with frame size check ++ __ bind(after_frame_check); ++} ++ ++// Allocate monitor and lock method (asm interpreter) ++// Rmethod - Method* ++void TemplateInterpreterGenerator::lock_method(void) { ++ // synchronize method ++ const int entry_size = frame::interpreter_frame_monitor_size_in_bytes(); ++ ++#ifdef ASSERT ++ { Label L; ++ __ ld_w(T0, Rmethod, in_bytes(Method::access_flags_offset())); ++ __ andi(T0, T0, JVM_ACC_SYNCHRONIZED); ++ __ bne(T0, R0, L); ++ __ stop("method doesn't need synchronization"); ++ __ bind(L); ++ } ++#endif // ASSERT ++ // get synchronization object ++ { ++ Label done; ++ __ ld_w(T0, Rmethod, in_bytes(Method::access_flags_offset())); ++ __ andi(T2, T0, JVM_ACC_STATIC); ++ __ ld_d(T0, LVP, Interpreter::local_offset_in_bytes(0)); ++ __ beq(T2, R0, done); ++ __ load_mirror(T0, Rmethod, SCR2, SCR1); ++ __ bind(done); ++ } ++ // add space for monitor & lock ++ __ addi_d(SP, SP, (-1) * entry_size); // add space for a monitor entry ++ __ st_d(SP, FP, frame::interpreter_frame_monitor_block_top_offset * wordSize); ++ // set new monitor block top ++ __ st_d(T0, Address(SP, BasicObjectLock::obj_offset())); // store object ++ ++ const Register lock_reg = T0; ++ __ move(lock_reg, SP); // object address ++ __ lock_object(lock_reg); ++} ++ ++// Generate a fixed interpreter frame. This is identical setup for ++// interpreted methods and for native methods hence the shared code. ++void TemplateInterpreterGenerator::generate_fixed_frame(bool native_call) { ++ ++ // [ local var m-1 ] <--- sp ++ // ... ++ // [ local var 0 ] ++ // [ argument word n-1 ] <--- T0(sender's sp) ++ // ... ++ // [ argument word 0 ] <--- S7 ++ ++ // initialize fixed part of activation frame ++ // sender's sp in Rsender ++ int i = 2; ++ int frame_size = 11; ++ ++ __ addi_d(SP, SP, (-frame_size) * wordSize); ++ __ st_d(RA, SP, (frame_size - 1) * wordSize); // save return address ++ __ st_d(FP, SP, (frame_size - 2) * wordSize); // save sender's fp ++ __ addi_d(FP, SP, (frame_size) * wordSize); ++ __ st_d(Rsender, FP, (-++i) * wordSize); // save sender's sp ++ __ st_d(R0, FP,(-++i) * wordSize); //save last_sp as null ++ __ sub_d(AT, LVP, FP); ++ __ srli_d(AT, AT, Interpreter::logStackElementSize); ++ // Store relativized LVP, see frame::interpreter_frame_locals(). ++ __ st_d(AT, FP, (-++i) * wordSize); // save locals offset ++ __ ld_d(BCP, Rmethod, in_bytes(Method::const_offset())); // get constMethodOop ++ __ addi_d(BCP, BCP, in_bytes(ConstMethod::codes_offset())); // get codebase ++ __ st_d(Rmethod, FP, (-++i) * wordSize); // save Method* ++ // Get mirror and store it in the frame as GC root for this Method* ++ __ load_mirror(T2, Rmethod, SCR2, SCR1); ++ __ st_d(T2, FP, (-++i) * wordSize); // Mirror ++ ++ if (ProfileInterpreter) { ++ Label method_data_continue; ++ __ ld_d(AT, Rmethod, in_bytes(Method::method_data_offset())); ++ __ beq(AT, R0, method_data_continue); ++ __ addi_d(AT, AT, in_bytes(MethodData::data_offset())); ++ __ bind(method_data_continue); ++ __ st_d(AT, FP, (-++i) * wordSize); ++ } else { ++ __ st_d(R0, FP, (-++i) * wordSize); ++ } ++ ++ __ ld_d(T2, Rmethod, in_bytes(Method::const_offset())); ++ __ ld_d(T2, T2, in_bytes(ConstMethod::constants_offset())); ++ __ ld_d(T2, Address(T2, ConstantPool::cache_offset())); ++ __ st_d(T2, FP, (-++i) * wordSize); // set constant pool cache ++ if (native_call) { ++ __ st_d(R0, FP, (-++i) * wordSize); // no bcp ++ } else { ++ __ st_d(BCP, FP, (-++i) * wordSize); // set bcp ++ } ++ __ st_d(SP, FP, (-++i) * wordSize); // reserve word for pointer to expression stack bottom ++ assert(i == frame_size, "i should be equal to frame_size"); ++} ++ ++// End of helpers ++ ++// Various method entries ++//------------------------------------------------------------------------------------------------------------------------ ++// ++// ++ ++// Method entry for java.lang.ref.Reference.get. ++address TemplateInterpreterGenerator::generate_Reference_get_entry(void) { ++ // Code: _aload_0, _getfield, _areturn ++ // parameter size = 1 ++ // ++ // The code that gets generated by this routine is split into 2 parts: ++ // 1. The "intrinsified" code for G1 (or any SATB based GC), ++ // 2. The slow path - which is an expansion of the regular method entry. ++ // ++ // Notes:- ++ // * In the G1 code we do not check whether we need to block for ++ // a safepoint. If G1 is enabled then we must execute the specialized ++ // code for Reference.get (except when the Reference object is null) ++ // so that we can log the value in the referent field with an SATB ++ // update buffer. ++ // If the code for the getfield template is modified so that the ++ // G1 pre-barrier code is executed when the current method is ++ // Reference.get() then going through the normal method entry ++ // will be fine. ++ // * The G1 code can, however, check the receiver object (the instance ++ // of java.lang.Reference) and jump to the slow path if null. If the ++ // Reference object is null then we obviously cannot fetch the referent ++ // and so we don't need to call the G1 pre-barrier. Thus we can use the ++ // regular method entry code to generate the NPE. ++ // ++ // This code is based on generate_accessor_entry. ++ // ++ // Rmethod: Method* ++ // Rsender: senderSP must preserve for slow path, set SP to it on fast path ++ // RA is live. It must be saved around calls. ++ ++ address entry = __ pc(); ++ ++ const int referent_offset = java_lang_ref_Reference::referent_offset(); ++ ++ Label slow_path; ++ const Register local_0 = A0; ++ // Check if local 0 != nullptr ++ // If the receiver is null then it is OK to jump to the slow path. ++ __ ld_d(local_0, Address(SP, 0)); ++ __ beqz(local_0, slow_path); ++ ++ // Load the value of the referent field. ++ const Address field_address(local_0, referent_offset); ++ BarrierSetAssembler *bs = BarrierSet::barrier_set()->barrier_set_assembler(); ++ bs->load_at(_masm, IN_HEAP | ON_WEAK_OOP_REF, T_OBJECT, local_0, field_address, /*tmp1*/ SCR2, /*tmp2*/ SCR1); ++ ++ // areturn ++ __ move(SP, Rsender); ++ __ jr(RA); ++ ++ // generate a vanilla interpreter entry as the slow path ++ __ bind(slow_path); ++ __ jump_to_entry(Interpreter::entry_for_kind(Interpreter::zerolocals)); ++ return entry; ++} ++ ++// Interpreter stub for calling a native method. (asm interpreter) ++// This sets up a somewhat different looking stack for calling the ++// native method than the typical interpreter frame setup. ++address TemplateInterpreterGenerator::generate_native_entry(bool synchronized) { ++ // determine code generation flags ++ bool inc_counter = UseCompiler || CountCompiledCalls; ++ // Rsender: sender's sp ++ // Rmethod: Method* ++ address entry_point = __ pc(); ++ ++ // get parameter size (always needed) ++ // the size in the java stack ++ __ ld_d(V0, Rmethod, in_bytes(Method::const_offset())); ++ __ ld_hu(V0, V0, in_bytes(ConstMethod::size_of_parameters_offset())); ++ ++ // native calls don't need the stack size check since they have no expression stack ++ // and the arguments are already on the stack and we only add a handful of words ++ // to the stack ++ ++ // Rmethod: Method* ++ // V0: size of parameters ++ // Layout of frame at this point ++ // ++ // [ argument word n-1 ] <--- sp ++ // ... ++ // [ argument word 0 ] ++ ++ // for natives the size of locals is zero ++ ++ // compute beginning of parameters (S7) ++ __ slli_d(LVP, V0, Address::times_8); ++ __ addi_d(LVP, LVP, (-1) * wordSize); ++ __ add_d(LVP, LVP, SP); ++ ++ ++ // add 2 zero-initialized slots for native calls ++ // 1 slot for native oop temp offset (setup via runtime) ++ // 1 slot for static native result handler3 (setup via runtime) ++ __ push2(R0, R0); ++ ++ // Layout of frame at this point ++ // [ method holder mirror ] <--- sp ++ // [ result type info ] ++ // [ argument word n-1 ] <--- T0 ++ // ... ++ // [ argument word 0 ] <--- LVP ++ ++ // initialize fixed part of activation frame ++ generate_fixed_frame(true); ++ // after this function, the layout of frame is as following ++ // ++ // [ monitor block top ] <--- sp ( the top monitor entry ) ++ // [ byte code pointer (0) ] (if native, bcp = 0) ++ // [ constant pool cache ] ++ // [ Mirror ] ++ // [ Method* ] ++ // [ locals offset ] ++ // [ sender's sp ] ++ // [ sender's fp ] ++ // [ return address ] <--- fp ++ // [ method holder mirror ] ++ // [ result type info ] ++ // [ argument word n-1 ] <--- sender's sp ++ // ... ++ // [ argument word 0 ] <--- S7 ++ ++ ++ // make sure method is native & not abstract ++#ifdef ASSERT ++ __ ld_w(T0, Rmethod, in_bytes(Method::access_flags_offset())); ++ { ++ Label L; ++ __ andi(AT, T0, JVM_ACC_NATIVE); ++ __ bne(AT, R0, L); ++ __ stop("tried to execute native method as non-native"); ++ __ bind(L); ++ } ++ { ++ Label L; ++ __ andi(AT, T0, JVM_ACC_ABSTRACT); ++ __ beq(AT, R0, L); ++ __ stop("tried to execute abstract method in interpreter"); ++ __ bind(L); ++ } ++#endif ++ ++ // Since at this point in the method invocation the exception handler ++ // would try to exit the monitor of synchronized methods which hasn't ++ // been entered yet, we set the thread local variable ++ // _do_not_unlock_if_synchronized to true. The remove_activation will ++ // check this flag. ++ ++ __ li(AT, (int)true); ++ __ st_b(AT, TREG, in_bytes(JavaThread::do_not_unlock_if_synchronized_offset())); ++ ++ // increment invocation count & check for overflow ++ Label invocation_counter_overflow; ++ if (inc_counter) { ++ generate_counter_incr(&invocation_counter_overflow); ++ } ++ ++ Label continue_after_compile; ++ __ bind(continue_after_compile); ++ ++ bang_stack_shadow_pages(true); ++ ++ // reset the _do_not_unlock_if_synchronized flag ++ __ st_b(R0, TREG, in_bytes(JavaThread::do_not_unlock_if_synchronized_offset())); ++ ++ // check for synchronized methods ++ // Must happen AFTER invocation_counter check and stack overflow check, ++ // so method is not locked if overflows. ++ if (synchronized) { ++ lock_method(); ++ } else { ++ // no synchronization necessary ++#ifdef ASSERT ++ { ++ Label L; ++ __ ld_w(T0, Rmethod, in_bytes(Method::access_flags_offset())); ++ __ andi(AT, T0, JVM_ACC_SYNCHRONIZED); ++ __ beq(AT, R0, L); ++ __ stop("method needs synchronization"); ++ __ bind(L); ++ } ++#endif ++ } ++ ++ // after method_lock, the layout of frame is as following ++ // ++ // [ monitor entry ] <--- sp ++ // ... ++ // [ monitor entry ] ++ // [ monitor block top ] ( the top monitor entry ) ++ // [ byte code pointer (0) ] (if native, bcp = 0) ++ // [ constant pool cache ] ++ // [ Mirror ] ++ // [ Method* ] ++ // [ locals offset ] ++ // [ sender's sp ] ++ // [ sender's fp ] ++ // [ return address ] <--- fp ++ // [ method holder mirror ] ++ // [ result type info ] ++ // [ argument word n-1 ] <--- ( sender's sp ) ++ // ... ++ // [ argument word 0 ] <--- S7 ++ ++ // start execution ++#ifdef ASSERT ++ { ++ Label L; ++ __ ld_d(AT, FP, frame::interpreter_frame_monitor_block_top_offset * wordSize); ++ __ beq(AT, SP, L); ++ __ stop("broken stack frame setup in interpreter in asm"); ++ __ bind(L); ++ } ++#endif ++ ++ // jvmti/jvmpi support ++ __ notify_method_entry(); ++ ++ // work registers ++ const Register method = Rmethod; ++ const Register t = T8; ++ ++ __ get_method(method); ++ { ++ Label L, Lstatic; ++ __ ld_d(t,method,in_bytes(Method::const_offset())); ++ __ ld_hu(t, t, in_bytes(ConstMethod::size_of_parameters_offset())); ++ // LoongArch ABI: caller does not reserve space for the register auguments. ++ // A0 and A1(if needed) ++ __ ld_w(AT, Rmethod, in_bytes(Method::access_flags_offset())); ++ __ andi(AT, AT, JVM_ACC_STATIC); ++ __ beq(AT, R0, Lstatic); ++ __ addi_d(t, t, 1); ++ __ bind(Lstatic); ++ __ addi_d(t, t, -7); ++ __ bge(R0, t, L); ++ __ slli_d(t, t, Address::times_8); ++ __ sub_d(SP, SP, t); ++ __ bind(L); ++ } ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); ++ __ move(AT, SP); ++ // [ ] <--- sp ++ // ... (size of parameters - 8 ) ++ // [ monitor entry ] ++ // ... ++ // [ monitor entry ] ++ // [ monitor block top ] ( the top monitor entry ) ++ // [ byte code pointer (0) ] (if native, bcp = 0) ++ // [ constant pool cache ] ++ // [ Mirror ] ++ // [ Method* ] ++ // [ locals offset ] ++ // [ sender's sp ] ++ // [ sender's fp ] ++ // [ return address ] <--- fp ++ // [ method holder mirror ] ++ // [ result type info ] ++ // [ argument word n-1 ] <--- ( sender's sp ) ++ // ... ++ // [ argument word 0 ] <--- LVP ++ ++ // get signature handler ++ { ++ Label L; ++ __ ld_d(T4, method, in_bytes(Method::signature_handler_offset())); ++ __ bne(T4, R0, L); ++ __ call_VM(NOREG, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::prepare_native_call), method); ++ __ get_method(method); ++ __ ld_d(T4, method, in_bytes(Method::signature_handler_offset())); ++ __ bind(L); ++ } ++ ++ // call signature handler ++ // FIXME: when change codes in InterpreterRuntime, note this point ++ // from: begin of parameters ++ assert(InterpreterRuntime::SignatureHandlerGenerator::from() == LVP, "adjust this code"); ++ // to: current sp ++ assert(InterpreterRuntime::SignatureHandlerGenerator::to () == SP, "adjust this code"); ++ // temp: T3 ++ assert(InterpreterRuntime::SignatureHandlerGenerator::temp() == t , "adjust this code"); ++ ++ __ jalr(T4); ++ __ get_method(method); ++ ++ // ++ // if native function is static, and its second parameter has type length of double word, ++ // and first parameter has type length of word, we have to reserve one word ++ // for the first parameter, according to LoongArch abi. ++ // if native function is not static, and its third parameter has type length of double word, ++ // and second parameter has type length of word, we have to reserve one word for the second ++ // parameter. ++ // ++ ++ ++ // result handler is in V0 ++ // set result handler ++ __ st_d(V0, FP, (frame::interpreter_frame_result_handler_offset)*wordSize); ++ ++#define FIRSTPARA_SHIFT_COUNT 5 ++#define SECONDPARA_SHIFT_COUNT 9 ++#define THIRDPARA_SHIFT_COUNT 13 ++#define PARA_MASK 0xf ++ ++ // pass mirror handle if static call ++ { ++ Label L; ++ __ ld_w(t, method, in_bytes(Method::access_flags_offset())); ++ __ andi(AT, t, JVM_ACC_STATIC); ++ __ beq(AT, R0, L); ++ ++ // get mirror ++ __ load_mirror(t, method, SCR2, SCR1); ++ // copy mirror into activation frame ++ __ st_d(t, FP, frame::interpreter_frame_oop_temp_offset * wordSize); ++ // pass handle to mirror ++ __ addi_d(t, FP, frame::interpreter_frame_oop_temp_offset * wordSize); ++ __ move(A1, t); ++ __ bind(L); ++ } ++ ++ // [ mthd holder mirror ptr ] <--- sp --------------------| (only for static method) ++ // [ ] | ++ // ... size of parameters(or +1) | ++ // [ monitor entry ] | ++ // ... | ++ // [ monitor entry ] | ++ // [ monitor block top ] ( the top monitor entry ) | ++ // [ byte code pointer (0) ] (if native, bcp = 0) | ++ // [ constant pool cache ] | ++ // [ Mirror ] | ++ // [ Method* ] | ++ // [ locals offset ] | ++ // [ sender's sp ] | ++ // [ sender's fp ] | ++ // [ return address ] <--- fp | ++ // [ method holder mirror ] <----------------------------| ++ // [ result type info ] ++ // [ argument word n-1 ] <--- ( sender's sp ) ++ // ... ++ // [ argument word 0 ] <--- S7 ++ ++ // get native function entry point ++ { Label L; ++ __ ld_d(T4, method, in_bytes(Method::native_function_offset())); ++ __ li(T6, SharedRuntime::native_method_throw_unsatisfied_link_error_entry()); ++ __ bne(T6, T4, L); ++ __ call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::prepare_native_call), method); ++ __ get_method(method); ++ __ ld_d(T4, method, in_bytes(Method::native_function_offset())); ++ __ bind(L); ++ } ++ ++ // pass JNIEnv ++ __ addi_d(A0, TREG, in_bytes(JavaThread::jni_environment_offset())); ++ // [ jni environment ] <--- sp ++ // [ mthd holder mirror ptr ] ---------------------------->| (only for static method) ++ // [ ] | ++ // ... size of parameters | ++ // [ monitor entry ] | ++ // ... | ++ // [ monitor entry ] | ++ // [ monitor block top ] ( the top monitor entry ) | ++ // [ byte code pointer (0) ] (if native, bcp = 0) | ++ // [ constant pool cache ] | ++ // [ Mirror ] | ++ // [ Method* ] | ++ // [ locals offset ] | ++ // [ sender's sp ] | ++ // [ sender's fp ] | ++ // [ return address ] <--- fp | ++ // [ method holder mirror ] <----------------------------| ++ // [ result type info ] ++ // [ argument word n-1 ] <--- ( sender's sp ) ++ // ... ++ // [ argument word 0 ] <--- S7 ++ ++ // Set the last Java PC in the frame anchor to be the return address from ++ // the call to the native method: this will allow the debugger to ++ // generate an accurate stack trace. ++ Label native_return; ++ __ set_last_Java_frame(TREG, SP, FP, native_return); ++ ++ // change thread state ++#ifdef ASSERT ++ { ++ Label L; ++ __ ld_w(t, TREG, in_bytes(JavaThread::thread_state_offset())); ++ __ addi_d(t, t, (-1) * _thread_in_Java); ++ __ beq(t, R0, L); ++ __ stop("Wrong thread state in native stub"); ++ __ bind(L); ++ } ++#endif ++ ++ __ li(t, _thread_in_native); ++ if (os::is_MP()) { ++ __ addi_d(AT, TREG, in_bytes(JavaThread::thread_state_offset())); ++ __ amswap_db_w(R0, t, AT); ++ } else { ++ __ st_w(t, TREG, in_bytes(JavaThread::thread_state_offset())); ++ } ++ ++ // call native method ++ __ jalr(T4); ++ __ bind(native_return); ++ // result potentially in V0 or F0 ++ ++ ++ // via _last_native_pc and not via _last_jave_sp ++ // NOTE: the order of these push(es) is known to frame::interpreter_frame_result. ++ // If the order changes or anything else is added to the stack the code in ++ // interpreter_frame_result will have to be changed. ++ //FIXME, should modify here ++ // save return value to keep the value from being destroyed by other calls ++ __ push(dtos); ++ __ push(ltos); ++ ++ // change thread state ++ __ li(t, _thread_in_native_trans); ++ if (os::is_MP()) { ++ __ addi_d(AT, TREG, in_bytes(JavaThread::thread_state_offset())); ++ __ amswap_db_w(R0, t, AT); // Release-Store ++ ++ // Force this write out before the read below ++ if (!UseSystemMemoryBarrier) { ++ __ membar(__ AnyAny); ++ } ++ } else { ++ __ st_w(t, TREG, in_bytes(JavaThread::thread_state_offset())); ++ } ++ ++ // check for safepoint operation in progress and/or pending suspend requests ++ { Label Continue; ++ ++ // Don't use call_VM as it will see a possible pending exception and forward it ++ // and never return here preventing us from clearing _last_native_pc down below. ++ // Also can't use call_VM_leaf either as it will check to see if BCP & LVP are ++ // preserved and correspond to the bcp/locals pointers. So we do a runtime call ++ // by hand. ++ // ++ Label slow_path; ++ ++ // We need an acquire here to ensure that any subsequent load of the ++ // global SafepointSynchronize::_state flag is ordered after this load ++ // of the thread-local polling word. We don't want this poll to ++ // return false (i.e. not safepointing) and a later poll of the global ++ // SafepointSynchronize::_state spuriously to return true. ++ // ++ // This is to avoid a race when we're in a native->Java transition ++ // racing the code which wakes up from a safepoint. ++ __ safepoint_poll(slow_path, TREG, true /* at_return */, true /* acquire */, false /* in_nmethod */); ++ __ ld_w(AT, TREG, in_bytes(JavaThread::suspend_flags_offset())); ++ __ beq(AT, R0, Continue); ++ __ bind(slow_path); ++ __ move(A0, TREG); ++ __ call(CAST_FROM_FN_PTR(address, JavaThread::check_special_condition_for_native_trans), ++ relocInfo::runtime_call_type); ++ ++ //add for compressedoops ++ __ reinit_heapbase(); ++ __ bind(Continue); ++ } ++ ++ // change thread state ++ __ li(t, _thread_in_Java); ++ if (os::is_MP()) { ++ __ addi_d(AT, TREG, in_bytes(JavaThread::thread_state_offset())); ++ __ amswap_db_w(R0, t, AT); ++ } else { ++ __ st_w(t, TREG, in_bytes(JavaThread::thread_state_offset())); ++ } ++ __ reset_last_Java_frame(TREG, true); ++ ++ if (CheckJNICalls) { ++ // clear_pending_jni_exception_check ++ __ st_d(R0, TREG, in_bytes(JavaThread::pending_jni_exception_check_fn_offset())); ++ } ++ ++ // reset handle block ++ __ ld_d(t, TREG, in_bytes(JavaThread::active_handles_offset())); ++ __ st_w(R0, Address(t, JNIHandleBlock::top_offset())); ++ ++ // If result was an oop then unbox and save it in the frame ++ { ++ Label no_oop; ++ __ ld_d(AT, FP, frame::interpreter_frame_result_handler_offset*wordSize); ++ __ li(T0, AbstractInterpreter::result_handler(T_OBJECT)); ++ __ bne(AT, T0, no_oop); ++ __ pop(ltos); ++ // Unbox oop result, e.g. JNIHandles::resolve value. ++ __ resolve_jobject(V0, SCR2, SCR1); ++ __ st_d(V0, FP, (frame::interpreter_frame_oop_temp_offset)*wordSize); ++ // keep stack depth as expected by pushing oop which will eventually be discarded ++ __ push(ltos); ++ __ bind(no_oop); ++ } ++ { ++ Label no_reguard; ++ __ ld_w(t, TREG, in_bytes(JavaThread::stack_guard_state_offset())); ++ __ li(AT, (u1)StackOverflow::stack_guard_yellow_reserved_disabled); ++ __ bne(t, AT, no_reguard); ++ __ push_call_clobbered_registers(); ++ __ move(S5_heapbase, SP); ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); ++ __ call(CAST_FROM_FN_PTR(address, SharedRuntime::reguard_yellow_pages), relocInfo::runtime_call_type); ++ __ move(SP, S5_heapbase); ++ __ pop_call_clobbered_registers(); ++ //add for compressedoops ++ __ reinit_heapbase(); ++ __ bind(no_reguard); ++ } ++ // restore BCP to have legal interpreter frame, ++ // i.e., bci == 0 <=> BCP == code_base() ++ // Can't call_VM until bcp is within reasonable. ++ __ get_method(method); // method is junk from thread_in_native to now. ++ __ ld_d(BCP, method, in_bytes(Method::const_offset())); ++ __ lea(BCP, Address(BCP, in_bytes(ConstMethod::codes_offset()))); ++ // handle exceptions (exception handling will handle unlocking!) ++ { ++ Label L; ++ __ ld_d(t, TREG, in_bytes(Thread::pending_exception_offset())); ++ __ beq(t, R0, L); ++ // Note: At some point we may want to unify this with the code used in ++ // call_VM_base(); ++ // i.e., we should use the StubRoutines::forward_exception code. For now this ++ // doesn't work here because the sp is not correctly set at this point. ++ __ MacroAssembler::call_VM(noreg, ++ CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::throw_pending_exception)); ++ __ should_not_reach_here(); ++ __ bind(L); ++ } ++ ++ // do unlocking if necessary ++ { ++ const Register monitor_reg = T0; ++ Label L; ++ __ ld_w(t, method, in_bytes(Method::access_flags_offset())); ++ __ andi(t, t, JVM_ACC_SYNCHRONIZED); ++ __ addi_d(monitor_reg, FP, frame::interpreter_frame_initial_sp_offset * wordSize - (int)sizeof(BasicObjectLock)); ++ __ beq(t, R0, L); ++ // the code below should be shared with interpreter macro assembler implementation ++ { ++ Label unlock; ++ // BasicObjectLock will be first in list, ++ // since this is a synchronized method. However, need ++ // to check that the object has not been unlocked by ++ // an explicit monitorexit bytecode. ++ // address of first monitor ++ ++ __ ld_d(t, Address(monitor_reg, BasicObjectLock::obj_offset())); ++ __ bne(t, R0, unlock); ++ ++ // Entry already unlocked, need to throw exception ++ __ MacroAssembler::call_VM(NOREG, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::throw_illegal_monitor_state_exception)); ++ __ should_not_reach_here(); ++ ++ __ bind(unlock); ++ __ unlock_object(monitor_reg); ++ } ++ __ bind(L); ++ } ++ ++ // jvmti/jvmpi support ++ // Note: This must happen _after_ handling/throwing any exceptions since ++ // the exception handler code notifies the runtime of method exits ++ // too. If this happens before, method entry/exit notifications are ++ // not properly paired (was bug - gri 11/22/99). ++ __ notify_method_exit(vtos, InterpreterMacroAssembler::NotifyJVMTI); ++ ++ // restore potential result in V0, ++ // call result handler to restore potential result in ST0 & handle result ++ ++ __ pop(ltos); ++ __ pop(dtos); ++ ++ __ ld_d(t, FP, (frame::interpreter_frame_result_handler_offset) * wordSize); ++ __ jalr(t); ++ ++ ++ // remove activation ++ __ ld_d(SP, FP, frame::interpreter_frame_sender_sp_offset * wordSize); // get sender sp ++ __ ld_d(RA, FP, frame::return_addr_offset * wordSize); // get return address ++ __ ld_d(FP, FP, frame::link_offset * wordSize); // restore sender's fp ++ __ jr(RA); ++ ++ if (inc_counter) { ++ // Handle overflow of counter and compile method ++ __ bind(invocation_counter_overflow); ++ generate_counter_overflow(continue_after_compile); ++ } ++ ++ return entry_point; ++} ++ ++void TemplateInterpreterGenerator::bang_stack_shadow_pages(bool native_call) { ++ // Quick & dirty stack overflow checking: bang the stack & handle trap. ++ // Note that we do the banging after the frame is setup, since the exception ++ // handling code expects to find a valid interpreter frame on the stack. ++ // Doing the banging earlier fails if the caller frame is not an interpreter ++ // frame. ++ // (Also, the exception throwing code expects to unlock any synchronized ++ // method receiever, so do the banging after locking the receiver.) ++ ++ // Bang each page in the shadow zone. We can't assume it's been done for ++ // an interpreter frame with greater than a page of locals, so each page ++ // needs to be checked. Only true for non-native. ++ const int page_size = (int)os::vm_page_size(); ++ const int n_shadow_pages = ((int)StackOverflow::stack_shadow_zone_size()) / page_size; ++ const int start_page = native_call ? n_shadow_pages : 1; ++ BLOCK_COMMENT("bang_stack_shadow_pages:"); ++ for (int pages = start_page; pages <= n_shadow_pages; pages++) { ++ __ bang_stack_with_offset(pages*page_size); ++ } ++} ++ ++// ++// Generic interpreted method entry to (asm) interpreter ++// ++// Layout of frame just at the entry ++// ++// [ argument word n-1 ] <--- sp ++// ... ++// [ argument word 0 ] ++// assume Method* in Rmethod before call this method. ++// prerequisites to the generated stub : the callee Method* in Rmethod ++// note you must save the caller bcp before call the generated stub ++// ++address TemplateInterpreterGenerator::generate_normal_entry(bool synchronized) { ++ // determine code generation flags ++ bool inc_counter = UseCompiler || CountCompiledCalls; ++ ++ // Rmethod: Method* ++ // Rsender: sender 's sp ++ address entry_point = __ pc(); ++ ++ const Address invocation_counter(Rmethod, ++ in_bytes(MethodCounters::invocation_counter_offset() + InvocationCounter::counter_offset())); ++ ++ // get parameter size (always needed) ++ __ ld_d(T3, Rmethod, in_bytes(Method::const_offset())); //T3 --> Rmethod._constMethod ++ __ ld_hu(V0, T3, in_bytes(ConstMethod::size_of_parameters_offset())); ++ ++ // Rmethod: Method* ++ // V0: size of parameters ++ // Rsender: sender 's sp ,could be different from sp+ wordSize if we call via c2i ++ // get size of locals in words to T2 ++ __ ld_hu(T2, T3, in_bytes(ConstMethod::size_of_locals_offset())); ++ // T2 = no. of additional locals, locals include parameters ++ __ sub_d(T2, T2, V0); ++ ++ // see if we've got enough room on the stack for locals plus overhead. ++ // Layout of frame at this point ++ // ++ // [ argument word n-1 ] <--- sp ++ // ... ++ // [ argument word 0 ] ++ generate_stack_overflow_check(); ++ // after this function, the layout of frame does not change ++ ++ // compute beginning of parameters (LVP) ++ __ slli_d(LVP, V0, LogBytesPerWord); ++ __ addi_d(LVP, LVP, (-1) * wordSize); ++ __ add_d(LVP, LVP, SP); ++ ++ // T2 - # of additional locals ++ // allocate space for locals ++ // explicitly initialize locals ++ { ++ Label exit, loop; ++ __ beq(T2, R0, exit); ++ ++ __ bind(loop); ++ __ addi_d(SP, SP, (-1) * wordSize); ++ __ addi_d(T2, T2, -1); // until everything initialized ++ __ st_d(R0, SP, 0); // initialize local variables ++ __ bne(T2, R0, loop); ++ ++ __ bind(exit); ++ } ++ ++ // And the base dispatch table ++ __ get_dispatch(); ++ ++ // initialize fixed part of activation frame ++ generate_fixed_frame(false); ++ ++ ++ // after this function, the layout of frame is as following ++ // ++ // [ monitor block top ] <--- sp ( the top monitor entry ) ++ // [ byte code pointer ] (if native, bcp = 0) ++ // [ constant pool cache ] ++ // [ Method* ] ++ // [ locals offset ] ++ // [ sender's sp ] ++ // [ sender's fp ] <--- fp ++ // [ return address ] ++ // [ local var m-1 ] ++ // ... ++ // [ local var 0 ] ++ // [ argument word n-1 ] <--- ( sender's sp ) ++ // ... ++ // [ argument word 0 ] <--- LVP ++ ++ ++ // make sure method is not native & not abstract ++#ifdef ASSERT ++ __ ld_d(AT, Rmethod, in_bytes(Method::access_flags_offset())); ++ { ++ Label L; ++ __ andi(T2, AT, JVM_ACC_NATIVE); ++ __ beq(T2, R0, L); ++ __ stop("tried to execute native method as non-native"); ++ __ bind(L); ++ } ++ { ++ Label L; ++ __ andi(T2, AT, JVM_ACC_ABSTRACT); ++ __ beq(T2, R0, L); ++ __ stop("tried to execute abstract method in interpreter"); ++ __ bind(L); ++ } ++#endif ++ ++ // Since at this point in the method invocation the exception handler ++ // would try to exit the monitor of synchronized methods which hasn't ++ // been entered yet, we set the thread local variable ++ // _do_not_unlock_if_synchronized to true. The remove_activation will ++ // check this flag. ++ ++ __ li(AT, (int)true); ++ __ st_b(AT, TREG, in_bytes(JavaThread::do_not_unlock_if_synchronized_offset())); ++ ++ // mdp : T8 ++ // tmp1: T4 ++ // tmp2: T2 ++ __ profile_parameters_type(T8, T4, T2); ++ ++ // increment invocation count & check for overflow ++ Label invocation_counter_overflow; ++ if (inc_counter) { ++ generate_counter_incr(&invocation_counter_overflow); ++ } ++ ++ Label continue_after_compile; ++ __ bind(continue_after_compile); ++ ++ bang_stack_shadow_pages(false); ++ ++ // reset the _do_not_unlock_if_synchronized flag ++ __ st_b(R0, TREG, in_bytes(JavaThread::do_not_unlock_if_synchronized_offset())); ++ ++ // check for synchronized methods ++ // Must happen AFTER invocation_counter check and stack overflow check, ++ // so method is not locked if overflows. ++ // ++ if (synchronized) { ++ // Allocate monitor and lock method ++ lock_method(); ++ } else { ++ // no synchronization necessary ++#ifdef ASSERT ++ { Label L; ++ __ ld_w(AT, Rmethod, in_bytes(Method::access_flags_offset())); ++ __ andi(T2, AT, JVM_ACC_SYNCHRONIZED); ++ __ beq(T2, R0, L); ++ __ stop("method needs synchronization"); ++ __ bind(L); ++ } ++#endif ++ } ++ ++ // layout of frame after lock_method ++ // [ monitor entry ] <--- sp ++ // ... ++ // [ monitor entry ] ++ // [ monitor block top ] ( the top monitor entry ) ++ // [ byte code pointer ] (if native, bcp = 0) ++ // [ constant pool cache ] ++ // [ Method* ] ++ // [ locals offset ] ++ // [ sender's sp ] ++ // [ sender's fp ] ++ // [ return address ] <--- fp ++ // [ local var m-1 ] ++ // ... ++ // [ local var 0 ] ++ // [ argument word n-1 ] <--- ( sender's sp ) ++ // ... ++ // [ argument word 0 ] <--- LVP ++ ++ ++ // start execution ++#ifdef ASSERT ++ { ++ Label L; ++ __ ld_d(AT, FP, frame::interpreter_frame_monitor_block_top_offset * wordSize); ++ __ beq(AT, SP, L); ++ __ stop("broken stack frame setup in interpreter in native"); ++ __ bind(L); ++ } ++#endif ++ ++ // jvmti/jvmpi support ++ __ notify_method_entry(); ++ ++ __ dispatch_next(vtos); ++ ++ // invocation counter overflow ++ if (inc_counter) { ++ // Handle overflow of counter and compile method ++ __ bind(invocation_counter_overflow); ++ generate_counter_overflow(continue_after_compile); ++ } ++ ++ return entry_point; ++} ++ ++//----------------------------------------------------------------------------- ++// Exceptions ++ ++void TemplateInterpreterGenerator::generate_throw_exception() { ++ // Entry point in previous activation (i.e., if the caller was ++ // interpreted) ++ Interpreter::_rethrow_exception_entry = __ pc(); ++ // Restore sp to interpreter_frame_last_sp even though we are going ++ // to empty the expression stack for the exception processing. ++ __ st_d(R0,FP, frame::interpreter_frame_last_sp_offset * wordSize); ++ ++ // V0: exception ++ // V1: return address/pc that threw exception ++ __ restore_bcp(); // BCP points to call/send ++ __ restore_locals(); ++ __ reinit_heapbase(); ++ __ get_dispatch(); ++ ++ // Entry point for exceptions thrown within interpreter code ++ Interpreter::_throw_exception_entry = __ pc(); ++ // expression stack is undefined here ++ // V0: exception ++ // BCP: exception bcp ++ __ verify_oop(V0); ++ ++ // expression stack must be empty before entering the VM in case of an exception ++ __ empty_expression_stack(); ++ // find exception handler address and preserve exception oop ++ __ move(A1, V0); ++ __ call_VM(V1, CAST_FROM_FN_PTR(address, InterpreterRuntime::exception_handler_for_exception), A1); ++ // V0: exception handler entry point ++ // V1: preserved exception oop ++ // S0: bcp for exception handler ++ __ push(V1); // push exception which is now the only value on the stack ++ __ jr(V0); // jump to exception handler (may be _remove_activation_entry!) ++ ++ // If the exception is not handled in the current frame the frame is removed and ++ // the exception is rethrown (i.e. exception continuation is _rethrow_exception). ++ // ++ // Note: At this point the bci is still the bxi for the instruction which caused ++ // the exception and the expression stack is empty. Thus, for any VM calls ++ // at this point, GC will find a legal oop map (with empty expression stack). ++ ++ // In current activation ++ // V0: exception ++ // BCP: exception bcp ++ ++ // ++ // JVMTI PopFrame support ++ // ++ ++ Interpreter::_remove_activation_preserving_args_entry = __ pc(); ++ __ empty_expression_stack(); ++ // Set the popframe_processing bit in pending_popframe_condition indicating that we are ++ // currently handling popframe, so that call_VMs that may happen later do not trigger new ++ // popframe handling cycles. ++ __ ld_w(T3, TREG, in_bytes(JavaThread::popframe_condition_offset())); ++ __ ori(T3, T3, JavaThread::popframe_processing_bit); ++ __ st_w(T3, TREG, in_bytes(JavaThread::popframe_condition_offset())); ++ ++ { ++ // Check to see whether we are returning to a deoptimized frame. ++ // (The PopFrame call ensures that the caller of the popped frame is ++ // either interpreted or compiled and deoptimizes it if compiled.) ++ // In this case, we can't call dispatch_next() after the frame is ++ // popped, but instead must save the incoming arguments and restore ++ // them after deoptimization has occurred. ++ // ++ // Note that we don't compare the return PC against the ++ // deoptimization blob's unpack entry because of the presence of ++ // adapter frames in C2. ++ Label caller_not_deoptimized; ++ __ ld_d(A0, FP, frame::return_addr_offset * wordSize); ++ __ super_call_VM_leaf(CAST_FROM_FN_PTR(address, InterpreterRuntime::interpreter_contains), A0); ++ __ bne(V0, R0, caller_not_deoptimized); ++ ++ // Compute size of arguments for saving when returning to deoptimized caller ++ __ get_method(A1); ++ __ ld_d(A1, A1, in_bytes(Method::const_offset())); ++ __ ld_hu(A1, A1, in_bytes(ConstMethod::size_of_parameters_offset())); ++ __ slli_d(A1, A1, Interpreter::logStackElementSize); ++ __ restore_locals(); ++ __ sub_d(A2, LVP, A1); ++ __ addi_d(A2, A2, wordSize); ++ // Save these arguments ++ __ move(A0, TREG); ++ __ super_call_VM_leaf(CAST_FROM_FN_PTR(address, Deoptimization::popframe_preserve_args), A0, A1, A2); ++ ++ __ remove_activation(vtos, ++ /* throw_monitor_exception */ false, ++ /* install_monitor_exception */ false, ++ /* notify_jvmdi */ false); ++ ++ // Inform deoptimization that it is responsible for restoring these arguments ++ __ li(AT, JavaThread::popframe_force_deopt_reexecution_bit); ++ __ st_w(AT, TREG, in_bytes(JavaThread::popframe_condition_offset())); ++ // Continue in deoptimization handler ++ __ jr(RA); ++ ++ __ bind(caller_not_deoptimized); ++ } ++ ++ __ remove_activation(vtos, ++ /* throw_monitor_exception */ false, ++ /* install_monitor_exception */ false, ++ /* notify_jvmdi */ false); ++ ++ // Clear the popframe condition flag ++ // Finish with popframe handling ++ // A previous I2C followed by a deoptimization might have moved the ++ // outgoing arguments further up the stack. PopFrame expects the ++ // mutations to those outgoing arguments to be preserved and other ++ // constraints basically require this frame to look exactly as ++ // though it had previously invoked an interpreted activation with ++ // no space between the top of the expression stack (current ++ // last_sp) and the top of stack. Rather than force deopt to ++ // maintain this kind of invariant all the time we call a small ++ // fixup routine to move the mutated arguments onto the top of our ++ // expression stack if necessary. ++ __ move(T8, SP); ++ __ ld_d(A2, FP, frame::interpreter_frame_last_sp_offset * wordSize); ++ // PC must point into interpreter here ++ Label L; ++ __ bind(L); ++ __ set_last_Java_frame(TREG, noreg, FP, L); ++ __ super_call_VM_leaf(CAST_FROM_FN_PTR(address, InterpreterRuntime::popframe_move_outgoing_args), TREG, T8, A2); ++ __ reset_last_Java_frame(TREG, true); ++ // Restore the last_sp and null it out ++ __ ld_d(SP, FP, frame::interpreter_frame_last_sp_offset * wordSize); ++ __ st_d(R0, FP, frame::interpreter_frame_last_sp_offset * wordSize); ++ ++ ++ ++ __ li(AT, JavaThread::popframe_inactive); ++ __ st_w(AT, TREG, in_bytes(JavaThread::popframe_condition_offset())); ++ ++ // Finish with popframe handling ++ __ restore_bcp(); ++ __ restore_locals(); ++ __ get_method(Rmethod); ++ __ get_dispatch(); ++ ++ // The method data pointer was incremented already during ++ // call profiling. We have to restore the mdp for the current bcp. ++ if (ProfileInterpreter) { ++ __ set_method_data_pointer_for_bcp(); ++ } ++ ++ // Clear the popframe condition flag ++ __ li(AT, JavaThread::popframe_inactive); ++ __ st_w(AT, TREG, in_bytes(JavaThread::popframe_condition_offset())); ++ ++#if INCLUDE_JVMTI ++ { ++ Label L_done; ++ ++ __ ld_bu(AT, BCP, 0); ++ __ addi_d(AT, AT, -1 * Bytecodes::_invokestatic); ++ __ bne(AT, R0, L_done); ++ ++ // The member name argument must be restored if _invokestatic is re-executed after a PopFrame call. ++ // Detect such a case in the InterpreterRuntime function and return the member name argument, or null. ++ ++ __ ld_d(T8, LVP, 0); ++ __ call_VM(T8, CAST_FROM_FN_PTR(address, InterpreterRuntime::member_name_arg_or_null), T8, Rmethod, BCP); ++ ++ __ beq(T8, R0, L_done); ++ ++ __ st_d(T8, SP, 0); ++ __ bind(L_done); ++ } ++#endif // INCLUDE_JVMTI ++ ++ __ dispatch_next(vtos); ++ // end of PopFrame support ++ ++ Interpreter::_remove_activation_entry = __ pc(); ++ ++ // preserve exception over this code sequence ++ __ pop(T0); ++ __ st_d(T0, TREG, in_bytes(JavaThread::vm_result_offset())); ++ // remove the activation (without doing throws on illegalMonitorExceptions) ++ __ remove_activation(vtos, false, true, false); ++ // restore exception ++ __ get_vm_result(T0, TREG); ++ ++ // In between activations - previous activation type unknown yet ++ // compute continuation point - the continuation point expects ++ // the following registers set up: ++ // ++ // T0: exception ++ // RA: return address/pc that threw exception ++ // SP: expression stack of caller ++ // FP: fp of caller ++ __ push2(T0, RA); // save exception and return address ++ __ super_call_VM_leaf(CAST_FROM_FN_PTR(address, ++ SharedRuntime::exception_handler_for_return_address), TREG, RA); ++ __ move(T4, A0); // save exception handler ++ __ pop2(A0, A1); // restore return address and exception ++ ++ // Note that an "issuing PC" is actually the next PC after the call ++ __ jr(T4); // jump to exception handler of caller ++} ++ ++ ++// ++// JVMTI ForceEarlyReturn support ++// ++address TemplateInterpreterGenerator::generate_earlyret_entry_for(TosState state) { ++ address entry = __ pc(); ++ ++ __ restore_bcp(); ++ __ restore_locals(); ++ __ empty_expression_stack(); ++ __ load_earlyret_value(state); ++ ++ __ ld_d(T4, Address(TREG, JavaThread::jvmti_thread_state_offset())); ++ const Address cond_addr(T4, in_bytes(JvmtiThreadState::earlyret_state_offset())); ++ ++ // Clear the earlyret state ++ __ li(AT, JvmtiThreadState::earlyret_inactive); ++ __ st_w(AT, cond_addr); ++ ++ __ remove_activation(state, ++ false, /* throw_monitor_exception */ ++ false, /* install_monitor_exception */ ++ true); /* notify_jvmdi */ ++ __ jr(RA); ++ ++ return entry; ++} // end of ForceEarlyReturn support ++ ++ ++//----------------------------------------------------------------------------- ++// Helper for vtos entry point generation ++ ++void TemplateInterpreterGenerator::set_vtos_entry_points(Template* t, ++ address& bep, ++ address& cep, ++ address& sep, ++ address& aep, ++ address& iep, ++ address& lep, ++ address& fep, ++ address& dep, ++ address& vep) { ++ assert(t->is_valid() && t->tos_in() == vtos, "illegal template"); ++ Label L; ++ fep = __ pc(); __ push(ftos); __ b(L); ++ dep = __ pc(); __ push(dtos); __ b(L); ++ lep = __ pc(); __ push(ltos); __ b(L); ++ aep =__ pc(); __ push(atos); __ b(L); ++ bep = cep = sep = ++ iep = __ pc(); __ push(itos); ++ vep = __ pc(); ++ __ bind(L); ++ generate_and_dispatch(t); ++} ++ ++//----------------------------------------------------------------------------- ++ ++// Non-product code ++#ifndef PRODUCT ++address TemplateInterpreterGenerator::generate_trace_code(TosState state) { ++ address entry = __ pc(); ++ ++ // prepare expression stack ++ __ push(state); // save tosca ++ ++ // tos & tos2 ++ // trace_bytecode need actually 4 args, the last two is tos&tos2 ++ // this work fine for x86. but LA ABI calling convention will store A2-A3 ++ // to the stack position it think is the tos&tos2 ++ // when the expression stack have no more than 2 data, error occur. ++ __ ld_d(A2, SP, 0); ++ __ ld_d(A3, SP, 1 * wordSize); ++ ++ // pass arguments & call tracer ++ __ call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::trace_bytecode), RA, A2, A3); ++ __ move(RA, V0); // make sure return address is not destroyed by pop(state) ++ ++ // restore expression stack ++ __ pop(state); // restore tosca ++ ++ // return ++ __ jr(RA); ++ return entry; ++} ++ ++void TemplateInterpreterGenerator::count_bytecode() { ++ __ li(T8, (long)&BytecodeCounter::_counter_value); ++ __ ld_w(AT, T8, 0); ++ __ addi_d(AT, AT, 1); ++ __ st_w(AT, T8, 0); ++} ++ ++void TemplateInterpreterGenerator::histogram_bytecode(Template* t) { ++ __ li(T8, (long)&BytecodeHistogram::_counters[t->bytecode()]); ++ __ ld_w(AT, T8, 0); ++ __ addi_d(AT, AT, 1); ++ __ st_w(AT, T8, 0); ++} ++ ++void TemplateInterpreterGenerator::histogram_bytecode_pair(Template* t) { ++ __ li(T8, (long)&BytecodePairHistogram::_index); ++ __ ld_w(T4, T8, 0); ++ __ srli_d(T4, T4, BytecodePairHistogram::log2_number_of_codes); ++ __ li(T8, ((long)t->bytecode()) << BytecodePairHistogram::log2_number_of_codes); ++ __ orr(T4, T4, T8); ++ __ li(T8, (long)&BytecodePairHistogram::_index); ++ __ st_w(T4, T8, 0); ++ __ slli_d(T4, T4, 2); ++ __ li(T8, (long)BytecodePairHistogram::_counters); ++ __ add_d(T8, T8, T4); ++ __ ld_w(AT, T8, 0); ++ __ addi_d(AT, AT, 1); ++ __ st_w(AT, T8, 0); ++} ++ ++ ++void TemplateInterpreterGenerator::trace_bytecode(Template* t) { ++ // Call a little run-time stub to avoid blow-up for each bytecode. ++ // The run-time runtime saves the right registers, depending on ++ // the tosca in-state for the given template. ++ address entry = Interpreter::trace_code(t->tos_in()); ++ assert(entry != nullptr, "entry must have been generated"); ++ __ call(entry, relocInfo::none); ++ //add for compressedoops ++ __ reinit_heapbase(); ++} ++ ++ ++void TemplateInterpreterGenerator::stop_interpreter_at() { ++ Label L; ++ __ li(T8, long(&BytecodeCounter::_counter_value)); ++ __ ld_w(T8, T8, 0); ++ __ li(AT, StopInterpreterAt); ++ __ bne(T8, AT, L); ++ __ brk(5); ++ __ bind(L); ++} ++#endif // !PRODUCT +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/templateTable_loongarch_64.cpp b/src/hotspot/cpu/loongarch/templateTable_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/templateTable_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/templateTable_loongarch_64.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,4007 @@ ++/* ++ * Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "interpreter/interpreter.hpp" ++#include "interpreter/interpreterRuntime.hpp" ++#include "interpreter/interp_masm.hpp" ++#include "interpreter/templateTable.hpp" ++#include "gc/shared/collectedHeap.hpp" ++#include "memory/universe.hpp" ++#include "oops/klass.inline.hpp" ++#include "oops/methodData.hpp" ++#include "oops/objArrayKlass.hpp" ++#include "oops/oop.inline.hpp" ++#include "prims/jvmtiExport.hpp" ++#include "prims/methodHandles.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "runtime/synchronizer.hpp" ++#include "utilities/macros.hpp" ++ ++ ++#define __ _masm-> ++ ++ ++// Address computation: local variables ++ ++static inline Address iaddress(int n) { ++ return Address(LVP, Interpreter::local_offset_in_bytes(n)); ++} ++ ++static inline Address laddress(int n) { ++ return iaddress(n + 1); ++} ++ ++static inline Address faddress(int n) { ++ return iaddress(n); ++} ++ ++static inline Address daddress(int n) { ++ return laddress(n); ++} ++ ++static inline Address aaddress(int n) { ++ return iaddress(n); ++} ++static inline Address haddress(int n) { return iaddress(n + 0); } ++ ++ ++static inline Address at_sp() { return Address(SP, 0); } ++static inline Address at_sp_p1() { return Address(SP, 1 * wordSize); } ++static inline Address at_sp_p2() { return Address(SP, 2 * wordSize); } ++ ++// At top of Java expression stack which may be different than sp(). ++// It isn't for category 1 objects. ++static inline Address at_tos () { ++ Address tos = Address(SP, Interpreter::expr_offset_in_bytes(0)); ++ return tos; ++} ++ ++static inline Address at_tos_p1() { ++ return Address(SP, Interpreter::expr_offset_in_bytes(1)); ++} ++ ++static inline Address at_tos_p2() { ++ return Address(SP, Interpreter::expr_offset_in_bytes(2)); ++} ++ ++static inline Address at_tos_p3() { ++ return Address(SP, Interpreter::expr_offset_in_bytes(3)); ++} ++ ++// we use S0 as bcp, be sure you have bcp in S0 before you call any of the Template generator ++Address TemplateTable::at_bcp(int offset) { ++ assert(_desc->uses_bcp(), "inconsistent uses_bcp information"); ++ return Address(BCP, offset); ++} ++ ++// Miscellaneous helper routines ++// Store an oop (or null) at the address described by obj. ++// If val == noreg this means store a null ++static void do_oop_store(InterpreterMacroAssembler* _masm, ++ Address dst, ++ Register val, ++ DecoratorSet decorators = 0) { ++ assert(val == noreg || val == V0, "parameter is just for looks"); ++ __ store_heap_oop(dst, val, T8, T1, T3, decorators); ++} ++ ++static void do_oop_load(InterpreterMacroAssembler* _masm, ++ Address src, ++ Register dst, ++ DecoratorSet decorators = 0) { ++ __ load_heap_oop(dst, src, T4, T8, decorators); ++} ++ ++// bytecode folding ++void TemplateTable::patch_bytecode(Bytecodes::Code bc, Register bc_reg, ++ Register tmp_reg, bool load_bc_into_bc_reg/*=true*/, ++ int byte_no) { ++ if (!RewriteBytecodes) return; ++ Label L_patch_done; ++ ++ switch (bc) { ++ case Bytecodes::_fast_aputfield: ++ case Bytecodes::_fast_bputfield: ++ case Bytecodes::_fast_zputfield: ++ case Bytecodes::_fast_cputfield: ++ case Bytecodes::_fast_dputfield: ++ case Bytecodes::_fast_fputfield: ++ case Bytecodes::_fast_iputfield: ++ case Bytecodes::_fast_lputfield: ++ case Bytecodes::_fast_sputfield: ++ { ++ // We skip bytecode quickening for putfield instructions when ++ // the put_code written to the constant pool cache is zero. ++ // This is required so that every execution of this instruction ++ // calls out to InterpreterRuntime::resolve_get_put to do ++ // additional, required work. ++ assert(byte_no == f1_byte || byte_no == f2_byte, "byte_no out of range"); ++ assert(load_bc_into_bc_reg, "we use bc_reg as temp"); ++ __ get_cache_and_index_and_bytecode_at_bcp(tmp_reg, bc_reg, tmp_reg, byte_no, 1); ++ __ addi_d(bc_reg, R0, bc); ++ __ beq(tmp_reg, R0, L_patch_done); ++ } ++ break; ++ default: ++ assert(byte_no == -1, "sanity"); ++ // the pair bytecodes have already done the load. ++ if (load_bc_into_bc_reg) { ++ __ li(bc_reg, bc); ++ } ++ } ++ ++ if (JvmtiExport::can_post_breakpoint()) { ++ Label L_fast_patch; ++ // if a breakpoint is present we can't rewrite the stream directly ++ __ ld_bu(tmp_reg, at_bcp(0)); ++ __ li(AT, Bytecodes::_breakpoint); ++ __ bne(tmp_reg, AT, L_fast_patch); ++ ++ __ get_method(tmp_reg); ++ // Let breakpoint table handling rewrite to quicker bytecode ++ __ call_VM(NOREG, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::set_original_bytecode_at), tmp_reg, BCP, bc_reg); ++ ++ __ b(L_patch_done); ++ __ bind(L_fast_patch); ++ } ++ ++#ifdef ASSERT ++ Label L_okay; ++ __ ld_bu(tmp_reg, at_bcp(0)); ++ __ li(AT, (int)Bytecodes::java_code(bc)); ++ __ beq(tmp_reg, AT, L_okay); ++ __ beq(tmp_reg, bc_reg, L_patch_done); ++ __ stop("patching the wrong bytecode"); ++ __ bind(L_okay); ++#endif ++ ++ // patch bytecode ++ __ st_b(bc_reg, at_bcp(0)); ++ __ bind(L_patch_done); ++} ++ ++ ++// Individual instructions ++ ++void TemplateTable::nop() { ++ transition(vtos, vtos); ++ // nothing to do ++} ++ ++void TemplateTable::shouldnotreachhere() { ++ transition(vtos, vtos); ++ __ stop("shouldnotreachhere bytecode"); ++} ++ ++void TemplateTable::aconst_null() { ++ transition(vtos, atos); ++ __ move(FSR, R0); ++} ++ ++void TemplateTable::iconst(int value) { ++ transition(vtos, itos); ++ if (value == 0) { ++ __ move(FSR, R0); ++ } else { ++ __ li(FSR, value); ++ } ++} ++ ++void TemplateTable::lconst(int value) { ++ transition(vtos, ltos); ++ if (value == 0) { ++ __ move(FSR, R0); ++ } else { ++ __ li(FSR, value); ++ } ++} ++ ++void TemplateTable::fconst(int value) { ++ transition(vtos, ftos); ++ switch( value ) { ++ case 0: __ movgr2fr_w(FSF, R0); return; ++ case 1: __ addi_d(AT, R0, 1); break; ++ case 2: __ addi_d(AT, R0, 2); break; ++ default: ShouldNotReachHere(); ++ } ++ __ movgr2fr_w(FSF, AT); ++ __ ffint_s_w(FSF, FSF); ++} ++ ++void TemplateTable::dconst(int value) { ++ transition(vtos, dtos); ++ switch( value ) { ++ case 0: __ movgr2fr_d(FSF, R0); ++ return; ++ case 1: __ addi_d(AT, R0, 1); ++ __ movgr2fr_d(FSF, AT); ++ __ ffint_d_w(FSF, FSF); ++ break; ++ default: ShouldNotReachHere(); ++ } ++} ++ ++void TemplateTable::bipush() { ++ transition(vtos, itos); ++ __ ld_b(FSR, at_bcp(1)); ++} ++ ++void TemplateTable::sipush() { ++ transition(vtos, itos); ++ __ ld_b(FSR, BCP, 1); ++ __ ld_bu(AT, BCP, 2); ++ __ slli_d(FSR, FSR, 8); ++ __ orr(FSR, FSR, AT); ++} ++ ++// T1 : tags ++// T2 : index ++// T3 : cpool ++// T8 : tag ++void TemplateTable::ldc(LdcType type) { ++ transition(vtos, vtos); ++ Label call_ldc, notFloat, notClass, notInt, Done; ++ // get index in cpool ++ if (is_ldc_wide(type)) { ++ __ get_unsigned_2_byte_index_at_bcp(T2, 1); ++ } else { ++ __ ld_bu(T2, at_bcp(1)); ++ } ++ ++ __ get_cpool_and_tags(T3, T1); ++ ++ const int base_offset = ConstantPool::header_size() * wordSize; ++ const int tags_offset = Array::base_offset_in_bytes(); ++ ++ // get type ++ __ add_d(AT, T1, T2); ++ __ ld_b(T1, AT, tags_offset); ++ if(os::is_MP()) { ++ __ membar(Assembler::Membar_mask_bits(__ LoadLoad|__ LoadStore)); ++ } ++ //now T1 is the tag ++ ++ // unresolved class - get the resolved class ++ __ addi_d(AT, T1, - JVM_CONSTANT_UnresolvedClass); ++ __ beq(AT, R0, call_ldc); ++ ++ // unresolved class in error (resolution failed) - call into runtime ++ // so that the same error from first resolution attempt is thrown. ++ __ addi_d(AT, T1, -JVM_CONSTANT_UnresolvedClassInError); ++ __ beq(AT, R0, call_ldc); ++ ++ // resolved class - need to call vm to get java mirror of the class ++ __ addi_d(AT, T1, - JVM_CONSTANT_Class); ++ __ slli_d(T2, T2, Address::times_8); ++ __ bne(AT, R0, notClass); ++ ++ __ bind(call_ldc); ++ __ li(A1, is_ldc_wide(type) ? 1 : 0); ++ call_VM(FSR, CAST_FROM_FN_PTR(address, InterpreterRuntime::ldc), A1); ++ //__ push(atos); ++ __ addi_d(SP, SP, - Interpreter::stackElementSize); ++ __ st_d(FSR, SP, 0); ++ __ b(Done); ++ ++ __ bind(notClass); ++ __ addi_d(AT, T1, -JVM_CONSTANT_Float); ++ __ bne(AT, R0, notFloat); ++ // ftos ++ __ add_d(AT, T3, T2); ++ __ fld_s(FSF, AT, base_offset); ++ //__ push_f(); ++ __ addi_d(SP, SP, - Interpreter::stackElementSize); ++ __ fst_s(FSF, SP, 0); ++ __ b(Done); ++ ++ __ bind(notFloat); ++ __ addi_d(AT, T1, -JVM_CONSTANT_Integer); ++ __ bne(AT, R0, notInt); ++ // itos ++ __ add_d(T0, T3, T2); ++ __ ld_w(FSR, T0, base_offset); ++ __ push(itos); ++ __ b(Done); ++ ++ // assume the tag is for condy; if not, the VM runtime will tell us ++ __ bind(notInt); ++ condy_helper(Done); ++ ++ __ bind(Done); ++} ++ ++void TemplateTable::condy_helper(Label& Done) { ++ const Register obj = FSR; ++ const Register rarg = A1; ++ const Register flags = A2; ++ const Register off = A3; ++ ++ __ li(rarg, (int)bytecode()); ++ __ call_VM(obj, CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_ldc), rarg); ++ __ get_vm_result_2(flags, TREG); ++ // VMr = obj = base address to find primitive value to push ++ // VMr2 = flags = (tos, off) using format of CPCE::_flags ++ __ li(AT, ConstantPoolCacheEntry::field_index_mask); ++ __ andr(off, flags, AT); ++ __ add_d(obj, off, obj); ++ const Address field(obj, 0 * wordSize); ++ ++ // What sort of thing are we loading? ++ __ srli_d(flags, flags, ConstantPoolCacheEntry::tos_state_shift); ++ ConstantPoolCacheEntry::verify_tos_state_shift(); ++ ++ switch (bytecode()) { ++ case Bytecodes::_ldc: ++ case Bytecodes::_ldc_w: ++ { ++ // tos in (itos, ftos, stos, btos, ctos, ztos) ++ Label notInt, notFloat, notShort, notByte, notChar, notBool; ++ __ addi_d(AT, flags, -itos); ++ __ bne(AT, R0, notInt); ++ // itos ++ __ ld_d(obj, field); ++ __ push(itos); ++ __ b(Done); ++ ++ __ bind(notInt); ++ __ addi_d(AT, flags, -ftos); ++ __ bne(AT, R0, notFloat); ++ // ftos ++ __ fld_s(FSF, field); ++ __ push(ftos); ++ __ b(Done); ++ ++ __ bind(notFloat); ++ __ addi_d(AT, flags, -stos); ++ __ bne(AT, R0, notShort); ++ // stos ++ __ ld_h(obj, field); ++ __ push(stos); ++ __ b(Done); ++ ++ __ bind(notShort); ++ __ addi_d(AT, flags, -btos); ++ __ bne(AT, R0, notByte); ++ // btos ++ __ ld_b(obj, field); ++ __ push(btos); ++ __ b(Done); ++ ++ __ bind(notByte); ++ __ addi_d(AT, flags, -ctos); ++ __ bne(AT, R0, notChar); ++ // ctos ++ __ ld_hu(obj, field); ++ __ push(ctos); ++ __ b(Done); ++ ++ __ bind(notChar); ++ __ addi_d(AT, flags, -ztos); ++ __ bne(AT, R0, notBool); ++ // ztos ++ __ ld_bu(obj, field); ++ __ push(ztos); ++ __ b(Done); ++ ++ __ bind(notBool); ++ break; ++ } ++ ++ case Bytecodes::_ldc2_w: ++ { ++ Label notLong, notDouble; ++ __ addi_d(AT, flags, -ltos); ++ __ bne(AT, R0, notLong); ++ // ltos ++ __ ld_d(obj, field); ++ __ push(ltos); ++ __ b(Done); ++ ++ __ bind(notLong); ++ __ addi_d(AT, flags, -dtos); ++ __ bne(AT, R0, notDouble); ++ // dtos ++ __ fld_d(FSF, field); ++ __ push(dtos); ++ __ b(Done); ++ ++ __ bind(notDouble); ++ break; ++ } ++ ++ default: ++ ShouldNotReachHere(); ++ } ++ ++ __ stop("bad ldc/condy"); ++} ++ ++// Fast path for caching oop constants. ++void TemplateTable::fast_aldc(LdcType type) { ++ transition(vtos, atos); ++ ++ Register result = FSR; ++ Register tmp = A1; ++ Register rarg = A2; ++ ++ int index_size = is_ldc_wide(type) ? sizeof(u2) : sizeof(u1); ++ ++ Label resolved; ++ ++ // We are resolved if the resolved reference cache entry contains a ++ // non-null object (String, MethodType, etc.) ++ assert_different_registers(result, tmp); ++ __ get_cache_index_at_bcp(tmp, 1, index_size); ++ __ load_resolved_reference_at_index(result, tmp, T4); ++ __ bne(result, R0, resolved); ++ ++ address entry = CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_ldc); ++ // first time invocation - must resolve first ++ int i = (int)bytecode(); ++ __ li(rarg, i); ++ __ call_VM(result, entry, rarg); ++ ++ __ bind(resolved); ++ ++ { // Check for the null sentinel. ++ // If we just called the VM, it already did the mapping for us, ++ // but it's harmless to retry. ++ Label notNull; ++ __ li(rarg, (long)Universe::the_null_sentinel_addr()); ++ __ ld_d(tmp, Address(rarg)); ++ __ resolve_oop_handle(tmp, SCR2, SCR1); ++ __ bne(tmp, result, notNull); ++ __ xorr(result, result, result); // null object reference ++ __ bind(notNull); ++ } ++ ++ if (VerifyOops) { ++ __ verify_oop(result); ++ } ++} ++ ++// used register: T2, T3, T1 ++// T2 : index ++// T3 : cpool ++// T1 : tag ++void TemplateTable::ldc2_w() { ++ transition(vtos, vtos); ++ Label notDouble, notLong, Done; ++ ++ // get index in cpool ++ __ get_unsigned_2_byte_index_at_bcp(T2, 1); ++ ++ __ get_cpool_and_tags(T3, T1); ++ ++ const int base_offset = ConstantPool::header_size() * wordSize; ++ const int tags_offset = Array::base_offset_in_bytes(); ++ ++ // get type in T1 ++ __ add_d(AT, T1, T2); ++ __ ld_b(T1, AT, tags_offset); ++ ++ __ addi_d(AT, T1, -JVM_CONSTANT_Double); ++ __ bne(AT, R0, notDouble); ++ ++ // dtos ++ __ alsl_d(AT, T2, T3, Address::times_8 - 1); ++ __ fld_d(FSF, AT, base_offset); ++ __ push(dtos); ++ __ b(Done); ++ ++ __ bind(notDouble); ++ __ addi_d(AT, T1, -JVM_CONSTANT_Long); ++ __ bne(AT, R0, notLong); ++ ++ // ltos ++ __ slli_d(T2, T2, Address::times_8); ++ __ add_d(AT, T3, T2); ++ __ ld_d(FSR, AT, base_offset); ++ __ push(ltos); ++ __ b(Done); ++ ++ __ bind(notLong); ++ condy_helper(Done); ++ ++ __ bind(Done); ++} ++ ++// we compute the actual local variable address here ++void TemplateTable::locals_index(Register reg, int offset) { ++ __ ld_bu(reg, at_bcp(offset)); ++ __ slli_d(reg, reg, Address::times_8); ++ __ sub_d(reg, LVP, reg); ++} ++ ++void TemplateTable::iload() { ++ iload_internal(); ++} ++ ++void TemplateTable::nofast_iload() { ++ iload_internal(may_not_rewrite); ++} ++ ++// this method will do bytecode folding of the two form: ++// iload iload iload caload ++// used register : T2, T3 ++// T2 : bytecode ++// T3 : folded code ++void TemplateTable::iload_internal(RewriteControl rc) { ++ transition(vtos, itos); ++ if (RewriteFrequentPairs && rc == may_rewrite) { ++ Label rewrite, done; ++ // get the next bytecode in T2 ++ __ ld_bu(T2, at_bcp(Bytecodes::length_for(Bytecodes::_iload))); ++ // if _iload, wait to rewrite to iload2. We only want to rewrite the ++ // last two iloads in a pair. Comparing against fast_iload means that ++ // the next bytecode is neither an iload or a caload, and therefore ++ // an iload pair. ++ __ li(AT, Bytecodes::_iload); ++ __ beq(AT, T2, done); ++ ++ __ li(T3, Bytecodes::_fast_iload2); ++ __ li(AT, Bytecodes::_fast_iload); ++ __ beq(AT, T2, rewrite); ++ ++ // if _caload, rewrite to fast_icaload ++ __ li(T3, Bytecodes::_fast_icaload); ++ __ li(AT, Bytecodes::_caload); ++ __ beq(AT, T2, rewrite); ++ ++ // rewrite so iload doesn't check again. ++ __ li(T3, Bytecodes::_fast_iload); ++ ++ // rewrite ++ // T3 : fast bytecode ++ __ bind(rewrite); ++ patch_bytecode(Bytecodes::_iload, T3, T2, false); ++ __ bind(done); ++ } ++ ++ // Get the local value into tos ++ locals_index(T2); ++ __ ld_w(FSR, T2, 0); ++} ++ ++// used register T2 ++// T2 : index ++void TemplateTable::fast_iload2() { ++ transition(vtos, itos); ++ locals_index(T2); ++ __ ld_w(FSR, T2, 0); ++ __ push(itos); ++ locals_index(T2, 3); ++ __ ld_w(FSR, T2, 0); ++} ++ ++// used register T2 ++// T2 : index ++void TemplateTable::fast_iload() { ++ transition(vtos, itos); ++ locals_index(T2); ++ __ ld_w(FSR, T2, 0); ++} ++ ++// used register T2 ++// T2 : index ++void TemplateTable::lload() { ++ transition(vtos, ltos); ++ locals_index(T2); ++ __ ld_d(FSR, T2, -wordSize); ++} ++ ++// used register T2 ++// T2 : index ++void TemplateTable::fload() { ++ transition(vtos, ftos); ++ locals_index(T2); ++ __ fld_s(FSF, T2, 0); ++} ++ ++// used register T2 ++// T2 : index ++void TemplateTable::dload() { ++ transition(vtos, dtos); ++ locals_index(T2); ++ __ fld_d(FSF, T2, -wordSize); ++} ++ ++// used register T2 ++// T2 : index ++void TemplateTable::aload() { ++ transition(vtos, atos); ++ locals_index(T2); ++ __ ld_d(FSR, T2, 0); ++} ++ ++void TemplateTable::locals_index_wide(Register reg) { ++ __ get_unsigned_2_byte_index_at_bcp(reg, 2); ++ __ slli_d(reg, reg, Address::times_8); ++ __ sub_d(reg, LVP, reg); ++} ++ ++// used register T2 ++// T2 : index ++void TemplateTable::wide_iload() { ++ transition(vtos, itos); ++ locals_index_wide(T2); ++ __ ld_d(FSR, T2, 0); ++} ++ ++// used register T2 ++// T2 : index ++void TemplateTable::wide_lload() { ++ transition(vtos, ltos); ++ locals_index_wide(T2); ++ __ ld_d(FSR, T2, -wordSize); ++} ++ ++// used register T2 ++// T2 : index ++void TemplateTable::wide_fload() { ++ transition(vtos, ftos); ++ locals_index_wide(T2); ++ __ fld_s(FSF, T2, 0); ++} ++ ++// used register T2 ++// T2 : index ++void TemplateTable::wide_dload() { ++ transition(vtos, dtos); ++ locals_index_wide(T2); ++ __ fld_d(FSF, T2, -wordSize); ++} ++ ++// used register T2 ++// T2 : index ++void TemplateTable::wide_aload() { ++ transition(vtos, atos); ++ locals_index_wide(T2); ++ __ ld_d(FSR, T2, 0); ++} ++ ++// we use A2 as the register for index, BE CAREFUL! ++// we dont use our tge 29 now, for later optimization ++void TemplateTable::index_check(Register array, Register index) { ++ // Pop ptr into array ++ __ pop_ptr(array); ++ index_check_without_pop(array, index); ++} ++ ++void TemplateTable::index_check_without_pop(Register array, Register index) { ++ // destroys A2 ++ // sign extend since tos (index) might contain garbage in upper bits ++ __ slli_w(index, index, 0); ++ ++ // check index ++ Label ok; ++ __ ld_w(AT, array, arrayOopDesc::length_offset_in_bytes()); ++ __ bltu(index, AT, ok); ++ ++ // throw_ArrayIndexOutOfBoundsException assume abberrant index in A2 ++ assert(index != A1, "smashed arg"); ++ if (A1 != array) __ move(A1, array); ++ if (A2 != index) __ move(A2, index); ++ __ jmp(Interpreter::_throw_ArrayIndexOutOfBoundsException_entry); ++ __ bind(ok); ++} ++ ++void TemplateTable::iaload() { ++ transition(itos, itos); ++ index_check(A1, FSR); ++ __ alsl_d(FSR, FSR, A1, 1); ++ __ access_load_at(T_INT, IN_HEAP | IS_ARRAY, FSR, Address(FSR, arrayOopDesc::base_offset_in_bytes(T_INT)), noreg, noreg); ++} ++ ++void TemplateTable::laload() { ++ transition(itos, ltos); ++ index_check(A1, FSR); ++ __ alsl_d(T4, FSR, A1, Address::times_8 - 1); ++ __ access_load_at(T_LONG, IN_HEAP | IS_ARRAY, FSR, Address(T4, arrayOopDesc::base_offset_in_bytes(T_LONG)), noreg, noreg); ++} ++ ++void TemplateTable::faload() { ++ transition(itos, ftos); ++ index_check(A1, FSR); ++ __ alsl_d(FSR, FSR, A1, Address::times_4 - 1); ++ __ access_load_at(T_FLOAT, IN_HEAP | IS_ARRAY, noreg, Address(FSR, arrayOopDesc::base_offset_in_bytes(T_FLOAT)), noreg, noreg); ++} ++ ++void TemplateTable::daload() { ++ transition(itos, dtos); ++ index_check(A1, FSR); ++ __ alsl_d(T4, FSR, A1, 2); ++ __ access_load_at(T_DOUBLE, IN_HEAP | IS_ARRAY, noreg, Address(T4, arrayOopDesc::base_offset_in_bytes(T_DOUBLE)), noreg, noreg); ++} ++ ++void TemplateTable::aaload() { ++ transition(itos, atos); ++ index_check(A1, FSR); ++ __ alsl_d(FSR, FSR, A1, (UseCompressedOops ? Address::times_4 : Address::times_8) - 1); ++ //add for compressedoops ++ do_oop_load(_masm, ++ Address(FSR, arrayOopDesc::base_offset_in_bytes(T_OBJECT)), ++ FSR, ++ IS_ARRAY); ++} ++ ++void TemplateTable::baload() { ++ transition(itos, itos); ++ index_check(A1, FSR); ++ __ add_d(FSR, A1, FSR); ++ __ access_load_at(T_BYTE, IN_HEAP | IS_ARRAY, FSR, Address(FSR, arrayOopDesc::base_offset_in_bytes(T_BYTE)), noreg, noreg); ++} ++ ++void TemplateTable::caload() { ++ transition(itos, itos); ++ index_check(A1, FSR); ++ __ alsl_d(FSR, FSR, A1, Address::times_2 - 1); ++ __ access_load_at(T_CHAR, IN_HEAP | IS_ARRAY, FSR, Address(FSR, arrayOopDesc::base_offset_in_bytes(T_CHAR)), noreg, noreg); ++} ++ ++// iload followed by caload frequent pair ++// used register : T2 ++// T2 : index ++void TemplateTable::fast_icaload() { ++ transition(vtos, itos); ++ // load index out of locals ++ locals_index(T2); ++ __ ld_w(FSR, T2, 0); ++ index_check(A1, FSR); ++ __ alsl_d(FSR, FSR, A1, 0); ++ __ access_load_at(T_CHAR, IN_HEAP | IS_ARRAY, FSR, Address(FSR, arrayOopDesc::base_offset_in_bytes(T_CHAR)), noreg, noreg); ++} ++ ++void TemplateTable::saload() { ++ transition(itos, itos); ++ index_check(A1, FSR); ++ __ alsl_d(FSR, FSR, A1, Address::times_2 - 1); ++ __ access_load_at(T_SHORT, IN_HEAP | IS_ARRAY, FSR, Address(FSR, arrayOopDesc::base_offset_in_bytes(T_SHORT)), noreg, noreg); ++} ++ ++void TemplateTable::iload(int n) { ++ transition(vtos, itos); ++ __ ld_w(FSR, iaddress(n)); ++} ++ ++void TemplateTable::lload(int n) { ++ transition(vtos, ltos); ++ __ ld_d(FSR, laddress(n)); ++} ++ ++void TemplateTable::fload(int n) { ++ transition(vtos, ftos); ++ __ fld_s(FSF, faddress(n)); ++} ++ ++void TemplateTable::dload(int n) { ++ transition(vtos, dtos); ++ __ fld_d(FSF, laddress(n)); ++} ++ ++void TemplateTable::aload(int n) { ++ transition(vtos, atos); ++ __ ld_d(FSR, aaddress(n)); ++} ++ ++void TemplateTable::aload_0() { ++ aload_0_internal(); ++} ++ ++void TemplateTable::nofast_aload_0() { ++ aload_0_internal(may_not_rewrite); ++} ++ ++// used register : T2, T3 ++// T2 : bytecode ++// T3 : folded code ++void TemplateTable::aload_0_internal(RewriteControl rc) { ++ transition(vtos, atos); ++ // According to bytecode histograms, the pairs: ++ // ++ // _aload_0, _fast_igetfield ++ // _aload_0, _fast_agetfield ++ // _aload_0, _fast_fgetfield ++ // ++ // occur frequently. If RewriteFrequentPairs is set, the (slow) ++ // _aload_0 bytecode checks if the next bytecode is either ++ // _fast_igetfield, _fast_agetfield or _fast_fgetfield and then ++ // rewrites the current bytecode into a pair bytecode; otherwise it ++ // rewrites the current bytecode into _fast_aload_0 that doesn't do ++ // the pair check anymore. ++ // ++ // Note: If the next bytecode is _getfield, the rewrite must be ++ // delayed, otherwise we may miss an opportunity for a pair. ++ // ++ // Also rewrite frequent pairs ++ // aload_0, aload_1 ++ // aload_0, iload_1 ++ // These bytecodes with a small amount of code are most profitable ++ // to rewrite ++ if (RewriteFrequentPairs && rc == may_rewrite) { ++ Label rewrite, done; ++ // get the next bytecode in T2 ++ __ ld_bu(T2, at_bcp(Bytecodes::length_for(Bytecodes::_aload_0))); ++ ++ // do actual aload_0 ++ aload(0); ++ ++ // if _getfield then wait with rewrite ++ __ li(AT, Bytecodes::_getfield); ++ __ beq(AT, T2, done); ++ ++ // if _igetfield then reqrite to _fast_iaccess_0 ++ assert(Bytecodes::java_code(Bytecodes::_fast_iaccess_0) == ++ Bytecodes::_aload_0, ++ "fix bytecode definition"); ++ __ li(T3, Bytecodes::_fast_iaccess_0); ++ __ li(AT, Bytecodes::_fast_igetfield); ++ __ beq(AT, T2, rewrite); ++ ++ // if _agetfield then reqrite to _fast_aaccess_0 ++ assert(Bytecodes::java_code(Bytecodes::_fast_aaccess_0) == ++ Bytecodes::_aload_0, ++ "fix bytecode definition"); ++ __ li(T3, Bytecodes::_fast_aaccess_0); ++ __ li(AT, Bytecodes::_fast_agetfield); ++ __ beq(AT, T2, rewrite); ++ ++ // if _fgetfield then reqrite to _fast_faccess_0 ++ assert(Bytecodes::java_code(Bytecodes::_fast_faccess_0) == ++ Bytecodes::_aload_0, ++ "fix bytecode definition"); ++ __ li(T3, Bytecodes::_fast_faccess_0); ++ __ li(AT, Bytecodes::_fast_fgetfield); ++ __ beq(AT, T2, rewrite); ++ ++ // else rewrite to _fast_aload0 ++ assert(Bytecodes::java_code(Bytecodes::_fast_aload_0) == ++ Bytecodes::_aload_0, ++ "fix bytecode definition"); ++ __ li(T3, Bytecodes::_fast_aload_0); ++ ++ // rewrite ++ __ bind(rewrite); ++ patch_bytecode(Bytecodes::_aload_0, T3, T2, false); ++ ++ __ bind(done); ++ } else { ++ aload(0); ++ } ++} ++ ++void TemplateTable::istore() { ++ transition(itos, vtos); ++ locals_index(T2); ++ __ st_w(FSR, T2, 0); ++} ++ ++void TemplateTable::lstore() { ++ transition(ltos, vtos); ++ locals_index(T2); ++ __ st_d(FSR, T2, -wordSize); ++} ++ ++void TemplateTable::fstore() { ++ transition(ftos, vtos); ++ locals_index(T2); ++ __ fst_s(FSF, T2, 0); ++} ++ ++void TemplateTable::dstore() { ++ transition(dtos, vtos); ++ locals_index(T2); ++ __ fst_d(FSF, T2, -wordSize); ++} ++ ++void TemplateTable::astore() { ++ transition(vtos, vtos); ++ __ pop_ptr(FSR); ++ locals_index(T2); ++ __ st_d(FSR, T2, 0); ++} ++ ++void TemplateTable::wide_istore() { ++ transition(vtos, vtos); ++ __ pop_i(FSR); ++ locals_index_wide(T2); ++ __ st_d(FSR, T2, 0); ++} ++ ++void TemplateTable::wide_lstore() { ++ transition(vtos, vtos); ++ __ pop_l(FSR); ++ locals_index_wide(T2); ++ __ st_d(FSR, T2, -wordSize); ++} ++ ++void TemplateTable::wide_fstore() { ++ wide_istore(); ++} ++ ++void TemplateTable::wide_dstore() { ++ wide_lstore(); ++} ++ ++void TemplateTable::wide_astore() { ++ transition(vtos, vtos); ++ __ pop_ptr(FSR); ++ locals_index_wide(T2); ++ __ st_d(FSR, T2, 0); ++} ++ ++void TemplateTable::iastore() { ++ transition(itos, vtos); ++ __ pop_i(T2); ++ index_check(A1, T2); ++ __ alsl_d(A1, T2, A1, Address::times_4 - 1); ++ __ access_store_at(T_INT, IN_HEAP | IS_ARRAY, Address(A1, arrayOopDesc::base_offset_in_bytes(T_INT)), FSR, noreg, noreg, noreg); ++} ++ ++// used register T2, T3 ++void TemplateTable::lastore() { ++ transition(ltos, vtos); ++ __ pop_i (T2); ++ index_check(T3, T2); ++ __ alsl_d(T3, T2, T3, Address::times_8 - 1); ++ __ access_store_at(T_LONG, IN_HEAP | IS_ARRAY, Address(T3, arrayOopDesc::base_offset_in_bytes(T_LONG)), FSR, noreg, noreg, noreg); ++} ++ ++// used register T2 ++void TemplateTable::fastore() { ++ transition(ftos, vtos); ++ __ pop_i(T2); ++ index_check(A1, T2); ++ __ alsl_d(A1, T2, A1, Address::times_4 - 1); ++ __ access_store_at(T_FLOAT, IN_HEAP | IS_ARRAY, Address(A1, arrayOopDesc::base_offset_in_bytes(T_FLOAT)), noreg, noreg, noreg, noreg); ++} ++ ++// used register T2, T3 ++void TemplateTable::dastore() { ++ transition(dtos, vtos); ++ __ pop_i (T2); ++ index_check(T3, T2); ++ __ alsl_d(T3, T2, T3, Address::times_8 - 1); ++ __ access_store_at(T_DOUBLE, IN_HEAP | IS_ARRAY, Address(T3, arrayOopDesc::base_offset_in_bytes(T_DOUBLE)), noreg, noreg, noreg, noreg); ++} ++ ++void TemplateTable::aastore() { ++ Label is_null, ok_is_subtype, done; ++ transition(vtos, vtos); ++ // stack: ..., array, index, value ++ __ ld_d(FSR, at_tos()); // Value ++ __ ld_w(T2, at_tos_p1()); // Index ++ __ ld_d(A1, at_tos_p2()); // Array ++ ++ index_check_without_pop(A1, T2); ++ // do array store check - check for null value first ++ __ beq(FSR, R0, is_null); ++ ++ // Move subklass into T3 ++ //add for compressedoops ++ __ load_klass(T3, FSR); ++ // Move superklass into T8 ++ //add for compressedoops ++ __ load_klass(T8, A1); ++ __ ld_d(T8, Address(T8, ObjArrayKlass::element_klass_offset())); ++ // Compress array+index*4+12 into a single register. ++ __ alsl_d(A1, T2, A1, (UseCompressedOops? Address::times_4 : Address::times_8) - 1); ++ __ addi_d(A1, A1, arrayOopDesc::base_offset_in_bytes(T_OBJECT)); ++ ++ // Generate subtype check. ++ // Superklass in T8. Subklass in T3. ++ __ gen_subtype_check(T8, T3, ok_is_subtype); ++ // Come here on failure ++ // object is at FSR ++ __ jmp(Interpreter::_throw_ArrayStoreException_entry); ++ // Come here on success ++ __ bind(ok_is_subtype); ++ do_oop_store(_masm, Address(A1, 0), FSR, IS_ARRAY); ++ __ b(done); ++ ++ // Have a null in FSR, A1=array, T2=index. Store null at ary[idx] ++ __ bind(is_null); ++ __ profile_null_seen(T4); ++ __ alsl_d(A1, T2, A1, (UseCompressedOops? Address::times_4 : Address::times_8) - 1); ++ do_oop_store(_masm, Address(A1, arrayOopDesc::base_offset_in_bytes(T_OBJECT)), noreg, IS_ARRAY); ++ ++ __ bind(done); ++ __ addi_d(SP, SP, 3 * Interpreter::stackElementSize); ++} ++ ++void TemplateTable::bastore() { ++ transition(itos, vtos); ++ __ pop_i(T2); ++ index_check(A1, T2); ++ ++ // Need to check whether array is boolean or byte ++ // since both types share the bastore bytecode. ++ __ load_klass(T4, A1); ++ __ ld_w(T4, T4, in_bytes(Klass::layout_helper_offset())); ++ ++ int diffbit = Klass::layout_helper_boolean_diffbit(); ++ __ li(AT, diffbit); ++ ++ Label L_skip; ++ __ andr(AT, T4, AT); ++ __ beq(AT, R0, L_skip); ++ __ andi(FSR, FSR, 0x1); ++ __ bind(L_skip); ++ ++ __ add_d(A1, A1, T2); ++ __ access_store_at(T_BYTE, IN_HEAP | IS_ARRAY, Address(A1, arrayOopDesc::base_offset_in_bytes(T_BYTE)), FSR, noreg, noreg, noreg); ++} ++ ++void TemplateTable::castore() { ++ transition(itos, vtos); ++ __ pop_i(T2); ++ index_check(A1, T2); ++ __ alsl_d(A1, T2, A1, Address::times_2 - 1); ++ __ access_store_at(T_CHAR, IN_HEAP | IS_ARRAY, Address(A1, arrayOopDesc::base_offset_in_bytes(T_CHAR)), FSR, noreg, noreg, noreg); ++} ++ ++void TemplateTable::sastore() { ++ castore(); ++} ++ ++void TemplateTable::istore(int n) { ++ transition(itos, vtos); ++ __ st_w(FSR, iaddress(n)); ++} ++ ++void TemplateTable::lstore(int n) { ++ transition(ltos, vtos); ++ __ st_d(FSR, laddress(n)); ++} ++ ++void TemplateTable::fstore(int n) { ++ transition(ftos, vtos); ++ __ fst_s(FSF, faddress(n)); ++} ++ ++void TemplateTable::dstore(int n) { ++ transition(dtos, vtos); ++ __ fst_d(FSF, laddress(n)); ++} ++ ++void TemplateTable::astore(int n) { ++ transition(vtos, vtos); ++ __ pop_ptr(FSR); ++ __ st_d(FSR, aaddress(n)); ++} ++ ++void TemplateTable::pop() { ++ transition(vtos, vtos); ++ __ addi_d(SP, SP, Interpreter::stackElementSize); ++} ++ ++void TemplateTable::pop2() { ++ transition(vtos, vtos); ++ __ addi_d(SP, SP, 2 * Interpreter::stackElementSize); ++} ++ ++void TemplateTable::dup() { ++ transition(vtos, vtos); ++ // stack: ..., a ++ __ load_ptr(0, FSR); ++ __ push_ptr(FSR); ++ // stack: ..., a, a ++} ++ ++// blows FSR ++void TemplateTable::dup_x1() { ++ transition(vtos, vtos); ++ // stack: ..., a, b ++ __ load_ptr(0, FSR); // load b ++ __ load_ptr(1, A5); // load a ++ __ store_ptr(1, FSR); // store b ++ __ store_ptr(0, A5); // store a ++ __ push_ptr(FSR); // push b ++ // stack: ..., b, a, b ++} ++ ++// blows FSR ++void TemplateTable::dup_x2() { ++ transition(vtos, vtos); ++ // stack: ..., a, b, c ++ __ load_ptr(0, FSR); // load c ++ __ load_ptr(2, A5); // load a ++ __ store_ptr(2, FSR); // store c in a ++ __ push_ptr(FSR); // push c ++ // stack: ..., c, b, c, c ++ __ load_ptr(2, FSR); // load b ++ __ store_ptr(2, A5); // store a in b ++ // stack: ..., c, a, c, c ++ __ store_ptr(1, FSR); // store b in c ++ // stack: ..., c, a, b, c ++} ++ ++// blows FSR ++void TemplateTable::dup2() { ++ transition(vtos, vtos); ++ // stack: ..., a, b ++ __ load_ptr(1, FSR); // load a ++ __ push_ptr(FSR); // push a ++ __ load_ptr(1, FSR); // load b ++ __ push_ptr(FSR); // push b ++ // stack: ..., a, b, a, b ++} ++ ++// blows FSR ++void TemplateTable::dup2_x1() { ++ transition(vtos, vtos); ++ // stack: ..., a, b, c ++ __ load_ptr(0, T2); // load c ++ __ load_ptr(1, FSR); // load b ++ __ push_ptr(FSR); // push b ++ __ push_ptr(T2); // push c ++ // stack: ..., a, b, c, b, c ++ __ store_ptr(3, T2); // store c in b ++ // stack: ..., a, c, c, b, c ++ __ load_ptr(4, T2); // load a ++ __ store_ptr(2, T2); // store a in 2nd c ++ // stack: ..., a, c, a, b, c ++ __ store_ptr(4, FSR); // store b in a ++ // stack: ..., b, c, a, b, c ++ ++ // stack: ..., b, c, a, b, c ++} ++ ++void TemplateTable::dup2_x2() { ++ transition(vtos, vtos); ++ // stack: ..., a, b, c, d ++ // stack: ..., a, b, c, d ++ __ load_ptr(0, T2); // load d ++ __ load_ptr(1, FSR); // load c ++ __ push_ptr(FSR); // push c ++ __ push_ptr(T2); // push d ++ // stack: ..., a, b, c, d, c, d ++ __ load_ptr(4, FSR); // load b ++ __ store_ptr(2, FSR); // store b in d ++ __ store_ptr(4, T2); // store d in b ++ // stack: ..., a, d, c, b, c, d ++ __ load_ptr(5, T2); // load a ++ __ load_ptr(3, FSR); // load c ++ __ store_ptr(3, T2); // store a in c ++ __ store_ptr(5, FSR); // store c in a ++ // stack: ..., c, d, a, b, c, d ++ ++ // stack: ..., c, d, a, b, c, d ++} ++ ++// blows FSR ++void TemplateTable::swap() { ++ transition(vtos, vtos); ++ // stack: ..., a, b ++ ++ __ load_ptr(1, A5); // load a ++ __ load_ptr(0, FSR); // load b ++ __ store_ptr(0, A5); // store a in b ++ __ store_ptr(1, FSR); // store b in a ++ ++ // stack: ..., b, a ++} ++ ++void TemplateTable::iop2(Operation op) { ++ transition(itos, itos); ++ // FSR(A0) <== A1 op A0 ++ __ pop_i(A1); ++ switch (op) { ++ case add : __ add_w(FSR, A1, FSR); break; ++ case sub : __ sub_w(FSR, A1, FSR); break; ++ case mul : __ mul_w(FSR, A1, FSR); break; ++ case _and : __ andr (FSR, A1, FSR); break; ++ case _or : __ orr (FSR, A1, FSR); break; ++ case _xor : __ xorr (FSR, A1, FSR); break; ++ case shl : __ sll_w(FSR, A1, FSR); break; ++ case shr : __ sra_w(FSR, A1, FSR); break; ++ case ushr : __ srl_w(FSR, A1, FSR); break; ++ default : ShouldNotReachHere(); ++ } ++} ++ ++void TemplateTable::lop2(Operation op) { ++ transition(ltos, ltos); ++ // FSR(A0) <== A1 op A0 ++ __ pop_l(A1); ++ switch (op) { ++ case add : __ add_d(FSR, A1, FSR); break; ++ case sub : __ sub_d(FSR, A1, FSR); break; ++ case _and : __ andr (FSR, A1, FSR); break; ++ case _or : __ orr (FSR, A1, FSR); break; ++ case _xor : __ xorr (FSR, A1, FSR); break; ++ default : ShouldNotReachHere(); ++ } ++} ++ ++// java require this bytecode could handle 0x80000000/-1, dont cause a overflow exception, ++// the result is 0x80000000 ++// the godson2 cpu do the same, so we need not handle this specially like x86 ++void TemplateTable::idiv() { ++ transition(itos, itos); ++ Label not_zero; ++ ++ __ bne(FSR, R0, not_zero); ++ __ jmp(Interpreter::_throw_ArithmeticException_entry); ++ __ bind(not_zero); ++ ++ __ pop_i(A1); ++ __ div_w(FSR, A1, FSR); ++} ++ ++void TemplateTable::irem() { ++ transition(itos, itos); ++ // explicitly check for div0 ++ Label no_div0; ++ __ bnez(FSR, no_div0); ++ __ jmp(Interpreter::_throw_ArithmeticException_entry); ++ ++ __ bind(no_div0); ++ __ pop_i(A1); ++ __ mod_w(FSR, A1, FSR); ++} ++ ++void TemplateTable::lmul() { ++ transition(ltos, ltos); ++ __ pop_l(T2); ++ __ mul_d(FSR, T2, FSR); ++} ++ ++// NOTE: i DONT use the Interpreter::_throw_ArithmeticException_entry ++void TemplateTable::ldiv() { ++ transition(ltos, ltos); ++ Label normal; ++ ++ __ bne(FSR, R0, normal); ++ ++ //__ brk(7); //generate FPE ++ __ jmp(Interpreter::_throw_ArithmeticException_entry); ++ ++ __ bind(normal); ++ __ pop_l(A2); ++ __ div_d(FSR, A2, FSR); ++} ++ ++// NOTE: i DONT use the Interpreter::_throw_ArithmeticException_entry ++void TemplateTable::lrem() { ++ transition(ltos, ltos); ++ Label normal; ++ ++ __ bne(FSR, R0, normal); ++ ++ __ jmp(Interpreter::_throw_ArithmeticException_entry); ++ ++ __ bind(normal); ++ __ pop_l (A2); ++ ++ __ mod_d(FSR, A2, FSR); ++} ++ ++// result in FSR ++// used registers : T0 ++void TemplateTable::lshl() { ++ transition(itos, ltos); ++ __ pop_l(T0); ++ __ sll_d(FSR, T0, FSR); ++} ++ ++// used registers : T0 ++void TemplateTable::lshr() { ++ transition(itos, ltos); ++ __ pop_l(T0); ++ __ sra_d(FSR, T0, FSR); ++} ++ ++// used registers : T0 ++void TemplateTable::lushr() { ++ transition(itos, ltos); ++ __ pop_l(T0); ++ __ srl_d(FSR, T0, FSR); ++} ++ ++// result in FSF ++void TemplateTable::fop2(Operation op) { ++ transition(ftos, ftos); ++ switch (op) { ++ case add: ++ __ fld_s(fscratch, at_sp()); ++ __ fadd_s(FSF, fscratch, FSF); ++ break; ++ case sub: ++ __ fld_s(fscratch, at_sp()); ++ __ fsub_s(FSF, fscratch, FSF); ++ break; ++ case mul: ++ __ fld_s(fscratch, at_sp()); ++ __ fmul_s(FSF, fscratch, FSF); ++ break; ++ case div: ++ __ fld_s(fscratch, at_sp()); ++ __ fdiv_s(FSF, fscratch, FSF); ++ break; ++ case rem: ++ __ fmov_s(FA1, FSF); ++ __ fld_s(FA0, at_sp()); ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::frem), 2); ++ break; ++ default : ShouldNotReachHere(); ++ } ++ ++ __ addi_d(SP, SP, 1 * wordSize); ++} ++ ++// result in SSF||FSF ++// i dont handle the strict flags ++void TemplateTable::dop2(Operation op) { ++ transition(dtos, dtos); ++ switch (op) { ++ case add: ++ __ fld_d(fscratch, at_sp()); ++ __ fadd_d(FSF, fscratch, FSF); ++ break; ++ case sub: ++ __ fld_d(fscratch, at_sp()); ++ __ fsub_d(FSF, fscratch, FSF); ++ break; ++ case mul: ++ __ fld_d(fscratch, at_sp()); ++ __ fmul_d(FSF, fscratch, FSF); ++ break; ++ case div: ++ __ fld_d(fscratch, at_sp()); ++ __ fdiv_d(FSF, fscratch, FSF); ++ break; ++ case rem: ++ __ fmov_d(FA1, FSF); ++ __ fld_d(FA0, at_sp()); ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, SharedRuntime::drem), 2); ++ break; ++ default : ShouldNotReachHere(); ++ } ++ ++ __ addi_d(SP, SP, 2 * wordSize); ++} ++ ++void TemplateTable::ineg() { ++ transition(itos, itos); ++ __ sub_w(FSR, R0, FSR); ++} ++ ++void TemplateTable::lneg() { ++ transition(ltos, ltos); ++ __ sub_d(FSR, R0, FSR); ++} ++ ++void TemplateTable::fneg() { ++ transition(ftos, ftos); ++ __ fneg_s(FSF, FSF); ++} ++ ++void TemplateTable::dneg() { ++ transition(dtos, dtos); ++ __ fneg_d(FSF, FSF); ++} ++ ++// used registers : T2 ++void TemplateTable::iinc() { ++ transition(vtos, vtos); ++ locals_index(T2); ++ __ ld_w(FSR, T2, 0); ++ __ ld_b(AT, at_bcp(2)); // get constant ++ __ add_d(FSR, FSR, AT); ++ __ st_w(FSR, T2, 0); ++} ++ ++// used register : T2 ++void TemplateTable::wide_iinc() { ++ transition(vtos, vtos); ++ locals_index_wide(T2); ++ __ get_2_byte_integer_at_bcp(FSR, AT, 4); ++ __ bswap_h(FSR, FSR); ++ __ ld_w(AT, T2, 0); ++ __ add_d(FSR, AT, FSR); ++ __ st_w(FSR, T2, 0); ++} ++ ++void TemplateTable::convert() { ++ // Checking ++#ifdef ASSERT ++ { ++ TosState tos_in = ilgl; ++ TosState tos_out = ilgl; ++ switch (bytecode()) { ++ case Bytecodes::_i2l: // fall through ++ case Bytecodes::_i2f: // fall through ++ case Bytecodes::_i2d: // fall through ++ case Bytecodes::_i2b: // fall through ++ case Bytecodes::_i2c: // fall through ++ case Bytecodes::_i2s: tos_in = itos; break; ++ case Bytecodes::_l2i: // fall through ++ case Bytecodes::_l2f: // fall through ++ case Bytecodes::_l2d: tos_in = ltos; break; ++ case Bytecodes::_f2i: // fall through ++ case Bytecodes::_f2l: // fall through ++ case Bytecodes::_f2d: tos_in = ftos; break; ++ case Bytecodes::_d2i: // fall through ++ case Bytecodes::_d2l: // fall through ++ case Bytecodes::_d2f: tos_in = dtos; break; ++ default : ShouldNotReachHere(); ++ } ++ switch (bytecode()) { ++ case Bytecodes::_l2i: // fall through ++ case Bytecodes::_f2i: // fall through ++ case Bytecodes::_d2i: // fall through ++ case Bytecodes::_i2b: // fall through ++ case Bytecodes::_i2c: // fall through ++ case Bytecodes::_i2s: tos_out = itos; break; ++ case Bytecodes::_i2l: // fall through ++ case Bytecodes::_f2l: // fall through ++ case Bytecodes::_d2l: tos_out = ltos; break; ++ case Bytecodes::_i2f: // fall through ++ case Bytecodes::_l2f: // fall through ++ case Bytecodes::_d2f: tos_out = ftos; break; ++ case Bytecodes::_i2d: // fall through ++ case Bytecodes::_l2d: // fall through ++ case Bytecodes::_f2d: tos_out = dtos; break; ++ default : ShouldNotReachHere(); ++ } ++ transition(tos_in, tos_out); ++ } ++#endif // ASSERT ++ // Conversion ++ switch (bytecode()) { ++ case Bytecodes::_i2l: ++ __ slli_w(FSR, FSR, 0); ++ break; ++ case Bytecodes::_i2f: ++ __ movgr2fr_w(FSF, FSR); ++ __ ffint_s_w(FSF, FSF); ++ break; ++ case Bytecodes::_i2d: ++ __ movgr2fr_w(FSF, FSR); ++ __ ffint_d_w(FSF, FSF); ++ break; ++ case Bytecodes::_i2b: ++ __ ext_w_b(FSR, FSR); ++ break; ++ case Bytecodes::_i2c: ++ __ bstrpick_d(FSR, FSR, 15, 0); // truncate upper 56 bits ++ break; ++ case Bytecodes::_i2s: ++ __ ext_w_h(FSR, FSR); ++ break; ++ case Bytecodes::_l2i: ++ __ slli_w(FSR, FSR, 0); ++ break; ++ case Bytecodes::_l2f: ++ __ movgr2fr_d(FSF, FSR); ++ __ ffint_s_l(FSF, FSF); ++ break; ++ case Bytecodes::_l2d: ++ __ movgr2fr_d(FSF, FSR); ++ __ ffint_d_l(FSF, FSF); ++ break; ++ case Bytecodes::_f2i: ++ __ ftintrz_w_s(fscratch, FSF); ++ __ movfr2gr_s(FSR, fscratch); ++ break; ++ case Bytecodes::_f2l: ++ __ ftintrz_l_s(fscratch, FSF); ++ __ movfr2gr_d(FSR, fscratch); ++ break; ++ case Bytecodes::_f2d: ++ __ fcvt_d_s(FSF, FSF); ++ break; ++ case Bytecodes::_d2i: ++ __ ftintrz_w_d(fscratch, FSF); ++ __ movfr2gr_s(FSR, fscratch); ++ break; ++ case Bytecodes::_d2l: ++ __ ftintrz_l_d(fscratch, FSF); ++ __ movfr2gr_d(FSR, fscratch); ++ break; ++ case Bytecodes::_d2f: ++ __ fcvt_s_d(FSF, FSF); ++ break; ++ default : ++ ShouldNotReachHere(); ++ } ++} ++ ++void TemplateTable::lcmp() { ++ transition(ltos, itos); ++ ++ __ pop(T0); ++ __ pop(R0); ++ ++ __ slt(AT, T0, FSR); ++ __ slt(FSR, FSR, T0); ++ __ sub_d(FSR, FSR, AT); ++} ++ ++void TemplateTable::float_cmp(bool is_float, int unordered_result) { ++ if (is_float) { ++ __ fld_s(fscratch, at_sp()); ++ __ addi_d(SP, SP, 1 * wordSize); ++ ++ if (unordered_result < 0) { ++ __ fcmp_clt_s(FCC0, FSF, fscratch); ++ __ fcmp_cult_s(FCC1, fscratch, FSF); ++ } else { ++ __ fcmp_cult_s(FCC0, FSF, fscratch); ++ __ fcmp_clt_s(FCC1, fscratch, FSF); ++ } ++ } else { ++ __ fld_d(fscratch, at_sp()); ++ __ addi_d(SP, SP, 2 * wordSize); ++ ++ if (unordered_result < 0) { ++ __ fcmp_clt_d(FCC0, FSF, fscratch); ++ __ fcmp_cult_d(FCC1, fscratch, FSF); ++ } else { ++ __ fcmp_cult_d(FCC0, FSF, fscratch); ++ __ fcmp_clt_d(FCC1, fscratch, FSF); ++ } ++ } ++ ++ if (UseCF2GR) { ++ __ movcf2gr(FSR, FCC0); ++ __ movcf2gr(AT, FCC1); ++ } else { ++ __ movcf2fr(fscratch, FCC0); ++ __ movfr2gr_s(FSR, fscratch); ++ __ movcf2fr(fscratch, FCC1); ++ __ movfr2gr_s(AT, fscratch); ++ } ++ __ sub_d(FSR, FSR, AT); ++} ++ ++// used registers : T3, A7, Rnext ++// FSR : return bci, this is defined by the vm specification ++// T2 : MDO taken count ++// T3 : method ++// A7 : offset ++// Rnext : next bytecode, this is required by dispatch_base ++void TemplateTable::branch(bool is_jsr, bool is_wide) { ++ __ get_method(T3); ++ __ profile_taken_branch(A7, T2); // only C2 meaningful ++ ++ const ByteSize be_offset = MethodCounters::backedge_counter_offset() + ++ InvocationCounter::counter_offset(); ++ const ByteSize inv_offset = MethodCounters::invocation_counter_offset() + ++ InvocationCounter::counter_offset(); ++ ++ // Load up T4 with the branch displacement ++ if (!is_wide) { ++ __ ld_b(A7, BCP, 1); ++ __ ld_bu(AT, BCP, 2); ++ __ slli_d(A7, A7, 8); ++ __ orr(A7, A7, AT); ++ } else { ++ __ get_4_byte_integer_at_bcp(A7, 1); ++ __ bswap_w(A7, A7); ++ } ++ ++ // Handle all the JSR stuff here, then exit. ++ // It's much shorter and cleaner than intermingling with the non-JSR ++ // normal-branch stuff occurring below. ++ if (is_jsr) { ++ // Pre-load the next target bytecode into Rnext ++ __ ldx_bu(Rnext, BCP, A7); ++ ++ // compute return address as bci in FSR ++ __ addi_d(FSR, BCP, (is_wide?5:3) - in_bytes(ConstMethod::codes_offset())); ++ __ ld_d(AT, T3, in_bytes(Method::const_offset())); ++ __ sub_d(FSR, FSR, AT); ++ // Adjust the bcp in BCP by the displacement in A7 ++ __ add_d(BCP, BCP, A7); ++ // jsr returns atos that is not an oop ++ // Push return address ++ __ push_i(FSR); ++ // jsr returns vtos ++ __ dispatch_only_noverify(vtos); ++ ++ return; ++ } ++ ++ // Normal (non-jsr) branch handling ++ ++ // Adjust the bcp in S0 by the displacement in T4 ++ __ add_d(BCP, BCP, A7); ++ ++ assert(UseLoopCounter || !UseOnStackReplacement, "on-stack-replacement requires loop counters"); ++ Label backedge_counter_overflow; ++ Label profile_method; ++ Label dispatch; ++ if (UseLoopCounter) { ++ // increment backedge counter for backward branches ++ // T3: method ++ // T4: target offset ++ // BCP: target bcp ++ // LVP: locals pointer ++ __ blt(R0, A7, dispatch); // check if forward or backward branch ++ ++ // check if MethodCounters exists ++ Label has_counters; ++ __ ld_d(AT, T3, in_bytes(Method::method_counters_offset())); // use AT as MDO, TEMP ++ __ bne(AT, R0, has_counters); ++ __ push2(T3, A7); ++ __ call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::build_method_counters), ++ T3); ++ __ pop2(T3, A7); ++ __ ld_d(AT, T3, in_bytes(Method::method_counters_offset())); // use AT as MDO, TEMP ++ __ beq(AT, R0, dispatch); ++ __ bind(has_counters); ++ ++ Label no_mdo; ++ int increment = InvocationCounter::count_increment; ++ if (ProfileInterpreter) { ++ // Are we profiling? ++ __ ld_d(T0, Address(T3, in_bytes(Method::method_data_offset()))); ++ __ beq(T0, R0, no_mdo); ++ // Increment the MDO backedge counter ++ const Address mdo_backedge_counter(T0, in_bytes(MethodData::backedge_counter_offset()) + ++ in_bytes(InvocationCounter::counter_offset())); ++ const Address mask(T0, in_bytes(MethodData::backedge_mask_offset())); ++ __ increment_mask_and_jump(mdo_backedge_counter, increment, mask, ++ T1, false, Assembler::zero, ++ UseOnStackReplacement ? &backedge_counter_overflow : &dispatch); ++ __ beq(R0, R0, dispatch); ++ } ++ __ bind(no_mdo); ++ // Increment backedge counter in MethodCounters* ++ __ ld_d(T0, Address(T3, Method::method_counters_offset())); ++ const Address mask(T0, in_bytes(MethodCounters::backedge_mask_offset())); ++ __ increment_mask_and_jump(Address(T0, be_offset), increment, mask, ++ T1, false, Assembler::zero, ++ UseOnStackReplacement ? &backedge_counter_overflow : &dispatch); ++ __ bind(dispatch); ++ } ++ ++ // Pre-load the next target bytecode into Rnext ++ __ ld_bu(Rnext, BCP, 0); ++ ++ // continue with the bytecode @ target ++ // FSR: return bci for jsr's, unused otherwise ++ // Rnext: target bytecode ++ // BCP: target bcp ++ __ dispatch_only(vtos, true); ++ ++ if (UseLoopCounter && UseOnStackReplacement) { ++ // invocation counter overflow ++ __ bind(backedge_counter_overflow); ++ __ sub_d(A7, BCP, A7); // branch bcp ++ call_VM(NOREG, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::frequency_counter_overflow), A7); ++ ++ // V0: osr nmethod (osr ok) or null (osr not possible) ++ // V1: osr adapter frame return address ++ // LVP: locals pointer ++ // BCP: bcp ++ __ beq(V0, R0, dispatch); ++ // nmethod may have been invalidated (VM may block upon call_VM return) ++ __ ld_b(T3, Address(V0, nmethod::state_offset())); ++ __ li(AT, nmethod::in_use); ++ __ bne(AT, T3, dispatch); ++ ++ // We have the address of an on stack replacement routine in rax. ++ // In preparation of invoking it, first we must migrate the locals ++ // and monitors from off the interpreter frame on the stack. ++ // Ensure to save the osr nmethod over the migration call, ++ // it will be preserved in Rnext. ++ __ move(Rnext, V0); ++ call_VM(noreg, CAST_FROM_FN_PTR(address, SharedRuntime::OSR_migration_begin)); ++ ++ // V0 is OSR buffer, move it to expected parameter location ++ // refer to osrBufferPointer in c1_LIRAssembler_loongarch.cpp ++ __ move(T0, V0); ++ ++ // pop the interpreter frame ++ __ ld_d(A7, Address(FP, frame::interpreter_frame_sender_sp_offset * wordSize)); ++ // remove frame anchor ++ __ leave(); ++ __ move(LVP, RA); ++ __ move(SP, A7); ++ ++ assert(StackAlignmentInBytes == 16, "must be"); ++ __ bstrins_d(SP, R0, 3, 0); ++ ++ // push the (possibly adjusted) return address ++ // refer to osr_entry in c1_LIRAssembler_loongarch.cpp ++ __ ld_d(AT, Address(Rnext, nmethod::osr_entry_point_offset())); ++ __ jr(AT); ++ } ++} ++ ++void TemplateTable::if_0cmp(Condition cc) { ++ transition(itos, vtos); ++ // assume branch is more often taken than not (loops use backward branches) ++ Label not_taken; ++ switch(cc) { ++ case not_equal: ++ __ beq(FSR, R0, not_taken); ++ break; ++ case equal: ++ __ bne(FSR, R0, not_taken); ++ break; ++ case less: ++ __ bge(FSR, R0, not_taken); ++ break; ++ case less_equal: ++ __ blt(R0, FSR, not_taken); ++ break; ++ case greater: ++ __ bge(R0, FSR, not_taken); ++ break; ++ case greater_equal: ++ __ blt(FSR, R0, not_taken); ++ break; ++ } ++ ++ branch(false, false); ++ ++ __ bind(not_taken); ++ __ profile_not_taken_branch(FSR); ++} ++ ++void TemplateTable::if_icmp(Condition cc) { ++ transition(itos, vtos); ++ // assume branch is more often taken than not (loops use backward branches) ++ Label not_taken; ++ __ pop_i(A1); ++ __ add_w(FSR, FSR, R0); ++ switch(cc) { ++ case not_equal: ++ __ beq(A1, FSR, not_taken); ++ break; ++ case equal: ++ __ bne(A1, FSR, not_taken); ++ break; ++ case less: ++ __ bge(A1, FSR, not_taken); ++ break; ++ case less_equal: ++ __ blt(FSR, A1, not_taken); ++ break; ++ case greater: ++ __ bge(FSR, A1, not_taken); ++ break; ++ case greater_equal: ++ __ blt(A1, FSR, not_taken); ++ break; ++ } ++ ++ branch(false, false); ++ __ bind(not_taken); ++ __ profile_not_taken_branch(FSR); ++} ++ ++void TemplateTable::if_nullcmp(Condition cc) { ++ transition(atos, vtos); ++ // assume branch is more often taken than not (loops use backward branches) ++ Label not_taken; ++ switch(cc) { ++ case not_equal: ++ __ beq(FSR, R0, not_taken); ++ break; ++ case equal: ++ __ bne(FSR, R0, not_taken); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ ++ branch(false, false); ++ __ bind(not_taken); ++ __ profile_not_taken_branch(FSR); ++} ++ ++ ++void TemplateTable::if_acmp(Condition cc) { ++ transition(atos, vtos); ++ // assume branch is more often taken than not (loops use backward branches) ++ Label not_taken; ++ __ pop_ptr(A1); ++ ++ switch(cc) { ++ case not_equal: ++ __ beq(A1, FSR, not_taken); ++ break; ++ case equal: ++ __ bne(A1, FSR, not_taken); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ ++ branch(false, false); ++ ++ __ bind(not_taken); ++ __ profile_not_taken_branch(FSR); ++} ++ ++// used registers : T1, T2, T3 ++// T1 : method ++// T2 : returb bci ++void TemplateTable::ret() { ++ transition(vtos, vtos); ++ ++ locals_index(T2); ++ __ ld_d(T2, T2, 0); ++ __ profile_ret(T2, T3); ++ ++ __ get_method(T1); ++ __ ld_d(BCP, T1, in_bytes(Method::const_offset())); ++ __ add_d(BCP, BCP, T2); ++ __ addi_d(BCP, BCP, in_bytes(ConstMethod::codes_offset())); ++ ++ __ dispatch_next(vtos, 0, true); ++} ++ ++// used registers : T1, T2, T3 ++// T1 : method ++// T2 : returb bci ++void TemplateTable::wide_ret() { ++ transition(vtos, vtos); ++ ++ locals_index_wide(T2); ++ __ ld_d(T2, T2, 0); // get return bci, compute return bcp ++ __ profile_ret(T2, T3); ++ ++ __ get_method(T1); ++ __ ld_d(BCP, T1, in_bytes(Method::const_offset())); ++ __ add_d(BCP, BCP, T2); ++ __ addi_d(BCP, BCP, in_bytes(ConstMethod::codes_offset())); ++ ++ __ dispatch_next(vtos, 0, true); ++} ++ ++// used register T2, T3, A7, Rnext ++// T2 : bytecode pointer ++// T3 : low ++// A7 : high ++// Rnext : dest bytecode, required by dispatch_base ++void TemplateTable::tableswitch() { ++ Label default_case, continue_execution; ++ transition(itos, vtos); ++ ++ // align BCP ++ __ addi_d(T2, BCP, BytesPerInt); ++ __ li(AT, -BytesPerInt); ++ __ andr(T2, T2, AT); ++ ++ // load lo & hi ++ __ ld_w(T3, T2, 1 * BytesPerInt); ++ __ bswap_w(T3, T3); ++ __ ld_w(A7, T2, 2 * BytesPerInt); ++ __ bswap_w(A7, A7); ++ ++ // check against lo & hi ++ __ blt(FSR, T3, default_case); ++ __ blt(A7, FSR, default_case); ++ ++ // lookup dispatch offset, in A7 big endian ++ __ sub_d(FSR, FSR, T3); ++ __ alsl_d(AT, FSR, T2, Address::times_4 - 1); ++ __ ld_w(A7, AT, 3 * BytesPerInt); ++ __ profile_switch_case(FSR, T4, T3); ++ ++ __ bind(continue_execution); ++ __ bswap_w(A7, A7); ++ __ add_d(BCP, BCP, A7); ++ __ ld_bu(Rnext, BCP, 0); ++ __ dispatch_only(vtos, true); ++ ++ // handle default ++ __ bind(default_case); ++ __ profile_switch_default(FSR); ++ __ ld_w(A7, T2, 0); ++ __ b(continue_execution); ++} ++ ++void TemplateTable::lookupswitch() { ++ transition(itos, itos); ++ __ stop("lookupswitch bytecode should have been rewritten"); ++} ++ ++// used registers : T2, T3, A7, Rnext ++// T2 : bytecode pointer ++// T3 : pair index ++// A7 : offset ++// Rnext : dest bytecode ++// the data after the opcode is the same as lookupswitch ++// see Rewriter::rewrite_method for more information ++void TemplateTable::fast_linearswitch() { ++ transition(itos, vtos); ++ Label loop_entry, loop, found, continue_execution; ++ ++ // swap FSR so we can avoid swapping the table entries ++ __ bswap_w(FSR, FSR); ++ ++ // align BCP ++ __ addi_d(T2, BCP, BytesPerInt); ++ __ li(AT, -BytesPerInt); ++ __ andr(T2, T2, AT); ++ ++ // set counter ++ __ ld_w(T3, T2, BytesPerInt); ++ __ bswap_w(T3, T3); ++ __ b(loop_entry); ++ ++ // table search ++ __ bind(loop); ++ // get the entry value ++ __ alsl_d(AT, T3, T2, Address::times_8 - 1); ++ __ ld_w(AT, AT, 2 * BytesPerInt); ++ ++ // found? ++ __ beq(FSR, AT, found); ++ ++ __ bind(loop_entry); ++ Label L1; ++ __ bge(R0, T3, L1); ++ __ addi_d(T3, T3, -1); ++ __ b(loop); ++ __ bind(L1); ++ __ addi_d(T3, T3, -1); ++ ++ // default case ++ __ profile_switch_default(FSR); ++ __ ld_w(A7, T2, 0); ++ __ b(continue_execution); ++ ++ // entry found -> get offset ++ __ bind(found); ++ __ alsl_d(AT, T3, T2, Address::times_8 - 1); ++ __ ld_w(A7, AT, 3 * BytesPerInt); ++ __ profile_switch_case(T3, FSR, T2); ++ ++ // continue execution ++ __ bind(continue_execution); ++ __ bswap_w(A7, A7); ++ __ add_d(BCP, BCP, A7); ++ __ ld_bu(Rnext, BCP, 0); ++ __ dispatch_only(vtos, true); ++} ++ ++// used registers : T0, T1, T2, T3, A7, Rnext ++// T2 : pairs address(array) ++// Rnext : dest bytecode ++// the data after the opcode is the same as lookupswitch ++// see Rewriter::rewrite_method for more information ++void TemplateTable::fast_binaryswitch() { ++ transition(itos, vtos); ++ // Implementation using the following core algorithm: ++ // ++ // int binary_search(int key, LookupswitchPair* array, int n) { ++ // // Binary search according to "Methodik des Programmierens" by ++ // // Edsger W. Dijkstra and W.H.J. Feijen, Addison Wesley Germany 1985. ++ // int i = 0; ++ // int j = n; ++ // while (i+1 < j) { ++ // // invariant P: 0 <= i < j <= n and (a[i] <= key < a[j] or Q) ++ // // with Q: for all i: 0 <= i < n: key < a[i] ++ // // where a stands for the array and assuming that the (inexisting) ++ // // element a[n] is infinitely big. ++ // int h = (i + j) >> 1; ++ // // i < h < j ++ // if (key < array[h].fast_match()) { ++ // j = h; ++ // } else { ++ // i = h; ++ // } ++ // } ++ // // R: a[i] <= key < a[i+1] or Q ++ // // (i.e., if key is within array, i is the correct index) ++ // return i; ++ // } ++ ++ // register allocation ++ const Register array = T2; ++ const Register i = T3, j = A7; ++ const Register h = T1; ++ const Register temp = T0; ++ const Register key = FSR; ++ ++ // setup array ++ __ addi_d(array, BCP, 3*BytesPerInt); ++ __ li(AT, -BytesPerInt); ++ __ andr(array, array, AT); ++ ++ // initialize i & j ++ __ move(i, R0); ++ __ ld_w(j, array, - 1 * BytesPerInt); ++ // Convert j into native byteordering ++ __ bswap_w(j, j); ++ ++ // and start ++ Label entry; ++ __ b(entry); ++ ++ // binary search loop ++ { ++ Label loop; ++ __ bind(loop); ++ // int h = (i + j) >> 1; ++ __ add_d(h, i, j); ++ __ srli_d(h, h, 1); ++ // if (key < array[h].fast_match()) { ++ // j = h; ++ // } else { ++ // i = h; ++ // } ++ // Convert array[h].match to native byte-ordering before compare ++ __ alsl_d(AT, h, array, Address::times_8 - 1); ++ __ ld_w(temp, AT, 0 * BytesPerInt); ++ __ bswap_w(temp, temp); ++ ++ __ slt(AT, key, temp); ++ __ maskeqz(i, i, AT); ++ __ masknez(temp, h, AT); ++ __ OR(i, i, temp); ++ __ masknez(j, j, AT); ++ __ maskeqz(temp, h, AT); ++ __ OR(j, j, temp); ++ ++ // while (i+1 < j) ++ __ bind(entry); ++ __ addi_d(h, i, 1); ++ __ blt(h, j, loop); ++ } ++ ++ // end of binary search, result index is i (must check again!) ++ Label default_case; ++ // Convert array[i].match to native byte-ordering before compare ++ __ alsl_d(AT, i, array, Address::times_8 - 1); ++ __ ld_w(temp, AT, 0 * BytesPerInt); ++ __ bswap_w(temp, temp); ++ __ bne(key, temp, default_case); ++ ++ // entry found -> j = offset ++ __ alsl_d(AT, i, array, Address::times_8 - 1); ++ __ ld_w(j, AT, 1 * BytesPerInt); ++ __ profile_switch_case(i, key, array); ++ __ bswap_w(j, j); ++ ++ __ add_d(BCP, BCP, j); ++ __ ld_bu(Rnext, BCP, 0); ++ __ dispatch_only(vtos, true); ++ ++ // default case -> j = default offset ++ __ bind(default_case); ++ __ profile_switch_default(i); ++ __ ld_w(j, array, - 2 * BytesPerInt); ++ __ bswap_w(j, j); ++ __ add_d(BCP, BCP, j); ++ __ ld_bu(Rnext, BCP, 0); ++ __ dispatch_only(vtos, true); ++} ++ ++void TemplateTable::_return(TosState state) { ++ transition(state, state); ++ assert(_desc->calls_vm(), ++ "inconsistent calls_vm information"); // call in remove_activation ++ ++ if (_desc->bytecode() == Bytecodes::_return_register_finalizer) { ++ assert(state == vtos, "only valid state"); ++ ++ __ ld_d(c_rarg1, aaddress(0)); ++ __ load_klass(LVP, c_rarg1); ++ __ ld_w(LVP, Address(LVP, Klass::access_flags_offset())); ++ __ li(AT, JVM_ACC_HAS_FINALIZER); ++ __ andr(AT, AT, LVP); ++ Label skip_register_finalizer; ++ __ beqz(AT, skip_register_finalizer); ++ ++ __ call_VM(noreg, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::register_finalizer), c_rarg1); ++ ++ __ bind(skip_register_finalizer); ++ } ++ ++ // Issue a StoreStore barrier after all stores but before return ++ // from any constructor for any class with a final field. We don't ++ // know if this is a finalizer, so we always do so. ++ if (_desc->bytecode() == Bytecodes::_return) { ++ __ membar(__ StoreStore); ++ } ++ ++ // Narrow result if state is itos but result type is smaller. ++ // Need to narrow in the return bytecode rather than in generate_return_entry ++ // since compiled code callers expect the result to already be narrowed. ++ if (state == itos) { ++ __ narrow(A0); ++ } ++ ++ __ remove_activation(state); ++ __ jr(RA); ++} ++ ++// we dont shift left 2 bits in get_cache_and_index_at_bcp ++// for we always need shift the index we use it. the ConstantPoolCacheEntry ++// is 16-byte long, index is the index in ++// ConstantPoolCache, so cache + base_offset() + index * 16 is ++// the corresponding ConstantPoolCacheEntry ++// used registers : T2 ++// NOTE : the returned index need also shift left 4 to get the address! ++void TemplateTable::resolve_cache_and_index(int byte_no, ++ Register Rcache, ++ Register index, ++ size_t index_size) { ++ assert(byte_no == f1_byte || byte_no == f2_byte, "byte_no out of range"); ++ const Register temp = A1; ++ assert_different_registers(Rcache, index); ++ ++ Label resolved, clinit_barrier_slow; ++ ++ Bytecodes::Code code = bytecode(); ++ switch (code) { ++ case Bytecodes::_nofast_getfield: code = Bytecodes::_getfield; break; ++ case Bytecodes::_nofast_putfield: code = Bytecodes::_putfield; break; ++ default: break; ++ } ++ ++ __ get_cache_and_index_and_bytecode_at_bcp(Rcache, index, temp, byte_no, 1, index_size); ++ // is resolved? ++ int i = (int)code; ++ __ addi_d(temp, temp, -i); ++ __ beq(temp, R0, resolved); ++ ++ // resolve first time through ++ // Class initialization barrier slow path lands here as well. ++ __ bind(clinit_barrier_slow); ++ address entry = CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_from_cache); ++ ++ __ li(temp, i); ++ __ call_VM(NOREG, entry, temp); ++ ++ // Update registers with resolved info ++ __ get_cache_and_index_at_bcp(Rcache, index, 1, index_size); ++ __ bind(resolved); ++ ++ // Class initialization barrier for static methods ++ if (VM_Version::supports_fast_class_init_checks() && bytecode() == Bytecodes::_invokestatic) { ++ __ load_resolved_method_at_index(byte_no, temp, Rcache, index); ++ __ load_method_holder(temp, temp); ++ __ clinit_barrier(temp, AT, nullptr, &clinit_barrier_slow); ++ } ++} ++//END: LA ++ ++// The Rcache and index registers must be set before call ++void TemplateTable::load_field_cp_cache_entry(Register obj, ++ Register cache, ++ Register index, ++ Register off, ++ Register flags, ++ bool is_static = false) { ++ assert_different_registers(cache, index, flags, off); ++ ++ ByteSize cp_base_offset = ConstantPoolCache::base_offset(); ++ // Field offset ++ __ alsl_d(AT, index, cache, Address::times_ptr - 1); ++ __ ld_d(off, AT, in_bytes(cp_base_offset + ConstantPoolCacheEntry::f2_offset())); ++ // Flags ++ __ ld_d(flags, AT, in_bytes(cp_base_offset + ConstantPoolCacheEntry::flags_offset())); ++ ++ // klass overwrite register ++ if (is_static) { ++ __ ld_d(obj, AT, in_bytes(cp_base_offset + ConstantPoolCacheEntry::f1_offset())); ++ const int mirror_offset = in_bytes(Klass::java_mirror_offset()); ++ __ ld_d(obj, Address(obj, mirror_offset)); ++ ++ __ resolve_oop_handle(obj, SCR2, SCR1); ++ } ++} ++ ++// The Rmethod register is input and overwritten to be the adapter method for the ++// indy call. Return address (ra) is set to the return address for the adapter and ++// an appendix may be pushed to the stack. Registers A2-A3 are clobbered. ++void TemplateTable::load_invokedynamic_entry(Register method) { ++ // setup registers ++ const Register appendix = T2; ++ const Register cache = A2; ++ const Register index = A3; ++ assert_different_registers(method, appendix, cache, index); ++ ++ __ save_bcp(); ++ ++ Label resolved; ++ ++ __ load_resolved_indy_entry(cache, index); ++ __ ld_d(method, Address(cache, in_bytes(ResolvedIndyEntry::method_offset()))); ++ __ membar(Assembler::Membar_mask_bits(Assembler::LoadLoad | Assembler::LoadStore)); ++ ++ // Compare the method to zero ++ __ bnez(method, resolved); ++ ++ Bytecodes::Code code = bytecode(); ++ ++ // Call to the interpreter runtime to resolve invokedynamic ++ address entry = CAST_FROM_FN_PTR(address, InterpreterRuntime::resolve_from_cache); ++ __ li(method, code); // this is essentially Bytecodes::_invokedynamic ++ __ call_VM(noreg, entry, method); ++ // Update registers with resolved info ++ __ load_resolved_indy_entry(cache, index); ++ __ ld_d(method, Address(cache, in_bytes(ResolvedIndyEntry::method_offset()))); ++ __ membar(Assembler::Membar_mask_bits(Assembler::LoadLoad | Assembler::LoadStore)); ++ ++#ifdef ASSERT ++ __ bnez(method, resolved); ++ __ stop("Should be resolved by now"); ++#endif // ASSERT ++ __ bind(resolved); ++ ++ Label L_no_push; ++ // Check if there is an appendix ++ __ ld_bu(index, Address(cache, in_bytes(ResolvedIndyEntry::flags_offset()))); ++ __ andi(AT, index, 1UL << ResolvedIndyEntry::has_appendix_shift); ++ __ beqz(AT, L_no_push); ++ ++ // Get appendix ++ __ ld_hu(index, Address(cache, in_bytes(ResolvedIndyEntry::resolved_references_index_offset()))); ++ // Push the appendix as a trailing parameter ++ // since the parameter_size includes it. ++ __ push(method); ++ __ move(method, index); ++ __ load_resolved_reference_at_index(appendix, method, A1); ++ __ verify_oop(appendix); ++ __ pop(method); ++ __ push(appendix); // push appendix (MethodType, CallSite, etc.) ++ __ bind(L_no_push); ++ ++ // compute return type ++ __ ld_bu(index, Address(cache, in_bytes(ResolvedIndyEntry::result_type_offset()))); ++ // load return address ++ // Return address is loaded into ra and not pushed to the stack like x86 ++ { ++ const address table_addr = (address) Interpreter::invoke_return_entry_table_for(code); ++ __ li(AT, table_addr); ++ __ alsl_d(AT, index, AT, 3-1); ++ __ ld_d(RA, AT, 0); ++ } ++} ++ ++// get the method, itable_index and flags of the current invoke ++void TemplateTable::load_invoke_cp_cache_entry(int byte_no, ++ Register method, ++ Register itable_index, ++ Register flags, ++ bool is_invokevirtual, ++ bool is_invokevfinal, /*unused*/ ++ bool is_invokedynamic /*unused*/) { ++ // setup registers ++ const Register cache = T3; ++ const Register index = T1; ++ assert_different_registers(method, flags); ++ assert_different_registers(method, cache, index); ++ assert_different_registers(itable_index, flags); ++ assert_different_registers(itable_index, cache, index); ++ assert(is_invokevirtual == (byte_no == f2_byte), "is invokevirtual flag redundant"); ++ // determine constant pool cache field offsets ++ const int method_offset = in_bytes( ++ ConstantPoolCache::base_offset() + ++ ((byte_no == f2_byte) ++ ? ConstantPoolCacheEntry::f2_offset() ++ : ConstantPoolCacheEntry::f1_offset())); ++ const int flags_offset = in_bytes(ConstantPoolCache::base_offset() + ++ ConstantPoolCacheEntry::flags_offset()); ++ // access constant pool cache fields ++ const int index_offset = in_bytes(ConstantPoolCache::base_offset() + ++ ConstantPoolCacheEntry::f2_offset()); ++ ++ size_t index_size = sizeof(u2); ++ resolve_cache_and_index(byte_no, cache, index, index_size); ++ ++ __ alsl_d(AT, index, cache, Address::times_ptr - 1); ++ __ ld_d(method, AT, method_offset); ++ ++ if (itable_index != NOREG) { ++ __ ld_d(itable_index, AT, index_offset); ++ } ++ __ ld_d(flags, AT, flags_offset); ++} ++ ++// The registers cache and index expected to be set before call. ++// Correct values of the cache and index registers are preserved. ++void TemplateTable::jvmti_post_field_access(Register cache, Register index, ++ bool is_static, bool has_tos) { ++ // do the JVMTI work here to avoid disturbing the register state below ++ // We use c_rarg registers here because we want to use the register used in ++ // the call to the VM ++ if (JvmtiExport::can_post_field_access()) { ++ // Check to see if a field access watch has been set before we ++ // take the time to call into the VM. ++ Label L1; ++ // kill FSR ++ Register tmp1 = T2; ++ Register tmp2 = T1; ++ Register tmp3 = T3; ++ assert_different_registers(cache, index, AT); ++ __ li(AT, (intptr_t)JvmtiExport::get_field_access_count_addr()); ++ __ ld_w(AT, AT, 0); ++ __ beq(AT, R0, L1); ++ ++ __ get_cache_and_index_at_bcp(tmp2, tmp3, 1); ++ ++ // cache entry pointer ++ __ addi_d(tmp2, tmp2, in_bytes(ConstantPoolCache::base_offset())); ++ __ alsl_d(tmp2, tmp3, tmp2, LogBytesPerWord - 1); ++ ++ if (is_static) { ++ __ move(tmp1, R0); ++ } else { ++ __ ld_d(tmp1, SP, 0); ++ __ verify_oop(tmp1); ++ } ++ // tmp1: object pointer or null ++ // tmp2: cache entry pointer ++ __ call_VM(NOREG, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::post_field_access), ++ tmp1, tmp2); ++ __ get_cache_and_index_at_bcp(cache, index, 1); ++ __ bind(L1); ++ } ++} ++ ++void TemplateTable::pop_and_check_object(Register r) { ++ __ pop_ptr(r); ++ __ null_check(r); // for field access must check obj. ++ __ verify_oop(r); ++} ++ ++// used registers : T1, T2, T3, T1 ++// T1 : flags ++// T2 : off ++// T3 : obj ++// T1 : field address ++// The flags 31, 30, 29, 28 together build a 4 bit number 0 to 8 with the ++// following mapping to the TosState states: ++// btos: 0 ++// ctos: 1 ++// stos: 2 ++// itos: 3 ++// ltos: 4 ++// ftos: 5 ++// dtos: 6 ++// atos: 7 ++// vtos: 8 ++// see ConstantPoolCacheEntry::set_field for more info ++void TemplateTable::getfield_or_static(int byte_no, bool is_static, RewriteControl rc) { ++ transition(vtos, vtos); ++ ++ const Register cache = T3; ++ const Register index = T0; ++ ++ const Register obj = T3; ++ const Register off = T2; ++ const Register flags = T1; ++ ++ const Register scratch = T8; ++ ++ resolve_cache_and_index(byte_no, cache, index, sizeof(u2)); ++ jvmti_post_field_access(cache, index, is_static, false); ++ load_field_cp_cache_entry(obj, cache, index, off, flags, is_static); ++ ++ { ++ __ li(scratch, 1 << ConstantPoolCacheEntry::is_volatile_shift); ++ __ andr(scratch, scratch, flags); ++ ++ Label notVolatile; ++ __ beq(scratch, R0, notVolatile); ++ __ membar(MacroAssembler::AnyAny); ++ __ bind(notVolatile); ++ } ++ ++ if (!is_static) pop_and_check_object(obj); ++ __ add_d(index, obj, off); ++ ++ const Address field(index, 0); ++ ++ Label Done, notByte, notBool, notInt, notShort, notChar, ++ notLong, notFloat, notObj, notDouble; ++ ++ assert(btos == 0, "change code, btos != 0"); ++ __ srli_d(flags, flags, ConstantPoolCacheEntry::tos_state_shift); ++ __ andi(flags, flags, ConstantPoolCacheEntry::tos_state_mask); ++ __ bne(flags, R0, notByte); ++ ++ // btos ++ __ access_load_at(T_BYTE, IN_HEAP, FSR, field, noreg, noreg); ++ __ push(btos); ++ ++ // Rewrite bytecode to be faster ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_bgetfield, T3, T2); ++ } ++ __ b(Done); ++ ++ ++ __ bind(notByte); ++ __ li(AT, ztos); ++ __ bne(flags, AT, notBool); ++ ++ // ztos ++ __ access_load_at(T_BOOLEAN, IN_HEAP, FSR, field, noreg, noreg); ++ __ push(ztos); ++ ++ // Rewrite bytecode to be faster ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_bgetfield, T3, T2); ++ } ++ __ b(Done); ++ ++ ++ __ bind(notBool); ++ __ li(AT, itos); ++ __ bne(flags, AT, notInt); ++ ++ // itos ++ __ access_load_at(T_INT, IN_HEAP, FSR, field, noreg, noreg); ++ __ push(itos); ++ ++ // Rewrite bytecode to be faster ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_igetfield, T3, T2); ++ } ++ __ b(Done); ++ ++ __ bind(notInt); ++ __ li(AT, atos); ++ __ bne(flags, AT, notObj); ++ ++ // atos ++ //add for compressedoops ++ do_oop_load(_masm, Address(index, 0), FSR, IN_HEAP); ++ __ push(atos); ++ ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_agetfield, T3, T2); ++ } ++ __ b(Done); ++ ++ __ bind(notObj); ++ __ li(AT, ctos); ++ __ bne(flags, AT, notChar); ++ ++ // ctos ++ __ access_load_at(T_CHAR, IN_HEAP, FSR, field, noreg, noreg); ++ __ push(ctos); ++ ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_cgetfield, T3, T2); ++ } ++ __ b(Done); ++ ++ __ bind(notChar); ++ __ li(AT, stos); ++ __ bne(flags, AT, notShort); ++ ++ // stos ++ __ access_load_at(T_SHORT, IN_HEAP, FSR, field, noreg, noreg); ++ __ push(stos); ++ ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_sgetfield, T3, T2); ++ } ++ __ b(Done); ++ ++ __ bind(notShort); ++ __ li(AT, ltos); ++ __ bne(flags, AT, notLong); ++ ++ // ltos ++ __ access_load_at(T_LONG, IN_HEAP | MO_RELAXED, FSR, field, noreg, noreg); ++ __ push(ltos); ++ ++ // Don't rewrite to _fast_lgetfield for potential volatile case. ++ __ b(Done); ++ ++ __ bind(notLong); ++ __ li(AT, ftos); ++ __ bne(flags, AT, notFloat); ++ ++ // ftos ++ __ access_load_at(T_FLOAT, IN_HEAP, noreg /* ftos */, field, noreg, noreg); ++ __ push(ftos); ++ ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_fgetfield, T3, T2); ++ } ++ __ b(Done); ++ ++ __ bind(notFloat); ++ __ li(AT, dtos); ++#ifdef ASSERT ++ __ bne(flags, AT, notDouble); ++#endif ++ ++ // dtos ++ __ access_load_at(T_DOUBLE, IN_HEAP, noreg /* dtos */, field, noreg, noreg); ++ __ push(dtos); ++ ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_dgetfield, T3, T2); ++ } ++ ++#ifdef ASSERT ++ __ b(Done); ++ __ bind(notDouble); ++ __ stop("Bad state"); ++#endif ++ ++ __ bind(Done); ++ ++ { ++ Label notVolatile; ++ __ beq(scratch, R0, notVolatile); ++ __ membar(Assembler::Membar_mask_bits(__ LoadLoad | __ LoadStore)); ++ __ bind(notVolatile); ++ } ++} ++ ++void TemplateTable::getfield(int byte_no) { ++ getfield_or_static(byte_no, false); ++} ++ ++void TemplateTable::nofast_getfield(int byte_no) { ++ getfield_or_static(byte_no, false, may_not_rewrite); ++} ++ ++void TemplateTable::getstatic(int byte_no) { ++ getfield_or_static(byte_no, true); ++} ++ ++// The registers cache and index expected to be set before call. ++// The function may destroy various registers, just not the cache and index registers. ++void TemplateTable::jvmti_post_field_mod(Register cache, Register index, bool is_static) { ++ transition(vtos, vtos); ++ ++ ByteSize cp_base_offset = ConstantPoolCache::base_offset(); ++ ++ if (JvmtiExport::can_post_field_modification()) { ++ // Check to see if a field modification watch has been set before ++ // we take the time to call into the VM. ++ Label L1; ++ //kill AT, T1, T2, T3, T4 ++ Register tmp1 = T2; ++ Register tmp2 = T1; ++ Register tmp3 = T3; ++ Register tmp4 = T4; ++ assert_different_registers(cache, index, tmp4); ++ ++ __ li(AT, JvmtiExport::get_field_modification_count_addr()); ++ __ ld_w(AT, AT, 0); ++ __ beq(AT, R0, L1); ++ ++ __ get_cache_and_index_at_bcp(tmp2, tmp4, 1); ++ ++ if (is_static) { ++ __ move(tmp1, R0); ++ } else { ++ // Life is harder. The stack holds the value on top, followed by ++ // the object. We don't know the size of the value, though; it ++ // could be one or two words depending on its type. As a result, ++ // we must find the type to determine where the object is. ++ Label two_word, valsize_known; ++ __ alsl_d(AT, tmp4, tmp2, Address::times_8 - 1); ++ __ ld_wu(tmp3, AT, in_bytes(cp_base_offset + ++ ConstantPoolCacheEntry::flags_offset())); ++ __ srli_d(tmp3, tmp3, ConstantPoolCacheEntry::tos_state_shift); ++ ++ ConstantPoolCacheEntry::verify_tos_state_shift(); ++ __ move(tmp1, SP); ++ __ li(AT, ltos); ++ __ beq(tmp3, AT, two_word); ++ __ li(AT, dtos); ++ __ beq(tmp3, AT, two_word); ++ __ addi_d(tmp1, tmp1, Interpreter::expr_offset_in_bytes(1) ); ++ __ b(valsize_known); ++ ++ __ bind(two_word); ++ __ addi_d(tmp1, tmp1, Interpreter::expr_offset_in_bytes(2)); ++ ++ __ bind(valsize_known); ++ // setup object pointer ++ __ ld_d(tmp1, tmp1, 0 * wordSize); ++ } ++ // cache entry pointer ++ __ addi_d(tmp2, tmp2, in_bytes(cp_base_offset)); ++ __ alsl_d(tmp2, tmp4, tmp2, LogBytesPerWord - 1); ++ // object (tos) ++ __ move(tmp3, SP); ++ // tmp1: object pointer set up above (null if static) ++ // tmp2: cache entry pointer ++ // tmp3: jvalue object on the stack ++ __ call_VM(NOREG, ++ CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::post_field_modification), ++ tmp1, tmp2, tmp3); ++ __ get_cache_and_index_at_bcp(cache, index, 1); ++ __ bind(L1); ++ } ++} ++ ++// used registers : T0, T1, T2, T3, T8 ++// T1 : flags ++// T2 : off ++// T3 : obj ++// T8 : volatile bit ++// see ConstantPoolCacheEntry::set_field for more info ++void TemplateTable::putfield_or_static(int byte_no, bool is_static, RewriteControl rc) { ++ transition(vtos, vtos); ++ ++ const Register cache = T3; ++ const Register index = T0; ++ const Register obj = T3; ++ const Register off = T2; ++ const Register flags = T1; ++ const Register bc = T3; ++ ++ const Register scratch = T8; ++ ++ resolve_cache_and_index(byte_no, cache, index, sizeof(u2)); ++ jvmti_post_field_mod(cache, index, is_static); ++ load_field_cp_cache_entry(obj, cache, index, off, flags, is_static); ++ ++ Label Done; ++ { ++ __ li(scratch, 1 << ConstantPoolCacheEntry::is_volatile_shift); ++ __ andr(scratch, scratch, flags); ++ ++ Label notVolatile; ++ __ beq(scratch, R0, notVolatile); ++ __ membar(Assembler::Membar_mask_bits(__ StoreStore | __ LoadStore)); ++ __ bind(notVolatile); ++ } ++ ++ ++ Label notByte, notBool, notInt, notShort, notChar, notLong, notFloat, notObj, notDouble; ++ ++ assert(btos == 0, "change code, btos != 0"); ++ ++ // btos ++ __ srli_d(flags, flags, ConstantPoolCacheEntry::tos_state_shift); ++ __ andi(flags, flags, ConstantPoolCacheEntry::tos_state_mask); ++ __ bne(flags, R0, notByte); ++ ++ __ pop(btos); ++ if (!is_static) { ++ pop_and_check_object(obj); ++ } ++ __ add_d(T4, obj, off); ++ __ access_store_at(T_BYTE, IN_HEAP, Address(T4), FSR, noreg, noreg, noreg); ++ ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_bputfield, bc, off, true, byte_no); ++ } ++ __ b(Done); ++ ++ // ztos ++ __ bind(notByte); ++ __ li(AT, ztos); ++ __ bne(flags, AT, notBool); ++ ++ __ pop(ztos); ++ if (!is_static) { ++ pop_and_check_object(obj); ++ } ++ __ add_d(T4, obj, off); ++ __ andi(FSR, FSR, 0x1); ++ __ access_store_at(T_BOOLEAN, IN_HEAP, Address(T4), FSR, noreg, noreg, noreg); ++ ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_zputfield, bc, off, true, byte_no); ++ } ++ __ b(Done); ++ ++ // itos ++ __ bind(notBool); ++ __ li(AT, itos); ++ __ bne(flags, AT, notInt); ++ ++ __ pop(itos); ++ if (!is_static) { ++ pop_and_check_object(obj); ++ } ++ __ add_d(T4, obj, off); ++ __ access_store_at(T_INT, IN_HEAP, Address(T4), FSR, noreg, noreg, noreg); ++ ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_iputfield, bc, off, true, byte_no); ++ } ++ __ b(Done); ++ ++ // atos ++ __ bind(notInt); ++ __ li(AT, atos); ++ __ bne(flags, AT, notObj); ++ ++ __ pop(atos); ++ if (!is_static) { ++ pop_and_check_object(obj); ++ } ++ ++ do_oop_store(_masm, Address(obj, off, Address::no_scale, 0), FSR); ++ ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_aputfield, bc, off, true, byte_no); ++ } ++ __ b(Done); ++ ++ // ctos ++ __ bind(notObj); ++ __ li(AT, ctos); ++ __ bne(flags, AT, notChar); ++ ++ __ pop(ctos); ++ if (!is_static) { ++ pop_and_check_object(obj); ++ } ++ __ add_d(T4, obj, off); ++ __ access_store_at(T_CHAR, IN_HEAP, Address(T4), FSR, noreg, noreg, noreg); ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_cputfield, bc, off, true, byte_no); ++ } ++ __ b(Done); ++ ++ // stos ++ __ bind(notChar); ++ __ li(AT, stos); ++ __ bne(flags, AT, notShort); ++ ++ __ pop(stos); ++ if (!is_static) { ++ pop_and_check_object(obj); ++ } ++ __ add_d(T4, obj, off); ++ __ access_store_at(T_SHORT, IN_HEAP, Address(T4), FSR, noreg, noreg, noreg); ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_sputfield, bc, off, true, byte_no); ++ } ++ __ b(Done); ++ ++ // ltos ++ __ bind(notShort); ++ __ li(AT, ltos); ++ __ bne(flags, AT, notLong); ++ ++ __ pop(ltos); ++ if (!is_static) { ++ pop_and_check_object(obj); ++ } ++ __ add_d(T4, obj, off); ++ __ access_store_at(T_LONG, IN_HEAP, Address(T4), FSR, noreg, noreg, noreg); ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_lputfield, bc, off, true, byte_no); ++ } ++ __ b(Done); ++ ++ // ftos ++ __ bind(notLong); ++ __ li(AT, ftos); ++ __ bne(flags, AT, notFloat); ++ ++ __ pop(ftos); ++ if (!is_static) { ++ pop_and_check_object(obj); ++ } ++ __ add_d(T4, obj, off); ++ __ access_store_at(T_FLOAT, IN_HEAP, Address(T4), noreg, noreg, noreg, noreg); ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_fputfield, bc, off, true, byte_no); ++ } ++ __ b(Done); ++ ++ ++ // dtos ++ __ bind(notFloat); ++ __ li(AT, dtos); ++#ifdef ASSERT ++ __ bne(flags, AT, notDouble); ++#endif ++ ++ __ pop(dtos); ++ if (!is_static) { ++ pop_and_check_object(obj); ++ } ++ __ add_d(T4, obj, off); ++ __ access_store_at(T_DOUBLE, IN_HEAP, Address(T4), noreg, noreg, noreg, noreg); ++ if (!is_static && rc == may_rewrite) { ++ patch_bytecode(Bytecodes::_fast_dputfield, bc, off, true, byte_no); ++ } ++ ++#ifdef ASSERT ++ __ b(Done); ++ ++ __ bind(notDouble); ++ __ stop("Bad state"); ++#endif ++ ++ __ bind(Done); ++ ++ { ++ Label notVolatile; ++ __ beq(scratch, R0, notVolatile); ++ __ membar(Assembler::Membar_mask_bits(__ StoreLoad | __ StoreStore)); ++ __ bind(notVolatile); ++ } ++} ++ ++void TemplateTable::putfield(int byte_no) { ++ putfield_or_static(byte_no, false); ++} ++ ++void TemplateTable::nofast_putfield(int byte_no) { ++ putfield_or_static(byte_no, false, may_not_rewrite); ++} ++ ++void TemplateTable::putstatic(int byte_no) { ++ putfield_or_static(byte_no, true); ++} ++ ++// used registers : T1, T2, T3 ++// T1 : cp_entry ++// T2 : obj ++// T3 : value pointer ++void TemplateTable::jvmti_post_fast_field_mod() { ++ if (JvmtiExport::can_post_field_modification()) { ++ // Check to see if a field modification watch has been set before ++ // we take the time to call into the VM. ++ Label L2; ++ //kill AT, T1, T2, T3, T4 ++ Register tmp1 = T2; ++ Register tmp2 = T1; ++ Register tmp3 = T3; ++ Register tmp4 = T4; ++ __ li(AT, JvmtiExport::get_field_modification_count_addr()); ++ __ ld_w(tmp3, AT, 0); ++ __ beq(tmp3, R0, L2); ++ __ pop_ptr(tmp1); ++ __ verify_oop(tmp1); ++ __ push_ptr(tmp1); ++ switch (bytecode()) { // load values into the jvalue object ++ case Bytecodes::_fast_aputfield: __ push_ptr(FSR); break; ++ case Bytecodes::_fast_bputfield: // fall through ++ case Bytecodes::_fast_zputfield: // fall through ++ case Bytecodes::_fast_sputfield: // fall through ++ case Bytecodes::_fast_cputfield: // fall through ++ case Bytecodes::_fast_iputfield: __ push_i(FSR); break; ++ case Bytecodes::_fast_dputfield: __ push_d(FSF); break; ++ case Bytecodes::_fast_fputfield: __ push_f(); break; ++ case Bytecodes::_fast_lputfield: __ push_l(FSR); break; ++ default: ShouldNotReachHere(); ++ } ++ __ move(tmp3, SP); ++ // access constant pool cache entry ++ __ get_cache_entry_pointer_at_bcp(tmp2, FSR, 1); ++ __ verify_oop(tmp1); ++ // tmp1: object pointer copied above ++ // tmp2: cache entry pointer ++ // tmp3: jvalue object on the stack ++ __ call_VM(NOREG, ++ CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::post_field_modification), ++ tmp1, tmp2, tmp3); ++ ++ switch (bytecode()) { // restore tos values ++ case Bytecodes::_fast_aputfield: __ pop_ptr(FSR); break; ++ case Bytecodes::_fast_bputfield: // fall through ++ case Bytecodes::_fast_zputfield: // fall through ++ case Bytecodes::_fast_sputfield: // fall through ++ case Bytecodes::_fast_cputfield: // fall through ++ case Bytecodes::_fast_iputfield: __ pop_i(FSR); break; ++ case Bytecodes::_fast_dputfield: __ pop_d(); break; ++ case Bytecodes::_fast_fputfield: __ pop_f(); break; ++ case Bytecodes::_fast_lputfield: __ pop_l(FSR); break; ++ default: break; ++ } ++ __ bind(L2); ++ } ++} ++ ++// used registers : T2, T3, T1 ++// T2 : index & off & field address ++// T3 : cache & obj ++// T1 : flags ++void TemplateTable::fast_storefield(TosState state) { ++ transition(state, vtos); ++ ++ const Register scratch = T8; ++ ++ ByteSize base = ConstantPoolCache::base_offset(); ++ ++ jvmti_post_fast_field_mod(); ++ ++ // access constant pool cache ++ __ get_cache_and_index_at_bcp(T3, T2, 1); ++ ++ // Must prevent reordering of the following cp cache loads with bytecode load ++ __ membar(__ LoadLoad); ++ ++ // test for volatile with T1 ++ __ alsl_d(AT, T2, T3, Address::times_8 - 1); ++ __ ld_d(T1, AT, in_bytes(base + ConstantPoolCacheEntry::flags_offset())); ++ ++ // replace index with field offset from cache entry ++ __ ld_d(T2, AT, in_bytes(base + ConstantPoolCacheEntry::f2_offset())); ++ ++ Label Done; ++ { ++ __ li(scratch, 1 << ConstantPoolCacheEntry::is_volatile_shift); ++ __ andr(scratch, scratch, T1); ++ ++ Label notVolatile; ++ __ beq(scratch, R0, notVolatile); ++ __ membar(Assembler::Membar_mask_bits(__ StoreStore | __ LoadStore)); ++ __ bind(notVolatile); ++ } ++ ++ // Get object from stack ++ pop_and_check_object(T3); ++ ++ if (bytecode() != Bytecodes::_fast_aputfield) { ++ // field address ++ __ add_d(T2, T3, T2); ++ } ++ ++ // access field ++ switch (bytecode()) { ++ case Bytecodes::_fast_zputfield: ++ __ andi(FSR, FSR, 0x1); // boolean is true if LSB is 1 ++ __ access_store_at(T_BOOLEAN, IN_HEAP, Address(T2), FSR, noreg, noreg, noreg); ++ break; ++ case Bytecodes::_fast_bputfield: ++ __ access_store_at(T_BYTE, IN_HEAP, Address(T2), FSR, noreg, noreg, noreg); ++ break; ++ case Bytecodes::_fast_sputfield: ++ __ access_store_at(T_SHORT, IN_HEAP, Address(T2), FSR, noreg, noreg, noreg); ++ break; ++ case Bytecodes::_fast_cputfield: ++ __ access_store_at(T_CHAR, IN_HEAP, Address(T2), FSR, noreg, noreg, noreg); ++ break; ++ case Bytecodes::_fast_iputfield: ++ __ access_store_at(T_INT, IN_HEAP, Address(T2), FSR, noreg, noreg, noreg); ++ break; ++ case Bytecodes::_fast_lputfield: ++ __ access_store_at(T_LONG, IN_HEAP, Address(T2), FSR, noreg, noreg, noreg); ++ break; ++ case Bytecodes::_fast_fputfield: ++ __ access_store_at(T_FLOAT, IN_HEAP, Address(T2), noreg, noreg, noreg, noreg); ++ break; ++ case Bytecodes::_fast_dputfield: ++ __ access_store_at(T_DOUBLE, IN_HEAP, Address(T2), noreg, noreg, noreg, noreg); ++ break; ++ case Bytecodes::_fast_aputfield: ++ do_oop_store(_masm, Address(T3, T2, Address::no_scale, 0), FSR); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ ++ { ++ Label notVolatile; ++ __ beq(scratch, R0, notVolatile); ++ __ membar(Assembler::Membar_mask_bits(__ StoreLoad | __ StoreStore)); ++ __ bind(notVolatile); ++ } ++} ++ ++// used registers : T2, T3, T1 ++// T3 : cp_entry & cache ++// T2 : index & offset ++void TemplateTable::fast_accessfield(TosState state) { ++ transition(atos, state); ++ ++ const Register scratch = T8; ++ ++ // do the JVMTI work here to avoid disturbing the register state below ++ if (JvmtiExport::can_post_field_access()) { ++ // Check to see if a field access watch has been set before we take ++ // the time to call into the VM. ++ Label L1; ++ __ li(AT, (intptr_t)JvmtiExport::get_field_access_count_addr()); ++ __ ld_w(T3, AT, 0); ++ __ beq(T3, R0, L1); ++ // access constant pool cache entry ++ __ get_cache_entry_pointer_at_bcp(T3, T1, 1); ++ __ move(TSR, FSR); ++ __ verify_oop(FSR); ++ // FSR: object pointer copied above ++ // T3: cache entry pointer ++ __ call_VM(NOREG, ++ CAST_FROM_FN_PTR(address, InterpreterRuntime::post_field_access), ++ FSR, T3); ++ __ move(FSR, TSR); ++ __ bind(L1); ++ } ++ ++ // access constant pool cache ++ __ get_cache_and_index_at_bcp(T3, T2, 1); ++ ++ // Must prevent reordering of the following cp cache loads with bytecode load ++ __ membar(__ LoadLoad); ++ ++ // replace index with field offset from cache entry ++ __ alsl_d(AT, T2, T3, Address::times_8 - 1); ++ __ ld_d(T2, AT, in_bytes(ConstantPoolCache::base_offset() + ConstantPoolCacheEntry::f2_offset())); ++ ++ { ++ __ ld_d(AT, AT, in_bytes(ConstantPoolCache::base_offset() + ConstantPoolCacheEntry::flags_offset())); ++ __ li(scratch, 1 << ConstantPoolCacheEntry::is_volatile_shift); ++ __ andr(scratch, scratch, AT); ++ ++ Label notVolatile; ++ __ beq(scratch, R0, notVolatile); ++ __ membar(MacroAssembler::AnyAny); ++ __ bind(notVolatile); ++ } ++ ++ // FSR: object ++ __ verify_oop(FSR); ++ __ null_check(FSR); ++ // field addresses ++ __ add_d(FSR, FSR, T2); ++ ++ // access field ++ switch (bytecode()) { ++ case Bytecodes::_fast_bgetfield: ++ __ access_load_at(T_BYTE, IN_HEAP, FSR, Address(FSR), noreg, noreg); ++ break; ++ case Bytecodes::_fast_sgetfield: ++ __ access_load_at(T_SHORT, IN_HEAP, FSR, Address(FSR), noreg, noreg); ++ break; ++ case Bytecodes::_fast_cgetfield: ++ __ access_load_at(T_CHAR, IN_HEAP, FSR, Address(FSR), noreg, noreg); ++ break; ++ case Bytecodes::_fast_igetfield: ++ __ access_load_at(T_INT, IN_HEAP, FSR, Address(FSR), noreg, noreg); ++ break; ++ case Bytecodes::_fast_lgetfield: ++ __ stop("should not be rewritten"); ++ break; ++ case Bytecodes::_fast_fgetfield: ++ __ access_load_at(T_FLOAT, IN_HEAP, noreg, Address(FSR), noreg, noreg); ++ break; ++ case Bytecodes::_fast_dgetfield: ++ __ access_load_at(T_DOUBLE, IN_HEAP, noreg, Address(FSR), noreg, noreg); ++ break; ++ case Bytecodes::_fast_agetfield: ++ do_oop_load(_masm, Address(FSR, 0), FSR, IN_HEAP); ++ __ verify_oop(FSR); ++ break; ++ default: ++ ShouldNotReachHere(); ++ } ++ ++ { ++ Label notVolatile; ++ __ beq(scratch, R0, notVolatile); ++ __ membar(Assembler::Membar_mask_bits(__ LoadLoad | __ LoadStore)); ++ __ bind(notVolatile); ++ } ++} ++ ++// generator for _fast_iaccess_0, _fast_aaccess_0, _fast_faccess_0 ++// used registers : T1, T2, T3, T1 ++// T1 : obj & field address ++// T2 : off ++// T3 : cache ++// T1 : index ++void TemplateTable::fast_xaccess(TosState state) { ++ transition(vtos, state); ++ ++ const Register scratch = T8; ++ ++ // get receiver ++ __ ld_d(T1, aaddress(0)); ++ // access constant pool cache ++ __ get_cache_and_index_at_bcp(T3, T2, 2); ++ __ alsl_d(AT, T2, T3, Address::times_8 - 1); ++ __ ld_d(T2, AT, in_bytes(ConstantPoolCache::base_offset() + ConstantPoolCacheEntry::f2_offset())); ++ ++ { ++ __ ld_d(AT, AT, in_bytes(ConstantPoolCache::base_offset() + ConstantPoolCacheEntry::flags_offset())); ++ __ li(scratch, 1 << ConstantPoolCacheEntry::is_volatile_shift); ++ __ andr(scratch, scratch, AT); ++ ++ Label notVolatile; ++ __ beq(scratch, R0, notVolatile); ++ __ membar(MacroAssembler::AnyAny); ++ __ bind(notVolatile); ++ } ++ ++ // make sure exception is reported in correct bcp range (getfield is ++ // next instruction) ++ __ addi_d(BCP, BCP, 1); ++ __ null_check(T1); ++ __ add_d(T1, T1, T2); ++ ++ if (state == itos) { ++ __ access_load_at(T_INT, IN_HEAP, FSR, Address(T1), noreg, noreg); ++ } else if (state == atos) { ++ do_oop_load(_masm, Address(T1, 0), FSR, IN_HEAP); ++ __ verify_oop(FSR); ++ } else if (state == ftos) { ++ __ access_load_at(T_FLOAT, IN_HEAP, noreg, Address(T1), noreg, noreg); ++ } else { ++ ShouldNotReachHere(); ++ } ++ __ addi_d(BCP, BCP, -1); ++ ++ { ++ Label notVolatile; ++ __ beq(scratch, R0, notVolatile); ++ __ membar(Assembler::Membar_mask_bits(__ LoadLoad | __ LoadStore)); ++ __ bind(notVolatile); ++ } ++} ++ ++ ++//----------------------------------------------------------------------------- ++// Calls ++ ++// method, index, recv, flags: T1, T2, T3, T1 ++// byte_no = 2 for _invokevirtual, 1 else ++// T0 : return address ++// get the method & index of the invoke, and push the return address of ++// the invoke(first word in the frame) ++// this address is where the return code jmp to. ++// NOTE : this method will set T3&T1 as recv&flags ++void TemplateTable::prepare_invoke(int byte_no, ++ Register method, // linked method (or i-klass) ++ Register index, // itable index, MethodType, etc. ++ Register recv, // if caller wants to see it ++ Register flags // if caller wants to test it ++ ) { ++ // determine flags ++ const Bytecodes::Code code = bytecode(); ++ const bool is_invokeinterface = code == Bytecodes::_invokeinterface; ++ const bool is_invokedynamic = code == Bytecodes::_invokedynamic; ++ const bool is_invokehandle = code == Bytecodes::_invokehandle; ++ const bool is_invokevirtual = code == Bytecodes::_invokevirtual; ++ const bool is_invokespecial = code == Bytecodes::_invokespecial; ++ const bool load_receiver = (recv != noreg); ++ const bool save_flags = (flags != noreg); ++ assert(load_receiver == (code != Bytecodes::_invokestatic && code != Bytecodes::_invokedynamic),""); ++ assert(save_flags == (is_invokeinterface || is_invokevirtual), "need flags for vfinal"); ++ assert(flags == noreg || flags == T1, "error flags reg."); ++ assert(recv == noreg || recv == T3, "error recv reg."); ++ ++ // setup registers & access constant pool cache ++ if(recv == noreg) recv = T3; ++ if(flags == noreg) flags = T1; ++ assert_different_registers(method, index, recv, flags); ++ ++ // save 'interpreter return address' ++ __ save_bcp(); ++ ++ load_invoke_cp_cache_entry(byte_no, method, index, flags, is_invokevirtual, false, is_invokedynamic); ++ ++ if (is_invokehandle) { ++ Label L_no_push; ++ __ li(AT, (1 << ConstantPoolCacheEntry::has_appendix_shift)); ++ __ andr(AT, AT, flags); ++ __ beq(AT, R0, L_no_push); ++ // Push the appendix as a trailing parameter. ++ // This must be done before we get the receiver, ++ // since the parameter_size includes it. ++ Register tmp = T6; ++ __ push(tmp); ++ __ move(tmp, index); ++ __ load_resolved_reference_at_index(index, tmp, recv); ++ __ pop(tmp); ++ __ push(index); // push appendix (MethodType, CallSite, etc.) ++ __ bind(L_no_push); ++ } ++ ++ // load receiver if needed (note: no return address pushed yet) ++ if (load_receiver) { ++ // parameter_size_mask = 1 << 8 ++ __ andi(recv, flags, ConstantPoolCacheEntry::parameter_size_mask); ++ ++ Address recv_addr = __ argument_address(recv, -1); ++ __ ld_d(recv, recv_addr); ++ __ verify_oop(recv); ++ } ++ if(save_flags) { ++ __ move(BCP, flags); ++ } ++ ++ // compute return type ++ __ srli_d(flags, flags, ConstantPoolCacheEntry::tos_state_shift); ++ __ andi(flags, flags, 0xf); ++ ++ // Make sure we don't need to mask flags for tos_state_shift after the above shift ++ ConstantPoolCacheEntry::verify_tos_state_shift(); ++ // load return address ++ { ++ const address table = (address) Interpreter::invoke_return_entry_table_for(code); ++ __ li(AT, (long)table); ++ __ alsl_d(AT, flags, AT, LogBytesPerWord - 1); ++ __ ld_d(RA, AT, 0); ++ } ++ ++ if (save_flags) { ++ __ move(flags, BCP); ++ __ restore_bcp(); ++ } ++} ++ ++// used registers : T0, T3, T1, T2 ++// T3 : recv, this two register using convention is by prepare_invoke ++// T1 : flags, klass ++// Rmethod : method, index must be Rmethod ++void TemplateTable::invokevirtual_helper(Register index, ++ Register recv, ++ Register flags) { ++ ++ assert_different_registers(index, recv, flags, T2); ++ ++ // Test for an invoke of a final method ++ Label notFinal; ++ __ li(AT, (1 << ConstantPoolCacheEntry::is_vfinal_shift)); ++ __ andr(AT, flags, AT); ++ __ beq(AT, R0, notFinal); ++ ++ Register method = index; // method must be Rmethod ++ assert(method == Rmethod, "Method must be Rmethod for interpreter calling convention"); ++ ++ // do the call - the index is actually the method to call ++ // the index is indeed Method*, for this is vfinal, ++ // see ConstantPoolCacheEntry::set_method for more info ++ ++ // It's final, need a null check here! ++ __ null_check(recv); ++ ++ // profile this call ++ __ profile_final_call(T2); ++ ++ // T2: tmp, used for mdp ++ // method: callee ++ // T4: tmp ++ // is_virtual: true ++ __ profile_arguments_type(T2, method, T4, true); ++ ++ __ jump_from_interpreted(method); ++ ++ __ bind(notFinal); ++ ++ // get receiver klass ++ __ load_klass(T2, recv); ++ ++ // profile this call ++ __ profile_virtual_call(T2, T0, T1); ++ ++ // get target Method & entry point ++ __ lookup_virtual_method(T2, index, method); ++ __ profile_arguments_type(T2, method, T4, true); ++ __ jump_from_interpreted(method); ++} ++ ++void TemplateTable::invokevirtual(int byte_no) { ++ transition(vtos, vtos); ++ assert(byte_no == f2_byte, "use this argument"); ++ prepare_invoke(byte_no, Rmethod, NOREG, T3, T1); ++ // now recv & flags in T3, T1 ++ invokevirtual_helper(Rmethod, T3, T1); ++} ++ ++// T4 : entry ++// Rmethod : method ++void TemplateTable::invokespecial(int byte_no) { ++ transition(vtos, vtos); ++ assert(byte_no == f1_byte, "use this argument"); ++ prepare_invoke(byte_no, Rmethod, NOREG, T3); ++ // now recv & flags in T3, T1 ++ __ verify_oop(T3); ++ __ null_check(T3); ++ __ profile_call(T4); ++ ++ // T8: tmp, used for mdp ++ // Rmethod: callee ++ // T4: tmp ++ // is_virtual: false ++ __ profile_arguments_type(T8, Rmethod, T4, false); ++ ++ __ jump_from_interpreted(Rmethod); ++ __ move(T0, T3); ++} ++ ++void TemplateTable::invokestatic(int byte_no) { ++ transition(vtos, vtos); ++ assert(byte_no == f1_byte, "use this argument"); ++ prepare_invoke(byte_no, Rmethod, NOREG); ++ ++ __ profile_call(T4); ++ ++ // T8: tmp, used for mdp ++ // Rmethod: callee ++ // T4: tmp ++ // is_virtual: false ++ __ profile_arguments_type(T8, Rmethod, T4, false); ++ ++ __ jump_from_interpreted(Rmethod); ++} ++ ++// i have no idea what to do here, now. for future change. FIXME. ++void TemplateTable::fast_invokevfinal(int byte_no) { ++ transition(vtos, vtos); ++ assert(byte_no == f2_byte, "use this argument"); ++ __ stop("fast_invokevfinal not used on LoongArch64"); ++} ++ ++// used registers : T0, T1, T2, T3, T1, A7 ++// T0 : itable, vtable, entry ++// T1 : interface ++// T3 : receiver ++// T1 : flags, klass ++// Rmethod : index, method, this is required by interpreter_entry ++void TemplateTable::invokeinterface(int byte_no) { ++ transition(vtos, vtos); ++ //this method will use T1-T4 and T0 ++ assert(byte_no == f1_byte, "use this argument"); ++ prepare_invoke(byte_no, T2, Rmethod, T3, T1); ++ // T2: reference klass (from f1) if interface method ++ // Rmethod: method (from f2) ++ // T3: receiver ++ // T1: flags ++ ++ // First check for Object case, then private interface method, ++ // then regular interface method. ++ ++ // Special case of invokeinterface called for virtual method of ++ // java.lang.Object. See cpCache.cpp for details. ++ Label notObjectMethod; ++ __ li(AT, (1 << ConstantPoolCacheEntry::is_forced_virtual_shift)); ++ __ andr(AT, T1, AT); ++ __ beq(AT, R0, notObjectMethod); ++ ++ invokevirtual_helper(Rmethod, T3, T1); ++ // no return from above ++ __ bind(notObjectMethod); ++ ++ Label no_such_interface; // for receiver subtype check ++ Register recvKlass; // used for exception processing ++ ++ // Check for private method invocation - indicated by vfinal ++ Label notVFinal; ++ __ li(AT, (1 << ConstantPoolCacheEntry::is_vfinal_shift)); ++ __ andr(AT, T1, AT); ++ __ beq(AT, R0, notVFinal); ++ ++ // Get receiver klass into FSR ++ __ load_klass(FSR, T3); ++ ++ Label subtype; ++ __ check_klass_subtype(FSR, T2, T0, subtype); ++ // If we get here the typecheck failed ++ recvKlass = T1; ++ __ move(recvKlass, FSR); ++ __ b(no_such_interface); ++ ++ __ bind(subtype); ++ ++ // do the call - rbx is actually the method to call ++ ++ __ profile_final_call(T1); ++ __ profile_arguments_type(T1, Rmethod, T0, true); ++ ++ __ jump_from_interpreted(Rmethod); ++ // no return from above ++ __ bind(notVFinal); ++ ++ // Get receiver klass into T1 ++ __ restore_locals(); ++ __ load_klass(T1, T3); ++ ++ Label no_such_method; ++ ++ // Preserve method for throw_AbstractMethodErrorVerbose. ++ __ move(T3, Rmethod); ++ // Receiver subtype check against REFC. ++ // Superklass in T2. Subklass in T1. ++ __ lookup_interface_method(// inputs: rec. class, interface, itable index ++ T1, T2, noreg, ++ // outputs: scan temp. reg, scan temp. reg ++ T0, FSR, ++ no_such_interface, ++ /*return_method=*/false); ++ ++ ++ // profile this call ++ __ restore_bcp(); ++ __ profile_virtual_call(T1, T0, FSR); ++ ++ // Get declaring interface class from method, and itable index ++ __ load_method_holder(T2, Rmethod); ++ __ ld_w(Rmethod, Rmethod, in_bytes(Method::itable_index_offset())); ++ __ addi_d(Rmethod, Rmethod, (-1) * Method::itable_index_max); ++ __ sub_w(Rmethod, R0, Rmethod); ++ ++ // Preserve recvKlass for throw_AbstractMethodErrorVerbose. ++ __ move(FSR, T1); ++ __ lookup_interface_method(// inputs: rec. class, interface, itable index ++ FSR, T2, Rmethod, ++ // outputs: method, scan temp. reg ++ Rmethod, T0, ++ no_such_interface); ++ ++ // Rmethod: Method* to call ++ // T3: receiver ++ // Check for abstract method error ++ // Note: This should be done more efficiently via a throw_abstract_method_error ++ // interpreter entry point and a conditional jump to it in case of a null ++ // method. ++ __ beq(Rmethod, R0, no_such_method); ++ ++ __ profile_arguments_type(T1, Rmethod, T0, true); ++ ++ // do the call ++ // T3: receiver ++ // Rmethod: Method* ++ __ jump_from_interpreted(Rmethod); ++ __ should_not_reach_here(); ++ ++ // exception handling code follows... ++ // note: must restore interpreter registers to canonical ++ // state for exception handling to work correctly! ++ ++ __ bind(no_such_method); ++ // throw exception ++ __ restore_bcp(); ++ __ restore_locals(); ++ // Pass arguments for generating a verbose error message. ++ recvKlass = A1; ++ Register method = A2; ++ if (recvKlass != T1) { __ move(recvKlass, T1); } ++ if (method != T3) { __ move(method, T3); } ++ __ call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::throw_AbstractMethodErrorVerbose), recvKlass, method); ++ // the call_VM checks for exception, so we should never return here. ++ __ should_not_reach_here(); ++ ++ __ bind(no_such_interface); ++ // throw exception ++ __ restore_bcp(); ++ __ restore_locals(); ++ // Pass arguments for generating a verbose error message. ++ if (recvKlass != T1) { __ move(recvKlass, T1); } ++ __ call_VM(noreg, CAST_FROM_FN_PTR(address, InterpreterRuntime::throw_IncompatibleClassChangeErrorVerbose), recvKlass, T2); ++ // the call_VM checks for exception, so we should never return here. ++ __ should_not_reach_here(); ++} ++ ++ ++void TemplateTable::invokehandle(int byte_no) { ++ transition(vtos, vtos); ++ assert(byte_no == f1_byte, "use this argument"); ++ const Register T2_method = Rmethod; ++ const Register FSR_mtype = FSR; ++ const Register T3_recv = T3; ++ ++ prepare_invoke(byte_no, T2_method, FSR_mtype, T3_recv); ++ //??__ verify_method_ptr(T2_method); ++ __ verify_oop(T3_recv); ++ __ null_check(T3_recv); ++ ++ // T4: MethodType object (from cpool->resolved_references[f1], if necessary) ++ // T2_method: MH.invokeExact_MT method (from f2) ++ ++ // Note: T4 is already pushed (if necessary) by prepare_invoke ++ ++ // FIXME: profile the LambdaForm also ++ __ profile_final_call(T4); ++ ++ // T8: tmp, used for mdp ++ // T2_method: callee ++ // T4: tmp ++ // is_virtual: true ++ __ profile_arguments_type(T8, T2_method, T4, true); ++ ++ __ jump_from_interpreted(T2_method); ++} ++ ++void TemplateTable::invokedynamic(int byte_no) { ++ transition(vtos, vtos); ++ assert(byte_no == f1_byte, "use this argument"); ++ ++ const Register T2_callsite = T2; ++ ++ load_invokedynamic_entry(Rmethod); ++ ++ // T2: CallSite object (from cpool->resolved_references[f1]) ++ // Rmethod: MH.linkToCallSite method (from f2) ++ ++ // Note: T2_callsite is already pushed by prepare_invoke ++ // %%% should make a type profile for any invokedynamic that takes a ref argument ++ // profile this call ++ __ profile_call(T4); ++ ++ // T8: tmp, used for mdp ++ // Rmethod: callee ++ // T4: tmp ++ // is_virtual: false ++ __ profile_arguments_type(T8, Rmethod, T4, false); ++ ++ __ verify_oop(T2_callsite); ++ ++ __ jump_from_interpreted(Rmethod); ++} ++ ++//----------------------------------------------------------------------------- ++// Allocation ++// T1 : tags & buffer end & thread ++// T2 : object end ++// T3 : klass ++// T1 : object size ++// A1 : cpool ++// A2 : cp index ++// return object in FSR ++void TemplateTable::_new() { ++ transition(vtos, atos); ++ __ get_unsigned_2_byte_index_at_bcp(A2, 1); ++ ++ Label slow_case; ++ Label done; ++ Label initialize_header; ++ ++ __ get_cpool_and_tags(A1, T1); ++ ++ // make sure the class we're about to instantiate has been resolved. ++ // Note: slow_case does a pop of stack, which is why we loaded class/pushed above ++ const int tags_offset = Array::base_offset_in_bytes(); ++ __ add_d(T1, T1, A2); ++ __ ld_b(AT, T1, tags_offset); ++ if(os::is_MP()) { ++ __ membar(Assembler::Membar_mask_bits(__ LoadLoad|__ LoadStore)); ++ } ++ __ addi_d(AT, AT, -(int)JVM_CONSTANT_Class); ++ __ bnez(AT, slow_case); ++ ++ // get InstanceKlass ++ __ load_resolved_klass_at_index(A1, A2, T3); ++ ++ // make sure klass is initialized & doesn't have finalizer ++ // make sure klass is fully initialized ++ __ ld_hu(T1, T3, in_bytes(InstanceKlass::init_state_offset())); ++ __ addi_d(AT, T1, - (int)InstanceKlass::fully_initialized); ++ __ bnez(AT, slow_case); ++ ++ // has_finalizer ++ __ ld_w(T0, T3, in_bytes(Klass::layout_helper_offset()) ); ++ __ andi(AT, T0, Klass::_lh_instance_slow_path_bit); ++ __ bnez(AT, slow_case); ++ ++ // Allocate the instance: ++ // If TLAB is enabled: ++ // Try to allocate in the TLAB. ++ // If fails, go to the slow path. ++ // Initialize the allocation. ++ // Exit. ++ // ++ // Go to slow path. ++ ++ if (UseTLAB) { ++ __ tlab_allocate(FSR, T0, 0, noreg, T2, slow_case); ++ ++ if (ZeroTLAB) { ++ // the fields have been already cleared ++ __ b(initialize_header); ++ } ++ ++ // The object is initialized before the header. If the object size is ++ // zero, go directly to the header initialization. ++ __ li(AT, - sizeof(oopDesc)); ++ __ add_d(T0, T0, AT); ++ __ beqz(T0, initialize_header); ++ ++ // initialize remaining object fields: T0 is a multiple of 2 ++ { ++ Label loop; ++ __ add_d(T1, FSR, T0); ++ ++ __ bind(loop); ++ __ addi_d(T1, T1, -oopSize); ++ __ st_d(R0, T1, sizeof(oopDesc)); ++ __ bne(T1, FSR, loop); // dont clear header ++ } ++ ++ // klass in T3, ++ // initialize object header only. ++ __ bind(initialize_header); ++ __ li(AT, (long)markWord::prototype().value()); ++ __ st_d(AT, FSR, oopDesc::mark_offset_in_bytes()); ++ ++ __ store_klass_gap(FSR, R0); ++ __ store_klass(FSR, T3); ++ ++ { ++ SkipIfEqual skip_if(_masm, &DTraceAllocProbes, 0); ++ // Trigger dtrace event for fastpath ++ __ push(atos); ++ __ call_VM_leaf(CAST_FROM_FN_PTR(address, static_cast(SharedRuntime::dtrace_object_alloc)), FSR); ++ __ pop(atos); ++ ++ } ++ __ b(done); ++ } ++ ++ // slow case ++ __ bind(slow_case); ++ __ get_constant_pool(A1); ++ __ get_unsigned_2_byte_index_at_bcp(A2, 1); ++ call_VM(FSR, CAST_FROM_FN_PTR(address, InterpreterRuntime::_new), A1, A2); ++ ++ // continue ++ __ bind(done); ++ __ membar(__ StoreStore); ++} ++ ++void TemplateTable::newarray() { ++ transition(itos, atos); ++ __ ld_bu(A1, at_bcp(1)); ++ // type, count ++ call_VM(FSR, CAST_FROM_FN_PTR(address, InterpreterRuntime::newarray), A1, FSR); ++ __ membar(__ StoreStore); ++} ++ ++void TemplateTable::anewarray() { ++ transition(itos, atos); ++ __ get_unsigned_2_byte_index_at_bcp(A2, 1); // big-endian ++ __ get_constant_pool(A1); ++ // cp, index, count ++ call_VM(FSR, CAST_FROM_FN_PTR(address, InterpreterRuntime::anewarray), A1, A2, FSR); ++ __ membar(__ StoreStore); ++} ++ ++void TemplateTable::arraylength() { ++ transition(atos, itos); ++ __ ld_w(FSR, FSR, arrayOopDesc::length_offset_in_bytes()); ++} ++ ++// when invoke gen_subtype_check, super in T3, sub in T2, object in FSR(it's always) ++// T2 : sub klass ++// T3 : cpool ++// T3 : super klass ++void TemplateTable::checkcast() { ++ transition(atos, atos); ++ Label done, is_null, ok_is_subtype, quicked, resolved; ++ __ beq(FSR, R0, is_null); ++ ++ // Get cpool & tags index ++ __ get_cpool_and_tags(T3, T1); ++ __ get_unsigned_2_byte_index_at_bcp(T2, 1); // big-endian ++ ++ // See if bytecode has already been quicked ++ __ add_d(AT, T1, T2); ++ __ ld_b(AT, AT, Array::base_offset_in_bytes()); ++ if(os::is_MP()) { ++ __ membar(Assembler::Membar_mask_bits(__ LoadLoad|__ LoadStore)); ++ } ++ __ addi_d(AT, AT, - (int)JVM_CONSTANT_Class); ++ __ beq(AT, R0, quicked); ++ ++ // In InterpreterRuntime::quicken_io_cc, lots of new classes may be loaded. ++ // Then, GC will move the object in V0 to another places in heap. ++ // Therefore, We should never save such an object in register. ++ // Instead, we should save it in the stack. It can be modified automatically by the GC thread. ++ // After GC, the object address in FSR is changed to a new place. ++ // ++ __ push(atos); ++ call_VM(NOREG, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc)); ++ __ get_vm_result_2(T3, TREG); ++ __ pop_ptr(FSR); ++ __ b(resolved); ++ ++ // klass already in cp, get superklass in T3 ++ __ bind(quicked); ++ __ load_resolved_klass_at_index(T3, T2, T3); ++ ++ __ bind(resolved); ++ ++ // get subklass in T2 ++ __ load_klass(T2, FSR); ++ // Superklass in T3. Subklass in T2. ++ __ gen_subtype_check(T3, T2, ok_is_subtype); ++ ++ // Come here on failure ++ // object is at FSR ++ __ jmp(Interpreter::_throw_ClassCastException_entry); ++ ++ // Come here on success ++ __ bind(ok_is_subtype); ++ ++ // Collect counts on whether this check-cast sees nulls a lot or not. ++ if (ProfileInterpreter) { ++ __ b(done); ++ __ bind(is_null); ++ __ profile_null_seen(T3); ++ } else { ++ __ bind(is_null); ++ } ++ __ bind(done); ++} ++ ++// T3 as cpool, T1 as tags, T2 as index ++// object always in FSR, superklass in T3, subklass in T2 ++void TemplateTable::instanceof() { ++ transition(atos, itos); ++ Label done, is_null, ok_is_subtype, quicked, resolved; ++ ++ __ beq(FSR, R0, is_null); ++ ++ // Get cpool & tags index ++ __ get_cpool_and_tags(T3, T1); ++ // get index ++ __ get_unsigned_2_byte_index_at_bcp(T2, 1); // big-endian ++ ++ // See if bytecode has already been quicked ++ // quicked ++ __ add_d(AT, T1, T2); ++ __ ld_b(AT, AT, Array::base_offset_in_bytes()); ++ if(os::is_MP()) { ++ __ membar(Assembler::Membar_mask_bits(__ LoadLoad|__ LoadStore)); ++ } ++ __ addi_d(AT, AT, - (int)JVM_CONSTANT_Class); ++ __ beq(AT, R0, quicked); ++ ++ __ push(atos); ++ call_VM(NOREG, CAST_FROM_FN_PTR(address, InterpreterRuntime::quicken_io_cc)); ++ __ get_vm_result_2(T3, TREG); ++ __ pop_ptr(FSR); ++ __ b(resolved); ++ ++ // get superklass in T3, subklass in T2 ++ __ bind(quicked); ++ __ load_resolved_klass_at_index(T3, T2, T3); ++ ++ __ bind(resolved); ++ // get subklass in T2 ++ __ load_klass(T2, FSR); ++ ++ // Superklass in T3. Subklass in T2. ++ __ gen_subtype_check(T3, T2, ok_is_subtype); ++ __ move(FSR, R0); ++ // Come here on failure ++ __ b(done); ++ ++ // Come here on success ++ __ bind(ok_is_subtype); ++ __ li(FSR, 1); ++ ++ // Collect counts on whether this test sees nulls a lot or not. ++ if (ProfileInterpreter) { ++ __ beq(R0, R0, done); ++ __ bind(is_null); ++ __ profile_null_seen(T3); ++ } else { ++ __ bind(is_null); // same as 'done' ++ } ++ __ bind(done); ++ // FSR = 0: obj == nullptr or obj is not an instanceof the specified klass ++ // FSR = 1: obj != nullptr and obj is an instanceof the specified klass ++} ++ ++//-------------------------------------------------------- ++//-------------------------------------------- ++// Breakpoints ++void TemplateTable::_breakpoint() { ++ // Note: We get here even if we are single stepping.. ++ // jbug inists on setting breakpoints at every bytecode ++ // even if we are in single step mode. ++ ++ transition(vtos, vtos); ++ ++ // get the unpatched byte code ++ __ get_method(A1); ++ __ call_VM(NOREG, ++ CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::get_original_bytecode_at), ++ A1, BCP); ++ __ move(Rnext, V0); // Rnext will be used in dispatch_only_normal ++ ++ // post the breakpoint event ++ __ get_method(A1); ++ __ call_VM(NOREG, CAST_FROM_FN_PTR(address, InterpreterRuntime::_breakpoint), A1, BCP); ++ ++ // complete the execution of original bytecode ++ __ dispatch_only_normal(vtos); ++} ++ ++//----------------------------------------------------------------------------- ++// Exceptions ++ ++void TemplateTable::athrow() { ++ transition(atos, vtos); ++ __ null_check(FSR); ++ __ jmp(Interpreter::throw_exception_entry()); ++} ++ ++//----------------------------------------------------------------------------- ++// Synchronization ++// ++// Note: monitorenter & exit are symmetric routines; which is reflected ++// in the assembly code structure as well ++// ++// Stack layout: ++// ++// [expressions ] <--- SP = expression stack top ++// .. ++// [expressions ] ++// [monitor entry] <--- monitor block top = expression stack bot ++// .. ++// [monitor entry] ++// [frame data ] <--- monitor block bot ++// ... ++// [return addr ] <--- FP ++ ++// we use T2 as monitor entry pointer, T3 as monitor top pointer ++// object always in FSR ++void TemplateTable::monitorenter() { ++ transition(atos, vtos); ++ ++ // check for null object ++ __ null_check(FSR); ++ ++ const Address monitor_block_top(FP, frame::interpreter_frame_monitor_block_top_offset ++ * wordSize); ++ const int entry_size = frame::interpreter_frame_monitor_size_in_bytes(); ++ Label allocated; ++ ++ const Register monitor_reg = T0; ++ ++ // initialize entry pointer ++ __ move(monitor_reg, R0); ++ ++ // find a free slot in the monitor block (result in monitor_reg) ++ { ++ Label entry, loop, exit, next; ++ __ ld_d(T2, monitor_block_top); ++ __ addi_d(T3, FP, frame::interpreter_frame_initial_sp_offset * wordSize); ++ __ b(entry); ++ ++ // free slot? ++ __ bind(loop); ++ __ ld_d(AT, Address(T2, BasicObjectLock::obj_offset())); ++ __ bne(AT, R0, next); ++ __ move(monitor_reg, T2); ++ ++ __ bind(next); ++ __ beq(FSR, AT, exit); ++ __ addi_d(T2, T2, entry_size); ++ ++ __ bind(entry); ++ __ bne(T3, T2, loop); ++ __ bind(exit); ++ } ++ ++ __ bnez(monitor_reg, allocated); ++ ++ // allocate one if there's no free slot ++ { ++ Label entry, loop; ++ // 1. compute new pointers // SP: old expression stack top ++ __ ld_d(monitor_reg, monitor_block_top); ++ __ addi_d(SP, SP, -entry_size); ++ __ addi_d(monitor_reg, monitor_reg, -entry_size); ++ __ st_d(monitor_reg, monitor_block_top); ++ __ move(T3, SP); ++ __ b(entry); ++ ++ // 2. move expression stack contents ++ __ bind(loop); ++ __ ld_d(AT, T3, entry_size); ++ __ st_d(AT, T3, 0); ++ __ addi_d(T3, T3, wordSize); ++ __ bind(entry); ++ __ bne(T3, monitor_reg, loop); ++ } ++ ++ __ bind(allocated); ++ // Increment bcp to point to the next bytecode, ++ // so exception handling for async. exceptions work correctly. ++ // The object has already been popped from the stack, so the ++ // expression stack looks correct. ++ __ addi_d(BCP, BCP, 1); ++ __ st_d(FSR, Address(monitor_reg, BasicObjectLock::obj_offset())); ++ __ lock_object(monitor_reg); ++ // check to make sure this monitor doesn't cause stack overflow after locking ++ __ save_bcp(); // in case of exception ++ __ generate_stack_overflow_check(0); ++ // The bcp has already been incremented. Just need to dispatch to next instruction. ++ ++ __ dispatch_next(vtos); ++} ++ ++// T0 : points to current entry, strating with top-most monitor entry ++// T2 : points to word before bottom of monitor block ++void TemplateTable::monitorexit() { ++ transition(atos, vtos); ++ ++ __ null_check(FSR); ++ ++ const int entry_size = frame::interpreter_frame_monitor_size_in_bytes(); ++ ++ const Register monitor_top = T0; ++ const Register monitor_bot = T2; ++ ++ Label found; ++ ++ // find matching slot ++ { ++ Label entry, loop; ++ __ ld_d(monitor_top, FP, frame::interpreter_frame_monitor_block_top_offset * wordSize); ++ __ addi_d(monitor_bot, FP, frame::interpreter_frame_initial_sp_offset * wordSize); ++ __ b(entry); ++ ++ __ bind(loop); ++ __ ld_d(AT, Address(monitor_top, BasicObjectLock::obj_offset())); ++ __ beq(FSR, AT, found); ++ __ addi_d(monitor_top, monitor_top, entry_size); ++ __ bind(entry); ++ __ bne(monitor_bot, monitor_top, loop); ++ } ++ ++ // error handling. Unlocking was not block-structured ++ __ call_VM(NOREG, CAST_FROM_FN_PTR(address, ++ InterpreterRuntime::throw_illegal_monitor_state_exception)); ++ __ should_not_reach_here(); ++ ++ // call run-time routine ++ __ bind(found); ++ __ move(TSR, FSR); ++ __ unlock_object(monitor_top); ++ __ move(FSR, TSR); ++} ++ ++ ++// Wide instructions ++void TemplateTable::wide() { ++ transition(vtos, vtos); ++ __ ld_bu(Rnext, at_bcp(1)); ++ __ li(AT, (long)Interpreter::_wentry_point); ++ __ alsl_d(AT, Rnext, AT, Address::times_8 - 1); ++ __ ld_d(AT, AT, 0); ++ __ jr(AT); ++} ++ ++ ++void TemplateTable::multianewarray() { ++ transition(vtos, atos); ++ // last dim is on top of stack; we want address of first one: ++ // first_addr = last_addr + (ndims - 1) * wordSize ++ __ ld_bu(A1, at_bcp(3)); // dimension ++ __ addi_d(A1, A1, -1); ++ __ alsl_d(A1, A1, SP, Address::times_8 - 1); // now A1 pointer to the count array on the stack ++ call_VM(FSR, CAST_FROM_FN_PTR(address, InterpreterRuntime::multianewarray), A1); ++ __ ld_bu(AT, at_bcp(3)); ++ __ alsl_d(SP, AT, SP, Address::times_8 - 1); ++ __ membar(__ AnyAny);//no membar here for aarch64 ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/templateTable_loongarch.hpp b/src/hotspot/cpu/loongarch/templateTable_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/templateTable_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/templateTable_loongarch.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,43 @@ ++/* ++ * Copyright (c) 2003, 2012, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_TEMPLATETABLE_LOONGARCH_64_HPP ++#define CPU_LOONGARCH_TEMPLATETABLE_LOONGARCH_64_HPP ++ ++ static void prepare_invoke(int byte_no, ++ Register method, // linked method (or i-klass) ++ Register index = noreg, // itable index, MethodType, etc. ++ Register recv = noreg, // if caller wants to see it ++ Register flags = noreg // if caller wants to test it ++ ); ++ static void invokevirtual_helper(Register index, Register recv, ++ Register flags); ++ static void volatile_barrier(); ++ ++ // Helpers ++ static void index_check(Register array, Register index); ++ static void index_check_without_pop(Register array, Register index); ++ ++#endif // CPU_LOONGARCH_TEMPLATETABLE_LOONGARCH_64_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/upcallLinker_loongarch_64.cpp b/src/hotspot/cpu/loongarch/upcallLinker_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/upcallLinker_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/upcallLinker_loongarch_64.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,347 @@ ++/* ++ * Copyright (c) 2020, Red Hat, Inc. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "logging/logStream.hpp" ++#include "memory/resourceArea.hpp" ++#include "prims/upcallLinker.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/signature.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "utilities/formatBuffer.hpp" ++#include "utilities/globalDefinitions.hpp" ++#include "vmreg_loongarch.inline.hpp" ++ ++#define __ _masm-> ++ ++// for callee saved regs, according to the caller's ABI ++static int compute_reg_save_area_size(const ABIDescriptor& abi) { ++ int size = 0; ++ for (int i = 0; i < Register::number_of_registers; i++) { ++ Register reg = as_Register(i); ++ if (reg == FP || reg == SP || reg == RA) continue; // saved/restored by prologue/epilogue ++ if (!abi.is_volatile_reg(reg)) { ++ size += 8; // bytes ++ } ++ } ++ ++ for (int i = 0; i < FloatRegister::number_of_registers; i++) { ++ FloatRegister reg = as_FloatRegister(i); ++ if (!abi.is_volatile_reg(reg)) { ++ size += 8; ++ } ++ } ++ ++ return size; ++} ++ ++static void preserve_callee_saved_registers(MacroAssembler* _masm, const ABIDescriptor& abi, int reg_save_area_offset) { ++ // 1. iterate all registers in the architecture ++ // - check if they are volatile or not for the given abi ++ // - if NOT, we need to save it here ++ ++ int offset = reg_save_area_offset; ++ ++ __ block_comment("{ preserve_callee_saved_regs "); ++ for (int i = 0; i < Register::number_of_registers; i++) { ++ Register reg = as_Register(i); ++ if (reg == FP || reg == SP || reg == RA) continue; // saved/restored by prologue/epilogue ++ if (!abi.is_volatile_reg(reg)) { ++ __ st_d(reg, SP, offset); ++ offset += 8; ++ } ++ } ++ ++ for (int i = 0; i < FloatRegister::number_of_registers; i++) { ++ FloatRegister reg = as_FloatRegister(i); ++ if (!abi.is_volatile_reg(reg)) { ++ __ fst_d(reg, SP, offset); ++ offset += 8; ++ } ++ } ++ ++ __ block_comment("} preserve_callee_saved_regs "); ++} ++ ++static void restore_callee_saved_registers(MacroAssembler* _masm, const ABIDescriptor& abi, int reg_save_area_offset) { ++ // 1. iterate all registers in the architecture ++ // - check if they are volatile or not for the given abi ++ // - if NOT, we need to restore it here ++ ++ int offset = reg_save_area_offset; ++ ++ __ block_comment("{ restore_callee_saved_regs "); ++ for (int i = 0; i < Register::number_of_registers; i++) { ++ Register reg = as_Register(i); ++ if (reg == FP || reg == SP || reg == RA) continue; // saved/restored by prologue/epilogue ++ if (!abi.is_volatile_reg(reg)) { ++ __ ld_d(reg, SP, offset); ++ offset += 8; ++ } ++ } ++ ++ for (int i = 0; i < FloatRegister::number_of_registers; i++) { ++ FloatRegister reg = as_FloatRegister(i); ++ if (!abi.is_volatile_reg(reg)) { ++ __ fld_d(reg, SP, offset); ++ offset += 8; ++ } ++ } ++ ++ __ block_comment("} restore_callee_saved_regs "); ++} ++ ++static const int upcall_stub_code_base_size = 2048; ++static const int upcall_stub_size_per_arg = 16; ++ ++address UpcallLinker::make_upcall_stub(jobject receiver, Method* entry, ++ BasicType* in_sig_bt, int total_in_args, ++ BasicType* out_sig_bt, int total_out_args, ++ BasicType ret_type, ++ jobject jabi, jobject jconv, ++ bool needs_return_buffer, int ret_buf_size) { ++ ResourceMark rm; ++ const ABIDescriptor abi = ForeignGlobals::parse_abi_descriptor(jabi); ++ const CallRegs call_regs = ForeignGlobals::parse_call_regs(jconv); ++ int code_size = upcall_stub_code_base_size + (total_in_args * upcall_stub_size_per_arg); ++ CodeBuffer buffer("upcall_stub", code_size, /* locs_size = */ 1); ++ ++ Register shuffle_reg = S0; ++ JavaCallingConvention out_conv; ++ NativeCallingConvention in_conv(call_regs._arg_regs); ++ ArgumentShuffle arg_shuffle(in_sig_bt, total_in_args, out_sig_bt, total_out_args, &in_conv, &out_conv, as_VMStorage(shuffle_reg)); ++ int preserved_bytes = SharedRuntime::out_preserve_stack_slots() * VMRegImpl::stack_slot_size; ++ int stack_bytes = preserved_bytes + arg_shuffle.out_arg_bytes(); ++ int out_arg_area = align_up(stack_bytes, StackAlignmentInBytes); ++ ++#ifndef PRODUCT ++ LogTarget(Trace, foreign, upcall) lt; ++ if (lt.is_enabled()) { ++ ResourceMark rm; ++ LogStream ls(lt); ++ arg_shuffle.print_on(&ls); ++ } ++#endif ++ ++ // out_arg_area (for stack arguments) doubles as shadow space for native calls. ++ // make sure it is big enough. ++ if (out_arg_area < frame::arg_reg_save_area_bytes) { ++ out_arg_area = frame::arg_reg_save_area_bytes; ++ } ++ ++ int reg_save_area_size = compute_reg_save_area_size(abi); ++ RegSpiller arg_spilller(call_regs._arg_regs); ++ RegSpiller result_spiller(call_regs._ret_regs); ++ ++ int shuffle_area_offset = 0; ++ int res_save_area_offset = shuffle_area_offset + out_arg_area; ++ int arg_save_area_offset = res_save_area_offset + result_spiller.spill_size_bytes(); ++ int reg_save_area_offset = arg_save_area_offset + arg_spilller.spill_size_bytes(); ++ int frame_data_offset = reg_save_area_offset + reg_save_area_size; ++ int frame_bottom_offset = frame_data_offset + sizeof(UpcallStub::FrameData); ++ ++ StubLocations locs; ++ int ret_buf_offset = -1; ++ if (needs_return_buffer) { ++ ret_buf_offset = frame_bottom_offset; ++ frame_bottom_offset += ret_buf_size; ++ // use a free register for shuffling code to pick up return ++ // buffer address from ++ locs.set(StubLocations::RETURN_BUFFER, abi._scratch1); ++ } ++ ++ int frame_size = frame_bottom_offset; ++ frame_size = align_up(frame_size, StackAlignmentInBytes); ++ ++ // The space we have allocated will look like: ++ // ++ // ++ // FP-> | 2 slots RA | ++ // | 2 slots FP | ++ // |---------------------| = frame_bottom_offset = frame_size ++ // | (optional) | ++ // | ret_buf | ++ // |---------------------| = ret_buf_offset ++ // | | ++ // | FrameData | ++ // |---------------------| = frame_data_offset ++ // | | ++ // | reg_save_area | ++ // |---------------------| = reg_save_are_offset ++ // | | ++ // | arg_save_area | ++ // |---------------------| = arg_save_are_offset ++ // | | ++ // | res_save_area | ++ // |---------------------| = res_save_are_offset ++ // | | ++ // SP-> | out_arg_area | needs to be at end for shadow space ++ // ++ // ++ ++ ////////////////////////////////////////////////////////////////////////////// ++ ++ MacroAssembler* _masm = new MacroAssembler(&buffer); ++ address start = __ pc(); ++ __ enter(); // set up frame ++ assert((abi._stack_alignment_bytes % 16) == 0, "must be 16 byte aligned"); ++ // allocate frame (frame_size is also aligned, so stack is still aligned) ++ __ addi_d(SP, SP, -frame_size); ++ ++ // we have to always spill args since we need to do a call to get the thread ++ // (and maybe attach it). ++ arg_spilller.generate_spill(_masm, arg_save_area_offset); ++ preserve_callee_saved_registers(_masm, abi, reg_save_area_offset); ++ ++ __ block_comment("{ on_entry"); ++ __ lea(c_rarg0, Address(SP, frame_data_offset)); ++ __ call(CAST_FROM_FN_PTR(address, UpcallLinker::on_entry), relocInfo::runtime_call_type); ++ __ move(TREG, V0); ++ __ reinit_heapbase(); ++ __ block_comment("} on_entry"); ++ ++ __ block_comment("{ argument shuffle"); ++ arg_spilller.generate_fill(_masm, arg_save_area_offset); ++ if (needs_return_buffer) { ++ assert(ret_buf_offset != -1, "no return buffer allocated"); ++ __ lea(as_Register(locs.get(StubLocations::RETURN_BUFFER)), Address(SP, ret_buf_offset)); ++ } ++ arg_shuffle.generate(_masm, as_VMStorage(shuffle_reg), abi._shadow_space_bytes, 0, locs); ++ __ block_comment("} argument shuffle"); ++ ++ __ block_comment("{ receiver "); ++ __ li(shuffle_reg, (intptr_t)receiver); ++ __ resolve_jobject(shuffle_reg, SCR2, SCR1); ++ __ move(j_rarg0, shuffle_reg); ++ __ block_comment("} receiver "); ++ ++ __ mov_metadata(Rmethod, entry); ++ __ st_d(Rmethod, TREG, in_bytes(JavaThread::callee_target_offset())); // just in case callee is deoptimized ++ ++ __ ld_d(T4, Rmethod, in_bytes(Method::from_compiled_offset())); ++ __ jalr(T4); ++ ++ // return value shuffle ++ if (!needs_return_buffer) { ++#ifdef ASSERT ++ if (call_regs._ret_regs.length() == 1) { // 0 or 1 ++ VMStorage j_expected_result_reg; ++ switch (ret_type) { ++ case T_BOOLEAN: ++ case T_BYTE: ++ case T_SHORT: ++ case T_CHAR: ++ case T_INT: ++ case T_LONG: ++ j_expected_result_reg = as_VMStorage(V0); ++ break; ++ case T_FLOAT: ++ case T_DOUBLE: ++ j_expected_result_reg = as_VMStorage(FA0); ++ break; ++ default: ++ fatal("unexpected return type: %s", type2name(ret_type)); ++ } ++ // No need to move for now, since CallArranger can pick a return type ++ // that goes in the same reg for both CCs. But, at least assert they are the same ++ assert(call_regs._ret_regs.at(0) == j_expected_result_reg, "unexpected result register"); ++ } ++#endif ++ } else { ++ assert(ret_buf_offset != -1, "no return buffer allocated"); ++ __ lea(SCR1, Address(SP, ret_buf_offset)); ++ int offset = 0; ++ for (int i = 0; i < call_regs._ret_regs.length(); i++) { ++ VMStorage reg = call_regs._ret_regs.at(i); ++ if (reg.type() == StorageType::INTEGER) { ++ __ ld_d(as_Register(reg), SCR1, offset); ++ offset += 8; ++ } else if (reg.type() == StorageType::FLOAT) { ++ __ fld_d(as_FloatRegister(reg), SCR1, offset); ++ offset += 8; // needs to match VECTOR_REG_SIZE in LoongArch64Architecture (Java) ++ } else { ++ ShouldNotReachHere(); ++ } ++ } ++ } ++ ++ result_spiller.generate_spill(_masm, res_save_area_offset); ++ ++ __ block_comment("{ on_exit"); ++ __ lea(c_rarg0, Address(SP, frame_data_offset)); ++ // stack already aligned ++ __ call(CAST_FROM_FN_PTR(address, UpcallLinker::on_exit), relocInfo::runtime_call_type); ++ __ block_comment("} on_exit"); ++ ++ restore_callee_saved_registers(_masm, abi, reg_save_area_offset); ++ ++ result_spiller.generate_fill(_masm, res_save_area_offset); ++ ++ __ leave(); ++ __ jr(RA); ++ ++ ////////////////////////////////////////////////////////////////////////////// ++ ++ __ block_comment("{ exception handler"); ++ ++ intptr_t exception_handler_offset = __ pc() - start; ++ ++ // Native caller has no idea how to handle exceptions, ++ // so we just crash here. Up to callee to catch exceptions. ++ __ verify_oop(V0); ++ __ call(CAST_FROM_FN_PTR(address, UpcallLinker::handle_uncaught_exception), relocInfo::runtime_call_type); ++ __ should_not_reach_here(); ++ ++ __ block_comment("} exception handler"); ++ ++ _masm->flush(); ++ ++#ifndef PRODUCT ++ stringStream ss; ++ ss.print("upcall_stub_%s", entry->signature()->as_C_string()); ++ const char* name = _masm->code_string(ss.as_string()); ++#else // PRODUCT ++ const char* name = "upcall_stub"; ++#endif // PRODUCT ++ ++ buffer.log_section_sizes(name); ++ ++ UpcallStub* blob ++ = UpcallStub::create(name, ++ &buffer, ++ exception_handler_offset, ++ receiver, ++ in_ByteSize(frame_data_offset)); ++ ++#ifndef PRODUCT ++ if (lt.is_enabled()) { ++ ResourceMark rm; ++ LogStream ls(lt); ++ blob->print_on(&ls); ++ } ++#endif ++ ++ return blob->code_begin(); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/vmreg_loongarch.cpp b/src/hotspot/cpu/loongarch/vmreg_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/vmreg_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/vmreg_loongarch.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,54 @@ ++/* ++ * Copyright (c) 2006, 2012, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/assembler.hpp" ++#include "code/vmreg.hpp" ++#include "vmreg_loongarch.inline.hpp" ++ ++ ++ ++void VMRegImpl::set_regName() { ++ Register reg = ::as_Register(0); ++ int i; ++ for (i = 0; i < ConcreteRegisterImpl::max_gpr ; ) { ++ for (int j = 0 ; j < Register::max_slots_per_register ; j++) { ++ regName[i++] = reg->name(); ++ } ++ reg = reg->successor(); ++ } ++ ++ FloatRegister freg = ::as_FloatRegister(0); ++ for ( ; i < ConcreteRegisterImpl::max_fpr ; ) { ++ for (int j = 0 ; j < FloatRegister::max_slots_per_register ; j++) { ++ regName[i++] = freg->name(); ++ } ++ freg = freg->successor(); ++ } ++ ++ for ( ; i < ConcreteRegisterImpl::number_of_registers ; i ++ ) { ++ regName[i] = "NON-GPR-FPR"; ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/vmreg_loongarch.hpp b/src/hotspot/cpu/loongarch/vmreg_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/vmreg_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/vmreg_loongarch.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,58 @@ ++/* ++ * Copyright (c) 2006, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_VMREG_LOONGARCH_HPP ++#define CPU_LOONGARCH_VMREG_LOONGARCH_HPP ++ ++inline bool is_Register() { ++ return (unsigned int) value() < (unsigned int) ConcreteRegisterImpl::max_gpr; ++} ++ ++inline Register as_Register() { ++ assert( is_Register(), "must be"); ++ return ::as_Register(value() / Register::max_slots_per_register); ++} ++ ++inline bool is_FloatRegister() { ++ return value() >= ConcreteRegisterImpl::max_gpr && value() < ConcreteRegisterImpl::max_fpr; ++} ++ ++inline FloatRegister as_FloatRegister() { ++ assert( is_FloatRegister() && is_even(value()), "must be" ); ++ return ::as_FloatRegister((value() - ConcreteRegisterImpl::max_gpr) / ++ FloatRegister::max_slots_per_register); ++} ++ ++inline bool is_concrete() { ++ assert(is_reg(), "must be"); ++ if (is_FloatRegister()) { ++ int base = value() - ConcreteRegisterImpl::max_gpr; ++ return base % FloatRegister::max_slots_per_register == 0; ++ } else { ++ return is_even(value()); ++ } ++} ++ ++#endif // CPU_LOONGARCH_VMREG_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/vmreg_loongarch.inline.hpp b/src/hotspot/cpu/loongarch/vmreg_loongarch.inline.hpp +--- a/src/hotspot/cpu/loongarch/vmreg_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/vmreg_loongarch.inline.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,38 @@ ++/* ++ * Copyright (c) 2006, 2012, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_VMREG_LOONGARCH_INLINE_HPP ++#define CPU_LOONGARCH_VMREG_LOONGARCH_INLINE_HPP ++ ++inline VMReg Register::RegisterImpl::as_VMReg() const { ++ return VMRegImpl::as_VMReg(encoding() * Register::max_slots_per_register); ++} ++ ++inline VMReg FloatRegister::FloatRegisterImpl::as_VMReg() const { ++ return VMRegImpl::as_VMReg((encoding() * FloatRegister::max_slots_per_register) + ++ ConcreteRegisterImpl::max_gpr); ++} ++ ++#endif // CPU_LOONGARCH_VMREG_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/vmstorage_loongarch.hpp b/src/hotspot/cpu/loongarch/vmstorage_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/vmstorage_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/vmstorage_loongarch.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,87 @@ ++/* ++ * Copyright (c) 2023, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#ifndef CPU_LOONGARCH_VMSTORAGE_LOONGARCH_INLINE_HPP ++#define CPU_LOONGARCH_VMSTORAGE_LOONGARCH_INLINE_HPP ++ ++#include ++ ++#include "asm/register.hpp" ++ ++// keep in sync with jdk/internal/foreign/abi/aarch64/AArch64Architecture ++enum class StorageType : int8_t { ++ INTEGER = 0, ++ FLOAT = 1, ++ STACK = 2, ++ PLACEHOLDER = 3, ++// special locations used only by native code ++ FRAME_DATA = PLACEHOLDER + 1, ++ INVALID = -1 ++}; ++ ++// need to define this before constructing VMStorage (below) ++constexpr inline bool VMStorage::is_reg(StorageType type) { ++ return type == StorageType::INTEGER || type == StorageType::FLOAT; ++} ++constexpr inline StorageType VMStorage::stack_type() { return StorageType::STACK; } ++constexpr inline StorageType VMStorage::placeholder_type() { return StorageType::PLACEHOLDER; } ++constexpr inline StorageType VMStorage::frame_data_type() { return StorageType::FRAME_DATA; } ++ ++constexpr uint16_t REG64_MASK = 0b0000000000000001; ++constexpr uint16_t FLOAT64_MASK = 0b0000000000000001; ++ ++inline Register as_Register(VMStorage vms) { ++ assert(vms.type() == StorageType::INTEGER, "not the right type"); ++ return ::as_Register(vms.index()); ++} ++ ++inline FloatRegister as_FloatRegister(VMStorage vms) { ++ assert(vms.type() == StorageType::FLOAT, "not the right type"); ++ return ::as_FloatRegister(vms.index()); ++} ++ ++constexpr inline VMStorage as_VMStorage(Register reg) { ++ return VMStorage::reg_storage(StorageType::INTEGER, REG64_MASK, reg->encoding()); ++} ++ ++constexpr inline VMStorage as_VMStorage(FloatRegister reg) { ++ return VMStorage::reg_storage(StorageType::FLOAT, FLOAT64_MASK, reg->encoding()); ++} ++ ++inline VMStorage as_VMStorage(VMReg reg, BasicType bt) { ++ if (reg->is_Register()) { ++ return as_VMStorage(reg->as_Register()); ++ } else if (reg->is_FloatRegister()) { ++ return as_VMStorage(reg->as_FloatRegister()); ++ } else if (reg->is_stack()) { ++ return VMStorage::stack_storage(reg); ++ } else if (!reg->is_valid()) { ++ return VMStorage::invalid(); ++ } ++ ++ ShouldNotReachHere(); ++ return VMStorage::invalid(); ++} ++ ++#endif // CPU_LOONGARCH_VMSTORAGE_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/vmStructs_loongarch.hpp b/src/hotspot/cpu/loongarch/vmStructs_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/vmStructs_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/vmStructs_loongarch.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,61 @@ ++/* ++ * Copyright (c) 2001, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_VMSTRUCTS_LOONGARCH_HPP ++#define CPU_LOONGARCH_VMSTRUCTS_LOONGARCH_HPP ++ ++// These are the CPU-specific fields, types and integer ++// constants required by the Serviceability Agent. This file is ++// referenced by vmStructs.cpp. ++ ++#define VM_STRUCTS_CPU(nonstatic_field, static_field, unchecked_nonstatic_field, volatile_nonstatic_field, nonproduct_nonstatic_field, c2_nonstatic_field, unchecked_c1_static_field, unchecked_c2_static_field) \ ++ volatile_nonstatic_field(JavaFrameAnchor, _last_Java_fp, intptr_t*) \ ++ \ ++ ++ /* NOTE that we do not use the last_entry() macro here; it is used */ ++ /* in vmStructs__.hpp's VM_STRUCTS_OS_CPU macro (and must */ ++ /* be present there) */ ++ ++ ++#define VM_TYPES_CPU(declare_type, declare_toplevel_type, declare_oop_type, declare_integer_type, declare_unsigned_integer_type, declare_c1_toplevel_type, declare_c2_type, declare_c2_toplevel_type) \ ++ ++ /* NOTE that we do not use the last_entry() macro here; it is used */ ++ /* in vmStructs__.hpp's VM_TYPES_OS_CPU macro (and must */ ++ /* be present there) */ ++ ++ ++#define VM_INT_CONSTANTS_CPU(declare_constant, declare_preprocessor_constant, declare_c1_constant, declare_c2_constant, declare_c2_preprocessor_constant) \ ++ ++ /* NOTE that we do not use the last_entry() macro here; it is used */ ++ /* in vmStructs__.hpp's VM_INT_CONSTANTS_OS_CPU macro (and must */ ++ /* be present there) */ ++ ++#define VM_LONG_CONSTANTS_CPU(declare_constant, declare_preprocessor_constant, declare_c1_constant, declare_c2_constant, declare_c2_preprocessor_constant) \ ++ ++ /* NOTE that we do not use the last_entry() macro here; it is used */ ++ /* in vmStructs__.hpp's VM_LONG_CONSTANTS_OS_CPU macro (and must */ ++ /* be present there) */ ++ ++#endif // CPU_LOONGARCH_VMSTRUCTS_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/vm_version_loongarch.cpp b/src/hotspot/cpu/loongarch/vm_version_loongarch.cpp +--- a/src/hotspot/cpu/loongarch/vm_version_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/vm_version_loongarch.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,511 @@ ++/* ++ * Copyright (c) 1997, 2014, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "asm/macroAssembler.inline.hpp" ++#include "classfile/vmIntrinsics.hpp" ++#include "memory/resourceArea.hpp" ++#include "runtime/arguments.hpp" ++#include "runtime/java.hpp" ++#include "runtime/stubCodeGenerator.hpp" ++#include "runtime/vm_version.hpp" ++#include "os_linux.hpp" ++#ifdef TARGET_OS_FAMILY_linux ++# include "os_linux.inline.hpp" ++#endif ++ ++VM_Version::CpuidInfo VM_Version::_cpuid_info = { 0, }; ++bool VM_Version::_cpu_info_is_initialized = false; ++ ++static BufferBlob* stub_blob; ++static const int stub_size = 600; ++ ++extern "C" { ++ typedef void (*get_cpu_info_stub_t)(void*); ++} ++static get_cpu_info_stub_t get_cpu_info_stub = nullptr; ++ ++ ++class VM_Version_StubGenerator: public StubCodeGenerator { ++ public: ++ ++ VM_Version_StubGenerator(CodeBuffer *c) : StubCodeGenerator(c) {} ++ ++ address generate_get_cpu_info() { ++ assert(!VM_Version::cpu_info_is_initialized(), "VM_Version should not be initialized"); ++ StubCodeMark mark(this, "VM_Version", "get_cpu_info_stub"); ++# define __ _masm-> ++ ++ address start = __ pc(); ++ ++ __ enter(); ++ __ push(AT); ++ __ push(T5); ++ ++ __ li(AT, (long)0); ++ __ cpucfg(T5, AT); ++ __ st_w(T5, A0, in_bytes(VM_Version::Loongson_Cpucfg_id0_offset())); ++ ++ __ li(AT, 1); ++ __ cpucfg(T5, AT); ++ __ st_w(T5, A0, in_bytes(VM_Version::Loongson_Cpucfg_id1_offset())); ++ ++ __ li(AT, 2); ++ __ cpucfg(T5, AT); ++ __ st_w(T5, A0, in_bytes(VM_Version::Loongson_Cpucfg_id2_offset())); ++ ++ __ li(AT, 3); ++ __ cpucfg(T5, AT); ++ __ st_w(T5, A0, in_bytes(VM_Version::Loongson_Cpucfg_id3_offset())); ++ ++ __ li(AT, 4); ++ __ cpucfg(T5, AT); ++ __ st_w(T5, A0, in_bytes(VM_Version::Loongson_Cpucfg_id4_offset())); ++ ++ __ li(AT, 5); ++ __ cpucfg(T5, AT); ++ __ st_w(T5, A0, in_bytes(VM_Version::Loongson_Cpucfg_id5_offset())); ++ ++ __ li(AT, 6); ++ __ cpucfg(T5, AT); ++ __ st_w(T5, A0, in_bytes(VM_Version::Loongson_Cpucfg_id6_offset())); ++ ++ __ li(AT, 10); ++ __ cpucfg(T5, AT); ++ __ st_w(T5, A0, in_bytes(VM_Version::Loongson_Cpucfg_id10_offset())); ++ ++ __ li(AT, 11); ++ __ cpucfg(T5, AT); ++ __ st_w(T5, A0, in_bytes(VM_Version::Loongson_Cpucfg_id11_offset())); ++ ++ __ li(AT, 12); ++ __ cpucfg(T5, AT); ++ __ st_w(T5, A0, in_bytes(VM_Version::Loongson_Cpucfg_id12_offset())); ++ ++ __ li(AT, 13); ++ __ cpucfg(T5, AT); ++ __ st_w(T5, A0, in_bytes(VM_Version::Loongson_Cpucfg_id13_offset())); ++ ++ __ li(AT, 14); ++ __ cpucfg(T5, AT); ++ __ st_w(T5, A0, in_bytes(VM_Version::Loongson_Cpucfg_id14_offset())); ++ ++ __ pop(T5); ++ __ pop(AT); ++ __ leave(); ++ __ jr(RA); ++# undef __ ++ return start; ++ }; ++}; ++ ++uint32_t VM_Version::get_feature_flags_by_cpucfg() { ++ uint32_t result = 0; ++ if (_cpuid_info.cpucfg_info_id1.bits.ARCH == 0b00 || _cpuid_info.cpucfg_info_id1.bits.ARCH == 0b01 ) { ++ result |= CPU_LA32; ++ } else if (_cpuid_info.cpucfg_info_id1.bits.ARCH == 0b10 ) { ++ result |= CPU_LA64; ++ } ++ ++ if (_cpuid_info.cpucfg_info_id2.bits.FP_CFG != 0) ++ result |= CPU_FP; ++ if (_cpuid_info.cpucfg_info_id2.bits.LAM_BH != 0) ++ result |= CPU_LAM_BH; ++ if (_cpuid_info.cpucfg_info_id2.bits.LAMCAS != 0) ++ result |= CPU_LAMCAS; ++ ++ if (_cpuid_info.cpucfg_info_id3.bits.CCDMA != 0) ++ result |= CPU_CCDMA; ++ if (_cpuid_info.cpucfg_info_id3.bits.LLDBAR != 0) ++ result |= CPU_LLDBAR; ++ if (_cpuid_info.cpucfg_info_id3.bits.SCDLY != 0) ++ result |= CPU_SCDLY; ++ if (_cpuid_info.cpucfg_info_id3.bits.LLEXC != 0) ++ result |= CPU_LLEXC; ++ ++ result |= CPU_ULSYNC; ++ ++ return result; ++} ++ ++void VM_Version::get_processor_features() { ++ ++ clean_cpuFeatures(); ++ ++ get_os_cpu_info(); ++ ++ get_cpu_info_stub(&_cpuid_info); ++ _features |= get_feature_flags_by_cpucfg(); ++ ++ _supports_cx8 = true; ++ ++ if (UseG1GC && FLAG_IS_DEFAULT(MaxGCPauseMillis)) { ++ FLAG_SET_DEFAULT(MaxGCPauseMillis, 150); ++ } ++ ++ if (supports_lsx()) { ++ if (FLAG_IS_DEFAULT(UseLSX)) { ++ FLAG_SET_DEFAULT(UseLSX, true); ++ } ++ } else if (UseLSX) { ++ warning("LSX instructions are not available on this CPU"); ++ FLAG_SET_DEFAULT(UseLSX, false); ++ } ++ ++ if (supports_lasx()) { ++ if (FLAG_IS_DEFAULT(UseLASX)) { ++ FLAG_SET_DEFAULT(UseLASX, true); ++ } ++ } else if (UseLASX) { ++ warning("LASX instructions are not available on this CPU"); ++ FLAG_SET_DEFAULT(UseLASX, false); ++ } ++ ++ if (UseLASX && !UseLSX) { ++ warning("LASX instructions depends on LSX, setting UseLASX to false"); ++ FLAG_SET_DEFAULT(UseLASX, false); ++ } ++ ++ if (supports_lam_bh()) { ++ if (FLAG_IS_DEFAULT(UseAMBH)) { ++ FLAG_SET_DEFAULT(UseAMBH, true); ++ } ++ } else if (UseAMBH) { ++ warning("AM{SWAP/ADD}{_DB}.{B/H} instructions are not available on this CPU"); ++ FLAG_SET_DEFAULT(UseAMBH, false); ++ } ++ ++ if (supports_lamcas()) { ++ if (FLAG_IS_DEFAULT(UseAMCAS)) { ++ FLAG_SET_DEFAULT(UseAMCAS, true); ++ } ++ } else if (UseAMCAS) { ++ warning("AMCAS{_DB}.{B/H/W/D} instructions are not available on this CPU"); ++ FLAG_SET_DEFAULT(UseAMCAS, false); ++ } ++#ifdef COMPILER2 ++ int max_vector_size = 0; ++ int min_vector_size = 0; ++ if (UseLASX) { ++ max_vector_size = 32; ++ min_vector_size = 4; ++ } ++ else if (UseLSX) { ++ max_vector_size = 16; ++ min_vector_size = 4; ++ } ++ ++ if (!FLAG_IS_DEFAULT(MaxVectorSize)) { ++ if (MaxVectorSize == 0) { ++ // do nothing ++ } else if (MaxVectorSize > max_vector_size) { ++ warning("MaxVectorSize must be at most %i on this platform", max_vector_size); ++ FLAG_SET_DEFAULT(MaxVectorSize, max_vector_size); ++ } else if (MaxVectorSize < min_vector_size) { ++ warning("MaxVectorSize must be at least %i or 0 on this platform, setting to: %i", min_vector_size, min_vector_size); ++ FLAG_SET_DEFAULT(MaxVectorSize, min_vector_size); ++ } else if (!is_power_of_2(MaxVectorSize)) { ++ warning("MaxVectorSize must be a power of 2, setting to default: %i", max_vector_size); ++ FLAG_SET_DEFAULT(MaxVectorSize, max_vector_size); ++ } ++ } else { ++ // If default, use highest supported configuration ++ FLAG_SET_DEFAULT(MaxVectorSize, max_vector_size); ++ } ++#endif ++ ++ char buf[256]; ++ ++ // A note on the _features_string format: ++ // There are jtreg tests checking the _features_string for various properties. ++ // For some strange reason, these tests require the string to contain ++ // only _lowercase_ characters. Keep that in mind when being surprised ++ // about the unusual notation of features - and when adding new ones. ++ // Features may have one comma at the end. ++ // Furthermore, use one, and only one, separator space between features. ++ // Multiple spaces are considered separate tokens, messing up everything. ++ jio_snprintf(buf, sizeof(buf), "%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s%s, " ++ "0x%lx, fp_ver: %d, lvz_ver: %d, ", ++ (is_la64() ? "la64" : ""), ++ (is_la32() ? "la32" : ""), ++ (supports_lsx() ? ", lsx" : ""), ++ (supports_lasx() ? ", lasx" : ""), ++ (supports_crypto() ? ", crypto" : ""), ++ (supports_lam() ? ", am" : ""), ++ (supports_ual() ? ", ual" : ""), ++ (supports_lldbar() ? ", lldbar" : ""), ++ (supports_scdly() ? ", scdly" : ""), ++ (supports_llexc() ? ", llexc" : ""), ++ (supports_lbt_x86() ? ", lbt_x86" : ""), ++ (supports_lbt_arm() ? ", lbt_arm" : ""), ++ (supports_lbt_mips() ? ", lbt_mips" : ""), ++ (needs_llsync() ? ", needs_llsync" : ""), ++ (needs_tgtsync() ? ", needs_tgtsync": ""), ++ (needs_ulsync() ? ", needs_ulsync": ""), ++ (supports_lam_bh() ? ", lam_bh" : ""), ++ (supports_lamcas() ? ", lamcas" : ""), ++ _cpuid_info.cpucfg_info_id0.bits.PRID, ++ _cpuid_info.cpucfg_info_id2.bits.FP_VER, ++ _cpuid_info.cpucfg_info_id2.bits.LVZ_VER); ++ _features_string = os::strdup(buf); ++ ++ assert(!is_la32(), "Should Not Reach Here, what is the cpu type?"); ++ assert( is_la64(), "Should be LoongArch64"); ++ ++ if (FLAG_IS_DEFAULT(AllocatePrefetchStyle)) { ++ FLAG_SET_DEFAULT(AllocatePrefetchStyle, 1); ++ } ++ ++ if (FLAG_IS_DEFAULT(AllocatePrefetchLines)) { ++ FLAG_SET_DEFAULT(AllocatePrefetchLines, 3); ++ } ++ ++ if (FLAG_IS_DEFAULT(AllocatePrefetchStepSize)) { ++ FLAG_SET_DEFAULT(AllocatePrefetchStepSize, 64); ++ } ++ ++ if (FLAG_IS_DEFAULT(AllocatePrefetchDistance)) { ++ FLAG_SET_DEFAULT(AllocatePrefetchDistance, 192); ++ } ++ ++ if (FLAG_IS_DEFAULT(AllocateInstancePrefetchLines)) { ++ FLAG_SET_DEFAULT(AllocateInstancePrefetchLines, 1); ++ } ++ ++ // Basic instructions are used to implement SHA Intrinsics on LA, so sha ++ // instructions support is not needed. ++ if (/*supports_crypto()*/ 1) { ++ if (FLAG_IS_DEFAULT(UseSHA)) { ++ FLAG_SET_DEFAULT(UseSHA, true); ++ } ++ } else if (UseSHA) { ++ warning("SHA instructions are not available on this CPU"); ++ FLAG_SET_DEFAULT(UseSHA, false); ++ } ++ ++ if (UseSHA/* && supports_crypto()*/) { ++ if (FLAG_IS_DEFAULT(UseSHA1Intrinsics)) { ++ FLAG_SET_DEFAULT(UseSHA1Intrinsics, true); ++ } ++ } else if (UseSHA1Intrinsics) { ++ warning("Intrinsics for SHA-1 crypto hash functions not available on this CPU."); ++ FLAG_SET_DEFAULT(UseSHA1Intrinsics, false); ++ } ++ ++ if (UseSHA/* && supports_crypto()*/) { ++ if (FLAG_IS_DEFAULT(UseSHA256Intrinsics)) { ++ FLAG_SET_DEFAULT(UseSHA256Intrinsics, true); ++ } ++ } else if (UseSHA256Intrinsics) { ++ warning("Intrinsics for SHA-224 and SHA-256 crypto hash functions not available on this CPU."); ++ FLAG_SET_DEFAULT(UseSHA256Intrinsics, false); ++ } ++ ++ if (UseSHA512Intrinsics) { ++ warning("Intrinsics for SHA-384 and SHA-512 crypto hash functions not available on this CPU."); ++ FLAG_SET_DEFAULT(UseSHA512Intrinsics, false); ++ } ++ ++ if (UseSHA3Intrinsics) { ++ warning("Intrinsics for SHA3-224, SHA3-256, SHA3-384 and SHA3-512 crypto hash functions not available on this CPU."); ++ FLAG_SET_DEFAULT(UseSHA3Intrinsics, false); ++ } ++ ++ if (!(UseSHA1Intrinsics || UseSHA256Intrinsics || UseSHA3Intrinsics || UseSHA512Intrinsics)) { ++ FLAG_SET_DEFAULT(UseSHA, false); ++ } ++ ++ if (FLAG_IS_DEFAULT(UseMD5Intrinsics)) { ++ FLAG_SET_DEFAULT(UseMD5Intrinsics, true); ++ } ++ ++ // Basic instructions are used to implement AES Intrinsics on LA, so AES ++ // instructions support is not needed. ++ if (/*supports_crypto()*/ 1) { ++ if (FLAG_IS_DEFAULT(UseAES)) { ++ FLAG_SET_DEFAULT(UseAES, true); ++ } ++ } else if (UseAES) { ++ if (!FLAG_IS_DEFAULT(UseAES)) ++ warning("AES instructions are not available on this CPU"); ++ FLAG_SET_DEFAULT(UseAES, false); ++ } ++ ++ if (UseAES/* && supports_crypto()*/) { ++ if (FLAG_IS_DEFAULT(UseAESIntrinsics)) { ++ FLAG_SET_DEFAULT(UseAESIntrinsics, true); ++ } ++ } else if (UseAESIntrinsics) { ++ if (!FLAG_IS_DEFAULT(UseAESIntrinsics)) ++ warning("AES intrinsics are not available on this CPU"); ++ FLAG_SET_DEFAULT(UseAESIntrinsics, false); ++ } ++ ++ if (UseAESCTRIntrinsics) { ++ warning("AES/CTR intrinsics are not available on this CPU"); ++ FLAG_SET_DEFAULT(UseAESCTRIntrinsics, false); ++ } ++ ++ if (FLAG_IS_DEFAULT(UseCRC32)) { ++ FLAG_SET_DEFAULT(UseCRC32, true); ++ } ++ ++ if (UseCRC32) { ++ if (FLAG_IS_DEFAULT(UseCRC32Intrinsics)) { ++ UseCRC32Intrinsics = true; ++ } ++ ++ if (FLAG_IS_DEFAULT(UseCRC32CIntrinsics)) { ++ UseCRC32CIntrinsics = true; ++ } ++ } ++ ++ if (UseLSX) { ++ if (FLAG_IS_DEFAULT(UseChaCha20Intrinsics)) { ++ UseChaCha20Intrinsics = true; ++ } ++ } else if (UseChaCha20Intrinsics) { ++ if (!FLAG_IS_DEFAULT(UseChaCha20Intrinsics)) ++ warning("ChaCha20 intrinsic requires LSX instructions"); ++ FLAG_SET_DEFAULT(UseChaCha20Intrinsics, false); ++ } ++ ++#ifdef COMPILER2 ++ if (FLAG_IS_DEFAULT(UseMulAddIntrinsic)) { ++ FLAG_SET_DEFAULT(UseMulAddIntrinsic, true); ++ } ++ ++ if (FLAG_IS_DEFAULT(UseMontgomeryMultiplyIntrinsic)) { ++ UseMontgomeryMultiplyIntrinsic = true; ++ } ++ if (FLAG_IS_DEFAULT(UseMontgomerySquareIntrinsic)) { ++ UseMontgomerySquareIntrinsic = true; ++ } ++ ++ if (UseFPUForSpilling && !FLAG_IS_DEFAULT(UseFPUForSpilling)) { ++ if (UseCompressedOops || UseCompressedClassPointers) { ++ warning("UseFPUForSpilling not supported when UseCompressedOops or UseCompressedClassPointers is on"); ++ UseFPUForSpilling = false; ++ } ++ } ++#endif ++ ++ // This machine allows unaligned memory accesses ++ if (FLAG_IS_DEFAULT(UseUnalignedAccesses)) { ++ FLAG_SET_DEFAULT(UseUnalignedAccesses, true); ++ } ++ ++ if (FLAG_IS_DEFAULT(UseFMA)) { ++ FLAG_SET_DEFAULT(UseFMA, true); ++ } ++ ++ if (FLAG_IS_DEFAULT(UseCopySignIntrinsic)) { ++ FLAG_SET_DEFAULT(UseCopySignIntrinsic, true); ++ } ++ ++ if (UseLSX) { ++ if (FLAG_IS_DEFAULT(UsePopCountInstruction)) { ++ FLAG_SET_DEFAULT(UsePopCountInstruction, true); ++ } ++ } else if (UsePopCountInstruction) { ++ if (!FLAG_IS_DEFAULT(UsePopCountInstruction)) ++ warning("PopCountI/L/VI(4) employs LSX whereas PopCountVI(8) hinges on LASX."); ++ FLAG_SET_DEFAULT(UsePopCountInstruction, false); ++ } ++ ++ if (UseLASX) { ++ if (FLAG_IS_DEFAULT(UseBigIntegerShiftIntrinsic)) { ++ FLAG_SET_DEFAULT(UseBigIntegerShiftIntrinsic, true); ++ } ++ } else if (UseBigIntegerShiftIntrinsic) { ++ if (!FLAG_IS_DEFAULT(UseBigIntegerShiftIntrinsic)) ++ warning("Intrinsic for BigInteger.shiftLeft/Right() employs LASX."); ++ FLAG_SET_DEFAULT(UseBigIntegerShiftIntrinsic, false); ++ } ++ ++ if (UseActiveCoresMP) { ++ if (os::Linux::sched_active_processor_count() != 1) { ++ if (!FLAG_IS_DEFAULT(UseActiveCoresMP)) ++ warning("UseActiveCoresMP disabled because active processors are more than one."); ++ FLAG_SET_DEFAULT(UseActiveCoresMP, false); ++ } ++ } else { // !UseActiveCoresMP ++ if (FLAG_IS_DEFAULT(UseActiveCoresMP) && !os::is_MP()) { ++ FLAG_SET_DEFAULT(UseActiveCoresMP, true); ++ } ++ } ++ ++#ifdef COMPILER2 ++ if (FLAG_IS_DEFAULT(AlignVector)) { ++ AlignVector = false; ++ } ++#endif // COMPILER2 ++} ++ ++void VM_Version::initialize() { ++ ResourceMark rm; ++ // Making this stub must be FIRST use of assembler ++ ++ stub_blob = BufferBlob::create("get_cpu_info_stub", stub_size); ++ if (stub_blob == nullptr) { ++ vm_exit_during_initialization("Unable to allocate get_cpu_info_stub"); ++ } ++ CodeBuffer c(stub_blob); ++ VM_Version_StubGenerator g(&c); ++ get_cpu_info_stub = CAST_TO_FN_PTR(get_cpu_info_stub_t, ++ g.generate_get_cpu_info()); ++ ++ get_processor_features(); ++} ++ ++void VM_Version::initialize_cpu_information(void) { ++ // do nothing if cpu info has been initialized ++ if (_initialized) { ++ return; ++ } ++ ++ _no_of_cores = os::processor_count(); ++ _no_of_threads = _no_of_cores; ++ _no_of_sockets = _no_of_cores; ++ snprintf(_cpu_name, CPU_TYPE_DESC_BUF_SIZE - 1, "LoongArch"); ++ snprintf(_cpu_desc, CPU_DETAILED_DESC_BUF_SIZE, "LoongArch %s", features_string()); ++ _initialized = true; ++} ++ ++bool VM_Version::is_intrinsic_supported(vmIntrinsicID id) { ++ assert(id != vmIntrinsics::_none, "must be a VM intrinsic"); ++ switch (id) { ++ case vmIntrinsics::_floatToFloat16: ++ case vmIntrinsics::_float16ToFloat: ++ if (!supports_float16()) { ++ return false; ++ } ++ break; ++ default: ++ break; ++ } ++ return true; ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/vm_version_loongarch.hpp b/src/hotspot/cpu/loongarch/vm_version_loongarch.hpp +--- a/src/hotspot/cpu/loongarch/vm_version_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/vm_version_loongarch.hpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,308 @@ ++/* ++ * Copyright (c) 1997, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef CPU_LOONGARCH_VM_VERSION_LOONGARCH_HPP ++#define CPU_LOONGARCH_VM_VERSION_LOONGARCH_HPP ++ ++#include "runtime/abstract_vm_version.hpp" ++#include "runtime/globals_extension.hpp" ++#include "utilities/sizes.hpp" ++ ++class VM_Version: public Abstract_VM_Version { ++ friend class JVMCIVMStructs; ++ ++public: ++ ++ union LoongArch_Cpucfg_Id0 { ++ uint32_t value; ++ struct { ++ uint32_t PRID : 32; ++ } bits; ++ }; ++ ++ union LoongArch_Cpucfg_Id1 { ++ uint32_t value; ++ struct { ++ uint32_t ARCH : 2, ++ PGMMU : 1, ++ IOCSR : 1, ++ PALEN : 8, ++ VALEN : 8, ++ UAL : 1, // unaligned access ++ RI : 1, ++ EP : 1, ++ RPLV : 1, ++ HP : 1, ++ IOCSR_BRD : 1, ++ MSG_INT : 1, ++ : 5; ++ } bits; ++ }; ++ ++ union LoongArch_Cpucfg_Id2 { ++ uint32_t value; ++ struct { ++ uint32_t FP_CFG : 1, // FP is used, use FP_CFG instead ++ FP_SP : 1, ++ FP_DP : 1, ++ FP_VER : 3, ++ LSX : 1, ++ LASX : 1, ++ COMPLEX : 1, ++ CRYPTO : 1, ++ LVZ : 1, ++ LVZ_VER : 3, ++ LLFTP : 1, ++ LLFTP_VER : 3, ++ LBT_X86 : 1, ++ LBT_ARM : 1, ++ LBT_MIPS : 1, ++ LSPW : 1, ++ LAM : 1, ++ UNUSED : 4, ++ LAM_BH : 1, ++ LAMCAS : 1, ++ : 3; ++ } bits; ++ }; ++ ++ union LoongArch_Cpucfg_Id3 { ++ uint32_t value; ++ struct { ++ uint32_t CCDMA : 1, ++ SFB : 1, ++ UCACC : 1, ++ LLEXC : 1, ++ SCDLY : 1, ++ LLDBAR : 1, ++ ITLBHMC : 1, ++ ICHMC : 1, ++ SPW_LVL : 3, ++ SPW_HP_HF : 1, ++ RVA : 1, ++ RVAMAXM1 : 4, ++ : 15; ++ } bits; ++ }; ++ ++ union LoongArch_Cpucfg_Id4 { ++ uint32_t value; ++ struct { ++ uint32_t CC_FREQ : 32; ++ } bits; ++ }; ++ ++ union LoongArch_Cpucfg_Id5 { ++ uint32_t value; ++ struct { ++ uint32_t CC_MUL : 16, ++ CC_DIV : 16; ++ } bits; ++ }; ++ ++ union LoongArch_Cpucfg_Id6 { ++ uint32_t value; ++ struct { ++ uint32_t PMP : 1, ++ PMVER : 3, ++ PMNUM : 4, ++ PMBITS : 6, ++ UPM : 1, ++ : 17; ++ } bits; ++ }; ++ ++ union LoongArch_Cpucfg_Id10 { ++ uint32_t value; ++ struct { ++ uint32_t L1IU_PRESENT : 1, ++ L1IU_UNIFY : 1, ++ L1D_PRESENT : 1, ++ L2IU_PRESENT : 1, ++ L2IU_UNIFY : 1, ++ L2IU_PRIVATE : 1, ++ L2IU_INCLUSIVE : 1, ++ L2D_PRESENT : 1, ++ L2D_PRIVATE : 1, ++ L2D_INCLUSIVE : 1, ++ L3IU_PRESENT : 1, ++ L3IU_UNIFY : 1, ++ L3IU_PRIVATE : 1, ++ L3IU_INCLUSIVE : 1, ++ L3D_PRESENT : 1, ++ L3D_PRIVATE : 1, ++ L3D_INCLUSIVE : 1, ++ : 15; ++ } bits; ++ }; ++ ++ union LoongArch_Cpucfg_Id11 { ++ uint32_t value; ++ struct { ++ uint32_t WAYM1 : 16, ++ INDEXMLOG2 : 8, ++ LINESIZELOG2 : 7, ++ : 1; ++ } bits; ++ }; ++ ++ union LoongArch_Cpucfg_Id12 { ++ uint32_t value; ++ struct { ++ uint32_t WAYM1 : 16, ++ INDEXMLOG2 : 8, ++ LINESIZELOG2 : 7, ++ : 1; ++ } bits; ++ }; ++ ++ union LoongArch_Cpucfg_Id13 { ++ uint32_t value; ++ struct { ++ uint32_t WAYM1 : 16, ++ INDEXMLOG2 : 8, ++ LINESIZELOG2 : 7, ++ : 1; ++ } bits; ++ }; ++ ++ union LoongArch_Cpucfg_Id14 { ++ uint32_t value; ++ struct { ++ uint32_t WAYM1 : 16, ++ INDEXMLOG2 : 8, ++ LINESIZELOG2 : 7, ++ : 1; ++ } bits; ++ }; ++ ++#define CPU_FEATURE_FLAGS(decl) \ ++ decl(LAM, lam, 1) \ ++ decl(UAL, ual, 2) \ ++ decl(LSX, lsx, 4) \ ++ decl(LASX, lasx, 5) \ ++ decl(COMPLEX, complex, 7) \ ++ decl(CRYPTO, crypto, 8) \ ++ decl(LBT_X86, lbt_x86, 10) \ ++ decl(LBT_ARM, lbt_arm, 11) \ ++ decl(LBT_MIPS, lbt_mips, 12) \ ++ /* flags above must follow Linux HWCAP */ \ ++ decl(LA32, la32, 13) \ ++ decl(LA64, la64, 14) \ ++ decl(FP, fp, 15) \ ++ decl(LLEXC, llexc, 16) \ ++ decl(SCDLY, scdly, 17) \ ++ decl(LLDBAR, lldbar, 18) \ ++ decl(CCDMA, ccdma, 19) \ ++ decl(LLSYNC, llsync, 20) \ ++ decl(TGTSYNC, tgtsync, 21) \ ++ decl(ULSYNC, ulsync, 22) \ ++ decl(LAM_BH, lam_bh, 23) \ ++ decl(LAMCAS, lamcas, 24) \ ++ ++ enum Feature_Flag { ++#define DECLARE_CPU_FEATURE_FLAG(id, name, bit) CPU_##id = (1 << bit), ++ CPU_FEATURE_FLAGS(DECLARE_CPU_FEATURE_FLAG) ++#undef DECLARE_CPU_FEATURE_FLAG ++ }; ++ ++protected: ++ ++ static bool _cpu_info_is_initialized; ++ ++ struct CpuidInfo { ++ LoongArch_Cpucfg_Id0 cpucfg_info_id0; ++ LoongArch_Cpucfg_Id1 cpucfg_info_id1; ++ LoongArch_Cpucfg_Id2 cpucfg_info_id2; ++ LoongArch_Cpucfg_Id3 cpucfg_info_id3; ++ LoongArch_Cpucfg_Id4 cpucfg_info_id4; ++ LoongArch_Cpucfg_Id5 cpucfg_info_id5; ++ LoongArch_Cpucfg_Id6 cpucfg_info_id6; ++ LoongArch_Cpucfg_Id10 cpucfg_info_id10; ++ LoongArch_Cpucfg_Id11 cpucfg_info_id11; ++ LoongArch_Cpucfg_Id12 cpucfg_info_id12; ++ LoongArch_Cpucfg_Id13 cpucfg_info_id13; ++ LoongArch_Cpucfg_Id14 cpucfg_info_id14; ++ }; ++ ++ // The actual cpuid info block ++ static CpuidInfo _cpuid_info; ++ ++ static uint32_t get_feature_flags_by_cpucfg(); ++ static void get_processor_features(); ++ static void get_os_cpu_info(); ++ ++public: ++ // Offsets for cpuid asm stub ++ static ByteSize Loongson_Cpucfg_id0_offset() { return byte_offset_of(CpuidInfo, cpucfg_info_id0); } ++ static ByteSize Loongson_Cpucfg_id1_offset() { return byte_offset_of(CpuidInfo, cpucfg_info_id1); } ++ static ByteSize Loongson_Cpucfg_id2_offset() { return byte_offset_of(CpuidInfo, cpucfg_info_id2); } ++ static ByteSize Loongson_Cpucfg_id3_offset() { return byte_offset_of(CpuidInfo, cpucfg_info_id3); } ++ static ByteSize Loongson_Cpucfg_id4_offset() { return byte_offset_of(CpuidInfo, cpucfg_info_id4); } ++ static ByteSize Loongson_Cpucfg_id5_offset() { return byte_offset_of(CpuidInfo, cpucfg_info_id5); } ++ static ByteSize Loongson_Cpucfg_id6_offset() { return byte_offset_of(CpuidInfo, cpucfg_info_id6); } ++ static ByteSize Loongson_Cpucfg_id10_offset() { return byte_offset_of(CpuidInfo, cpucfg_info_id10); } ++ static ByteSize Loongson_Cpucfg_id11_offset() { return byte_offset_of(CpuidInfo, cpucfg_info_id11); } ++ static ByteSize Loongson_Cpucfg_id12_offset() { return byte_offset_of(CpuidInfo, cpucfg_info_id12); } ++ static ByteSize Loongson_Cpucfg_id13_offset() { return byte_offset_of(CpuidInfo, cpucfg_info_id13); } ++ static ByteSize Loongson_Cpucfg_id14_offset() { return byte_offset_of(CpuidInfo, cpucfg_info_id14); } ++ ++ static void clean_cpuFeatures() { _features = 0; } ++ ++ // Initialization ++ static void initialize(); ++ ++ static bool cpu_info_is_initialized() { return _cpu_info_is_initialized; } ++ ++ static bool is_la32() { return _features & CPU_LA32; } ++ static bool is_la64() { return _features & CPU_LA64; } ++ static bool supports_crypto() { return _features & CPU_CRYPTO; } ++ static bool supports_lsx() { return _features & CPU_LSX; } ++ static bool supports_lasx() { return _features & CPU_LASX; } ++ static bool supports_lam() { return _features & CPU_LAM; } ++ static bool supports_llexc() { return _features & CPU_LLEXC; } ++ static bool supports_scdly() { return _features & CPU_SCDLY; } ++ static bool supports_lldbar() { return _features & CPU_LLDBAR; } ++ static bool supports_ual() { return _features & CPU_UAL; } ++ static bool supports_lbt_x86() { return _features & CPU_LBT_X86; } ++ static bool supports_lbt_arm() { return _features & CPU_LBT_ARM; } ++ static bool supports_lbt_mips() { return _features & CPU_LBT_MIPS; } ++ static bool needs_llsync() { return !supports_lldbar(); } ++ static bool needs_tgtsync() { return 1; } ++ static bool needs_ulsync() { return 1; } ++ static bool supports_lam_bh() { return _features & CPU_LAM_BH; } ++ static bool supports_lamcas() { return _features & CPU_LAMCAS; } ++ ++ static bool supports_fast_class_init_checks() { return true; } ++ static bool supports_float16() { return UseLSX; } ++ constexpr static bool supports_stack_watermark_barrier() { return true; } ++ ++ // Check intrinsic support ++ static bool is_intrinsic_supported(vmIntrinsicID id); ++ ++ static void initialize_cpu_information(void); ++}; ++ ++#endif // CPU_LOONGARCH_VM_VERSION_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/cpu/loongarch/vtableStubs_loongarch_64.cpp b/src/hotspot/cpu/loongarch/vtableStubs_loongarch_64.cpp +--- a/src/hotspot/cpu/loongarch/vtableStubs_loongarch_64.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/cpu/loongarch/vtableStubs_loongarch_64.cpp 2024-02-20 10:42:36.162196780 +0800 +@@ -0,0 +1,312 @@ ++/* ++ * Copyright (c) 2003, 2014, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/macroAssembler.hpp" ++#include "code/vtableStubs.hpp" ++#include "interp_masm_loongarch.hpp" ++#include "memory/resourceArea.hpp" ++#include "oops/compiledICHolder.hpp" ++#include "oops/klass.inline.hpp" ++#include "oops/klassVtable.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "vmreg_loongarch.inline.hpp" ++#ifdef COMPILER2 ++#include "opto/runtime.hpp" ++#endif ++ ++// machine-dependent part of VtableStubs: create VtableStub of correct size and ++// initialize its code ++ ++#define __ masm-> ++ ++#ifndef PRODUCT ++extern "C" void bad_compiled_vtable_index(JavaThread* thread, oop receiver, int index); ++#endif ++ ++// used by compiler only; receiver in T0. ++// used registers : ++// Rmethod : receiver klass & method ++// NOTE: If this code is used by the C1, the receiver_location is always 0. ++// when reach here, receiver in T0, klass in T8 ++VtableStub* VtableStubs::create_vtable_stub(int vtable_index) { ++ // Read "A word on VtableStub sizing" in share/code/vtableStubs.hpp for details on stub sizing. ++ const int stub_code_length = code_size_limit(true); ++ VtableStub* s = new(stub_code_length) VtableStub(true, vtable_index); ++ // Can be null if there is no free space in the code cache. ++ if (s == nullptr) { ++ return nullptr; ++ } ++ ++ // Count unused bytes in instruction sequences of variable size. ++ // We add them to the computed buffer size in order to avoid ++ // overflow in subsequently generated stubs. ++ address start_pc; ++ int slop_bytes = 0; ++ int slop_delta = 0; ++ int load_const_maxLen = 4*BytesPerInstWord; // load_const generates 4 instructions. Assume that as max size for li ++ // No variance was detected in vtable stub sizes. Setting index_dependent_slop == 0 will unveil any deviation from this observation. ++ const int index_dependent_slop = 0; ++ ++ ResourceMark rm; ++ CodeBuffer cb(s->entry_point(), stub_code_length); ++ MacroAssembler* masm = new MacroAssembler(&cb); ++ Register t1 = T8, t2 = Rmethod; ++#if (!defined(PRODUCT) && defined(COMPILER2)) ++ if (CountCompiledCalls) { ++ start_pc = __ pc(); ++ __ li(AT, SharedRuntime::nof_megamorphic_calls_addr()); ++ slop_delta = load_const_maxLen - (__ pc() - start_pc); ++ slop_bytes += slop_delta; ++ assert(slop_delta >= 0, "negative slop(%d) encountered, adjust code size estimate!", slop_delta); ++ __ ld_w(t1, AT , 0); ++ __ addi_w(t1, t1, 1); ++ __ st_w(t1, AT,0); ++ } ++#endif ++ ++ // get receiver (need to skip return address on top of stack) ++ //assert(receiver_location == T0->as_VMReg(), "receiver expected in T0"); ++ ++ // get receiver klass ++ address npe_addr = __ pc(); ++ __ load_klass(t1, T0); ++ ++#ifndef PRODUCT ++ if (DebugVtables) { ++ Label L; ++ // check offset vs vtable length ++ __ ld_w(t2, t1, in_bytes(Klass::vtable_length_offset())); ++ assert(Assembler::is_simm16(vtable_index*vtableEntry::size()), "change this code"); ++ __ li(AT, vtable_index*vtableEntry::size()); ++ __ blt(AT, t2, L); ++ __ li(A2, vtable_index); ++ __ move(A1, A0); ++ ++ // VTABLE TODO: find upper bound for call_VM length. ++ start_pc = __ pc(); ++ __ call_VM(noreg, CAST_FROM_FN_PTR(address, bad_compiled_vtable_index), A1, A2); ++ const ptrdiff_t estimate = 512; ++ const ptrdiff_t codesize = __ pc() - start_pc; ++ slop_delta = estimate - codesize; // call_VM varies in length, depending on data ++ assert(slop_delta >= 0, "vtable #%d: Code size estimate (%d) for DebugVtables too small, required: %d", vtable_index, (int)estimate, (int)codesize); ++ __ bind(L); ++ } ++#endif // PRODUCT ++ const Register method = Rmethod; ++ ++ // load Method* and target address ++ start_pc = __ pc(); ++ // lookup_virtual_method generates 6 instructions (worst case) ++ __ lookup_virtual_method(t1, vtable_index, method); ++ slop_delta = 6*BytesPerInstWord - (int)(__ pc() - start_pc); ++ slop_bytes += slop_delta; ++ assert(slop_delta >= 0, "negative slop(%d) encountered, adjust code size estimate!", slop_delta); ++ ++#ifndef PRODUCT ++ if (DebugVtables) { ++ Label L; ++ __ beq(method, R0, L); ++ __ ld_d(AT, method,in_bytes(Method::from_compiled_offset())); ++ __ bne(AT, R0, L); ++ __ stop("Vtable entry is null"); ++ __ bind(L); ++ } ++#endif // PRODUCT ++ ++ // T8: receiver klass ++ // T0: receiver ++ // Rmethod: Method* ++ // T4: entry ++ address ame_addr = __ pc(); ++ __ ld_d(T4, Address(method, Method::from_compiled_offset())); ++ __ jr(T4); ++ masm->flush(); ++ slop_bytes += index_dependent_slop; // add'l slop for size variance due to large itable offsets ++ bookkeeping(masm, tty, s, npe_addr, ame_addr, true, vtable_index, slop_bytes, index_dependent_slop); ++ ++ return s; ++} ++ ++ ++// used registers : ++// T1 T2 ++// when reach here, the receiver in T0, klass in T1 ++VtableStub* VtableStubs::create_itable_stub(int itable_index) { ++ // Read "A word on VtableStub sizing" in share/code/vtableStubs.hpp for details on stub sizing. ++ const int stub_code_length = code_size_limit(false); ++ VtableStub* s = new(stub_code_length) VtableStub(false, itable_index); ++ // Can be null if there is no free space in the code cache. ++ if (s == nullptr) { ++ return nullptr; ++ } ++ // Count unused bytes in instruction sequences of variable size. ++ // We add them to the computed buffer size in order to avoid ++ // overflow in subsequently generated stubs. ++ address start_pc; ++ int slop_bytes = 0; ++ int slop_delta = 0; ++ int load_const_maxLen = 4*BytesPerInstWord; // load_const generates 4 instructions. Assume that as max size for li ++ ++ ResourceMark rm; ++ CodeBuffer cb(s->entry_point(), stub_code_length); ++ MacroAssembler *masm = new MacroAssembler(&cb); ++ ++ // we use T8, T4, T2 as temporary register, they are free from register allocator ++ Register t1 = T8, t2 = T2, t3 = T4; ++ // Entry arguments: ++ // T1: Interface ++ // T0: Receiver ++ ++#if (!defined(PRODUCT) && defined(COMPILER2)) ++ if (CountCompiledCalls) { ++ start_pc = __ pc(); ++ __ li(AT, SharedRuntime::nof_megamorphic_calls_addr()); ++ slop_delta = load_const_maxLen - (__ pc() - start_pc); ++ slop_bytes += slop_delta; ++ assert(slop_delta >= 0, "negative slop(%d) encountered, adjust code size estimate!", slop_delta); ++ __ ld_w(T8, AT, 0); ++ __ addi_w(T8, T8, 1); ++ __ st_w(T8, AT, 0); ++ } ++#endif // PRODUCT ++ ++ const Register holder_klass_reg = T1; // declaring interface klass (DECC) ++ const Register resolved_klass_reg = Rmethod; // resolved interface klass (REFC) ++ const Register icholder_reg = T1; ++ ++ Label L_no_such_interface; ++ ++ __ ld_d(resolved_klass_reg, Address(icholder_reg, CompiledICHolder::holder_klass_offset())); ++ __ ld_d(holder_klass_reg, Address(icholder_reg, CompiledICHolder::holder_metadata_offset())); ++ ++ // get receiver klass (also an implicit null-check) ++ address npe_addr = __ pc(); ++ __ load_klass(t1, T0); ++ ++ // x86 use lookup_interface_method, but lookup_interface_method makes more instructions. ++ // No dynamic code size variance here, so slop_bytes is not needed. ++ const int base = in_bytes(Klass::vtable_start_offset()); ++ assert(vtableEntry::size() * wordSize == 8, "adjust the scaling in the code below"); ++ assert(Assembler::is_simm16(base), "change this code"); ++ __ addi_d(t2, t1, base); ++ __ ld_w(AT, t1, in_bytes(Klass::vtable_length_offset())); ++ __ alsl_d(t2, AT, t2, Address::times_8 - 1); ++ ++ __ move(t3, t2); ++ { ++ Label hit, entry; ++ ++ __ ld_d(AT, Address(t3, itableOffsetEntry::interface_offset())); ++ __ beq(AT, resolved_klass_reg, hit); ++ ++ __ bind(entry); ++ // Check that the entry is non-null. A null entry means that ++ // the receiver class doesn't implement the interface, and wasn't the ++ // same as when the caller was compiled. ++ __ beqz(AT, L_no_such_interface); ++ ++ __ addi_d(t3, t3, itableOffsetEntry::size() * wordSize); ++ __ ld_d(AT, Address(t3, itableOffsetEntry::interface_offset())); ++ __ bne(AT, resolved_klass_reg, entry); ++ ++ __ bind(hit); ++ } ++ ++ { ++ Label hit, entry; ++ ++ __ ld_d(AT, Address(t2, itableOffsetEntry::interface_offset())); ++ __ beq(AT, holder_klass_reg, hit); ++ ++ __ bind(entry); ++ // Check that the entry is non-null. A null entry means that ++ // the receiver class doesn't implement the interface, and wasn't the ++ // same as when the caller was compiled. ++ __ beqz(AT, L_no_such_interface); ++ ++ __ addi_d(t2, t2, itableOffsetEntry::size() * wordSize); ++ __ ld_d(AT, Address(t2, itableOffsetEntry::interface_offset())); ++ __ bne(AT, holder_klass_reg, entry); ++ ++ __ bind(hit); ++ } ++ ++ // We found a hit, move offset into T4 ++ __ ld_wu(t2, Address(t2, itableOffsetEntry::offset_offset())); ++ ++ // Compute itableMethodEntry. ++ const int method_offset = (itableMethodEntry::size() * wordSize * itable_index) + ++ in_bytes(itableMethodEntry::method_offset()); ++ ++ // Get Method* and entrypoint for compiler ++ const Register method = Rmethod; ++ ++ start_pc = __ pc(); ++ __ li(AT, method_offset); ++ slop_delta = load_const_maxLen - (__ pc() - start_pc); ++ slop_bytes += slop_delta; ++ assert(slop_delta >= 0, "negative slop(%d) encountered, adjust code size estimate!", slop_delta); ++ __ add_d(AT, AT, t2); ++ __ ldx_d(method, t1, AT); ++ ++#ifdef ASSERT ++ if (DebugVtables) { ++ Label L1; ++ __ beq(method, R0, L1); ++ __ ld_d(AT, method,in_bytes(Method::from_compiled_offset())); ++ __ bne(AT, R0, L1); ++ __ stop("compiler entrypoint is null"); ++ __ bind(L1); ++ } ++#endif // ASSERT ++ ++ // Rmethod: Method* ++ // T0: receiver ++ // T4: entry point ++ address ame_addr = __ pc(); ++ __ ld_d(T4, Address(method, Method::from_compiled_offset())); ++ __ jr(T4); ++ ++ __ bind(L_no_such_interface); ++ // Handle IncompatibleClassChangeError in itable stubs. ++ // More detailed error message. ++ // We force resolving of the call site by jumping to the "handle ++ // wrong method" stub, and so let the interpreter runtime do all the ++ // dirty work. ++ assert(SharedRuntime::get_handle_wrong_method_stub() != nullptr, "check initialization order"); ++ __ jmp((address)SharedRuntime::get_handle_wrong_method_stub(), relocInfo::runtime_call_type); ++ ++ masm->flush(); ++ bookkeeping(masm, tty, s, npe_addr, ame_addr, false, itable_index, slop_bytes, 0); ++ ++ return s; ++} ++ ++// NOTE : whenever you change the code above, dont forget to change the const here ++int VtableStub::pd_code_alignment() { ++ const unsigned int icache_line_size = wordSize; ++ return icache_line_size; ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp +--- a/src/hotspot/os/linux/os_linux.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/os/linux/os_linux.cpp 2024-02-20 10:42:36.235530055 +0800 +@@ -23,6 +23,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2021, 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + // no precompiled headers + #include "classfile/vmSymbols.hpp" + #include "code/icBuffer.hpp" +@@ -2186,6 +2192,12 @@ + return false; + } + ++int os::Linux::sched_active_processor_count() { ++ if (OSContainer::is_containerized()) ++ return OSContainer::active_processor_count(); ++ return os::Linux::active_processor_count(); ++} ++ + #ifdef __GLIBC__ + // For Glibc, print a one-liner with the malloc tunables. + // Most important and popular is MALLOC_ARENA_MAX, but we are +@@ -2402,7 +2414,7 @@ + // before "flags" so if we find a second "model name", then the + // "flags" field is considered missing. + static bool print_model_name_and_flags(outputStream* st, char* buf, size_t buflen) { +-#if defined(IA32) || defined(AMD64) ++#if defined(IA32) || defined(AMD64) || defined(LOONGARCH64) + // Other platforms have less repetitive cpuinfo files + FILE *fp = os::fopen("/proc/cpuinfo", "r"); + if (fp) { +@@ -2514,7 +2526,7 @@ + + #endif // INCLUDE_JFR + +-#if defined(AMD64) || defined(IA32) || defined(X32) ++#if defined(AMD64) || defined(IA32) || defined(X32) || defined(LOONGARCH64) + const char* search_string = "model name"; + #elif defined(M68K) + const char* search_string = "CPU"; +@@ -4483,6 +4495,44 @@ + // If there's only one node (they start from 0) or if the process + // is bound explicitly to a single node using membind, disable NUMA + UseNUMA = false; ++#if defined(LOONGARCH64) && !defined(ZERO) ++ } else if (InitialHeapSize < NUMAMinHeapSizePerNode * os::numa_get_groups_num()) { ++ // The MaxHeapSize is not actually used by the JVM unless your program ++ // creates enough objects to require it. A much smaller amount, called ++ // the InitialHeapSize, is allocated during JVM initialization. ++ // ++ // Setting the minimum and maximum heap size to the same value is typically ++ // not a good idea because garbage collection is delayed until the heap is ++ // full. Therefore, the first time that the GC runs, the process can take ++ // longer. Also, the heap is more likely to be fragmented and require a heap ++ // compaction. Start your application with the minimum heap size that your ++ // application requires. When the GC starts up, it runs frequently and ++ // efficiently because the heap is small. ++ // ++ // If the GC cannot find enough garbage, it runs compaction. If the GC finds ++ // enough garbage, or any of the other conditions for heap expansion are met, ++ // the GC expands the heap. ++ // ++ // Therefore, an application typically runs until the heap is full. Then, ++ // successive garbage collection cycles recover garbage. When the heap is ++ // full of live objects, the GC compacts the heap. If sufficient garbage ++ // is still not recovered, the GC expands the heap. ++ if (FLAG_IS_DEFAULT(UseNUMA)) { ++ FLAG_SET_ERGO(UseNUMA, false); ++ } else if (UseNUMA) { ++ log_info(os)("UseNUMA is disabled since insufficient initial heap size."); ++ UseNUMA = false; ++ } ++ } else if (FLAG_IS_CMDLINE(NewSize) && ++ (NewSize < ScaleForWordSize(1*M) * os::numa_get_groups_num())) { ++ if (FLAG_IS_DEFAULT(UseNUMA)) { ++ FLAG_SET_ERGO(UseNUMA, false); ++ } else if (UseNUMA) { ++ log_info(os)("Handcrafted MaxNewSize should be large enough " ++ "to avoid GC trigger before VM initialization completed."); ++ UseNUMA = false; ++ } ++#endif + } else { + LogTarget(Info,os) log; + LogStream ls(log); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os/linux/os_linux.hpp b/src/hotspot/os/linux/os_linux.hpp +--- a/src/hotspot/os/linux/os_linux.hpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/os/linux/os_linux.hpp 2024-02-20 10:42:36.235530055 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #ifndef OS_LINUX_OS_LINUX_HPP + #define OS_LINUX_OS_LINUX_HPP + +@@ -193,6 +199,8 @@ + + // none present + ++ static int sched_active_processor_count(); ++ + private: + static void numa_init(); + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os/linux/systemMemoryBarrier_linux.cpp b/src/hotspot/os/linux/systemMemoryBarrier_linux.cpp +--- a/src/hotspot/os/linux/systemMemoryBarrier_linux.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/os/linux/systemMemoryBarrier_linux.cpp 2024-02-20 10:42:36.235530055 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "logging/log.hpp" + #include "runtime/os.hpp" +@@ -43,6 +49,8 @@ + #define SYS_membarrier 283 + #elif defined(ALPHA) + #define SYS_membarrier 517 ++ #elif defined(LOONGARCH) ++ #define SYS_membarrier 283 + #else + #error define SYS_membarrier for the arch + #endif +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/amcas_asm.h b/src/hotspot/os_cpu/linux_loongarch/amcas_asm.h +--- a/src/hotspot/os_cpu/linux_loongarch/amcas_asm.h 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/amcas_asm.h 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,167 @@ ++/* ++ * Copyright (c) 1999, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef __AMCAS_ASM_H__ ++#define __AMCAS_ASM_H__ ++ asm( ++ ".macro parse_r var r \n\t" ++ "\\var = -1 \n\t" ++ ".ifc \\r, $r0 \n\t" ++ "\\var = 0 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r1 \n\t" ++ "\\var = 1 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r2 \n\t" ++ "\\var = 2 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r3 \n\t" ++ "\\var = 3 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r4 \n\t" ++ "\\var = 4 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r5 \n\t" ++ "\\var = 5 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r6 \n\t" ++ "\\var = 6 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r7 \n\t" ++ "\\var = 7 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r8 \n\t" ++ "\\var = 8 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r9 \n\t" ++ "\\var = 9 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r10 \n\t" ++ "\\var = 10 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r11 \n\t" ++ "\\var = 11 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r12 \n\t" ++ "\\var = 12 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r13 \n\t" ++ "\\var = 13 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r14 \n\t" ++ "\\var = 14 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r15 \n\t" ++ "\\var = 15 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r16 \n\t" ++ "\\var = 16 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r17 \n\t" ++ "\\var = 17 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r18 \n\t" ++ "\\var = 18 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r19 \n\t" ++ "\\var = 19 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r20 \n\t" ++ "\\var = 20 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r21 \n\t" ++ "\\var = 21 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r22 \n\t" ++ "\\var = 22 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r23 \n\t" ++ "\\var = 23 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r24 \n\t" ++ "\\var = 24 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r25 \n\t" ++ "\\var = 25 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r26 \n\t" ++ "\\var = 26 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r27 \n\t" ++ "\\var = 27 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r28 \n\t" ++ "\\var = 28 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r29 \n\t" ++ "\\var = 29 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r30 \n\t" ++ "\\var = 30 \n\t" ++ ".endif \n\t" ++ ".ifc \\r, $r31 \n\t" ++ "\\var = 31 \n\t" ++ ".endif \n\t" ++ ".iflt \\var \n\t" ++ ".error \n\t" ++ ".endif \n\t" ++ ".endm \n\t" ++ ++ ".macro amcas_w rd, rk, rj \n\t" ++ "parse_r d, \\rd \n\t" ++ "parse_r j, \\rj \n\t" ++ "parse_r k, \\rk \n\t" ++ ".word ((0b00111000010110010 << 15) | (k << 10) | (j << 5) | d) \n\t" ++ ".endm \n\t" ++ ++ ".macro amcas_d rd, rk, rj \n\t" ++ "parse_r d, \\rd \n\t" ++ "parse_r j, \\rj \n\t" ++ "parse_r k, \\rk \n\t" ++ ".word ((0b00111000010110011 << 15) | (k << 10) | (j << 5) | d) \n\t" ++ ".endm \n\t" ++ ++ ".macro amcas_db_b rd, rk, rj \n\t" ++ "parse_r d, \\rd \n\t" ++ "parse_r j, \\rj \n\t" ++ "parse_r k, \\rk \n\t" ++ ".word ((0b00111000010110100 << 15) | (k << 10) | (j << 5) | d) \n\t" ++ ".endm \n\t" ++ ++ ".macro amcas_db_w rd, rk, rj \n\t" ++ "parse_r d, \\rd \n\t" ++ "parse_r j, \\rj \n\t" ++ "parse_r k, \\rk \n\t" ++ ".word ((0b00111000010110110 << 15) | (k << 10) | (j << 5) | d) \n\t" ++ ".endm \n\t" ++ ++ ".macro amcas_db_d rd, rk, rj \n\t" ++ "parse_r d, \\rd \n\t" ++ "parse_r j, \\rj \n\t" ++ "parse_r k, \\rk \n\t" ++ ".word ((0b00111000010110111 << 15) | (k << 10) | (j << 5) | d) \n\t" ++ ".endm \n\t" ++ ); ++#endif /* __AMCAS_ASM_H__ */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/assembler_linux_loongarch.cpp b/src/hotspot/os_cpu/linux_loongarch/assembler_linux_loongarch.cpp +--- a/src/hotspot/os_cpu/linux_loongarch/assembler_linux_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/assembler_linux_loongarch.cpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,24 @@ ++/* ++ * Copyright (c) 1999, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/atomic_linux_loongarch.hpp b/src/hotspot/os_cpu/linux_loongarch/atomic_linux_loongarch.hpp +--- a/src/hotspot/os_cpu/linux_loongarch/atomic_linux_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/atomic_linux_loongarch.hpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,361 @@ ++/* ++ * Copyright (c) 1999, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef OS_CPU_LINUX_LOONGARCH_ATOMIC_LINUX_LOONGARCH_HPP ++#define OS_CPU_LINUX_LOONGARCH_ATOMIC_LINUX_LOONGARCH_HPP ++ ++#include "runtime/vm_version.hpp" ++#include "amcas_asm.h" ++ ++// Implementation of class atomic ++ ++template ++struct Atomic::PlatformAdd { ++ template ++ D fetch_then_add(D volatile* dest, I add_value, atomic_memory_order order) const; ++ ++ template ++ D add_then_fetch(D volatile* dest, I add_value, atomic_memory_order order) const { ++ return fetch_then_add(dest, add_value, order) + add_value; ++ } ++}; ++ ++template<> ++template ++inline D Atomic::PlatformAdd<4>::fetch_then_add(D volatile* dest, I add_value, ++ atomic_memory_order order) const { ++ STATIC_ASSERT(4 == sizeof(I)); ++ STATIC_ASSERT(4 == sizeof(D)); ++ D old_value; ++ ++ switch (order) { ++ case memory_order_relaxed: ++ asm volatile ( ++ "amadd.w %[old], %[add], %[dest] \n\t" ++ : [old] "=&r" (old_value) ++ : [add] "r" (add_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ default: ++ asm volatile ( ++ "amadd_db.w %[old], %[add], %[dest] \n\t" ++ : [old] "=&r" (old_value) ++ : [add] "r" (add_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ } ++ ++ return old_value; ++} ++ ++template<> ++template ++inline D Atomic::PlatformAdd<8>::fetch_then_add(D volatile* dest, I add_value, ++ atomic_memory_order order) const { ++ STATIC_ASSERT(8 == sizeof(I)); ++ STATIC_ASSERT(8 == sizeof(D)); ++ D old_value; ++ ++ switch (order) { ++ case memory_order_relaxed: ++ asm volatile ( ++ "amadd.d %[old], %[add], %[dest] \n\t" ++ : [old] "=&r" (old_value) ++ : [add] "r" (add_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ default: ++ asm volatile ( ++ "amadd_db.d %[old], %[add], %[dest] \n\t" ++ : [old] "=&r" (old_value) ++ : [add] "r" (add_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ } ++ ++ return old_value; ++} ++ ++template<> ++template ++inline T Atomic::PlatformXchg<4>::operator()(T volatile* dest, ++ T exchange_value, ++ atomic_memory_order order) const { ++ STATIC_ASSERT(4 == sizeof(T)); ++ T old_value; ++ ++ switch (order) { ++ case memory_order_relaxed: ++ asm volatile ( ++ "amswap.w %[_old], %[_new], %[dest] \n\t" ++ : [_old] "=&r" (old_value) ++ : [_new] "r" (exchange_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ default: ++ asm volatile ( ++ "amswap_db.w %[_old], %[_new], %[dest] \n\t" ++ : [_old] "=&r" (old_value) ++ : [_new] "r" (exchange_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ } ++ ++ return old_value; ++} ++ ++template<> ++template ++inline T Atomic::PlatformXchg<8>::operator()(T volatile* dest, ++ T exchange_value, ++ atomic_memory_order order) const { ++ STATIC_ASSERT(8 == sizeof(T)); ++ T old_value; ++ ++ switch (order) { ++ case memory_order_relaxed: ++ asm volatile ( ++ "amswap.d %[_old], %[_new], %[dest] \n\t" ++ : [_old] "=&r" (old_value) ++ : [_new] "r" (exchange_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ default: ++ asm volatile ( ++ "amswap_db.d %[_old], %[_new], %[dest] \n\t" ++ : [_old] "=&r" (old_value) ++ : [_new] "r" (exchange_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ } ++ ++ return old_value; ++} ++ ++template<> ++struct Atomic::PlatformCmpxchg<1> : Atomic::CmpxchgByteUsingInt {}; ++ ++template<> ++template ++inline T Atomic::PlatformCmpxchg<4>::operator()(T volatile* dest, ++ T compare_value, ++ T exchange_value, ++ atomic_memory_order order) const { ++ STATIC_ASSERT(4 == sizeof(T)); ++ T prev, temp; ++ ++ if (UseAMCAS) { ++ switch (order) { ++ case memory_order_relaxed: ++ asm volatile ( ++ " move %[prev], %[_old] \n\t" ++ " amcas_w %[prev], %[_new], %[dest] \n\t" ++ : [prev] "+&r" (prev) ++ : [_old] "r" (compare_value), [_new] "r" (exchange_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ case memory_order_acquire: ++ asm volatile ( ++ " move %[prev], %[_old] \n\t" ++ " amcas_w %[prev], %[_new], %[dest] \n\t" ++ " dbar 0x14 \n\t" ++ : [prev] "+&r" (prev) ++ : [_old] "r" (compare_value), [_new] "r" (exchange_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ case memory_order_release: ++ asm volatile ( ++ " move %[prev], %[_old] \n\t" ++ " dbar 0x12 \n\t" ++ " amcas_w %[prev], %[_new], %[dest] \n\t" ++ : [prev] "+&r" (prev) ++ : [_old] "r" (compare_value), [_new] "r" (exchange_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ default: ++ asm volatile ( ++ " move %[prev], %[_old] \n\t" ++ " amcas_db_w %[prev], %[_new], %[dest] \n\t" ++ : [prev] "+&r" (prev) ++ : [_old] "r" (compare_value), [_new] "r" (exchange_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ } ++ } else { ++ switch (order) { ++ case memory_order_relaxed: ++ case memory_order_release: ++ asm volatile ( ++ "1: ll.w %[prev], %[dest] \n\t" ++ " bne %[prev], %[_old], 2f \n\t" ++ " move %[temp], %[_new] \n\t" ++ " sc.w %[temp], %[dest] \n\t" ++ " beqz %[temp], 1b \n\t" ++ " b 3f \n\t" ++ "2: dbar 0x700 \n\t" ++ "3: \n\t" ++ : [prev] "=&r" (prev), [temp] "=&r" (temp) ++ : [_old] "r" (compare_value), [_new] "r" (exchange_value), [dest] "ZC" (*dest) ++ : "memory"); ++ break; ++ default: ++ asm volatile ( ++ "1: ll.w %[prev], %[dest] \n\t" ++ " bne %[prev], %[_old], 2f \n\t" ++ " move %[temp], %[_new] \n\t" ++ " sc.w %[temp], %[dest] \n\t" ++ " beqz %[temp], 1b \n\t" ++ " b 3f \n\t" ++ "2: dbar 0x14 \n\t" ++ "3: \n\t" ++ : [prev] "=&r" (prev), [temp] "=&r" (temp) ++ : [_old] "r" (compare_value), [_new] "r" (exchange_value), [dest] "ZC" (*dest) ++ : "memory"); ++ break; ++ } ++ } ++ ++ return prev; ++} ++ ++template<> ++template ++inline T Atomic::PlatformCmpxchg<8>::operator()(T volatile* dest, ++ T compare_value, ++ T exchange_value, ++ atomic_memory_order order) const { ++ STATIC_ASSERT(8 == sizeof(T)); ++ T prev, temp; ++ ++ if (UseAMCAS) { ++ switch (order) { ++ case memory_order_relaxed: ++ asm volatile ( ++ " move %[prev], %[_old] \n\t" ++ " amcas_d %[prev], %[_new], %[dest] \n\t" ++ : [prev] "+&r" (prev) ++ : [_old] "r" (compare_value), [_new] "r" (exchange_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ case memory_order_acquire: ++ asm volatile ( ++ " move %[prev], %[_old] \n\t" ++ " amcas_d %[prev], %[_new], %[dest] \n\t" ++ " dbar 0x14 \n\t" ++ : [prev] "+&r" (prev) ++ : [_old] "r" (compare_value), [_new] "r" (exchange_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ case memory_order_release: ++ asm volatile ( ++ " move %[prev], %[_old] \n\t" ++ " dbar 0x12 \n\t" ++ " amcas_d %[prev], %[_new], %[dest] \n\t" ++ : [prev] "+&r" (prev) ++ : [_old] "r" (compare_value), [_new] "r" (exchange_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ default: ++ asm volatile ( ++ " move %[prev], %[_old] \n\t" ++ " amcas_db_d %[prev], %[_new], %[dest] \n\t" ++ : [prev] "+&r" (prev) ++ : [_old] "r" (compare_value), [_new] "r" (exchange_value), [dest] "r" (dest) ++ : "memory"); ++ break; ++ } ++ } else { ++ switch (order) { ++ case memory_order_relaxed: ++ case memory_order_release: ++ asm volatile ( ++ "1: ll.d %[prev], %[dest] \n\t" ++ " bne %[prev], %[_old], 2f \n\t" ++ " move %[temp], %[_new] \n\t" ++ " sc.d %[temp], %[dest] \n\t" ++ " beqz %[temp], 1b \n\t" ++ " b 3f \n\t" ++ "2: dbar 0x700 \n\t" ++ "3: \n\t" ++ : [prev] "=&r" (prev), [temp] "=&r" (temp) ++ : [_old] "r" (compare_value), [_new] "r" (exchange_value), [dest] "ZC" (*dest) ++ : "memory"); ++ break; ++ default: ++ asm volatile ( ++ "1: ll.d %[prev], %[dest] \n\t" ++ " bne %[prev], %[_old], 2f \n\t" ++ " move %[temp], %[_new] \n\t" ++ " sc.d %[temp], %[dest] \n\t" ++ " beqz %[temp], 1b \n\t" ++ " b 3f \n\t" ++ "2: dbar 0x14 \n\t" ++ "3: \n\t" ++ : [prev] "=&r" (prev), [temp] "=&r" (temp) ++ : [_old] "r" (compare_value), [_new] "r" (exchange_value), [dest] "ZC" (*dest) ++ : "memory"); ++ break; ++ } ++ } ++ ++ return prev; ++} ++ ++template ++struct Atomic::PlatformOrderedLoad ++{ ++ template ++ T operator()(const volatile T* p) const { T data; __atomic_load(const_cast(p), &data, __ATOMIC_ACQUIRE); return data; } ++}; ++ ++template<> ++struct Atomic::PlatformOrderedStore<4, RELEASE_X> ++{ ++ template ++ void operator()(volatile T* p, T v) const { xchg(p, v, memory_order_release); } ++}; ++ ++template<> ++struct Atomic::PlatformOrderedStore<8, RELEASE_X> ++{ ++ template ++ void operator()(volatile T* p, T v) const { xchg(p, v, memory_order_release); } ++}; ++ ++template<> ++struct Atomic::PlatformOrderedStore<4, RELEASE_X_FENCE> ++{ ++ template ++ void operator()(volatile T* p, T v) const { xchg(p, v, memory_order_conservative); } ++}; ++ ++template<> ++struct Atomic::PlatformOrderedStore<8, RELEASE_X_FENCE> ++{ ++ template ++ void operator()(volatile T* p, T v) const { xchg(p, v, memory_order_conservative); } ++}; ++ ++#endif // OS_CPU_LINUX_LOONGARCH_ATOMIC_LINUX_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/copy_linux_loongarch.inline.hpp b/src/hotspot/os_cpu/linux_loongarch/copy_linux_loongarch.inline.hpp +--- a/src/hotspot/os_cpu/linux_loongarch/copy_linux_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/copy_linux_loongarch.inline.hpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,145 @@ ++/* ++ * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef OS_CPU_LINUX_LOONGARCH_COPY_LINUX_LOONGARCH_INLINE_HPP ++#define OS_CPU_LINUX_LOONGARCH_COPY_LINUX_LOONGARCH_INLINE_HPP ++ ++static void pd_conjoint_words(const HeapWord* from, HeapWord* to, size_t count) { ++ _conjoint_words(from, to, count); ++} ++ ++static void pd_disjoint_words(const HeapWord* from, HeapWord* to, size_t count) { ++ if (__builtin_expect(count <= 8, 1)) { ++ switch (count) { ++ case 8: to[7] = from[7]; ++ case 7: to[6] = from[6]; ++ case 6: to[5] = from[5]; ++ case 5: to[4] = from[4]; ++ case 4: to[3] = from[3]; ++ case 3: to[2] = from[2]; ++ case 2: to[1] = from[1]; ++ case 1: to[0] = from[0]; ++ case 0: break; ++ } ++ return; ++ } ++ ++ _disjoint_words(from, to, count); ++} ++ ++static void pd_disjoint_words_atomic(const HeapWord* from, HeapWord* to, size_t count) { ++ if (__builtin_expect(count <= 8, 1)) { ++ switch (count) { ++ case 8: to[7] = from[7]; ++ case 7: to[6] = from[6]; ++ case 6: to[5] = from[5]; ++ case 5: to[4] = from[4]; ++ case 4: to[3] = from[3]; ++ case 3: to[2] = from[2]; ++ case 2: to[1] = from[1]; ++ case 1: to[0] = from[0]; ++ case 0: break; ++ } ++ return; ++ } ++ ++ _disjoint_words_atomic(from, to, count); ++} ++ ++static void pd_aligned_conjoint_words(const HeapWord* from, HeapWord* to, size_t count) { ++ _aligned_conjoint_words(from, to, count); ++} ++ ++static void pd_aligned_disjoint_words(const HeapWord* from, HeapWord* to, size_t count) { ++ _aligned_disjoint_words(from, to, count); ++} ++ ++static void pd_conjoint_bytes(const void* from, void* to, size_t count) { ++ _conjoint_bytes(from, to, count); ++} ++ ++static void pd_conjoint_bytes_atomic(const void* from, void* to, size_t count) { ++ _conjoint_bytes_atomic(from, to, count); ++} ++ ++static void pd_conjoint_jshorts_atomic(const jshort* from, jshort* to, size_t count) { ++ _conjoint_jshorts_atomic(from, to, count); ++} ++ ++static void pd_conjoint_jints_atomic(const jint* from, jint* to, size_t count) { ++ _conjoint_jints_atomic(from, to, count); ++} ++ ++static void pd_conjoint_jlongs_atomic(const jlong* from, jlong* to, size_t count) { ++ _conjoint_jlongs_atomic(from, to, count); ++} ++ ++static void pd_conjoint_oops_atomic(const oop* from, oop* to, size_t count) { ++ _conjoint_oops_atomic(from, to, count); ++} ++ ++static void pd_arrayof_conjoint_bytes(const HeapWord* from, HeapWord* to, size_t count) { ++ _arrayof_conjoint_bytes(from, to, count); ++} ++ ++static void pd_arrayof_conjoint_jshorts(const HeapWord* from, HeapWord* to, size_t count) { ++ _arrayof_conjoint_jshorts(from, to, count); ++} ++ ++static void pd_arrayof_conjoint_jints(const HeapWord* from, HeapWord* to, size_t count) { ++ _arrayof_conjoint_jints(from, to, count); ++} ++ ++static void pd_arrayof_conjoint_jlongs(const HeapWord* from, HeapWord* to, size_t count) { ++ _arrayof_conjoint_jlongs(from, to, count); ++} ++ ++static void pd_arrayof_conjoint_oops(const HeapWord* from, HeapWord* to, size_t count) { ++ _arrayof_conjoint_oops(from, to, count); ++} ++ ++static void pd_fill_to_words(HeapWord* tohw, size_t count, juint value) { ++ julong val = ((julong)value << 32) | value; ++ _fill_to_words(tohw, val, count); ++} ++ ++static void pd_fill_to_aligned_words(HeapWord* tohw, size_t count, juint value) { ++ julong val = ((julong)value << 32) | value; ++ _fill_to_aligned_words(tohw, val, count); ++} ++ ++static void pd_fill_to_bytes(void* to, size_t count, jubyte value) { ++ _fill_to_bytes(to, value, count); ++} ++ ++static void pd_zero_to_words(HeapWord* tohw, size_t count) { ++ pd_fill_to_words(tohw, count, 0); ++} ++ ++static void pd_zero_to_bytes(void* to, size_t count) { ++ pd_fill_to_bytes(to, count, 0); ++} ++ ++#endif // OS_CPU_LINUX_LOONGARCH_COPY_LINUX_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/gc/x/xSyscall_linux_loongarch.hpp b/src/hotspot/os_cpu/linux_loongarch/gc/x/xSyscall_linux_loongarch.hpp +--- a/src/hotspot/os_cpu/linux_loongarch/gc/x/xSyscall_linux_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/gc/x/xSyscall_linux_loongarch.hpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,41 @@ ++/* ++ * Copyright (c) 2019, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#ifndef OS_CPU_LINUX_LOONGARCH_GC_X_XSYSCALL_LINUX_LOONGARCH_HPP ++#define OS_CPU_LINUX_LOONGARCH_GC_X_XSYSCALL_LINUX_LOONGARCH_HPP ++ ++#include ++ ++// ++// Support for building on older Linux systems ++// ++ ++#ifndef SYS_memfd_create ++#define SYS_memfd_create 279 ++#endif ++#ifndef SYS_fallocate ++#define SYS_fallocate 47 ++#endif ++ ++#endif // OS_CPU_LINUX_LOONGARCH_GC_X_XSYSCALL_LINUX_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/gc/z/zSyscall_linux_loongarch.hpp b/src/hotspot/os_cpu/linux_loongarch/gc/z/zSyscall_linux_loongarch.hpp +--- a/src/hotspot/os_cpu/linux_loongarch/gc/z/zSyscall_linux_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/gc/z/zSyscall_linux_loongarch.hpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,41 @@ ++/* ++ * Copyright (c) 2019, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++#ifndef OS_CPU_LINUX_LOONGARCH_GC_Z_ZSYSCALL_LINUX_LOONGARCH_HPP ++#define OS_CPU_LINUX_LOONGARCH_GC_Z_ZSYSCALL_LINUX_LOONGARCH_HPP ++ ++#include ++ ++// ++// Support for building on older Linux systems ++// ++ ++#ifndef SYS_memfd_create ++#define SYS_memfd_create 279 ++#endif ++#ifndef SYS_fallocate ++#define SYS_fallocate 47 ++#endif ++ ++#endif // OS_CPU_LINUX_LOONGARCH_GC_Z_ZSYSCALL_LINUX_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/globals_linux_loongarch.hpp b/src/hotspot/os_cpu/linux_loongarch/globals_linux_loongarch.hpp +--- a/src/hotspot/os_cpu/linux_loongarch/globals_linux_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/globals_linux_loongarch.hpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,43 @@ ++/* ++ * Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef OS_CPU_LINUX_LOONGARCH_GLOBALS_LINUX_LOONGARCH_HPP ++#define OS_CPU_LINUX_LOONGARCH_GLOBALS_LINUX_LOONGARCH_HPP ++ ++// Sets the default values for platform dependent flags used by the runtime system. ++// (see globals.hpp) ++ ++define_pd_global(bool, DontYieldALot, false); ++define_pd_global(intx, ThreadStackSize, 2048); // 0 => use system default ++define_pd_global(intx, VMThreadStackSize, 2048); ++ ++define_pd_global(intx, CompilerThreadStackSize, 2048); ++ ++define_pd_global(uintx,JVMInvokeMethodSlack, 8192); ++ ++// Used on 64 bit platforms for UseCompressedOops base address ++define_pd_global(uintx,HeapBaseMinAddress, 2*G); ++ ++#endif // OS_CPU_LINUX_LOONGARCH_GLOBALS_LINUX_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/javaThread_linux_loongarch.cpp b/src/hotspot/os_cpu/linux_loongarch/javaThread_linux_loongarch.cpp +--- a/src/hotspot/os_cpu/linux_loongarch/javaThread_linux_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/javaThread_linux_loongarch.cpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,105 @@ ++/* ++ * Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "compiler/compileBroker.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/javaThread.hpp" ++#include "runtime/sharedRuntime.hpp" ++ ++void JavaThread::pd_initialize() ++{ ++ _anchor.clear(); ++} ++ ++frame JavaThread::pd_last_frame() { ++ assert(has_last_Java_frame(), "must have last_Java_sp() when suspended"); ++ if (_anchor.last_Java_pc() != nullptr) { ++ return frame(_anchor.last_Java_sp(), _anchor.last_Java_fp(), _anchor.last_Java_pc()); ++ } else { ++ // This will pick up pc from sp ++ return frame(_anchor.last_Java_sp(), _anchor.last_Java_fp()); ++ } ++} ++ ++// For Forte Analyzer AsyncGetCallTrace profiling support - thread is ++// currently interrupted by SIGPROF ++bool JavaThread::pd_get_top_frame_for_signal_handler(frame* fr_addr, ++ void* ucontext, bool isInJava) { ++ ++ assert(Thread::current() == this, "caller must be current thread"); ++ return pd_get_top_frame(fr_addr, ucontext, isInJava); ++} ++ ++bool JavaThread::pd_get_top_frame_for_profiling(frame* fr_addr, void* ucontext, bool isInJava) { ++ return pd_get_top_frame(fr_addr, ucontext, isInJava); ++} ++ ++bool JavaThread::pd_get_top_frame(frame* fr_addr, void* ucontext, bool isInJava) { ++ // If we have a last_Java_frame, then we should use it even if ++ // isInJava == true. It should be more reliable than ucontext info. ++ if (has_last_Java_frame() && frame_anchor()->walkable()) { ++ *fr_addr = pd_last_frame(); ++ return true; ++ } ++ ++ // At this point, we don't have a last_Java_frame, so ++ // we try to glean some information out of the ucontext ++ // if we were running Java code when SIGPROF came in. ++ if (isInJava) { ++ ucontext_t* uc = (ucontext_t*) ucontext; ++ ++ intptr_t* ret_fp; ++ intptr_t* ret_sp; ++ address addr = os::fetch_frame_from_context(uc, &ret_sp, &ret_fp); ++ if (addr == nullptr || ret_sp == nullptr) { ++ // ucontext wasn't useful ++ return false; ++ } ++ ++ frame ret_frame(ret_sp, ret_fp, addr); ++ if (!ret_frame.safe_for_sender(this)) { ++#ifdef COMPILER2 ++ // C2 and JVMCI use ebp as a general register see if nullptr fp helps ++ frame ret_frame2(ret_sp, nullptr, addr); ++ if (!ret_frame2.safe_for_sender(this)) { ++ // nothing else to try if the frame isn't good ++ return false; ++ } ++ ret_frame = ret_frame2; ++#else ++ // nothing else to try if the frame isn't good ++ return false; ++#endif // COMPILER2_OR_JVMCI ++ } ++ *fr_addr = ret_frame; ++ return true; ++ } ++ ++ // nothing else to try ++ return false; ++} ++ ++void JavaThread::cache_global_variables() { } +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/javaThread_linux_loongarch.hpp b/src/hotspot/os_cpu/linux_loongarch/javaThread_linux_loongarch.hpp +--- a/src/hotspot/os_cpu/linux_loongarch/javaThread_linux_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/javaThread_linux_loongarch.hpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,66 @@ ++/* ++ * Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef OS_CPU_LINUX_LOONGARCH_JAVATHREAD_LINUX_LOONGARCH_HPP ++#define OS_CPU_LINUX_LOONGARCH_JAVATHREAD_LINUX_LOONGARCH_HPP ++ ++ private: ++ void pd_initialize(); ++ ++ frame pd_last_frame(); ++ ++ public: ++ // Mutators are highly dangerous.... ++ intptr_t* last_Java_fp() { return _anchor.last_Java_fp(); } ++ void set_last_Java_fp(intptr_t* fp) { _anchor.set_last_Java_fp(fp); } ++ ++ void set_base_of_stack_pointer(intptr_t* base_sp) { ++ } ++ ++ static ByteSize last_Java_fp_offset() { ++ return byte_offset_of(JavaThread, _anchor) + JavaFrameAnchor::last_Java_fp_offset(); ++ } ++ ++ intptr_t* base_of_stack_pointer() { ++ return nullptr; ++ } ++ void record_base_of_stack_pointer() { ++ } ++ ++ bool pd_get_top_frame_for_signal_handler(frame* fr_addr, void* ucontext, ++ bool isInJava); ++ ++ bool pd_get_top_frame_for_profiling(frame* fr_addr, void* ucontext, bool isInJava); ++private: ++ bool pd_get_top_frame(frame* fr_addr, void* ucontext, bool isInJava); ++public: ++ ++ // These routines are only used on cpu architectures that ++ // have separate register stacks (Itanium). ++ static bool register_stack_overflow() { return false; } ++ static void enable_register_stack_guard() {} ++ static void disable_register_stack_guard() {} ++ ++#endif // OS_CPU_LINUX_LOONGARCH_JAVATHREAD_LINUX_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/linux_loongarch.s b/src/hotspot/os_cpu/linux_loongarch/linux_loongarch.s +--- a/src/hotspot/os_cpu/linux_loongarch/linux_loongarch.s 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/linux_loongarch.s 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,25 @@ ++# ++# Copyright (c) 2004, 2013, Oracle and/or its affiliates. All rights reserved. ++# Copyright (c) 2015, 2021, Loongson Technology. All rights reserved. ++# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++# ++# This code is free software; you can redistribute it and/or modify it ++# under the terms of the GNU General Public License version 2 only, as ++# published by the Free Software Foundation. ++# ++# This code is distributed in the hope that it will be useful, but WITHOUT ++# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++# version 2 for more details (a copy is included in the LICENSE file that ++# accompanied this code). ++# ++# You should have received a copy of the GNU General Public License version ++# 2 along with this work; if not, write to the Free Software Foundation, ++# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++# ++# Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++# or visit www.oracle.com if you need additional information or have any ++# questions. ++# ++ ++ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/orderAccess_linux_loongarch.hpp b/src/hotspot/os_cpu/linux_loongarch/orderAccess_linux_loongarch.hpp +--- a/src/hotspot/os_cpu/linux_loongarch/orderAccess_linux_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/orderAccess_linux_loongarch.hpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,50 @@ ++/* ++ * Copyright (c) 2003, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef OS_CPU_LINUX_LOONGARCH_ORDERACCESS_LINUX_LOONGARCH_HPP ++#define OS_CPU_LINUX_LOONGARCH_ORDERACCESS_LINUX_LOONGARCH_HPP ++ ++#include "runtime/os.hpp" ++ ++// Included in orderAccess.hpp header file. ++ ++// Implementation of class OrderAccess. ++#define inlasm_sync(v) if (!UseActiveCoresMP) \ ++ __asm__ __volatile__ ("dbar %0" : :"K"(v) : "memory"); ++#define inlasm_synci() __asm__ __volatile__ ("ibar 0" : : : "memory"); ++ ++inline void OrderAccess::loadload() { inlasm_sync(0x15); } ++inline void OrderAccess::storestore() { inlasm_sync(0x1a); } ++inline void OrderAccess::loadstore() { inlasm_sync(0x16); } ++inline void OrderAccess::storeload() { inlasm_sync(0x19); } ++ ++inline void OrderAccess::acquire() { inlasm_sync(0x14); } ++inline void OrderAccess::release() { inlasm_sync(0x12); } ++inline void OrderAccess::fence() { inlasm_sync(0x10); } ++inline void OrderAccess::cross_modify_fence_impl() { inlasm_synci(); } ++ ++#undef inlasm_sync ++ ++#endif // OS_CPU_LINUX_LOONGARCH_ORDERACCESS_LINUX_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/os_linux_loongarch.cpp b/src/hotspot/os_cpu/linux_loongarch/os_linux_loongarch.cpp +--- a/src/hotspot/os_cpu/linux_loongarch/os_linux_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/os_linux_loongarch.cpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,491 @@ ++/* ++ * Copyright (c) 1999, 2014, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++// no precompiled headers ++#include "asm/macroAssembler.hpp" ++#include "classfile/classLoader.hpp" ++#include "classfile/systemDictionary.hpp" ++#include "classfile/vmSymbols.hpp" ++#include "code/icBuffer.hpp" ++#include "code/vtableStubs.hpp" ++#include "interpreter/interpreter.hpp" ++#include "memory/allocation.inline.hpp" ++#include "os_linux.hpp" ++#include "os_posix.hpp" ++#include "prims/jniFastGetField.hpp" ++#include "prims/jvm_misc.hpp" ++#include "runtime/arguments.hpp" ++#include "runtime/frame.inline.hpp" ++#include "runtime/interfaceSupport.inline.hpp" ++#include "runtime/java.hpp" ++#include "runtime/javaCalls.hpp" ++#include "runtime/javaThread.hpp" ++#include "runtime/mutexLocker.hpp" ++#include "runtime/osThread.hpp" ++#include "runtime/safepointMechanism.hpp" ++#include "runtime/sharedRuntime.hpp" ++#include "runtime/stubRoutines.hpp" ++#include "runtime/timer.hpp" ++#include "signals_posix.hpp" ++#include "utilities/events.hpp" ++#include "utilities/vmError.hpp" ++#include "compiler/disassembler.hpp" ++ ++// put OS-includes here ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++# include ++ ++#define REG_SP 3 ++#define REG_FP 22 ++ ++NOINLINE address os::current_stack_pointer() { ++ register void *sp __asm__ ("$r3"); ++ return (address) sp; ++} ++ ++char* os::non_memory_address_word() { ++ // Must never look like an address returned by reserve_memory, ++ // even in its subfields (as defined by the CPU immediate fields, ++ // if the CPU splits constants across multiple instructions). ++ ++ return (char*) -1; ++} ++ ++address os::Posix::ucontext_get_pc(const ucontext_t * uc) { ++ return (address)uc->uc_mcontext.__pc; ++} ++ ++void os::Posix::ucontext_set_pc(ucontext_t * uc, address pc) { ++ uc->uc_mcontext.__pc = (intptr_t)pc; ++} ++ ++intptr_t* os::Linux::ucontext_get_sp(const ucontext_t * uc) { ++ return (intptr_t*)uc->uc_mcontext.__gregs[REG_SP]; ++} ++ ++intptr_t* os::Linux::ucontext_get_fp(const ucontext_t * uc) { ++ return (intptr_t*)uc->uc_mcontext.__gregs[REG_FP]; ++} ++ ++address os::fetch_frame_from_context(const void* ucVoid, ++ intptr_t** ret_sp, intptr_t** ret_fp) { ++ ++ address epc; ++ ucontext_t* uc = (ucontext_t*)ucVoid; ++ ++ if (uc != nullptr) { ++ epc = os::Posix::ucontext_get_pc(uc); ++ if (ret_sp) *ret_sp = os::Linux::ucontext_get_sp(uc); ++ if (ret_fp) *ret_fp = os::Linux::ucontext_get_fp(uc); ++ } else { ++ epc = nullptr; ++ if (ret_sp) *ret_sp = (intptr_t *)nullptr; ++ if (ret_fp) *ret_fp = (intptr_t *)nullptr; ++ } ++ ++ return epc; ++} ++ ++frame os::fetch_frame_from_context(const void* ucVoid) { ++ intptr_t* sp; ++ intptr_t* fp; ++ address epc = fetch_frame_from_context(ucVoid, &sp, &fp); ++ if (!is_readable_pointer(epc)) { ++ // Try to recover from calling into bad memory ++ // Assume new frame has not been set up, the same as ++ // compiled frame stack bang ++ return fetch_compiled_frame_from_context(ucVoid); ++ } ++ return frame(sp, fp, epc); ++} ++ ++frame os::fetch_compiled_frame_from_context(const void* ucVoid) { ++ const ucontext_t* uc = (const ucontext_t*)ucVoid; ++ // In compiled code, the stack banging is performed before RA ++ // has been saved in the frame. RA is live, and SP and FP ++ // belong to the caller. ++ intptr_t* fp = os::Linux::ucontext_get_fp(uc); ++ intptr_t* sp = os::Linux::ucontext_get_sp(uc); ++ address pc = (address)(uc->uc_mcontext.__gregs[1]); ++ return frame(sp, fp, pc); ++} ++ ++// By default, gcc always save frame pointer on stack. It may get ++// turned off by -fomit-frame-pointer, ++frame os::get_sender_for_C_frame(frame* fr) { ++ return frame(fr->sender_sp(), fr->link(), fr->sender_pc()); ++} ++ ++frame os::current_frame() { ++ intptr_t *fp = ((intptr_t **)__builtin_frame_address(0))[frame::link_offset]; ++ frame myframe((intptr_t*)os::current_stack_pointer(), ++ (intptr_t*)fp, ++ CAST_FROM_FN_PTR(address, os::current_frame)); ++ if (os::is_first_C_frame(&myframe)) { ++ // stack is not walkable ++ return frame(); ++ } else { ++ return os::get_sender_for_C_frame(&myframe); ++ } ++} ++ ++bool PosixSignals::pd_hotspot_signal_handler(int sig, siginfo_t* info, ++ ucontext_t* uc, JavaThread* thread) { ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print_cr("Signal: signo=%d, sicode=%d, sierrno=%d, siaddr=%lx", ++ info->si_signo, ++ info->si_code, ++ info->si_errno, ++ info->si_addr); ++#endif ++ ++ // decide if this trap can be handled by a stub ++ address stub = nullptr; ++ address pc = nullptr; ++ ++ pc = (address) os::Posix::ucontext_get_pc(uc); ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print_cr("pc=%lx", pc); ++ os::print_context(tty, uc); ++#endif ++ //%note os_trap_1 ++ if (info != nullptr && uc != nullptr && thread != nullptr) { ++ pc = (address) os::Posix::ucontext_get_pc(uc); ++ ++ // Handle ALL stack overflow variations here ++ if (sig == SIGSEGV) { ++ address addr = (address) info->si_addr; ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print("handle all stack overflow variations: "); ++ /*tty->print("addr = %lx, stack base = %lx, stack top = %lx\n", ++ addr, ++ thread->stack_base(), ++ thread->stack_base() - thread->stack_size()); ++ */ ++#endif ++ ++ // check if fault address is within thread stack ++ if (thread->is_in_full_stack(addr)) { ++ // stack overflow ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print("stack exception check \n"); ++#endif ++ if (os::Posix::handle_stack_overflow(thread, addr, pc, uc, &stub)) { ++ return true; // continue ++ } ++ } ++ } // sig == SIGSEGV ++ ++ if (thread->thread_state() == _thread_in_Java) { ++ // Java thread running in Java code => find exception handler if any ++ // a fault inside compiled code, the interpreter, or a stub ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print("java thread running in java code\n"); ++#endif ++ ++ // Handle signal from NativeJump::patch_verified_entry(). ++ if (sig == SIGILL && nativeInstruction_at(pc)->is_sigill_not_entrant()) { ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print_cr("verified entry = %lx, sig=%d", nativeInstruction_at(pc), sig); ++#endif ++ stub = SharedRuntime::get_handle_wrong_method_stub(); ++ } else if (sig == SIGSEGV && SafepointMechanism::is_poll_address((address)info->si_addr)) { ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print_cr("polling address = %lx, sig=%d", os::get_polling_page(), sig); ++#endif ++ stub = SharedRuntime::get_poll_stub(pc); ++ } else if (sig == SIGBUS /* && info->si_code == BUS_OBJERR */) { ++ // BugId 4454115: A read from a MappedByteBuffer can fault ++ // here if the underlying file has been truncated. ++ // Do not crash the VM in such a case. ++ CodeBlob* cb = CodeCache::find_blob(pc); ++ CompiledMethod* nm = (cb != nullptr) ? cb->as_compiled_method_or_null() : nullptr; ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print("cb = %lx, nm = %lx\n", cb, nm); ++#endif ++ bool is_unsafe_arraycopy = (thread->doing_unsafe_access() && UnsafeCopyMemory::contains_pc(pc)); ++ if ((nm != nullptr && nm->has_unsafe_access()) || is_unsafe_arraycopy) { ++ address next_pc = pc + NativeInstruction::nop_instruction_size; ++ if (is_unsafe_arraycopy) { ++ next_pc = UnsafeCopyMemory::page_error_continue_pc(pc); ++ } ++ stub = SharedRuntime::handle_unsafe_access(thread, next_pc); ++ } ++ } else if (sig == SIGFPE && ++ (info->si_code == FPE_INTDIV || info->si_code == FPE_FLTDIV)) { ++ stub = SharedRuntime::continuation_for_implicit_exception(thread, ++ pc, ++ SharedRuntime::IMPLICIT_DIVIDE_BY_ZERO); ++ } else if (sig == SIGSEGV && ++ MacroAssembler::uses_implicit_null_check(info->si_addr)) { ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print("continuation for implicit exception\n"); ++#endif ++ // Determination of interpreter/vtable stub/compiled code null exception ++ stub = SharedRuntime::continuation_for_implicit_exception(thread, pc, SharedRuntime::IMPLICIT_NULL); ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print_cr("continuation_for_implicit_exception stub: %lx", stub); ++#endif ++ } else if (sig == SIGILL && nativeInstruction_at(pc)->is_stop()) { ++ // Pull a pointer to the error message out of the instruction ++ // stream. ++ const uint64_t *detail_msg_ptr ++ = (uint64_t*)(pc + 4/*NativeInstruction::instruction_size*/); ++ const char *detail_msg = (const char *)*detail_msg_ptr; ++ const char *msg = "stop"; ++ if (TraceTraps) { ++ tty->print_cr("trap: %s: (SIGILL)", msg); ++ } ++ ++ // End life with a fatal error, message and detail message and the context. ++ // Note: no need to do any post-processing here (e.g. signal chaining) ++ va_list va_dummy; ++ VMError::report_and_die(thread, uc, nullptr, 0, msg, detail_msg, va_dummy); ++ va_end(va_dummy); ++ ++ ShouldNotReachHere(); ++ } ++ } else if ((thread->thread_state() == _thread_in_vm || ++ thread->thread_state() == _thread_in_native) && ++ sig == SIGBUS && /* info->si_code == BUS_OBJERR && */ ++ thread->doing_unsafe_access()) { ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print_cr("SIGBUS in vm thread \n"); ++#endif ++ address next_pc = pc + NativeInstruction::nop_instruction_size; ++ if (UnsafeCopyMemory::contains_pc(pc)) { ++ next_pc = UnsafeCopyMemory::page_error_continue_pc(pc); ++ } ++ stub = SharedRuntime::handle_unsafe_access(thread, next_pc); ++ } ++ ++ // jni_fast_GetField can trap at certain pc's if a GC kicks in ++ // and the heap gets shrunk before the field access. ++ if ((sig == SIGSEGV) || (sig == SIGBUS)) { ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print("jni fast get trap: "); ++#endif ++ address addr = JNI_FastGetField::find_slowcase_pc(pc); ++ if (addr != (address)-1) { ++ stub = addr; ++ } ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print_cr("addr = %d, stub = %lx", addr, stub); ++#endif ++ } ++ } ++ ++ if (stub != nullptr) { ++#ifdef PRINT_SIGNAL_HANDLE ++ tty->print_cr("resolved stub=%lx\n",stub); ++#endif ++ // save all thread context in case we need to restore it ++ if (thread != nullptr) thread->set_saved_exception_pc(pc); ++ ++ os::Posix::ucontext_set_pc(uc, stub); ++ return true; ++ } ++ ++ return false; ++} ++ ++void os::Linux::init_thread_fpu_state(void) { ++} ++ ++int os::Linux::get_fpu_control_word(void) { ++ return 0; // mute compiler ++} ++ ++void os::Linux::set_fpu_control_word(int fpu_control) { ++} ++ ++//////////////////////////////////////////////////////////////////////////////// ++// thread stack ++ ++// Minimum usable stack sizes required to get to user code. Space for ++// HotSpot guard pages is added later. ++size_t os::_compiler_thread_min_stack_allowed = 48 * K; ++size_t os::_java_thread_min_stack_allowed = 40 * K; ++size_t os::_vm_internal_thread_min_stack_allowed = 64 * K; ++ ++// Return default stack size for thr_type ++size_t os::Posix::default_stack_size(os::ThreadType thr_type) { ++ // Default stack size (compiler thread needs larger stack) ++ size_t s = (thr_type == os::compiler_thread ? 2 * M : 512 * K); ++ return s; ++} ++ ++///////////////////////////////////////////////////////////////////////////// ++// helper functions for fatal error handler ++void os::print_register_info(outputStream *st, const void *context, int& continuation) { ++ const int register_count = 32; ++ int n = continuation; ++ assert(n >= 0 && n <= register_count, "Invalid continuation value"); ++ if (context == nullptr || n == register_count) { ++ return; ++ } ++ ++ const ucontext_t *uc = (const ucontext_t*)context; ++ while (n < register_count) { ++ // Update continuation with next index before printing location ++ continuation = n + 1; ++# define CASE_PRINT_REG(n, str) case n: st->print(str); print_location(st, (intptr_t)uc->uc_mcontext.__gregs[n]); ++ switch (n) { ++ CASE_PRINT_REG( 0, "ZERO="); break; ++ CASE_PRINT_REG( 1, "RA="); break; ++ CASE_PRINT_REG( 2, "TP="); break; ++ CASE_PRINT_REG( 3, "SP="); break; ++ CASE_PRINT_REG( 4, "A0="); break; ++ CASE_PRINT_REG( 5, "A1="); break; ++ CASE_PRINT_REG( 6, "A2="); break; ++ CASE_PRINT_REG( 7, "A3="); break; ++ CASE_PRINT_REG( 8, "A4="); break; ++ CASE_PRINT_REG( 9, "A5="); break; ++ CASE_PRINT_REG(10, "A6="); break; ++ CASE_PRINT_REG(11, "A7="); break; ++ CASE_PRINT_REG(12, "T0="); break; ++ CASE_PRINT_REG(13, "T1="); break; ++ CASE_PRINT_REG(14, "T2="); break; ++ CASE_PRINT_REG(15, "T3="); break; ++ CASE_PRINT_REG(16, "T4="); break; ++ CASE_PRINT_REG(17, "T5="); break; ++ CASE_PRINT_REG(18, "T6="); break; ++ CASE_PRINT_REG(19, "T7="); break; ++ CASE_PRINT_REG(20, "T8="); break; ++ CASE_PRINT_REG(21, "RX="); break; ++ CASE_PRINT_REG(22, "FP="); break; ++ CASE_PRINT_REG(23, "S0="); break; ++ CASE_PRINT_REG(24, "S1="); break; ++ CASE_PRINT_REG(25, "S2="); break; ++ CASE_PRINT_REG(26, "S3="); break; ++ CASE_PRINT_REG(27, "S4="); break; ++ CASE_PRINT_REG(28, "S5="); break; ++ CASE_PRINT_REG(29, "S6="); break; ++ CASE_PRINT_REG(30, "S7="); break; ++ CASE_PRINT_REG(31, "S8="); break; ++ } ++# undef CASE_PRINT_REG ++ ++n; ++ } ++} ++ ++void os::print_context(outputStream *st, const void *context) { ++ if (context == nullptr) return; ++ ++ const ucontext_t *uc = (const ucontext_t*)context; ++ ++ st->print_cr("Registers:"); ++ st->print( "ZERO=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[0]); ++ st->print(", RA=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[1]); ++ st->print(", TP=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[2]); ++ st->print(", SP=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[3]); ++ st->cr(); ++ st->print( "A0=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[4]); ++ st->print(", A1=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[5]); ++ st->print(", A2=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[6]); ++ st->print(", A3=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[7]); ++ st->cr(); ++ st->print( "A4=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[8]); ++ st->print(", A5=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[9]); ++ st->print(", A6=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[10]); ++ st->print(", A7=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[11]); ++ st->cr(); ++ st->print( "T0=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[12]); ++ st->print(", T1=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[13]); ++ st->print(", T2=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[14]); ++ st->print(", T3=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[15]); ++ st->cr(); ++ st->print( "T4=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[16]); ++ st->print(", T5=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[17]); ++ st->print(", T6=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[18]); ++ st->print(", T7=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[19]); ++ st->cr(); ++ st->print( "T8=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[20]); ++ st->print(", RX=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[21]); ++ st->print(", FP=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[22]); ++ st->print(", S0=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[23]); ++ st->cr(); ++ st->print( "S1=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[24]); ++ st->print(", S2=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[25]); ++ st->print(", S3=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[26]); ++ st->print(", S4=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[27]); ++ st->cr(); ++ st->print( "S5=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[28]); ++ st->print(", S6=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[29]); ++ st->print(", S7=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[30]); ++ st->print(", S8=" INTPTR_FORMAT, (intptr_t)uc->uc_mcontext.__gregs[31]); ++ st->cr(); ++ st->cr(); ++} ++ ++void os::print_tos_pc(outputStream *st, const void *context) { ++ if (context == nullptr) return; ++ ++ const ucontext_t* uc = (const ucontext_t*)context; ++ ++ address sp = (address)os::Linux::ucontext_get_sp(uc); ++ print_tos(st, sp); ++ st->cr(); ++ ++ // Note: it may be unsafe to inspect memory near pc. For example, pc may ++ // point to garbage if entry point in an nmethod is corrupted. Leave ++ // this at the end, and hope for the best. ++ address pc = os::fetch_frame_from_context(uc).pc(); ++ print_instructions(st, pc); ++ st->cr(); ++} ++ ++void os::setup_fpu() { ++ // no use for LA ++} ++ ++#ifndef PRODUCT ++void os::verify_stack_alignment() { ++ assert(((intptr_t)os::current_stack_pointer() & (StackAlignmentInBytes-1)) == 0, "incorrect stack alignment"); ++} ++#endif ++ ++int os::extra_bang_size_in_bytes() { ++ // LA does not require the additional stack bang. ++ return 0; ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/os_linux_loongarch.inline.hpp b/src/hotspot/os_cpu/linux_loongarch/os_linux_loongarch.inline.hpp +--- a/src/hotspot/os_cpu/linux_loongarch/os_linux_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/os_linux_loongarch.inline.hpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,29 @@ ++/* ++ * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef OS_CPU_LINUX_LOONGARCH_OS_LINUX_LOONGARCH_INLINE_HPP ++#define OS_CPU_LINUX_LOONGARCH_OS_LINUX_LOONGARCH_INLINE_HPP ++ ++#endif // OS_CPU_LINUX_LOONGARCH_OS_LINUX_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/prefetch_linux_loongarch.inline.hpp b/src/hotspot/os_cpu/linux_loongarch/prefetch_linux_loongarch.inline.hpp +--- a/src/hotspot/os_cpu/linux_loongarch/prefetch_linux_loongarch.inline.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/prefetch_linux_loongarch.inline.hpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,56 @@ ++/* ++ * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef OS_CPU_LINUX_LOONGARCH_PREFETCH_LINUX_LOONGARCH_INLINE_HPP ++#define OS_CPU_LINUX_LOONGARCH_PREFETCH_LINUX_LOONGARCH_INLINE_HPP ++ ++ ++inline void Prefetch::read (const void *loc, intx interval) { ++// According to previous and present SPECjbb2015 score, ++// comment prefetch is better than if (interval >= 0) prefetch branch. ++// So choose comment prefetch as the base line. ++#if 0 ++ __asm__ __volatile__ ( ++ " preld 0, %[__loc] \n" ++ : ++ : [__loc] "m"( *((address)loc + interval) ) ++ : "memory" ++ ); ++#endif ++} ++ ++inline void Prefetch::write(void *loc, intx interval) { ++// Ditto ++#if 0 ++ __asm__ __volatile__ ( ++ " preld 8, %[__loc] \n" ++ : ++ : [__loc] "m"( *((address)loc + interval) ) ++ : "memory" ++ ); ++#endif ++} ++ ++#endif // OS_CPU_LINUX_LOONGARCH_PREFETCH_LINUX_LOONGARCH_INLINE_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/safefetch_linux_loongarch64.S b/src/hotspot/os_cpu/linux_loongarch/safefetch_linux_loongarch64.S +--- a/src/hotspot/os_cpu/linux_loongarch/safefetch_linux_loongarch64.S 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/safefetch_linux_loongarch64.S 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,56 @@ ++/* ++ * Copyright (c) 2022 SAP SE. All rights reserved. ++ * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++ .globl SafeFetchN_impl ++ .globl _SafeFetchN_fault ++ .globl _SafeFetchN_continuation ++ .globl SafeFetch32_impl ++ .globl _SafeFetch32_fault ++ .globl _SafeFetch32_continuation ++ ++ # Support for int SafeFetch32(int* address, int defaultval); ++ # ++ # a0 : address ++ # a1 : defaultval ++SafeFetch32_impl: ++_SafeFetch32_fault: ++ ld.w $r4, $r4, 0 ++ jr $r1 ++_SafeFetch32_continuation: ++ or $r4, $r5, $r0 ++ jr $r1 ++ ++ # Support for intptr_t SafeFetchN(intptr_t* address, intptr_t defaultval); ++ # ++ # a0 : address ++ # a1 : defaultval ++SafeFetchN_impl: ++_SafeFetchN_fault: ++ ld.d $r4, $r4, 0 ++ jr $r1 ++_SafeFetchN_continuation: ++ or $r4, $r5, $r0 ++ jr $r1 +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/vmStructs_linux_loongarch.hpp b/src/hotspot/os_cpu/linux_loongarch/vmStructs_linux_loongarch.hpp +--- a/src/hotspot/os_cpu/linux_loongarch/vmStructs_linux_loongarch.hpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/vmStructs_linux_loongarch.hpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,55 @@ ++/* ++ * Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#ifndef OS_CPU_LINUX_LOONGARCH_VMSTRUCTS_LINUX_LOONGARCH_HPP ++#define OS_CPU_LINUX_LOONGARCH_VMSTRUCTS_LINUX_LOONGARCH_HPP ++ ++// These are the OS and CPU-specific fields, types and integer ++// constants required by the Serviceability Agent. This file is ++// referenced by vmStructs.cpp. ++ ++#define VM_STRUCTS_OS_CPU(nonstatic_field, static_field, unchecked_nonstatic_field, volatile_nonstatic_field, nonproduct_nonstatic_field, c2_nonstatic_field, unchecked_c1_static_field, unchecked_c2_static_field) \ ++ \ ++ /******************************/ \ ++ /* Threads (NOTE: incomplete) */ \ ++ /******************************/ \ ++ nonstatic_field(OSThread, _thread_id, pid_t) \ ++ nonstatic_field(OSThread, _pthread_id, pthread_t) ++ ++ ++#define VM_TYPES_OS_CPU(declare_type, declare_toplevel_type, declare_oop_type, declare_integer_type, declare_unsigned_integer_type, declare_c1_toplevel_type, declare_c2_type, declare_c2_toplevel_type) \ ++ \ ++ /**********************/ \ ++ /* Posix Thread IDs */ \ ++ /**********************/ \ ++ \ ++ declare_integer_type(pid_t) \ ++ declare_unsigned_integer_type(pthread_t) ++ ++#define VM_INT_CONSTANTS_OS_CPU(declare_constant, declare_preprocessor_constant, declare_c1_constant, declare_c2_constant, declare_c2_preprocessor_constant) ++ ++#define VM_LONG_CONSTANTS_OS_CPU(declare_constant, declare_preprocessor_constant, declare_c1_constant, declare_c2_constant, declare_c2_preprocessor_constant) ++ ++#endif // OS_CPU_LINUX_LOONGARCH_VMSTRUCTS_LINUX_LOONGARCH_HPP +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/os_cpu/linux_loongarch/vm_version_linux_loongarch.cpp b/src/hotspot/os_cpu/linux_loongarch/vm_version_linux_loongarch.cpp +--- a/src/hotspot/os_cpu/linux_loongarch/vm_version_linux_loongarch.cpp 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/hotspot/os_cpu/linux_loongarch/vm_version_linux_loongarch.cpp 2024-02-20 10:42:36.245530048 +0800 +@@ -0,0 +1,95 @@ ++/* ++ * Copyright (c) 2006, 2021, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++#include "precompiled.hpp" ++#include "asm/register.hpp" ++#include "runtime/os.hpp" ++#include "runtime/os.inline.hpp" ++#include "runtime/vm_version.hpp" ++ ++#include ++#include ++ ++#ifndef HWCAP_LOONGARCH_LAM ++#define HWCAP_LOONGARCH_LAM (1 << 1) ++#endif ++ ++#ifndef HWCAP_LOONGARCH_UAL ++#define HWCAP_LOONGARCH_UAL (1 << 2) ++#endif ++ ++#ifndef HWCAP_LOONGARCH_LSX ++#define HWCAP_LOONGARCH_LSX (1 << 4) ++#endif ++ ++#ifndef HWCAP_LOONGARCH_LASX ++#define HWCAP_LOONGARCH_LASX (1 << 5) ++#endif ++ ++#ifndef HWCAP_LOONGARCH_COMPLEX ++#define HWCAP_LOONGARCH_COMPLEX (1 << 7) ++#endif ++ ++#ifndef HWCAP_LOONGARCH_CRYPTO ++#define HWCAP_LOONGARCH_CRYPTO (1 << 8) ++#endif ++ ++#ifndef HWCAP_LOONGARCH_LBT_X86 ++#define HWCAP_LOONGARCH_LBT_X86 (1 << 10) ++#endif ++ ++#ifndef HWCAP_LOONGARCH_LBT_ARM ++#define HWCAP_LOONGARCH_LBT_ARM (1 << 11) ++#endif ++ ++#ifndef HWCAP_LOONGARCH_LBT_MIPS ++#define HWCAP_LOONGARCH_LBT_MIPS (1 << 12) ++#endif ++ ++void VM_Version::get_os_cpu_info() { ++ ++ uint64_t auxv = getauxval(AT_HWCAP); ++ ++ static_assert(CPU_LAM == HWCAP_LOONGARCH_LAM, "Flag CPU_LAM must follow Linux HWCAP"); ++ static_assert(CPU_UAL == HWCAP_LOONGARCH_UAL, "Flag CPU_UAL must follow Linux HWCAP"); ++ static_assert(CPU_LSX == HWCAP_LOONGARCH_LSX, "Flag CPU_LSX must follow Linux HWCAP"); ++ static_assert(CPU_LASX == HWCAP_LOONGARCH_LASX, "Flag CPU_LASX must follow Linux HWCAP"); ++ static_assert(CPU_COMPLEX == HWCAP_LOONGARCH_COMPLEX, "Flag CPU_COMPLEX must follow Linux HWCAP"); ++ static_assert(CPU_CRYPTO == HWCAP_LOONGARCH_CRYPTO, "Flag CPU_CRYPTO must follow Linux HWCAP"); ++ static_assert(CPU_LBT_X86 == HWCAP_LOONGARCH_LBT_X86, "Flag CPU_LBT_X86 must follow Linux HWCAP"); ++ static_assert(CPU_LBT_ARM == HWCAP_LOONGARCH_LBT_ARM, "Flag CPU_LBT_ARM must follow Linux HWCAP"); ++ static_assert(CPU_LBT_MIPS == HWCAP_LOONGARCH_LBT_MIPS, "Flag CPU_LBT_MIPS must follow Linux HWCAP"); ++ ++ _features = auxv & ( ++ HWCAP_LOONGARCH_LAM | ++ HWCAP_LOONGARCH_UAL | ++ HWCAP_LOONGARCH_LSX | ++ HWCAP_LOONGARCH_LASX | ++ HWCAP_LOONGARCH_COMPLEX | ++ HWCAP_LOONGARCH_CRYPTO | ++ HWCAP_LOONGARCH_LBT_X86 | ++ HWCAP_LOONGARCH_LBT_ARM | ++ HWCAP_LOONGARCH_LBT_MIPS); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/adlc/formssel.cpp b/src/hotspot/share/adlc/formssel.cpp +--- a/src/hotspot/share/adlc/formssel.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/adlc/formssel.cpp 2024-02-20 10:42:36.252196709 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + // FORMS.CPP - Definitions for ADL Parser Forms Classes + #include "adlc.hpp" + +@@ -4144,6 +4150,7 @@ + !strcmp(_opType,"MemBarVolatile") || + !strcmp(_opType,"MemBarCPUOrder") || + !strcmp(_opType,"MemBarStoreStore") || ++ !strcmp(_opType,"SameAddrLoadFence" ) || + !strcmp(_opType,"OnSpinWait"); + } + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/c1/c1_Compiler.cpp b/src/hotspot/share/c1/c1_Compiler.cpp +--- a/src/hotspot/share/c1/c1_Compiler.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/c1/c1_Compiler.cpp 2024-02-20 10:42:36.255530038 +0800 +@@ -43,6 +43,12 @@ + #include "utilities/bitMap.inline.hpp" + #include "utilities/macros.hpp" + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + + Compiler::Compiler() : AbstractCompiler(compiler_c1) { + } +@@ -219,7 +225,7 @@ + case vmIntrinsics::_updateCRC32: + case vmIntrinsics::_updateBytesCRC32: + case vmIntrinsics::_updateByteBufferCRC32: +-#if defined(S390) || defined(PPC64) || defined(AARCH64) ++#if defined(S390) || defined(PPC64) || defined(AARCH64) || defined(LOONGARCH64) + case vmIntrinsics::_updateBytesCRC32C: + case vmIntrinsics::_updateDirectByteBufferCRC32C: + #endif +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/c1/c1_LinearScan.cpp b/src/hotspot/share/c1/c1_LinearScan.cpp +--- a/src/hotspot/share/c1/c1_LinearScan.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/c1/c1_LinearScan.cpp 2024-02-20 10:42:36.258863370 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "c1/c1_CFGPrinter.hpp" + #include "c1/c1_CodeStubs.hpp" +@@ -35,6 +41,12 @@ + #include "runtime/timerTrace.hpp" + #include "utilities/bitMap.inline.hpp" + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #ifndef PRODUCT + + static LinearScanStatistic _stat_before_alloc; +@@ -3141,9 +3153,6 @@ + } + } + +-#ifndef RISCV +- // Disable these optimizations on riscv temporarily, because it does not +- // work when the comparison operands are bound to branches or cmoves. + { TIME_LINEAR_SCAN(timer_optimize_lir); + + EdgeMoveOptimizer::optimize(ir()->code()); +@@ -3151,7 +3160,6 @@ + // check that cfg is still correct after optimizations + ir()->verify(); + } +-#endif + + NOT_PRODUCT(print_lir(1, "Before Code Generation", false)); + NOT_PRODUCT(LinearScanStatistic::compute(this, _stat_final)); +@@ -5957,9 +5965,13 @@ + if (block->number_of_preds() > 1 && !block->is_set(BlockBegin::exception_entry_flag)) { + optimizer.optimize_moves_at_block_end(block); + } ++#if !defined(RISCV) && !defined(LOONGARCH) ++ // Disable this optimization on riscv and loongarch temporarily, because it does not ++ // work when the comparison operands are bound to branches or cmoves. + if (block->number_of_sux() == 2) { + optimizer.optimize_moves_at_block_begin(block); + } ++#endif + } + } + +@@ -6376,7 +6388,16 @@ + LIR_OpBranch* prev_branch = (LIR_OpBranch*)prev_op; + + if (prev_branch->stub() == nullptr) { ++#if defined(RISCV) || defined(LOONGARCH) ++ if (prev_branch->block() == code->at(i + 1) && prev_branch->info() == nullptr) { ++ TRACE_LINEAR_SCAN(3, tty->print_cr("Negating conditional branch and deleting unconditional branch at end of block B%d", block->block_id())); + ++ // eliminate a conditional branch to the immediate successor ++ prev_branch->change_block(last_branch->block()); ++ prev_branch->negate_cond(); ++ instructions->trunc_to(instructions->length() - 1); ++ } ++#else + LIR_Op2* prev_cmp = nullptr; + // There might be a cmove inserted for profiling which depends on the same + // compare. If we change the condition of the respective compare, we have +@@ -6416,6 +6437,7 @@ + prev_cmove->set_in_opr2(t); + } + } ++#endif + } + } + } +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/c1/c1_LIR.cpp b/src/hotspot/share/c1/c1_LIR.cpp +--- a/src/hotspot/share/c1/c1_LIR.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/c1/c1_LIR.cpp 2024-02-20 10:42:36.258863370 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "c1/c1_CodeStubs.hpp" + #include "c1/c1_InstructionPrinter.hpp" +@@ -491,6 +497,7 @@ + assert(opConvert->_info == nullptr, "must be"); + if (opConvert->_opr->is_valid()) do_input(opConvert->_opr); + if (opConvert->_result->is_valid()) do_output(opConvert->_result); ++ if (opConvert->_tmp->is_valid()) do_temp(opConvert->_tmp); + do_stub(opConvert->_stub); + + break; +@@ -1101,7 +1108,7 @@ + , _file(nullptr) + , _line(0) + #endif +-#ifdef RISCV ++#if defined(RISCV) || defined(LOONGARCH) + , _cmp_opr1(LIR_OprFact::illegalOpr) + , _cmp_opr2(LIR_OprFact::illegalOpr) + #endif +@@ -1122,7 +1129,7 @@ + } + #endif + +-#ifdef RISCV ++#if defined(RISCV) || defined(LOONGARCH) + void LIR_List::set_cmp_oprs(LIR_Op* op) { + switch (op->code()) { + case lir_cmp: +@@ -1151,7 +1158,7 @@ + break; + #if INCLUDE_ZGC + case lir_xloadbarrier_test: +- _cmp_opr1 = FrameMap::as_opr(t1); ++ _cmp_opr1 = FrameMap::as_opr(RISCV_ONLY(t1) LOONGARCH64_ONLY(SCR1)); + _cmp_opr2 = LIR_OprFact::intConst(0); + break; + #endif +@@ -1406,11 +1413,12 @@ + tmp)); + } + +-void LIR_List::fcmp2int(LIR_Opr left, LIR_Opr right, LIR_Opr dst, bool is_unordered_less) { ++void LIR_List::fcmp2int(LIR_Opr left, LIR_Opr right, LIR_Opr dst, bool is_unordered_less, LIR_Opr tmp) { + append(new LIR_Op2(is_unordered_less ? lir_ucmp_fd2i : lir_cmp_fd2i, + left, + right, +- dst)); ++ dst, ++ tmp)); + } + + void LIR_List::lock_object(LIR_Opr hdr, LIR_Opr obj, LIR_Opr lock, LIR_Opr scratch, CodeStub* stub, CodeEmitInfo* info) { +@@ -1927,6 +1935,9 @@ + print_bytecode(out, bytecode()); + in_opr()->print(out); out->print(" "); + result_opr()->print(out); out->print(" "); ++ if(tmp()->is_valid()) { ++ tmp()->print(out); out->print(" "); ++ } + } + + void LIR_OpConvert::print_bytecode(outputStream* out, Bytecodes::Code code) { +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/c1/c1_LIR.hpp b/src/hotspot/share/c1/c1_LIR.hpp +--- a/src/hotspot/share/c1/c1_LIR.hpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/c1/c1_LIR.hpp 2024-02-20 10:42:36.258863370 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #ifndef SHARE_C1_C1_LIR_HPP + #define SHARE_C1_C1_LIR_HPP + +@@ -1453,15 +1459,18 @@ + private: + Bytecodes::Code _bytecode; + ConversionStub* _stub; ++ LIR_Opr _tmp; + + public: +- LIR_OpConvert(Bytecodes::Code code, LIR_Opr opr, LIR_Opr result, ConversionStub* stub) ++ LIR_OpConvert(Bytecodes::Code code, LIR_Opr opr, LIR_Opr result, ConversionStub* stub, LIR_Opr tmp) + : LIR_Op1(lir_convert, opr, result) + , _bytecode(code) +- , _stub(stub) {} ++ , _stub(stub) ++ , _tmp(tmp) {} + + Bytecodes::Code bytecode() const { return _bytecode; } + ConversionStub* stub() const { return _stub; } ++ LIR_Opr tmp() const { return _tmp; } + + virtual void emit_code(LIR_Assembler* masm); + virtual LIR_OpConvert* as_OpConvert() { return this; } +@@ -2097,7 +2106,7 @@ + const char * _file; + int _line; + #endif +-#ifdef RISCV ++#if defined(RISCV) || defined(LOONGARCH) + LIR_Opr _cmp_opr1; + LIR_Opr _cmp_opr2; + #endif +@@ -2113,7 +2122,7 @@ + } + #endif // PRODUCT + +-#ifdef RISCV ++#if defined(RISCV) || defined(LOONGARCH) + set_cmp_oprs(op); + // lir_cmp set cmp oprs only on riscv + if (op->code() == lir_cmp) return; +@@ -2135,7 +2144,7 @@ + void set_file_and_line(const char * file, int line); + #endif + +-#ifdef RISCV ++#if defined(RISCV) || defined(LOONGARCH) + void set_cmp_oprs(LIR_Op* op); + #endif + +@@ -2228,7 +2237,9 @@ + void safepoint(LIR_Opr tmp, CodeEmitInfo* info) { append(new LIR_Op1(lir_safepoint, tmp, info)); } + void return_op(LIR_Opr result) { append(new LIR_OpReturn(result)); } + +- void convert(Bytecodes::Code code, LIR_Opr left, LIR_Opr dst, ConversionStub* stub = nullptr/*, bool is_32bit = false*/) { append(new LIR_OpConvert(code, left, dst, stub)); } ++ void convert(Bytecodes::Code code, LIR_Opr left, LIR_Opr dst, ConversionStub* stub = nullptr, LIR_Opr tmp = LIR_OprFact::illegalOpr) { ++ append(new LIR_OpConvert(code, left, dst, stub, tmp)); ++ } + + void logical_and (LIR_Opr left, LIR_Opr right, LIR_Opr dst) { append(new LIR_Op2(lir_logic_and, left, right, dst)); } + void logical_or (LIR_Opr left, LIR_Opr right, LIR_Opr dst) { append(new LIR_Op2(lir_logic_or, left, right, dst)); } +@@ -2336,7 +2347,7 @@ + void unsigned_shift_right(LIR_Opr value, int count, LIR_Opr dst) { unsigned_shift_right(value, LIR_OprFact::intConst(count), dst, LIR_OprFact::illegalOpr); } + + void lcmp2int(LIR_Opr left, LIR_Opr right, LIR_Opr dst) { append(new LIR_Op2(lir_cmp_l2i, left, right, dst)); } +- void fcmp2int(LIR_Opr left, LIR_Opr right, LIR_Opr dst, bool is_unordered_less); ++ void fcmp2int(LIR_Opr left, LIR_Opr right, LIR_Opr dst, bool is_unordered_less, LIR_Opr tmp = LIR_OprFact::illegalOpr); + + void call_runtime_leaf(address routine, LIR_Opr tmp, LIR_Opr result, LIR_OprList* arguments) { + append(new LIR_OpRTCall(routine, tmp, result, arguments)); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/code/vtableStubs.cpp b/src/hotspot/share/code/vtableStubs.cpp +--- a/src/hotspot/share/code/vtableStubs.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/code/vtableStubs.cpp 2024-02-20 10:42:36.292196678 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2019, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "code/vtableStubs.hpp" + #include "compiler/compileBroker.hpp" +@@ -102,7 +108,11 @@ + + #if defined(PRODUCT) + // These values are good for the PRODUCT case (no tracing). ++#if defined LOONGARCH64 ++ static const int first_vtableStub_size = 128; ++#else + static const int first_vtableStub_size = 64; ++#endif + static const int first_itableStub_size = 256; + #else + // These values are good for the non-PRODUCT case (when tracing can be switched on). +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/gc/g1/g1Arguments.cpp b/src/hotspot/share/gc/g1/g1Arguments.cpp +--- a/src/hotspot/share/gc/g1/g1Arguments.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/gc/g1/g1Arguments.cpp 2024-02-20 10:42:36.298863339 +0800 +@@ -23,6 +23,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "gc/g1/g1Arguments.hpp" + #include "gc/g1/g1CardSet.hpp" +@@ -164,6 +170,20 @@ + void G1Arguments::initialize() { + GCArguments::initialize(); + assert(UseG1GC, "Error"); ++ ++#if defined(LOONGARCH64) && !defined(ZERO) ++ if (FLAG_IS_DEFAULT(UseNUMA)) { ++ FLAG_SET_DEFAULT(UseNUMA, true); ++ } ++ ++ if (UseNUMA && FLAG_IS_CMDLINE(G1HeapRegionSize)) { ++ uintx min_g1_heap_size_per_node = G1HeapRegionSize * NUMAMinG1RegionNumberPerNode; ++ if (min_g1_heap_size_per_node > NUMAMinHeapSizePerNode) { ++ FLAG_SET_ERGO(NUMAMinHeapSizePerNode, min_g1_heap_size_per_node); ++ } ++ } ++#endif ++ + FLAG_SET_DEFAULT(ParallelGCThreads, WorkerPolicy::parallel_worker_threads()); + if (ParallelGCThreads == 0) { + assert(!FLAG_IS_DEFAULT(ParallelGCThreads), "The default value for ParallelGCThreads should not be 0."); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/gc/g1/g1ParScanThreadState.inline.hpp b/src/hotspot/share/gc/g1/g1ParScanThreadState.inline.hpp +--- a/src/hotspot/share/gc/g1/g1ParScanThreadState.inline.hpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/gc/g1/g1ParScanThreadState.inline.hpp 2024-02-20 10:42:36.312196661 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2021, These ++ * modifications are Copyright (c) 2021, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #ifndef SHARE_GC_G1_G1PARSCANTHREADSTATE_INLINE_HPP + #define SHARE_GC_G1_G1PARSCANTHREADSTATE_INLINE_HPP + +@@ -59,6 +65,9 @@ + void G1ParScanThreadState::trim_queue() { + trim_queue_to_threshold(0); + assert(_task_queue->overflow_empty(), "invariant"); ++ // Load of _age._fields._top in trim_queue_to_threshold must not pass ++ // the load of _age._fields._top in assert _task_queue->taskqueue_empty(). ++ DEBUG_ONLY(OrderAccess::loadload();) + assert(_task_queue->taskqueue_empty(), "invariant"); + } + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/gc/parallel/parallelArguments.cpp b/src/hotspot/share/gc/parallel/parallelArguments.cpp +--- a/src/hotspot/share/gc/parallel/parallelArguments.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/gc/parallel/parallelArguments.cpp 2024-02-20 10:42:36.322196654 +0800 +@@ -23,6 +23,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "gc/parallel/parallelArguments.hpp" + #include "gc/parallel/parallelScavengeHeap.hpp" +@@ -45,6 +51,12 @@ + GCArguments::initialize(); + assert(UseParallelGC, "Error"); + ++#if defined(LOONGARCH64) && !defined(ZERO) ++ if (FLAG_IS_DEFAULT(UseNUMA)) { ++ FLAG_SET_DEFAULT(UseNUMA, true); ++ } ++#endif ++ + // If no heap maximum was requested explicitly, use some reasonable fraction + // of the physical memory, up to a maximum of 1GB. + FLAG_SET_DEFAULT(ParallelGCThreads, +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/gc/shared/barrierSetNMethod.cpp b/src/hotspot/share/gc/shared/barrierSetNMethod.cpp +--- a/src/hotspot/share/gc/shared/barrierSetNMethod.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/gc/shared/barrierSetNMethod.cpp 2024-02-20 10:42:36.328863315 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "code/codeCache.hpp" + #include "code/nmethod.hpp" +@@ -158,7 +164,7 @@ + BarrierSetNMethodArmClosure cl(_current_phase); + Threads::threads_do(&cl); + +-#if (defined(AARCH64) || defined(RISCV64)) && !defined(ZERO) ++#if (defined(AARCH64) || defined(RISCV64) || defined(LOONGARCH64)) && !defined(ZERO) + // We clear the patching epoch when disarming nmethods, so that + // the counter won't overflow. + BarrierSetAssembler::clear_patching_epoch(); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/gc/shared/c2/barrierSetC2.cpp b/src/hotspot/share/gc/shared/c2/barrierSetC2.cpp +--- a/src/hotspot/share/gc/shared/c2/barrierSetC2.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/gc/shared/c2/barrierSetC2.cpp 2024-02-20 10:42:36.332196645 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "gc/shared/tlab_globals.hpp" + #include "gc/shared/c2/barrierSetC2.hpp" +@@ -263,6 +269,8 @@ + + bool is_volatile = (decorators & MO_SEQ_CST) != 0; + bool is_acquire = (decorators & MO_ACQUIRE) != 0; ++ bool is_relaxed = (decorators & MO_RELAXED) != 0; ++ bool is_unsafe = (decorators & C2_UNSAFE_ACCESS) != 0; + + // If reference is volatile, prevent following volatiles ops from + // floating up before the volatile access. +@@ -296,6 +304,13 @@ + assert(_leading_membar == nullptr || support_IRIW_for_not_multiple_copy_atomic_cpu, "no leading membar expected"); + Node* mb = kit->insert_mem_bar(Op_MemBarAcquire, n); + mb->as_MemBar()->set_trailing_load(); ++ } else if (is_relaxed && is_unsafe) { ++#ifdef LOONGARCH64 ++ assert(kit != nullptr, "unsupported at optimization time"); ++ Node* n = _access.raw_access(); ++ Node* mb = kit->insert_mem_bar(Op_SameAddrLoadFence, n); ++ mb->as_MemBar()->set_trailing_load(); ++#endif + } + } + } +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp b/src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp +--- a/src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/gc/shenandoah/shenandoahArguments.cpp 2024-02-20 10:42:36.345529969 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "gc/shared/gcArguments.hpp" + #include "gc/shared/tlab_globals.hpp" +@@ -35,7 +41,7 @@ + #include "utilities/defaultStream.hpp" + + void ShenandoahArguments::initialize() { +-#if !(defined AARCH64 || defined AMD64 || defined IA32 || defined PPC64 || defined RISCV64) ++#if !(defined AARCH64 || defined AMD64 || defined IA32 || defined PPC64 || defined RISCV64 || defined LOONGARCH64) + vm_exit_during_initialization("Shenandoah GC is not supported on this platform."); + #endif + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/gc/z/c1/zBarrierSetC1.cpp b/src/hotspot/share/gc/z/c1/zBarrierSetC1.cpp +--- a/src/hotspot/share/gc/z/c1/zBarrierSetC1.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/gc/z/c1/zBarrierSetC1.cpp 2024-02-20 10:42:36.365529953 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "c1/c1_FrameMap.hpp" + #include "c1/c1_LIR.hpp" +@@ -453,6 +459,10 @@ + #endif + access.gen()->lir()->move(cmp_value.result(), cmp_value_opr); + ++#if defined(LOONGARCH) ++ LIR_Opr result = access.gen()->new_register(T_OBJECT); ++#endif ++ + __ cas_obj(access.resolved_addr()->as_address_ptr()->base(), + cmp_value_opr, + new_value_zpointer, +@@ -460,12 +470,19 @@ + access.gen()->new_register(T_OBJECT), + access.gen()->new_register(T_OBJECT), + access.gen()->new_register(T_OBJECT)); ++#elif defined(LOONGARCH) ++ access.gen()->new_register(T_OBJECT), ++ access.gen()->new_register(T_OBJECT), ++ result); + #else + LIR_OprFact::illegalOpr, LIR_OprFact::illegalOpr); + #endif ++ ++#if !defined(LOONGARCH) + LIR_Opr result = access.gen()->new_register(T_INT); + __ cmove(lir_cond_equal, LIR_OprFact::intConst(1), LIR_OprFact::intConst(0), + result, T_INT); ++#endif + + return result; + } +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/gc/z/zStoreBarrierBuffer.hpp b/src/hotspot/share/gc/z/zStoreBarrierBuffer.hpp +--- a/src/hotspot/share/gc/z/zStoreBarrierBuffer.hpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/gc/z/zStoreBarrierBuffer.hpp 2024-02-20 10:42:36.372196614 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #ifndef SHARE_GC_Z_ZSTOREBARRIERBUFFER_HPP + #define SHARE_GC_Z_ZSTOREBARRIERBUFFER_HPP + +@@ -42,7 +48,9 @@ + friend class ZVerify; + + private: +- static const size_t _buffer_length = 32; ++ // Tune ZStoreBarrierBuffer length to decrease the opportunity goto ++ // copy_store_at slow-path. ++ static const size_t _buffer_length = 32 LOONGARCH64_ONLY(+32); + static const size_t _buffer_size_bytes = _buffer_length * sizeof(ZStoreBarrierEntry); + + ZStoreBarrierEntry _buffer[_buffer_length]; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/interpreter/interpreterRuntime.cpp b/src/hotspot/share/interpreter/interpreterRuntime.cpp +--- a/src/hotspot/share/interpreter/interpreterRuntime.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/interpreter/interpreterRuntime.cpp 2024-02-20 10:42:36.375529945 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2018, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "classfile/javaClasses.inline.hpp" + #include "classfile/symbolTable.hpp" +@@ -1471,7 +1477,7 @@ + // preparing the same method will be sure to see non-null entry & mirror. + JRT_END + +-#if defined(IA32) || defined(AMD64) || defined(ARM) ++#if defined(IA32) || defined(AMD64) || defined(ARM) || defined(LOONGARCH64) + JRT_LEAF(void, InterpreterRuntime::popframe_move_outgoing_args(JavaThread* current, void* src_address, void* dest_address)) + assert(current == JavaThread::current(), "pre-condition"); + if (src_address == dest_address) { +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/interpreter/interpreterRuntime.hpp b/src/hotspot/share/interpreter/interpreterRuntime.hpp +--- a/src/hotspot/share/interpreter/interpreterRuntime.hpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/interpreter/interpreterRuntime.hpp 2024-02-20 10:42:36.375529945 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2018, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #ifndef SHARE_INTERPRETER_INTERPRETERRUNTIME_HPP + #define SHARE_INTERPRETER_INTERPRETERRUNTIME_HPP + +@@ -133,7 +139,7 @@ + Method* method, + intptr_t* from, intptr_t* to); + +-#if defined(IA32) || defined(AMD64) || defined(ARM) ++#if defined(IA32) || defined(AMD64) || defined(ARM) || defined(LOONGARCH64) + // Popframe support (only needed on x86, AMD64 and ARM) + static void popframe_move_outgoing_args(JavaThread* current, void* src_address, void* dest_address); + #endif +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/interpreter/templateInterpreterGenerator.hpp b/src/hotspot/share/interpreter/templateInterpreterGenerator.hpp +--- a/src/hotspot/share/interpreter/templateInterpreterGenerator.hpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/interpreter/templateInterpreterGenerator.hpp 2024-02-20 10:42:36.378863275 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2021, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #ifndef SHARE_INTERPRETER_TEMPLATEINTERPRETERGENERATOR_HPP + #define SHARE_INTERPRETER_TEMPLATEINTERPRETERGENERATOR_HPP + +@@ -115,9 +121,9 @@ + + void generate_fixed_frame(bool native_call); + +-#ifdef AARCH64 ++#if defined(AARCH64) || defined(LOONGARCH64) + void generate_transcendental_entry(AbstractInterpreter::MethodKind kind, int fpargs); +-#endif // AARCH64 ++#endif // AARCH64 || LOONGARCH64 + + #ifdef ARM32 + void generate_math_runtime_call(AbstractInterpreter::MethodKind kind); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/jfr/utilities/jfrBigEndian.hpp b/src/hotspot/share/jfr/utilities/jfrBigEndian.hpp +--- a/src/hotspot/share/jfr/utilities/jfrBigEndian.hpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/jfr/utilities/jfrBigEndian.hpp 2024-02-20 10:42:36.392196597 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2019, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #ifndef SHARE_JFR_UTILITIES_JFRBIGENDIAN_HPP + #define SHARE_JFR_UTILITIES_JFRBIGENDIAN_HPP + +@@ -103,7 +109,7 @@ + inline bool JfrBigEndian::platform_supports_unaligned_reads(void) { + #if defined(IA32) || defined(AMD64) || defined(PPC) || defined(S390) + return true; +-#elif defined(ARM) || defined(AARCH64) || defined(RISCV) ++#elif defined(ARM) || defined(AARCH64) || defined(RISCV) || defined(LOONGARCH) + return false; + #else + #warning "Unconfigured platform" +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp +--- a/src/hotspot/share/jvmci/vmStructs_jvmci.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/jvmci/vmStructs_jvmci.cpp 2024-02-20 10:42:36.398863261 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "code/codeCache.hpp" + #include "compiler/compileBroker.hpp" +@@ -857,6 +863,17 @@ + + #endif + ++#ifdef LOONGARCH64 ++ ++#define VM_STRUCTS_CPU(nonstatic_field, static_field, unchecked_nonstatic_field, volatile_nonstatic_field, nonproduct_nonstatic_field, c2_nonstatic_field, unchecked_c1_static_field, unchecked_c2_static_field) \ ++ volatile_nonstatic_field(JavaFrameAnchor, _last_Java_fp, intptr_t*) ++ ++#define DECLARE_INT_CPU_FEATURE_CONSTANT(id, name, bit) GENERATE_VM_INT_CONSTANT_ENTRY(VM_Version::CPU_##id) ++#define VM_INT_CPU_FEATURE_CONSTANTS CPU_FEATURE_FLAGS(DECLARE_INT_CPU_FEATURE_CONSTANT) ++ ++#endif ++ ++ + #ifdef X86 + + #define VM_STRUCTS_CPU(nonstatic_field, static_field, unchecked_nonstatic_field, volatile_nonstatic_field, nonproduct_nonstatic_field, c2_nonstatic_field, unchecked_c1_static_field, unchecked_c2_static_field) \ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/memory/metaspace.cpp b/src/hotspot/share/memory/metaspace.cpp +--- a/src/hotspot/share/memory/metaspace.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/memory/metaspace.cpp 2024-02-20 10:42:36.402196590 +0800 +@@ -23,6 +23,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2019, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "cds/metaspaceShared.hpp" + #include "classfile/classLoaderData.hpp" +@@ -582,12 +588,13 @@ + // On error, returns an unreserved space. + ReservedSpace Metaspace::reserve_address_space_for_compressed_classes(size_t size) { + +-#if defined(AARCH64) || defined(PPC64) ++#if defined(AARCH64) || defined(PPC64) || defined(LOONGARCH64) + const size_t alignment = Metaspace::reserve_alignment(); + + // AArch64: Try to align metaspace class space so that we can decode a + // compressed klass with a single MOVK instruction. We can do this iff the + // compressed class base is a multiple of 4G. ++ + // Additionally, above 32G, ensure the lower LogKlassAlignmentInBytes bits + // of the upper 32-bits of the address are zero so we can handle a shift + // when decoding. +@@ -644,16 +651,16 @@ + return rs; + } + } +-#endif // defined(AARCH64) || defined(PPC64) ++#endif // defined(AARCH64) || defined(PPC64) || defined(LOONGARCH64) + +-#ifdef AARCH64 ++#if defined(AARCH64) || defined(LOONGARCH64) + // Note: on AARCH64, if the code above does not find any good placement, we + // have no recourse. We return an empty space and the VM will exit. + return ReservedSpace(); + #else + // Default implementation: Just reserve anywhere. + return ReservedSpace(size, Metaspace::reserve_alignment(), os::vm_page_size(), (char*)nullptr); +-#endif // AARCH64 ++#endif // defined(AARCH64) || defined(LOONGARCH64) + } + + #endif // _LP64 +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/oops/stackChunkOop.inline.hpp b/src/hotspot/share/oops/stackChunkOop.inline.hpp +--- a/src/hotspot/share/oops/stackChunkOop.inline.hpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/oops/stackChunkOop.inline.hpp 2024-02-20 10:42:36.415529912 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2022, 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #ifndef SHARE_OOPS_STACKCHUNKOOP_INLINE_HPP + #define SHARE_OOPS_STACKCHUNKOOP_INLINE_HPP + +@@ -335,7 +341,7 @@ + assert(to >= start_address(), "Chunk underflow"); + assert(to + size <= end_address(), "Chunk overflow"); + +-#if !(defined(AMD64) || defined(AARCH64) || defined(RISCV64) || defined(PPC64)) || defined(ZERO) ++#if !(defined(AMD64) || defined(AARCH64) || defined(RISCV64) || defined(PPC64) || defined(LOONGARCH64)) || defined(ZERO) + // Suppress compilation warning-as-error on unimplemented architectures + // that stub out arch-specific methods. Some compilers are smart enough + // to figure out the argument is always null and then warn about it. +@@ -354,7 +360,7 @@ + assert(from >= start_address(), ""); + assert(from + size <= end_address(), ""); + +-#if !(defined(AMD64) || defined(AARCH64) || defined(RISCV64) || defined(PPC64)) || defined(ZERO) ++#if !(defined(AMD64) || defined(AARCH64) || defined(RISCV64) || defined(PPC64) || defined(LOONGARCH64)) || defined(ZERO) + // Suppress compilation warning-as-error on unimplemented architectures + // that stub out arch-specific methods. Some compilers are smart enough + // to figure out the argument is always null and then warn about it. +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/opto/classes.hpp b/src/hotspot/share/opto/classes.hpp +--- a/src/hotspot/share/opto/classes.hpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/opto/classes.hpp 2024-02-20 10:42:36.422196574 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "utilities/macros.hpp" + + // The giant table of Node classes. +@@ -233,6 +239,7 @@ + macro(MemBarReleaseLock) + macro(MemBarVolatile) + macro(MemBarStoreStore) ++macro(SameAddrLoadFence) + macro(MergeMem) + macro(MinI) + macro(MinL) +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/opto/compile.cpp b/src/hotspot/share/opto/compile.cpp +--- a/src/hotspot/share/opto/compile.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/opto/compile.cpp 2024-02-20 10:42:36.422196574 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "asm/macroAssembler.hpp" + #include "asm/macroAssembler.inline.hpp" +@@ -3746,6 +3752,7 @@ + n->set_req(MemBarNode::Precedent, top()); + } + break; ++ case Op_SameAddrLoadFence: + case Op_MemBarAcquire: { + if (n->as_MemBar()->trailing_load() && n->req() > MemBarNode::Precedent) { + // At parse time, the trailing MemBarAcquire for a volatile load +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/opto/memnode.cpp b/src/hotspot/share/opto/memnode.cpp +--- a/src/hotspot/share/opto/memnode.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/opto/memnode.cpp 2024-02-20 10:42:36.432196566 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "classfile/javaClasses.hpp" + #include "compiler/compileLog.hpp" +@@ -3281,6 +3287,7 @@ + case Op_MemBarReleaseLock: return new MemBarReleaseLockNode(C, atp, pn); + case Op_MemBarVolatile: return new MemBarVolatileNode(C, atp, pn); + case Op_MemBarCPUOrder: return new MemBarCPUOrderNode(C, atp, pn); ++ case Op_SameAddrLoadFence: return new SameAddrLoadFenceNode(C, atp, pn); + case Op_OnSpinWait: return new OnSpinWaitNode(C, atp, pn); + case Op_Initialize: return new InitializeNode(C, atp, pn); + default: ShouldNotReachHere(); return nullptr; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/opto/memnode.hpp b/src/hotspot/share/opto/memnode.hpp +--- a/src/hotspot/share/opto/memnode.hpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/opto/memnode.hpp 2024-02-20 10:42:36.432196566 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #ifndef SHARE_OPTO_MEMNODE_HPP + #define SHARE_OPTO_MEMNODE_HPP + +@@ -1289,6 +1295,14 @@ + virtual uint ideal_reg() const { return 0; } // not matched in the AD file + }; + ++// Used to prevent LoadLoad reorder for same address. ++class SameAddrLoadFenceNode: public MemBarNode { ++public: ++ SameAddrLoadFenceNode(Compile* C, int alias_idx, Node* precedent) ++ : MemBarNode(C, alias_idx, precedent) {} ++ virtual int Opcode() const; ++}; ++ + class OnSpinWaitNode: public MemBarNode { + public: + OnSpinWaitNode(Compile* C, int alias_idx, Node* precedent) +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/opto/output.cpp b/src/hotspot/share/opto/output.cpp +--- a/src/hotspot/share/opto/output.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/opto/output.cpp 2024-02-20 10:42:36.435529898 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2019, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "asm/assembler.inline.hpp" + #include "asm/macroAssembler.inline.hpp" +@@ -1616,6 +1622,22 @@ + DEBUG_ONLY(uint instr_offset = cb->insts_size()); + n->emit(*cb, C->regalloc()); + current_offset = cb->insts_size(); ++#if defined(LOONGARCH) ++ if (!n->is_Proj() && (cb->insts()->end() != badAddress)) { ++ // For LOONGARCH, the first instruction of the previous node (usually a instruction sequence) sometime ++ // is not the instruction which access memory. adjust is needed. previous_offset points to the ++ // instruction which access memory. Instruction size is 4. cb->insts_size() and ++ // cb->insts()->end() are the location of current instruction. ++ int adjust = 4; ++ NativeInstruction* inst = (NativeInstruction*) (cb->insts()->end() - 4); ++ if (inst->is_sync()) { ++ // a sync may be the last instruction, see store_B_immI_enc_sync ++ adjust += 4; ++ inst = (NativeInstruction*) (cb->insts()->end() - 8); ++ } ++ previous_offset = current_offset - adjust; ++ } ++#endif + + // Above we only verified that there is enough space in the instruction section. + // However, the instruction may emit stubs that cause code buffer expansion. +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/runtime/arguments.cpp b/src/hotspot/share/runtime/arguments.cpp +--- a/src/hotspot/share/runtime/arguments.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/runtime/arguments.cpp 2024-02-20 10:42:36.455529881 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2022, 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "cds/cds_globals.hpp" + #include "cds/filemap.hpp" +@@ -1904,14 +1910,14 @@ + } + #endif + +-#if !defined(X86) && !defined(AARCH64) && !defined(RISCV64) && !defined(ARM) && !defined(PPC64) && !defined(S390) ++#if !defined(X86) && !defined(AARCH64) && !defined(RISCV64) && !defined(ARM) && !defined(PPC64) && !defined(S390) && !defined(LOONGARCH64) + if (LockingMode == LM_LIGHTWEIGHT) { + FLAG_SET_CMDLINE(LockingMode, LM_LEGACY); + warning("New lightweight locking not supported on this platform"); + } + #endif + +-#if !defined(X86) && !defined(AARCH64) && !defined(PPC64) && !defined(RISCV64) && !defined(S390) ++#if !defined(X86) && !defined(AARCH64) && !defined(PPC64) && !defined(RISCV64) && !defined(S390) && !defined(LOONGARCH64) + if (LockingMode == LM_MONITOR) { + jio_fprintf(defaultStream::error_stream(), + "LockingMode == 0 (LM_MONITOR) is not fully implemented on this architecture"); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/runtime/continuation.cpp b/src/hotspot/share/runtime/continuation.cpp +--- a/src/hotspot/share/runtime/continuation.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/runtime/continuation.cpp 2024-02-20 10:42:36.455529881 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2022, 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "classfile/vmSymbols.hpp" + #include "gc/shared/barrierSetNMethod.hpp" +@@ -224,7 +230,7 @@ + + map->set_stack_chunk(nullptr); + +-#if (defined(X86) || defined(AARCH64) || defined(RISCV64) || defined(PPC64)) && !defined(ZERO) ++#if (defined(X86) || defined(AARCH64) || defined(RISCV64) || defined(PPC64) || defined(LOONGARCH64)) && !defined(ZERO) + frame sender(cont.entrySP(), cont.entryFP(), cont.entryPC()); + #else + frame sender = frame(); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/runtime/continuationFreezeThaw.cpp b/src/hotspot/share/runtime/continuationFreezeThaw.cpp +--- a/src/hotspot/share/runtime/continuationFreezeThaw.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/runtime/continuationFreezeThaw.cpp 2024-02-20 10:42:36.455529881 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "classfile/javaClasses.inline.hpp" + #include "classfile/vmSymbols.hpp" +@@ -774,7 +780,7 @@ + } + + frame FreezeBase::freeze_start_frame_safepoint_stub(frame f) { +-#if (defined(X86) || defined(AARCH64) || defined(RISCV64)) && !defined(ZERO) ++#if (defined(X86) || defined(AARCH64) || defined(RISCV64) || defined(LOONGARCH64)) && !defined(ZERO) + f.set_fp(f.real_fp()); // f.set_fp(*Frame::callee_link_address(f)); // ???? + #else + Unimplemented(); +@@ -832,7 +838,7 @@ + _freeze_size += fsize; + NOT_PRODUCT(_frames++;) + +- assert(FKind::frame_bottom(f) <= _bottom_address, ""); ++ NOT_LOONGARCH64(assert(FKind::frame_bottom(f) <= _bottom_address, "");) + + // We don't use FKind::frame_bottom(f) == _bottom_address because on x64 there's sometimes an extra word between + // enterSpecial and an interpreted frame +@@ -1604,7 +1610,7 @@ + if (!safepoint) { + f = f.sender(&map); // this is the yield frame + } else { // safepoint yield +-#if (defined(X86) || defined(AARCH64) || defined(RISCV64)) && !defined(ZERO) ++#if (defined(X86) || defined(AARCH64) || defined(RISCV64) || defined(LOONGARCH64)) && !defined(ZERO) + f.set_fp(f.real_fp()); // Instead of this, maybe in ContinuationWrapper::set_last_frame always use the real_fp? + #else + Unimplemented(); +@@ -2224,8 +2230,8 @@ + + // If we're the bottom-most thawed frame, we're writing to within one word from entrySP + // (we might have one padding word for alignment) +- assert(!is_bottom_frame || (_cont.entrySP() - 1 <= to + sz && to + sz <= _cont.entrySP()), ""); +- assert(!is_bottom_frame || hf.compiled_frame_stack_argsize() != 0 || (to + sz && to + sz == _cont.entrySP()), ""); ++ NOT_LOONGARCH64(assert(!is_bottom_frame || (_cont.entrySP() - 1 <= to + sz && to + sz <= _cont.entrySP()), "");) ++ NOT_LOONGARCH64(assert(!is_bottom_frame || hf.compiled_frame_stack_argsize() != 0 || (to + sz && to + sz == _cont.entrySP()), "");) + + copy_from_chunk(from, to, sz); // copying good oops because we invoked barriers above + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/runtime/javaThread.inline.hpp b/src/hotspot/share/runtime/javaThread.inline.hpp +--- a/src/hotspot/share/runtime/javaThread.inline.hpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/runtime/javaThread.inline.hpp 2024-02-20 10:42:36.462196543 +0800 +@@ -23,6 +23,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2018, 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #ifndef SHARE_RUNTIME_JAVATHREAD_INLINE_HPP + #define SHARE_RUNTIME_JAVATHREAD_INLINE_HPP + +@@ -138,7 +144,7 @@ + } + + inline JavaThreadState JavaThread::thread_state() const { +-#if defined(PPC64) || defined (AARCH64) || defined(RISCV64) ++#if defined(PPC64) || defined (AARCH64) || defined(RISCV64) || defined(LOONGARCH64) + // Use membars when accessing volatile _thread_state. See + // Threads::create_vm() for size checks. + return Atomic::load_acquire(&_thread_state); +@@ -150,7 +156,7 @@ + inline void JavaThread::set_thread_state(JavaThreadState s) { + assert(current_or_null() == nullptr || current_or_null() == this, + "state change should only be called by the current thread"); +-#if defined(PPC64) || defined (AARCH64) || defined(RISCV64) ++#if defined(PPC64) || defined (AARCH64) || defined(RISCV64) || defined(LOONGARCH64) + // Use membars when accessing volatile _thread_state. See + // Threads::create_vm() for size checks. + Atomic::release_store(&_thread_state, s); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/runtime/objectMonitor.cpp b/src/hotspot/share/runtime/objectMonitor.cpp +--- a/src/hotspot/share/runtime/objectMonitor.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/runtime/objectMonitor.cpp 2024-02-20 10:42:36.462196543 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "classfile/vmSymbols.hpp" + #include "gc/shared/oopStorage.hpp" +@@ -365,6 +371,9 @@ + } + + assert(owner_raw() != current, "invariant"); ++ // Thread _succ != current assertion load reording before Thread if (_succ == current) _succ = nullptr. ++ // But expect order is firstly if (_succ == current) _succ = nullptr then _succ != current assertion. ++ LOONGARCH64_ONLY(DEBUG_ONLY(__asm__ __volatile__ ("dbar 0x700\n");)) + assert(_succ != current, "invariant"); + assert(!SafepointSynchronize::is_at_safepoint(), "invariant"); + assert(current->thread_state() != _thread_blocked, "invariant"); +@@ -729,6 +738,7 @@ + } + + // The Spin failed -- Enqueue and park the thread ... ++ LOONGARCH64_ONLY(DEBUG_ONLY(__asm__ __volatile__ ("dbar 0x700\n");)) + assert(_succ != current, "invariant"); + assert(owner_raw() != current, "invariant"); + assert(_Responsible != current, "invariant"); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/runtime/os.cpp b/src/hotspot/share/runtime/os.cpp +--- a/src/hotspot/share/runtime/os.cpp 2024-01-17 09:43:20.000000000 +0800 ++++ b/src/hotspot/share/runtime/os.cpp 2024-02-20 10:42:36.465529874 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2019, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "classfile/javaClasses.hpp" + #include "classfile/moduleEntry.hpp" +@@ -1259,7 +1265,8 @@ + if ((uintptr_t)fr->sender_sp() == (uintptr_t)-1 || is_pointer_bad(fr->sender_sp())) return true; + + uintptr_t old_fp = (uintptr_t)fr->link_or_null(); +- if (old_fp == 0 || old_fp == (uintptr_t)-1 || old_fp == ufp || ++ // The check for old_fp and ufp is harmful on LoongArch due to their special ABIs. ++ if (old_fp == 0 || old_fp == (uintptr_t)-1 NOT_LOONGARCH64(|| old_fp == ufp) || + is_pointer_bad(fr->link_or_null())) return true; + + // stack grows downwards; if old_fp is below current fp or if the stack +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/runtime/sharedRuntime.cpp b/src/hotspot/share/runtime/sharedRuntime.cpp +--- a/src/hotspot/share/runtime/sharedRuntime.cpp 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/hotspot/share/runtime/sharedRuntime.cpp 2024-02-20 10:42:36.468863204 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2018, 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "classfile/javaClasses.inline.hpp" + #include "classfile/stringTable.hpp" +@@ -3063,7 +3069,7 @@ + struct { double data[20]; } locs_buf; + struct { double data[20]; } stubs_locs_buf; + buffer.insts()->initialize_shared_locs((relocInfo*)&locs_buf, sizeof(locs_buf) / sizeof(relocInfo)); +-#if defined(AARCH64) || defined(PPC64) ++#if defined(AARCH64) || defined(PPC64) || defined(LOONGARCH64) + // On AArch64 with ZGC and nmethod entry barriers, we need all oops to be + // in the constant pool to ensure ordering between the barrier and oops + // accesses. For native_wrappers we need a constant. +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/runtime/sharedRuntimeTrig.cpp b/src/hotspot/share/runtime/sharedRuntimeTrig.cpp +--- a/src/hotspot/share/runtime/sharedRuntimeTrig.cpp 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/hotspot/share/runtime/sharedRuntimeTrig.cpp 2024-02-20 10:42:36.468863204 +0800 +@@ -22,6 +22,13 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2015, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++ + #include "precompiled.hpp" + #include "jni.h" + #include "runtime/interfaceSupport.inline.hpp" +@@ -503,6 +510,14 @@ + * sin(x) = x + (S1*x + (x *(r-y/2)+y)) + */ + ++#if defined(LOONGARCH) ++#define S1 -1.66666666666666324348e-01 ++#define S2 8.33333333332248946124e-03 ++#define S3 -1.98412698298579493134e-04 ++#define S4 2.75573137070700676789e-06 ++#define S5 -2.50507602534068634195e-08 ++#define S6 1.58969099521155010221e-10 ++#else + static const double + S1 = -1.66666666666666324348e-01, /* 0xBFC55555, 0x55555549 */ + S2 = 8.33333333332248946124e-03, /* 0x3F811111, 0x1110F8A6 */ +@@ -510,6 +525,7 @@ + S4 = 2.75573137070700676789e-06, /* 0x3EC71DE3, 0x57B1FE7D */ + S5 = -2.50507602534068634195e-08, /* 0xBE5AE5E6, 0x8A2B9CEB */ + S6 = 1.58969099521155010221e-10; /* 0x3DE5D93A, 0x5ACFD57C */ ++#endif + + static double __kernel_sin(double x, double y, int iy) + { +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/runtime/synchronizer.cpp b/src/hotspot/share/runtime/synchronizer.cpp +--- a/src/hotspot/share/runtime/synchronizer.cpp 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/hotspot/share/runtime/synchronizer.cpp 2024-02-20 10:42:36.468863204 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "classfile/vmSymbols.hpp" + #include "gc/shared/collectedHeap.hpp" +@@ -487,7 +493,7 @@ + } + + static bool useHeavyMonitors() { +-#if defined(X86) || defined(AARCH64) || defined(PPC64) || defined(RISCV64) || defined(S390) ++#if defined(X86) || defined(AARCH64) || defined(PPC64) || defined(RISCV64) || defined(S390) || defined(LOONGARCH64) + return LockingMode == LM_MONITOR; + #else + return false; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/runtime/vmStructs.cpp b/src/hotspot/share/runtime/vmStructs.cpp +--- a/src/hotspot/share/runtime/vmStructs.cpp 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/hotspot/share/runtime/vmStructs.cpp 2024-02-20 10:42:36.472196536 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include "precompiled.hpp" + #include "cds/filemap.hpp" + #include "ci/ciField.hpp" +@@ -1567,6 +1573,7 @@ + declare_c2_type(StoreFenceNode, MemBarNode) \ + declare_c2_type(MemBarVolatileNode, MemBarNode) \ + declare_c2_type(MemBarCPUOrderNode, MemBarNode) \ ++ declare_c2_type(SameAddrLoadFenceNode, MemBarNode) \ + declare_c2_type(OnSpinWaitNode, MemBarNode) \ + declare_c2_type(BlackholeNode, MultiNode) \ + declare_c2_type(InitializeNode, MemBarNode) \ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/hotspot/share/utilities/macros.hpp b/src/hotspot/share/utilities/macros.hpp +--- a/src/hotspot/share/utilities/macros.hpp 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/hotspot/share/utilities/macros.hpp 2024-02-20 10:42:36.482196528 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2018, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #ifndef SHARE_UTILITIES_MACROS_HPP + #define SHARE_UTILITIES_MACROS_HPP + +@@ -484,6 +490,18 @@ + #define NOT_S390(code) code + #endif + ++#ifdef LOONGARCH64 ++#ifndef LOONGARCH ++#define LOONGARCH ++#endif ++#define LOONGARCH64_ONLY(code) code ++#define NOT_LOONGARCH64(code) ++#else ++#undef LOONGARCH ++#define LOONGARCH64_ONLY(code) ++#define NOT_LOONGARCH64(code) code ++#endif ++ + #if defined(PPC32) || defined(PPC64) + #ifndef PPC + #define PPC +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/java.base/share/classes/jdk/internal/foreign/abi/AbstractLinker.java b/src/java.base/share/classes/jdk/internal/foreign/abi/AbstractLinker.java +--- a/src/java.base/share/classes/jdk/internal/foreign/abi/AbstractLinker.java 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/java.base/share/classes/jdk/internal/foreign/abi/AbstractLinker.java 2024-02-20 10:42:36.685529702 +0800 +@@ -22,6 +22,12 @@ + * or visit www.oracle.com if you need additional information or have any + * questions. + */ ++ ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2022, 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ + package jdk.internal.foreign.abi; + + import jdk.internal.foreign.SystemLookup; +@@ -30,6 +36,7 @@ + import jdk.internal.foreign.abi.aarch64.macos.MacOsAArch64Linker; + import jdk.internal.foreign.abi.aarch64.windows.WindowsAArch64Linker; + import jdk.internal.foreign.abi.fallback.FallbackLinker; ++import jdk.internal.foreign.abi.loongarch64.linux.LinuxLoongArch64Linker; + import jdk.internal.foreign.abi.ppc64.linux.LinuxPPC64leLinker; + import jdk.internal.foreign.abi.riscv64.linux.LinuxRISCV64Linker; + import jdk.internal.foreign.abi.s390.linux.LinuxS390Linker; +@@ -62,6 +69,7 @@ + SysVx64Linker, WindowsAArch64Linker, + Windowsx64Linker, LinuxPPC64leLinker, + LinuxRISCV64Linker, LinuxS390Linker, ++ LinuxLoongArch64Linker, + FallbackLinker { + + public interface UpcallStubFactory { +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/linux/LinuxLoongArch64CallArranger.java b/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/linux/LinuxLoongArch64CallArranger.java +--- a/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/linux/LinuxLoongArch64CallArranger.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/linux/LinuxLoongArch64CallArranger.java 2024-02-20 10:42:36.688863032 +0800 +@@ -0,0 +1,472 @@ ++/* ++ * Copyright (c) 2020, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. Oracle designates this ++ * particular file as subject to the "Classpath" exception as provided ++ * by Oracle in the LICENSE file that accompanied this code. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++package jdk.internal.foreign.abi.loongarch64.linux; ++ ++import java.lang.foreign.AddressLayout; ++import java.lang.foreign.FunctionDescriptor; ++import java.lang.foreign.GroupLayout; ++import java.lang.foreign.MemoryLayout; ++import java.lang.foreign.MemorySegment; ++import jdk.internal.foreign.abi.ABIDescriptor; ++import jdk.internal.foreign.abi.AbstractLinker.UpcallStubFactory; ++import jdk.internal.foreign.abi.Binding; ++import jdk.internal.foreign.abi.CallingSequence; ++import jdk.internal.foreign.abi.CallingSequenceBuilder; ++import jdk.internal.foreign.abi.DowncallLinker; ++import jdk.internal.foreign.abi.LinkerOptions; ++import jdk.internal.foreign.abi.UpcallLinker; ++import jdk.internal.foreign.abi.SharedUtils; ++import jdk.internal.foreign.abi.VMStorage; ++import jdk.internal.foreign.Utils; ++ ++import java.lang.foreign.ValueLayout; ++import java.lang.invoke.MethodHandle; ++import java.lang.invoke.MethodType; ++import java.util.List; ++import java.util.Map; ++import java.util.Optional; ++ ++import static jdk.internal.foreign.abi.loongarch64.linux.TypeClass.*; ++import static jdk.internal.foreign.abi.loongarch64.LoongArch64Architecture.*; ++import static jdk.internal.foreign.abi.loongarch64.LoongArch64Architecture.Regs.*; ++ ++/** ++ * For the LoongArch64 C ABI specifically, this class uses CallingSequenceBuilder ++ * to translate a C FunctionDescriptor into a CallingSequence, which can then be turned into a MethodHandle. ++ * ++ * This includes taking care of synthetic arguments like pointers to return buffers for 'in-memory' returns. ++ */ ++public class LinuxLoongArch64CallArranger { ++ private static final int STACK_SLOT_SIZE = 8; ++ public static final int MAX_REGISTER_ARGUMENTS = 8; ++ private static final ABIDescriptor CLinux = abiFor( ++ new VMStorage[]{a0, a1, a2, a3, a4, a5, a6, a7}, ++ new VMStorage[]{f0, f1, f2, f3, f4, f5, f6, f7}, ++ new VMStorage[]{a0, a1}, ++ new VMStorage[]{f0, f1}, ++ new VMStorage[]{t0, t1, t2, t3, t4, t5, t6, t7, t8}, ++ new VMStorage[]{f8, f9, f10, f11, f12, f13, f14, f15, f16, ++ f17, f18, f19, f20, f21, f22, f23}, ++ 16, // stackAlignment ++ 0, // no shadow space ++ t4, t7 // scratch 1 & 2 ++ ); ++ ++ public record Bindings(CallingSequence callingSequence, ++ boolean isInMemoryReturn) { ++ } ++ ++ public static Bindings getBindings(MethodType mt, FunctionDescriptor cDesc, boolean forUpcall) { ++ return getBindings(mt, cDesc, forUpcall, LinkerOptions.empty()); ++ } ++ ++ public static Bindings getBindings(MethodType mt, FunctionDescriptor cDesc, boolean forUpcall, LinkerOptions options) { ++ CallingSequenceBuilder csb = new CallingSequenceBuilder(CLinux, forUpcall, options); ++ BindingCalculator argCalc = forUpcall ? new BoxBindingCalculator(true) : new UnboxBindingCalculator(true); ++ BindingCalculator retCalc = forUpcall ? new UnboxBindingCalculator(false) : new BoxBindingCalculator(false); ++ ++ boolean returnInMemory = isInMemoryReturn(cDesc.returnLayout()); ++ if (returnInMemory) { ++ Class carrier = MemorySegment.class; ++ MemoryLayout layout = SharedUtils.C_POINTER; ++ csb.addArgumentBindings(carrier, layout, argCalc.getBindings(carrier, layout, false)); ++ } else if (cDesc.returnLayout().isPresent()) { ++ Class carrier = mt.returnType(); ++ MemoryLayout layout = cDesc.returnLayout().get(); ++ csb.setReturnBindings(carrier, layout, retCalc.getBindings(carrier, layout, false)); ++ } ++ ++ for (int i = 0; i < mt.parameterCount(); i++) { ++ Class carrier = mt.parameterType(i); ++ MemoryLayout layout = cDesc.argumentLayouts().get(i); ++ boolean isVar = options.isVarargsIndex(i); ++ csb.addArgumentBindings(carrier, layout, argCalc.getBindings(carrier, layout, isVar)); ++ } ++ ++ return new Bindings(csb.build(), returnInMemory); ++ } ++ ++ public static MethodHandle arrangeDowncall(MethodType mt, FunctionDescriptor cDesc, LinkerOptions options) { ++ Bindings bindings = getBindings(mt, cDesc, false, options); ++ ++ MethodHandle handle = new DowncallLinker(CLinux, bindings.callingSequence).getBoundMethodHandle(); ++ ++ if (bindings.isInMemoryReturn) { ++ handle = SharedUtils.adaptDowncallForIMR(handle, cDesc, bindings.callingSequence); ++ } ++ ++ return handle; ++ } ++ ++ public static UpcallStubFactory arrangeUpcall(MethodType mt, FunctionDescriptor cDesc, LinkerOptions options) { ++ Bindings bindings = getBindings(mt, cDesc, true, options); ++ final boolean dropReturn = true; /* drop return, since we don't have bindings for it */ ++ return SharedUtils.arrangeUpcallHelper(mt, bindings.isInMemoryReturn, dropReturn, CLinux, ++ bindings.callingSequence); ++ } ++ ++ private static boolean isInMemoryReturn(Optional returnLayout) { ++ return returnLayout ++ .filter(GroupLayout.class::isInstance) ++ .filter(g -> TypeClass.classifyLayout(g) == TypeClass.STRUCT_REFERENCE) ++ .isPresent(); ++ } ++ ++ static class StorageCalculator { ++ private final boolean forArguments; ++ // next available register index. 0=integerRegIdx, 1=floatRegIdx ++ private final int IntegerRegIdx = 0; ++ private final int FloatRegIdx = 1; ++ private final int[] nRegs = {0, 0}; ++ ++ private long stackOffset = 0; ++ ++ public StorageCalculator(boolean forArguments) { ++ this.forArguments = forArguments; ++ } ++ ++ // Aggregates or scalars passed on the stack are aligned to the greater of ++ // the type alignment and XLEN bits, but never more than the stack alignment. ++ void alignStack(long alignment) { ++ alignment = Utils.alignUp(Math.clamp(alignment, STACK_SLOT_SIZE, 16), STACK_SLOT_SIZE); ++ stackOffset = Utils.alignUp(stackOffset, alignment); ++ } ++ ++ VMStorage stackAlloc() { ++ assert forArguments : "no stack returns"; ++ VMStorage storage = stackStorage((short) STACK_SLOT_SIZE, (int) stackOffset); ++ stackOffset += STACK_SLOT_SIZE; ++ return storage; ++ } ++ ++ Optional regAlloc(int storageClass) { ++ if (nRegs[storageClass] < MAX_REGISTER_ARGUMENTS) { ++ VMStorage[] source = (forArguments ? CLinux.inputStorage : CLinux.outputStorage)[storageClass]; ++ Optional result = Optional.of(source[nRegs[storageClass]]); ++ nRegs[storageClass] += 1; ++ return result; ++ } ++ return Optional.empty(); ++ } ++ ++ VMStorage getStorage(int storageClass) { ++ Optional storage = regAlloc(storageClass); ++ if (storage.isPresent()) { ++ return storage.get(); ++ } ++ // If storageClass is StorageType.FLOAT, and no floating-point register is available, ++ // try to allocate an integer register. ++ if (storageClass == StorageType.FLOAT) { ++ storage = regAlloc(StorageType.INTEGER); ++ if (storage.isPresent()) { ++ return storage.get(); ++ } ++ } ++ return stackAlloc(); ++ } ++ ++ VMStorage[] getStorages(MemoryLayout layout, boolean isVariadicArg) { ++ int regCnt = (int) SharedUtils.alignUp(layout.byteSize(), 8) / 8; ++ if (isVariadicArg && layout.byteAlignment() == 16 && layout.byteSize() <= 16) { ++ alignStorage(); ++ // Two registers or stack slots will be allocated, even layout.byteSize <= 8B. ++ regCnt = 2; ++ } ++ VMStorage[] storages = new VMStorage[regCnt]; ++ for (int i = 0; i < regCnt; i++) { ++ // use integer calling convention. ++ storages[i] = getStorage(StorageType.INTEGER); ++ } ++ return storages; ++ } ++ ++ boolean regsAvailable(int integerRegs, int floatRegs) { ++ return nRegs[IntegerRegIdx] + integerRegs <= MAX_REGISTER_ARGUMENTS && ++ nRegs[FloatRegIdx] + floatRegs <= MAX_REGISTER_ARGUMENTS; ++ } ++ ++ // Variadic arguments with 2 * XLEN-bit alignment and size at most 2 * XLEN bits ++ // are passed in an aligned register pair (i.e., the first register in the pair ++ // is even-numbered), or on the stack by value if none is available. ++ // After a variadic argument has been passed on the stack, all future arguments ++ // will also be passed on the stack. ++ void alignStorage() { ++ if (nRegs[IntegerRegIdx] + 2 <= MAX_REGISTER_ARGUMENTS) { ++ nRegs[IntegerRegIdx] = (nRegs[IntegerRegIdx] + 1) & -2; ++ } else { ++ nRegs[IntegerRegIdx] = MAX_REGISTER_ARGUMENTS; ++ stackOffset = Utils.alignUp(stackOffset, 16); ++ } ++ } ++ ++ @Override ++ public String toString() { ++ String nReg = "iReg: " + nRegs[IntegerRegIdx] + ", fReg: " + nRegs[FloatRegIdx]; ++ String stack = ", stackOffset: " + stackOffset; ++ return "{" + nReg + stack + "}"; ++ } ++ } ++ ++ abstract static class BindingCalculator { ++ protected final StorageCalculator storageCalculator; ++ ++ @Override ++ public String toString() { ++ return storageCalculator.toString(); ++ } ++ ++ protected BindingCalculator(boolean forArguments) { ++ this.storageCalculator = new LinuxLoongArch64CallArranger.StorageCalculator(forArguments); ++ } ++ ++ abstract List getBindings(Class carrier, MemoryLayout layout, boolean isVariadicArg); ++ ++ // When handling variadic part, integer calling convention should be used. ++ static final Map conventionConverterMap = ++ Map.ofEntries(Map.entry(FLOAT, INTEGER), ++ Map.entry(STRUCT_REGISTER_F, STRUCT_REGISTER_X), ++ Map.entry(STRUCT_REGISTER_XF, STRUCT_REGISTER_X)); ++ } ++ ++ static class UnboxBindingCalculator extends BindingCalculator { ++ boolean forArguments; ++ ++ UnboxBindingCalculator(boolean forArguments) { ++ super(forArguments); ++ this.forArguments = forArguments; ++ } ++ ++ @Override ++ List getBindings(Class carrier, MemoryLayout layout, boolean isVariadicArg) { ++ TypeClass typeClass = TypeClass.classifyLayout(layout); ++ if (isVariadicArg) { ++ typeClass = BindingCalculator.conventionConverterMap.getOrDefault(typeClass, typeClass); ++ } ++ return getBindings(carrier, layout, typeClass, isVariadicArg); ++ } ++ ++ List getBindings(Class carrier, MemoryLayout layout, TypeClass argumentClass, boolean isVariadicArg) { ++ Binding.Builder bindings = Binding.builder(); ++ switch (argumentClass) { ++ case INTEGER -> { ++ VMStorage storage = storageCalculator.getStorage(StorageType.INTEGER); ++ bindings.vmStore(storage, carrier); ++ } ++ case FLOAT -> { ++ VMStorage storage = storageCalculator.getStorage(StorageType.FLOAT); ++ bindings.vmStore(storage, carrier); ++ } ++ case POINTER -> { ++ bindings.unboxAddress(); ++ VMStorage storage = storageCalculator.getStorage(StorageType.INTEGER); ++ bindings.vmStore(storage, long.class); ++ } ++ case STRUCT_REGISTER_X -> { ++ assert carrier == MemorySegment.class; ++ ++ // When no register is available, struct will be passed by stack. ++ // Before allocation, stack must be aligned. ++ if (!storageCalculator.regsAvailable(1, 0)) { ++ storageCalculator.alignStack(layout.byteAlignment()); ++ } ++ VMStorage[] locations = storageCalculator.getStorages(layout, isVariadicArg); ++ int locIndex = 0; ++ long offset = 0; ++ while (offset < layout.byteSize()) { ++ final long copy = Math.min(layout.byteSize() - offset, 8); ++ VMStorage storage = locations[locIndex++]; ++ Class type = SharedUtils.primitiveCarrierForSize(copy, false); ++ if (offset + copy < layout.byteSize()) { ++ bindings.dup(); ++ } ++ bindings.bufferLoad(offset, type, (int) copy) ++ .vmStore(storage, type); ++ offset += copy; ++ } ++ } ++ case STRUCT_REGISTER_F -> { ++ assert carrier == MemorySegment.class; ++ List descs = getFlattenedFields((GroupLayout) layout); ++ if (storageCalculator.regsAvailable(0, descs.size())) { ++ for (int i = 0; i < descs.size(); i++) { ++ FlattenedFieldDesc desc = descs.get(i); ++ Class type = desc.layout().carrier(); ++ VMStorage storage = storageCalculator.getStorage(StorageType.FLOAT); ++ if (i < descs.size() - 1) { ++ bindings.dup(); ++ } ++ bindings.bufferLoad(desc.offset(), type) ++ .vmStore(storage, type); ++ } ++ } else { ++ // If there is not enough register can be used, then fall back to integer calling convention. ++ return getBindings(carrier, layout, STRUCT_REGISTER_X, isVariadicArg); ++ } ++ } ++ case STRUCT_REGISTER_XF -> { ++ assert carrier == MemorySegment.class; ++ if (storageCalculator.regsAvailable(1, 1)) { ++ List descs = getFlattenedFields((GroupLayout) layout); ++ for (int i = 0; i < 2; i++) { ++ FlattenedFieldDesc desc = descs.get(i); ++ int storageClass; ++ if (desc.typeClass() == INTEGER) { ++ storageClass = StorageType.INTEGER; ++ } else { ++ storageClass = StorageType.FLOAT; ++ } ++ VMStorage storage = storageCalculator.getStorage(storageClass); ++ Class type = desc.layout().carrier(); ++ if (i < 1) { ++ bindings.dup(); ++ } ++ bindings.bufferLoad(desc.offset(), type) ++ .vmStore(storage, type); ++ } ++ } else { ++ return getBindings(carrier, layout, STRUCT_REGISTER_X, isVariadicArg); ++ } ++ } ++ case STRUCT_REFERENCE -> { ++ assert carrier == MemorySegment.class; ++ bindings.copy(layout) ++ .unboxAddress(); ++ VMStorage storage = storageCalculator.getStorage(StorageType.INTEGER); ++ bindings.vmStore(storage, long.class); ++ } ++ default -> throw new UnsupportedOperationException("Unhandled class " + argumentClass); ++ } ++ ++ return bindings.build(); ++ } ++ } ++ ++ static class BoxBindingCalculator extends BindingCalculator { ++ ++ BoxBindingCalculator(boolean forArguments) { ++ super(forArguments); ++ } ++ ++ @Override ++ List getBindings(Class carrier, MemoryLayout layout, boolean isVariadicArg) { ++ TypeClass typeClass = TypeClass.classifyLayout(layout); ++ if (isVariadicArg) { ++ typeClass = BindingCalculator.conventionConverterMap.getOrDefault(typeClass, typeClass); ++ } ++ return getBindings(carrier, layout, typeClass, isVariadicArg); ++ } ++ ++ List getBindings(Class carrier, MemoryLayout layout, TypeClass argumentClass, boolean isVariadicArg) { ++ Binding.Builder bindings = Binding.builder(); ++ switch (argumentClass) { ++ case INTEGER -> { ++ VMStorage storage = storageCalculator.getStorage(StorageType.INTEGER); ++ bindings.vmLoad(storage, carrier); ++ } ++ case FLOAT -> { ++ VMStorage storage = storageCalculator.getStorage(StorageType.FLOAT); ++ bindings.vmLoad(storage, carrier); ++ } ++ case POINTER -> { ++ AddressLayout addressLayout = (AddressLayout) layout; ++ VMStorage storage = storageCalculator.getStorage(StorageType.INTEGER); ++ bindings.vmLoad(storage, long.class) ++ .boxAddressRaw(Utils.pointeeByteSize(addressLayout), Utils.pointeeByteAlign(addressLayout)); ++ } ++ case STRUCT_REGISTER_X -> { ++ assert carrier == MemorySegment.class; ++ ++ // When no register is available, struct will be passed by stack. ++ // Before allocation, stack must be aligned. ++ if (!storageCalculator.regsAvailable(1, 0)) { ++ storageCalculator.alignStack(layout.byteAlignment()); ++ } ++ bindings.allocate(layout); ++ VMStorage[] locations = storageCalculator.getStorages(layout, isVariadicArg); ++ int locIndex = 0; ++ long offset = 0; ++ while (offset < layout.byteSize()) { ++ final long copy = Math.min(layout.byteSize() - offset, 8); ++ VMStorage storage = locations[locIndex++]; ++ Class type = SharedUtils.primitiveCarrierForSize(copy, false); ++ bindings.dup().vmLoad(storage, type) ++ .bufferStore(offset, type, (int) copy); ++ offset += copy; ++ } ++ } ++ case STRUCT_REGISTER_F -> { ++ assert carrier == MemorySegment.class; ++ bindings.allocate(layout); ++ List descs = getFlattenedFields((GroupLayout) layout); ++ if (storageCalculator.regsAvailable(0, descs.size())) { ++ for (FlattenedFieldDesc desc : descs) { ++ Class type = desc.layout().carrier(); ++ VMStorage storage = storageCalculator.getStorage(StorageType.FLOAT); ++ bindings.dup() ++ .vmLoad(storage, type) ++ .bufferStore(desc.offset(), type); ++ } ++ } else { ++ return getBindings(carrier, layout, STRUCT_REGISTER_X, isVariadicArg); ++ } ++ } ++ case STRUCT_REGISTER_XF -> { ++ assert carrier == MemorySegment.class; ++ bindings.allocate(layout); ++ if (storageCalculator.regsAvailable(1, 1)) { ++ List descs = getFlattenedFields((GroupLayout) layout); ++ for (int i = 0; i < 2; i++) { ++ FlattenedFieldDesc desc = descs.get(i); ++ int storageClass; ++ if (desc.typeClass() == INTEGER) { ++ storageClass = StorageType.INTEGER; ++ } else { ++ storageClass = StorageType.FLOAT; ++ } ++ VMStorage storage = storageCalculator.getStorage(storageClass); ++ Class type = desc.layout().carrier(); ++ bindings.dup() ++ .vmLoad(storage, type) ++ .bufferStore(desc.offset(), type); ++ } ++ } else { ++ return getBindings(carrier, layout, STRUCT_REGISTER_X, isVariadicArg); ++ } ++ } ++ case STRUCT_REFERENCE -> { ++ assert carrier == MemorySegment.class; ++ VMStorage storage = storageCalculator.getStorage(StorageType.INTEGER); ++ bindings.vmLoad(storage, long.class) ++ .boxAddress(layout); ++ } ++ default -> throw new UnsupportedOperationException("Unhandled class " + argumentClass); ++ } ++ ++ return bindings.build(); ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/linux/LinuxLoongArch64Linker.java b/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/linux/LinuxLoongArch64Linker.java +--- a/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/linux/LinuxLoongArch64Linker.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/linux/LinuxLoongArch64Linker.java 2024-02-20 10:42:36.688863032 +0800 +@@ -0,0 +1,65 @@ ++/* ++ * Copyright (c) 2020, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. Oracle designates this ++ * particular file as subject to the "Classpath" exception as provided ++ * by Oracle in the LICENSE file that accompanied this code. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++package jdk.internal.foreign.abi.loongarch64.linux; ++ ++import jdk.internal.foreign.abi.AbstractLinker; ++import jdk.internal.foreign.abi.LinkerOptions; ++ ++import java.lang.foreign.FunctionDescriptor; ++import java.lang.invoke.MethodHandle; ++import java.lang.invoke.MethodType; ++import java.nio.ByteOrder; ++ ++public final class LinuxLoongArch64Linker extends AbstractLinker { ++ ++ public static LinuxLoongArch64Linker getInstance() { ++ final class Holder { ++ private static final LinuxLoongArch64Linker INSTANCE = new LinuxLoongArch64Linker(); ++ } ++ ++ return Holder.INSTANCE; ++ } ++ ++ private LinuxLoongArch64Linker() { ++ // Ensure there is only one instance ++ } ++ ++ @Override ++ protected MethodHandle arrangeDowncall(MethodType inferredMethodType, FunctionDescriptor function, LinkerOptions options) { ++ return LinuxLoongArch64CallArranger.arrangeDowncall(inferredMethodType, function, options); ++ } ++ ++ @Override ++ protected UpcallStubFactory arrangeUpcall(MethodType targetType, FunctionDescriptor function, LinkerOptions options) { ++ return LinuxLoongArch64CallArranger.arrangeUpcall(targetType, function, options); ++ } ++ ++ @Override ++ protected ByteOrder linkerByteOrder() { ++ return ByteOrder.LITTLE_ENDIAN; ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/linux/TypeClass.java b/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/linux/TypeClass.java +--- a/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/linux/TypeClass.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/linux/TypeClass.java 2024-02-20 10:42:36.688863032 +0800 +@@ -0,0 +1,217 @@ ++/* ++ * Copyright (c) 2020, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. Oracle designates this ++ * particular file as subject to the "Classpath" exception as provided ++ * by Oracle in the LICENSE file that accompanied this code. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++package jdk.internal.foreign.abi.loongarch64.linux; ++ ++import java.lang.foreign.GroupLayout; ++import java.lang.foreign.MemoryLayout; ++import java.lang.foreign.MemorySegment; ++import java.lang.foreign.PaddingLayout; ++import java.lang.foreign.SequenceLayout; ++import java.lang.foreign.UnionLayout; ++import java.lang.foreign.ValueLayout; ++import java.util.ArrayList; ++import java.util.List; ++ ++public enum TypeClass { ++ /* ++ * STRUCT_REFERENCE: Aggregates larger than 2 * XLEN bits are passed by reference and are replaced ++ * in the argument list with the address. The address will be passed in a register if at least ++ * one register is available, otherwise it will be passed on the stack. ++ * ++ * STRUCT_REGISTER_F: A struct containing just one floating-point real is passed as though it were ++ * a standalone floating-point real. A struct containing two floating-point reals is passed in two ++ * floating-point registers, if neither real is more than ABI_FLEN bits wide and at least two ++ * floating-point argument registers are available. (The registers need not be an aligned pair.) ++ * Otherwise, it is passed according to the integer calling convention. ++ * ++ * STRUCT_REGISTER_XF: A struct containing one floating-point real and one integer (or bitfield), in either ++ * order, is passed in a floating-point register and an integer register, provided the floating-point real ++ * is no more than ABI_FLEN bits wide and the integer is no more than XLEN bits wide, and at least one ++ * floating-point argument register and at least one integer argument register is available. If the struct ++ * is not passed in this manner, then it is passed according to the integer calling convention. ++ * ++ * STRUCT_REGISTER_X: Aggregates whose total size is no more than XLEN bits are passed in a register, with the ++ * fields laid out as though they were passed in memory. If no register is available, the aggregate is ++ * passed on the stack. Aggregates whose total size is no more than 2 * XLEN bits are passed in a pair of ++ * registers; if only one register is available, the first XLEN bits are passed in a register and the ++ * remaining bits are passed on the stack. If no registers are available, the aggregate is passed on the stack. ++ * ++ * */ ++ INTEGER, ++ FLOAT, ++ POINTER, ++ STRUCT_REFERENCE, ++ STRUCT_REGISTER_F, ++ STRUCT_REGISTER_XF, ++ STRUCT_REGISTER_X; ++ ++ private static final int MAX_AGGREGATE_REGS_SIZE = 2; ++ ++ /* ++ * Struct will be flattened while classifying. That is, struct{struct{int, double}} will be treated ++ * same as struct{int, double} and struct{int[2]} will be treated same as struct{int, int}. ++ * */ ++ private static record FieldCounter(long integerCnt, long floatCnt, long pointerCnt) { ++ static final FieldCounter EMPTY = new FieldCounter(0, 0, 0); ++ static final FieldCounter SINGLE_INTEGER = new FieldCounter(1, 0, 0); ++ static final FieldCounter SINGLE_FLOAT = new FieldCounter(0, 1, 0); ++ static final FieldCounter SINGLE_POINTER = new FieldCounter(0, 0, 1); ++ ++ static FieldCounter flatten(MemoryLayout layout) { ++ switch (layout) { ++ case ValueLayout valueLayout -> { ++ return switch (classifyValueType(valueLayout)) { ++ case INTEGER -> FieldCounter.SINGLE_INTEGER; ++ case FLOAT -> FieldCounter.SINGLE_FLOAT; ++ case POINTER -> FieldCounter.SINGLE_POINTER; ++ default -> throw new IllegalStateException("Should not reach here."); ++ }; ++ } ++ case GroupLayout groupLayout -> { ++ FieldCounter currCounter = FieldCounter.EMPTY; ++ for (MemoryLayout memberLayout : groupLayout.memberLayouts()) { ++ if (memberLayout instanceof PaddingLayout) { ++ continue; ++ } ++ currCounter = currCounter.add(flatten(memberLayout)); ++ } ++ return currCounter; ++ } ++ case SequenceLayout sequenceLayout -> { ++ long elementCount = sequenceLayout.elementCount(); ++ if (elementCount == 0) { ++ return FieldCounter.EMPTY; ++ } ++ return flatten(sequenceLayout.elementLayout()).mul(elementCount); ++ } ++ default -> throw new IllegalStateException("Cannot get here: " + layout); ++ } ++ } ++ ++ FieldCounter mul(long m) { ++ return new FieldCounter(integerCnt * m, ++ floatCnt * m, ++ pointerCnt * m); ++ } ++ ++ FieldCounter add(FieldCounter other) { ++ return new FieldCounter(integerCnt + other.integerCnt, ++ floatCnt + other.floatCnt, ++ pointerCnt + other.pointerCnt); ++ } ++ } ++ ++ public static record FlattenedFieldDesc(TypeClass typeClass, long offset, ValueLayout layout) { ++ ++ } ++ ++ private static List getFlattenedFieldsInner(long offset, MemoryLayout layout) { ++ if (layout instanceof ValueLayout valueLayout) { ++ TypeClass typeClass = classifyValueType(valueLayout); ++ return List.of(switch (typeClass) { ++ case INTEGER, FLOAT -> new FlattenedFieldDesc(typeClass, offset, valueLayout); ++ default -> throw new IllegalStateException("Should not reach here."); ++ }); ++ } else if (layout instanceof GroupLayout groupLayout) { ++ List fields = new ArrayList<>(); ++ for (MemoryLayout memberLayout : groupLayout.memberLayouts()) { ++ if (memberLayout instanceof PaddingLayout) { ++ offset += memberLayout.byteSize(); ++ continue; ++ } ++ fields.addAll(getFlattenedFieldsInner(offset, memberLayout)); ++ offset += memberLayout.byteSize(); ++ } ++ return fields; ++ } else if (layout instanceof SequenceLayout sequenceLayout) { ++ List fields = new ArrayList<>(); ++ MemoryLayout elementLayout = sequenceLayout.elementLayout(); ++ for (long i = 0; i < sequenceLayout.elementCount(); i++) { ++ fields.addAll(getFlattenedFieldsInner(offset, elementLayout)); ++ offset += elementLayout.byteSize(); ++ } ++ return fields; ++ } else { ++ throw new IllegalStateException("Cannot get here: " + layout); ++ } ++ } ++ ++ public static List getFlattenedFields(GroupLayout layout) { ++ return getFlattenedFieldsInner(0, layout); ++ } ++ ++ // ValueLayout will be classified by its carrier type. ++ private static TypeClass classifyValueType(ValueLayout type) { ++ Class carrier = type.carrier(); ++ if (carrier == boolean.class || carrier == byte.class || carrier == char.class || ++ carrier == short.class || carrier == int.class || carrier == long.class) { ++ return INTEGER; ++ } else if (carrier == float.class || carrier == double.class) { ++ return FLOAT; ++ } else if (carrier == MemorySegment.class) { ++ return POINTER; ++ } else { ++ throw new IllegalStateException("Cannot get here: " + carrier.getName()); ++ } ++ } ++ ++ private static boolean isRegisterAggregate(MemoryLayout type) { ++ return type.byteSize() <= MAX_AGGREGATE_REGS_SIZE * 8; ++ } ++ ++ private static TypeClass classifyStructType(GroupLayout layout) { ++ if (layout instanceof UnionLayout) { ++ return isRegisterAggregate(layout) ? STRUCT_REGISTER_X : STRUCT_REFERENCE; ++ } ++ ++ if (!isRegisterAggregate(layout)) { ++ return STRUCT_REFERENCE; ++ } ++ ++ // classify struct by its fields. ++ FieldCounter counter = FieldCounter.flatten(layout); ++ if (counter.integerCnt == 0 && counter.pointerCnt == 0 && ++ (counter.floatCnt == 1 || counter.floatCnt == 2)) { ++ return STRUCT_REGISTER_F; ++ } else if (counter.integerCnt == 1 && counter.floatCnt == 1 && ++ counter.pointerCnt == 0) { ++ return STRUCT_REGISTER_XF; ++ } else { ++ return STRUCT_REGISTER_X; ++ } ++ } ++ ++ public static TypeClass classifyLayout(MemoryLayout type) { ++ if (type instanceof ValueLayout vt) { ++ return classifyValueType(vt); ++ } else if (type instanceof GroupLayout gt) { ++ return classifyStructType(gt); ++ } else { ++ throw new IllegalArgumentException("Unsupported layout: " + type); ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/LoongArch64Architecture.java b/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/LoongArch64Architecture.java +--- a/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/LoongArch64Architecture.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/java.base/share/classes/jdk/internal/foreign/abi/loongarch64/LoongArch64Architecture.java 2024-02-20 10:42:36.688863032 +0800 +@@ -0,0 +1,179 @@ ++/* ++ * Copyright (c) 2020, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. Oracle designates this ++ * particular file as subject to the "Classpath" exception as provided ++ * by Oracle in the LICENSE file that accompanied this code. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++package jdk.internal.foreign.abi.loongarch64; ++ ++import jdk.internal.foreign.abi.ABIDescriptor; ++import jdk.internal.foreign.abi.Architecture; ++import jdk.internal.foreign.abi.StubLocations; ++import jdk.internal.foreign.abi.VMStorage; ++import jdk.internal.foreign.abi.loongarch64.linux.TypeClass; ++ ++public final class LoongArch64Architecture implements Architecture { ++ public static final Architecture INSTANCE = new LoongArch64Architecture(); ++ ++ private static final short REG64_MASK = 0b0000_0000_0000_0001; ++ private static final short FLOAT64_MASK = 0b0000_0000_0000_0001; ++ ++ private static final int INTEGER_REG_SIZE = 8; ++ private static final int FLOAT_REG_SIZE = 8; ++ ++ // Suppresses default constructor, ensuring non-instantiability. ++ private LoongArch64Architecture() {} ++ ++ @Override ++ public boolean isStackType(int cls) { ++ return cls == StorageType.STACK; ++ } ++ ++ @Override ++ public int typeSize(int cls) { ++ return switch (cls) { ++ case StorageType.INTEGER -> INTEGER_REG_SIZE; ++ case StorageType.FLOAT -> FLOAT_REG_SIZE; ++ // STACK is deliberately omitted ++ default -> throw new IllegalArgumentException("Invalid Storage Class: " + cls); ++ }; ++ } ++ ++ public interface StorageType { ++ byte INTEGER = 0; ++ byte FLOAT = 1; ++ byte STACK = 2; ++ byte PLACEHOLDER = 3; ++ } ++ ++ public static class Regs { // break circular dependency ++ public static final VMStorage r0 = integerRegister(0, "zero"); ++ public static final VMStorage ra = integerRegister(1, "ra"); ++ public static final VMStorage tp = integerRegister(2, "tp"); ++ public static final VMStorage sp = integerRegister(3, "sp"); ++ public static final VMStorage a0 = integerRegister(4, "a0"); ++ public static final VMStorage a1 = integerRegister(5, "a1"); ++ public static final VMStorage a2 = integerRegister(6, "a2"); ++ public static final VMStorage a3 = integerRegister(7, "a3"); ++ public static final VMStorage a4 = integerRegister(8, "a4"); ++ public static final VMStorage a5 = integerRegister(9, "a5"); ++ public static final VMStorage a6 = integerRegister(10, "a6"); ++ public static final VMStorage a7 = integerRegister(11, "a7"); ++ public static final VMStorage t0 = integerRegister(12, "t0"); ++ public static final VMStorage t1 = integerRegister(13, "t1"); ++ public static final VMStorage t2 = integerRegister(14, "t2"); ++ public static final VMStorage t3 = integerRegister(15, "t3"); ++ public static final VMStorage t4 = integerRegister(16, "t4"); ++ public static final VMStorage t5 = integerRegister(17, "t5"); ++ public static final VMStorage t6 = integerRegister(18, "t6"); ++ public static final VMStorage t7 = integerRegister(19, "t7"); ++ public static final VMStorage t8 = integerRegister(20, "t8"); ++ public static final VMStorage rx = integerRegister(21, "rx"); ++ public static final VMStorage fp = integerRegister(22, "fp"); ++ public static final VMStorage s0 = integerRegister(23, "s0"); ++ public static final VMStorage s1 = integerRegister(24, "s1"); ++ public static final VMStorage s2 = integerRegister(25, "s2"); ++ public static final VMStorage s3 = integerRegister(26, "s3"); ++ public static final VMStorage s4 = integerRegister(27, "s4"); ++ public static final VMStorage s5 = integerRegister(28, "s5"); ++ public static final VMStorage s6 = integerRegister(29, "s6"); ++ public static final VMStorage s7 = integerRegister(30, "s7"); ++ public static final VMStorage s8 = integerRegister(31, "s8"); ++ ++ public static final VMStorage f0 = floatRegister(0, "f0"); ++ public static final VMStorage f1 = floatRegister(1, "f1"); ++ public static final VMStorage f2 = floatRegister(2, "f2"); ++ public static final VMStorage f3 = floatRegister(3, "f3"); ++ public static final VMStorage f4 = floatRegister(4, "f4"); ++ public static final VMStorage f5 = floatRegister(5, "f5"); ++ public static final VMStorage f6 = floatRegister(6, "f6"); ++ public static final VMStorage f7 = floatRegister(7, "f7"); ++ public static final VMStorage f8 = floatRegister(8, "f8"); ++ public static final VMStorage f9 = floatRegister(9, "f9"); ++ public static final VMStorage f10 = floatRegister(10, "f10"); ++ public static final VMStorage f11 = floatRegister(11, "f11"); ++ public static final VMStorage f12 = floatRegister(12, "f12"); ++ public static final VMStorage f13 = floatRegister(13, "f13"); ++ public static final VMStorage f14 = floatRegister(14, "f14"); ++ public static final VMStorage f15 = floatRegister(15, "f15"); ++ public static final VMStorage f16 = floatRegister(16, "f16"); ++ public static final VMStorage f17 = floatRegister(17, "f17"); ++ public static final VMStorage f18 = floatRegister(18, "f18"); ++ public static final VMStorage f19 = floatRegister(19, "f19"); ++ public static final VMStorage f20 = floatRegister(20, "f20"); ++ public static final VMStorage f21 = floatRegister(21, "f21"); ++ public static final VMStorage f22 = floatRegister(22, "f22"); ++ public static final VMStorage f23 = floatRegister(23, "f23"); ++ public static final VMStorage f24 = floatRegister(24, "f24"); ++ public static final VMStorage f25 = floatRegister(25, "f25"); ++ public static final VMStorage f26 = floatRegister(26, "f26"); ++ public static final VMStorage f27 = floatRegister(27, "f27"); ++ public static final VMStorage f28 = floatRegister(28, "f28"); ++ public static final VMStorage f29 = floatRegister(29, "f29"); ++ public static final VMStorage f30 = floatRegister(30, "f30"); ++ public static final VMStorage f31 = floatRegister(31, "f31"); ++ } ++ ++ private static VMStorage integerRegister(int index, String debugName) { ++ return new VMStorage(StorageType.INTEGER, REG64_MASK, index, debugName); ++ } ++ ++ private static VMStorage floatRegister(int index, String debugName) { ++ return new VMStorage(StorageType.FLOAT, FLOAT64_MASK, index, debugName); ++ } ++ ++ public static VMStorage stackStorage(short size, int byteOffset) { ++ return new VMStorage(StorageType.STACK, size, byteOffset); ++ } ++ ++ public static ABIDescriptor abiFor(VMStorage[] inputIntRegs, ++ VMStorage[] inputFloatRegs, ++ VMStorage[] outputIntRegs, ++ VMStorage[] outputFloatRegs, ++ VMStorage[] volatileIntRegs, ++ VMStorage[] volatileFloatRegs, ++ int stackAlignment, ++ int shadowSpace, ++ VMStorage scratch1, VMStorage scratch2) { ++ return new ABIDescriptor( ++ INSTANCE, ++ new VMStorage[][]{ ++ inputIntRegs, ++ inputFloatRegs, ++ }, ++ new VMStorage[][]{ ++ outputIntRegs, ++ outputFloatRegs, ++ }, ++ new VMStorage[][]{ ++ volatileIntRegs, ++ volatileFloatRegs, ++ }, ++ stackAlignment, ++ shadowSpace, ++ scratch1, scratch2, ++ StubLocations.TARGET_ADDRESS.storage(StorageType.PLACEHOLDER), ++ StubLocations.RETURN_BUFFER.storage(StorageType.PLACEHOLDER), ++ StubLocations.CAPTURED_STATE_BUFFER.storage(StorageType.PLACEHOLDER)); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/java.base/share/classes/jdk/internal/foreign/abi/SharedUtils.java b/src/java.base/share/classes/jdk/internal/foreign/abi/SharedUtils.java +--- a/src/java.base/share/classes/jdk/internal/foreign/abi/SharedUtils.java 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/java.base/share/classes/jdk/internal/foreign/abi/SharedUtils.java 2024-02-20 10:42:36.685529702 +0800 +@@ -22,6 +22,12 @@ + * or visit www.oracle.com if you need additional information or have any + * questions. + */ ++ ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2022, 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ + package jdk.internal.foreign.abi; + + import jdk.internal.access.JavaLangAccess; +@@ -33,6 +39,7 @@ + import jdk.internal.foreign.abi.aarch64.macos.MacOsAArch64Linker; + import jdk.internal.foreign.abi.aarch64.windows.WindowsAArch64Linker; + import jdk.internal.foreign.abi.fallback.FallbackLinker; ++import jdk.internal.foreign.abi.loongarch64.linux.LinuxLoongArch64Linker; + import jdk.internal.foreign.abi.ppc64.linux.LinuxPPC64leLinker; + import jdk.internal.foreign.abi.riscv64.linux.LinuxRISCV64Linker; + import jdk.internal.foreign.abi.s390.linux.LinuxS390Linker; +@@ -244,6 +251,7 @@ + case LINUX_PPC_64_LE -> LinuxPPC64leLinker.getInstance(); + case LINUX_RISCV_64 -> LinuxRISCV64Linker.getInstance(); + case LINUX_S390 -> LinuxS390Linker.getInstance(); ++ case LINUX_LOONGARCH_64 -> LinuxLoongArch64Linker.getInstance(); + case FALLBACK -> FallbackLinker.getInstance(); + case UNSUPPORTED -> throw new UnsupportedOperationException("Platform does not support native linker"); + }; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/java.base/share/classes/jdk/internal/foreign/CABI.java b/src/java.base/share/classes/jdk/internal/foreign/CABI.java +--- a/src/java.base/share/classes/jdk/internal/foreign/CABI.java 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/java.base/share/classes/jdk/internal/foreign/CABI.java 2024-02-20 10:42:36.685529702 +0800 +@@ -23,6 +23,12 @@ + * questions. + * + */ ++ ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2022, 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ + package jdk.internal.foreign; + + import jdk.internal.foreign.abi.fallback.FallbackLinker; +@@ -42,6 +48,7 @@ + LINUX_PPC_64_LE, + LINUX_RISCV_64, + LINUX_S390, ++ LINUX_LOONGARCH_64, + FALLBACK, + UNSUPPORTED; + +@@ -86,7 +93,11 @@ + if (OperatingSystem.isLinux()) { + return LINUX_S390; + } +- } ++ } else if (arch.equals("loongarch64")) { ++ if (OperatingSystem.isLinux()) { ++ return LINUX_LOONGARCH_64; ++ } ++ } + } else if (FallbackLinker.isSupported()) { + return FALLBACK; // fallback linker + } +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/java.base/share/classes/jdk/internal/util/Architecture.java b/src/java.base/share/classes/jdk/internal/util/Architecture.java +--- a/src/java.base/share/classes/jdk/internal/util/Architecture.java 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/java.base/share/classes/jdk/internal/util/Architecture.java 2024-02-20 10:42:36.725529669 +0800 +@@ -22,6 +22,14 @@ + * or visit www.oracle.com if you need additional information or have any + * questions. + */ ++ ++/* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ * ++ */ ++ + package jdk.internal.util; + + import jdk.internal.vm.annotation.ForceInline; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/java.base/share/classes/jdk/internal/util/PlatformProps.java.template b/src/java.base/share/classes/jdk/internal/util/PlatformProps.java.template +--- a/src/java.base/share/classes/jdk/internal/util/PlatformProps.java.template 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/java.base/share/classes/jdk/internal/util/PlatformProps.java.template 2024-02-20 10:42:36.725529669 +0800 +@@ -22,6 +22,14 @@ + * or visit www.oracle.com if you need additional information or have any + * questions. + */ ++ ++/* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ * ++ */ ++ + package jdk.internal.util; + + /** +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/linux/native/libsaproc/libproc.h b/src/jdk.hotspot.agent/linux/native/libsaproc/libproc.h +--- a/src/jdk.hotspot.agent/linux/native/libsaproc/libproc.h 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/jdk.hotspot.agent/linux/native/libsaproc/libproc.h 2024-02-20 10:42:37.642195613 +0800 +@@ -22,6 +22,13 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ * ++ */ ++ + #ifndef _LIBPROC_H_ + #define _LIBPROC_H_ + +@@ -37,7 +44,7 @@ + #include + #define user_regs_struct pt_regs + #endif +-#if defined(aarch64) || defined(arm64) ++#if defined(aarch64) || defined(arm64) || defined(loongarch64) + #include + #define user_regs_struct user_pt_regs + #elif defined(arm) +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.cpp b/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.cpp +--- a/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.cpp 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.cpp 2024-02-20 10:42:37.642195613 +0800 +@@ -23,6 +23,13 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2021, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ * ++ */ ++ + #include + #include "libproc.h" + #include "proc_service.h" +@@ -60,6 +67,10 @@ + #include "sun_jvm_hotspot_debugger_aarch64_AARCH64ThreadContext.h" + #endif + ++#ifdef loongarch64 ++#include "sun_jvm_hotspot_debugger_loongarch64_LOONGARCH64ThreadContext.h" ++#endif ++ + #ifdef riscv64 + #include "sun_jvm_hotspot_debugger_riscv64_RISCV64ThreadContext.h" + #endif +@@ -411,7 +422,7 @@ + return (err == PS_OK)? array : 0; + } + +-#if defined(i586) || defined(amd64) || defined(ppc64) || defined(ppc64le) || defined(aarch64) || defined(riscv64) ++#if defined(i586) || defined(amd64) || defined(ppc64) || defined(ppc64le) || defined(aarch64) || defined(riscv64) || defined(loongarch64) + extern "C" + JNIEXPORT jlongArray JNICALL Java_sun_jvm_hotspot_debugger_linux_LinuxDebuggerLocal_getThreadIntegerRegisterSet0 + (JNIEnv *env, jobject this_obj, jint lwp_id) { +@@ -443,6 +454,9 @@ + #ifdef aarch64 + #define NPRGREG sun_jvm_hotspot_debugger_aarch64_AARCH64ThreadContext_NPRGREG + #endif ++#ifdef loongarch64 ++#define NPRGREG sun_jvm_hotspot_debugger_loongarch64_LOONGARCH64ThreadContext_NPRGREG ++#endif + #ifdef riscv64 + #define NPRGREG sun_jvm_hotspot_debugger_riscv64_RISCV64ThreadContext_NPRGREG + #endif +@@ -522,6 +536,18 @@ + } + #endif /* aarch64 */ + ++#if defined(loongarch64) ++ ++#define REG_INDEX(reg) sun_jvm_hotspot_debugger_loongarch64_LOONGARCH64ThreadContext_##reg ++ ++ { ++ int i; ++ for (i = 0; i < 31; i++) ++ regs[i] = gregs.regs[i]; ++ regs[REG_INDEX(PC)] = gregs.csr_era; ++ } ++#endif /* loongarch64 */ ++ + #if defined(riscv64) + #define REG_INDEX(reg) sun_jvm_hotspot_debugger_riscv64_RISCV64ThreadContext_##reg + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/linux/native/libsaproc/ps_proc.c b/src/jdk.hotspot.agent/linux/native/libsaproc/ps_proc.c +--- a/src/jdk.hotspot.agent/linux/native/libsaproc/ps_proc.c 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/jdk.hotspot.agent/linux/native/libsaproc/ps_proc.c 2024-02-20 10:42:37.642195613 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + #include + #include + #include +@@ -143,7 +149,7 @@ + return false; + } + return true; +-#elif defined(PTRACE_GETREGS_REQ) ++#elif defined(PTRACE_GETREGS_REQ) && !defined(loongarch64) + if (ptrace(PTRACE_GETREGS_REQ, pid, NULL, user) < 0) { + print_debug("ptrace(PTRACE_GETREGS, ...) failed for lwp(%d) errno(%d) \"%s\"\n", pid, + errno, strerror(errno)); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java 2024-02-20 10:42:37.655528938 +0800 +@@ -23,6 +23,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2019, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package sun.jvm.hotspot.debugger.linux; + + import java.io.*; +@@ -33,12 +39,14 @@ + import sun.jvm.hotspot.debugger.x86.*; + import sun.jvm.hotspot.debugger.amd64.*; + import sun.jvm.hotspot.debugger.aarch64.*; ++import sun.jvm.hotspot.debugger.loongarch64.*; + import sun.jvm.hotspot.debugger.riscv64.*; + import sun.jvm.hotspot.debugger.ppc64.*; + import sun.jvm.hotspot.debugger.linux.x86.*; + import sun.jvm.hotspot.debugger.linux.amd64.*; + import sun.jvm.hotspot.debugger.linux.ppc64.*; + import sun.jvm.hotspot.debugger.linux.aarch64.*; ++import sun.jvm.hotspot.debugger.linux.loongarch64.*; + import sun.jvm.hotspot.debugger.linux.riscv64.*; + import sun.jvm.hotspot.utilities.*; + +@@ -93,7 +101,14 @@ + Address pc = context.getRegisterAsAddress(AMD64ThreadContext.RIP); + if (pc == null) return null; + return LinuxAMD64CFrame.getTopFrame(dbg, pc, context); +- } else if (cpu.equals("ppc64")) { ++ } else if (cpu.equals("loongarch64")) { ++ LOONGARCH64ThreadContext context = (LOONGARCH64ThreadContext) thread.getContext(); ++ Address fp = context.getRegisterAsAddress(LOONGARCH64ThreadContext.FP); ++ if (fp == null) return null; ++ Address pc = context.getRegisterAsAddress(LOONGARCH64ThreadContext.PC); ++ if (pc == null) return null; ++ return new LinuxLOONGARCH64CFrame(dbg, fp, pc); ++ } else if (cpu.equals("ppc64")) { + PPC64ThreadContext context = (PPC64ThreadContext) thread.getContext(); + Address sp = context.getRegisterAsAddress(PPC64ThreadContext.SP); + if (sp == null) return null; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxThreadContextFactory.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxThreadContextFactory.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxThreadContextFactory.java 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxThreadContextFactory.java 2024-02-20 10:42:37.655528938 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2019, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package sun.jvm.hotspot.debugger.linux; + + import java.lang.reflect.*; +@@ -29,6 +35,7 @@ + import sun.jvm.hotspot.debugger.linux.amd64.*; + import sun.jvm.hotspot.debugger.linux.x86.*; + import sun.jvm.hotspot.debugger.linux.ppc64.*; ++import sun.jvm.hotspot.debugger.linux.loongarch64.*; + + class LinuxThreadContextFactory { + static ThreadContext createThreadContext(LinuxDebugger dbg) { +@@ -37,7 +44,9 @@ + return new LinuxX86ThreadContext(dbg); + } else if (cpu.equals("amd64")) { + return new LinuxAMD64ThreadContext(dbg); +- } else if (cpu.equals("ppc64")) { ++ } else if (cpu.equals("loongarch64")) { ++ return new LinuxLOONGARCH64ThreadContext(dbg); ++ } else if (cpu.equals("ppc64")) { + return new LinuxPPC64ThreadContext(dbg); + } else { + try { +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/loongarch64/LinuxLOONGARCH64CFrame.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/loongarch64/LinuxLOONGARCH64CFrame.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/loongarch64/LinuxLOONGARCH64CFrame.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/loongarch64/LinuxLOONGARCH64CFrame.java 2024-02-20 10:42:37.655528938 +0800 +@@ -0,0 +1,92 @@ ++/* ++ * Copyright (c) 2003, 2012, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++package sun.jvm.hotspot.debugger.linux.loongarch64; ++ ++import sun.jvm.hotspot.debugger.*; ++import sun.jvm.hotspot.debugger.linux.*; ++import sun.jvm.hotspot.debugger.cdbg.*; ++import sun.jvm.hotspot.debugger.cdbg.basic.*; ++import sun.jvm.hotspot.debugger.loongarch64.*; ++ ++final public class LinuxLOONGARCH64CFrame extends BasicCFrame { ++ // package/class internals only ++ public LinuxLOONGARCH64CFrame(LinuxDebugger dbg, Address fp, Address pc) { ++ super(dbg.getCDebugger()); ++ this.fp = fp; ++ this.pc = pc; ++ this.dbg = dbg; ++ } ++ ++ // override base class impl to avoid ELF parsing ++ public ClosestSymbol closestSymbolToPC() { ++ // try native lookup in debugger. ++ return dbg.lookup(dbg.getAddressValue(pc())); ++ } ++ ++ public Address pc() { ++ return pc; ++ } ++ ++ public Address localVariableBase() { ++ return fp; ++ } ++ ++ public CFrame sender(ThreadProxy thread) { ++ LOONGARCH64ThreadContext context = (LOONGARCH64ThreadContext) thread.getContext(); ++ Address sp = context.getRegisterAsAddress(LOONGARCH64ThreadContext.SP); ++ Address nextFP; ++ Address nextPC; ++ ++ if ((fp == null) || fp.lessThan(sp)) { ++ return null; ++ } ++ ++ try { ++ nextFP = fp.getAddressAt(-2 * ADDRESS_SIZE); ++ } catch (Exception e) { ++ return null; ++ } ++ if (nextFP == null) { ++ return null; ++ } ++ ++ try { ++ nextPC = fp.getAddressAt(-1 * ADDRESS_SIZE); ++ } catch (Exception e) { ++ return null; ++ } ++ if (nextPC == null) { ++ return null; ++ } ++ ++ return new LinuxLOONGARCH64CFrame(dbg, nextFP, nextPC); ++ } ++ ++ private static final int ADDRESS_SIZE = 8; ++ private Address pc; ++ private Address fp; ++ private LinuxDebugger dbg; ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/loongarch64/LinuxLOONGARCH64ThreadContext.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/loongarch64/LinuxLOONGARCH64ThreadContext.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/loongarch64/LinuxLOONGARCH64ThreadContext.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/loongarch64/LinuxLOONGARCH64ThreadContext.java 2024-02-20 10:42:37.655528938 +0800 +@@ -0,0 +1,47 @@ ++/* ++ * Copyright (c) 2003, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++package sun.jvm.hotspot.debugger.linux.loongarch64; ++ ++import sun.jvm.hotspot.debugger.*; ++import sun.jvm.hotspot.debugger.loongarch64.*; ++import sun.jvm.hotspot.debugger.linux.*; ++ ++public class LinuxLOONGARCH64ThreadContext extends LOONGARCH64ThreadContext { ++ private LinuxDebugger debugger; ++ ++ public LinuxLOONGARCH64ThreadContext(LinuxDebugger debugger) { ++ super(); ++ this.debugger = debugger; ++ } ++ ++ public void setRegisterAsAddress(int index, Address value) { ++ setRegister(index, debugger.getAddressValue(value)); ++ } ++ ++ public Address getRegisterAsAddress(int index) { ++ return debugger.newAddress(getRegister(index)); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/loongarch64/LOONGARCH64ThreadContext.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/loongarch64/LOONGARCH64ThreadContext.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/loongarch64/LOONGARCH64ThreadContext.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/loongarch64/LOONGARCH64ThreadContext.java 2024-02-20 10:42:37.658862267 +0800 +@@ -0,0 +1,128 @@ ++/* ++ * Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2015, 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++package sun.jvm.hotspot.debugger.loongarch64; ++ ++import java.lang.annotation.Native; ++ ++import sun.jvm.hotspot.debugger.*; ++import sun.jvm.hotspot.debugger.cdbg.*; ++ ++/** Specifies the thread context on loongarch64 platforms; only a sub-portion ++ of the context is guaranteed to be present on all operating ++ systems. */ ++ ++public abstract class LOONGARCH64ThreadContext implements ThreadContext { ++ ++ // NOTE: the indices for the various registers must be maintained as ++ // listed across various operating systems. However, only a small ++ // subset of the registers' values are guaranteed to be present (and ++ // must be present for the SA's stack walking to work): EAX, EBX, ++ // ECX, EDX, ESI, EDI, EBP, ESP, and EIP. ++ ++ // One instance of the Native annotation is enough to trigger header generation ++ // for this file. ++ @Native ++ public static final int ZERO = 0; ++ public static final int RA = 1; ++ public static final int TP = 2; ++ public static final int SP = 3; ++ public static final int A0 = 4; ++ public static final int A1 = 5; ++ public static final int A2 = 6; ++ public static final int A3 = 7; ++ public static final int A4 = 8; ++ public static final int A5 = 9; ++ public static final int A6 = 10; ++ public static final int A7 = 11; ++ public static final int T0 = 12; ++ public static final int T1 = 13; ++ public static final int T2 = 14; ++ public static final int T3 = 15; ++ public static final int T4 = 16; ++ public static final int T5 = 17; ++ public static final int T6 = 18; ++ public static final int T7 = 19; ++ public static final int T8 = 20; ++ public static final int RX = 21; ++ public static final int FP = 22; ++ public static final int S0 = 23; ++ public static final int S1 = 24; ++ public static final int S2 = 25; ++ public static final int S3 = 26; ++ public static final int S4 = 27; ++ public static final int S5 = 28; ++ public static final int S6 = 29; ++ public static final int S7 = 30; ++ public static final int S8 = 31; ++ public static final int PC = 32; ++ public static final int NPRGREG = 33; ++ ++ private static final String[] regNames = { ++ "ZERO", "RA", "TP", "SP", ++ "A0", "A1", "A2", "A3", ++ "A4", "A5", "A6", "A7", ++ "T0", "T1", "T2", "T3", ++ "T4", "T5", "T6", "T7", ++ "T8", "RX", "FP", "S0", ++ "S1", "S2", "S3", "S4", ++ "S5", "S6", "S7", "S8", ++ "PC" ++ }; ++ ++ private long[] data; ++ ++ public LOONGARCH64ThreadContext() { ++ data = new long[NPRGREG]; ++ } ++ ++ public int getNumRegisters() { ++ return NPRGREG; ++ } ++ ++ public String getRegisterName(int index) { ++ return regNames[index]; ++ } ++ ++ public void setRegister(int index, long value) { ++ data[index] = value; ++ } ++ ++ public long getRegister(int index) { ++ return data[index]; ++ } ++ ++ public CFrame getTopFrame(Debugger dbg) { ++ return null; ++ } ++ ++ /** This can't be implemented in this class since we would have to ++ tie the implementation to, for example, the debugging system */ ++ public abstract void setRegisterAsAddress(int index, Address value); ++ ++ /** This can't be implemented in this class since we would have to ++ tie the implementation to, for example, the debugging system */ ++ public abstract Address getRegisterAsAddress(int index); ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/MachineDescriptionLOONGARCH64.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/MachineDescriptionLOONGARCH64.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/MachineDescriptionLOONGARCH64.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/MachineDescriptionLOONGARCH64.java 2024-02-20 10:42:37.652195606 +0800 +@@ -0,0 +1,41 @@ ++/* ++ * Copyright (c) 2000, 2008, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++package sun.jvm.hotspot.debugger; ++ ++public class MachineDescriptionLOONGARCH64 extends MachineDescriptionTwosComplement implements MachineDescription { ++ public long getAddressSize() { ++ return 8; ++ } ++ ++ ++ public boolean isBigEndian() { ++ return false; ++ } ++ ++ public boolean isLP64() { ++ return true; ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/posix/elf/ELFHeader.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/posix/elf/ELFHeader.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/posix/elf/ELFHeader.java 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/posix/elf/ELFHeader.java 2024-02-20 10:42:37.658862267 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2021, These ++ * modifications are Copyright (c) 2019, 2021, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package sun.jvm.hotspot.debugger.posix.elf; + + import java.io.FileInputStream; +@@ -63,6 +69,8 @@ + public static final int ARCH_i860 = 7; + /** MIPS architecture type. */ + public static final int ARCH_MIPS = 8; ++ /** LOONGARCH architecture type. */ ++ public static final int ARCH_LOONGARCH = 9; + + /** Returns a file type which is defined by the file type constants. */ + public short getFileType(); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/loongarch64/RemoteLOONGARCH64ThreadContext.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/loongarch64/RemoteLOONGARCH64ThreadContext.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/loongarch64/RemoteLOONGARCH64ThreadContext.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/loongarch64/RemoteLOONGARCH64ThreadContext.java 2024-02-20 10:42:37.658862267 +0800 +@@ -0,0 +1,51 @@ ++/* ++ * Copyright (c) 2002, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++package sun.jvm.hotspot.debugger.remote.loongarch64; ++ ++import sun.jvm.hotspot.debugger.*; ++import sun.jvm.hotspot.debugger.loongarch64.*; ++import sun.jvm.hotspot.debugger.remote.*; ++ ++public class RemoteLOONGARCH64ThreadContext extends LOONGARCH64ThreadContext { ++ private RemoteDebuggerClient debugger; ++ ++ public RemoteLOONGARCH64ThreadContext(RemoteDebuggerClient debugger) { ++ super(); ++ this.debugger = debugger; ++ } ++ ++ /** This can't be implemented in this class since we would have to ++ tie the implementation to, for example, the debugging system */ ++ public void setRegisterAsAddress(int index, Address value) { ++ setRegister(index, debugger.getAddressValue(value)); ++ } ++ ++ /** This can't be implemented in this class since we would have to ++ tie the implementation to, for example, the debugging system */ ++ public Address getRegisterAsAddress(int index) { ++ return debugger.newAddress(getRegister(index)); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/loongarch64/RemoteLOONGARCH64ThreadFactory.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/loongarch64/RemoteLOONGARCH64ThreadFactory.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/loongarch64/RemoteLOONGARCH64ThreadFactory.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/loongarch64/RemoteLOONGARCH64ThreadFactory.java 2024-02-20 10:42:37.658862267 +0800 +@@ -0,0 +1,45 @@ ++/* ++ * Copyright (c) 2002, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++package sun.jvm.hotspot.debugger.remote.loongarch64; ++ ++import sun.jvm.hotspot.debugger.*; ++import sun.jvm.hotspot.debugger.remote.*; ++ ++public class RemoteLOONGARCH64ThreadFactory implements RemoteThreadFactory { ++ private RemoteDebuggerClient debugger; ++ ++ public RemoteLOONGARCH64ThreadFactory(RemoteDebuggerClient debugger) { ++ this.debugger = debugger; ++ } ++ ++ public ThreadProxy createThreadWrapper(Address threadIdentifierAddr) { ++ return new RemoteLOONGARCH64Thread(debugger, threadIdentifierAddr); ++ } ++ ++ public ThreadProxy createThreadWrapper(long id) { ++ return new RemoteLOONGARCH64Thread(debugger, id); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/loongarch64/RemoteLOONGARCH64Thread.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/loongarch64/RemoteLOONGARCH64Thread.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/loongarch64/RemoteLOONGARCH64Thread.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/loongarch64/RemoteLOONGARCH64Thread.java 2024-02-20 10:42:37.658862267 +0800 +@@ -0,0 +1,54 @@ ++/* ++ * Copyright (c) 2002, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++package sun.jvm.hotspot.debugger.remote.loongarch64; ++ ++import sun.jvm.hotspot.debugger.*; ++import sun.jvm.hotspot.debugger.loongarch64.*; ++import sun.jvm.hotspot.debugger.remote.*; ++import sun.jvm.hotspot.utilities.*; ++ ++public class RemoteLOONGARCH64Thread extends RemoteThread { ++ public RemoteLOONGARCH64Thread(RemoteDebuggerClient debugger, Address addr) { ++ super(debugger, addr); ++ } ++ ++ public RemoteLOONGARCH64Thread(RemoteDebuggerClient debugger, long id) { ++ super(debugger, id); ++ } ++ ++ public ThreadContext getContext() throws IllegalThreadStateException { ++ RemoteLOONGARCH64ThreadContext context = new RemoteLOONGARCH64ThreadContext(debugger); ++ long[] regs = (addr != null)? debugger.getThreadIntegerRegisterSet(addr) : ++ debugger.getThreadIntegerRegisterSet(id); ++ if (Assert.ASSERTS_ENABLED) { ++ Assert.that(regs.length == LOONGARCH64ThreadContext.NPRGREG, "size of register set must match"); ++ } ++ for (int i = 0; i < regs.length; i++) { ++ context.setRegister(i, regs[i]); ++ } ++ return context; ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/RemoteDebuggerClient.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/RemoteDebuggerClient.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/RemoteDebuggerClient.java 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/remote/RemoteDebuggerClient.java 2024-02-20 10:42:37.658862267 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2019, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package sun.jvm.hotspot.debugger.remote; + + import java.rmi.*; +@@ -33,6 +39,7 @@ + import sun.jvm.hotspot.debugger.remote.x86.*; + import sun.jvm.hotspot.debugger.remote.amd64.*; + import sun.jvm.hotspot.debugger.remote.ppc64.*; ++import sun.jvm.hotspot.debugger.remote.loongarch64.*; + + /** An implementation of Debugger which wraps a + RemoteDebugger, providing remote debugging via RMI. +@@ -61,6 +68,10 @@ + threadFactory = new RemoteAMD64ThreadFactory(this); + } else if (cpu.equals("ppc64")) { + threadFactory = new RemotePPC64ThreadFactory(this); ++ } else if (cpu.equals("loongarch64")) { ++ threadFactory = new RemoteLOONGARCH64ThreadFactory(this); ++ cachePageSize = 4096; ++ cacheNumPages = parseCacheNumPagesProperty(cacheSize / cachePageSize); + } else { + try { + Class tf = Class.forName("sun.jvm.hotspot.debugger.remote." + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/HotSpotAgent.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/HotSpotAgent.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/HotSpotAgent.java 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/HotSpotAgent.java 2024-02-20 10:42:37.645528945 +0800 +@@ -23,6 +23,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2018, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ * ++ */ + package sun.jvm.hotspot; + + import java.rmi.RemoteException; +@@ -38,6 +44,7 @@ + import sun.jvm.hotspot.debugger.MachineDescriptionAArch64; + import sun.jvm.hotspot.debugger.MachineDescriptionRISCV64; + import sun.jvm.hotspot.debugger.MachineDescriptionIntelX86; ++import sun.jvm.hotspot.debugger.MachineDescriptionLOONGARCH64; + import sun.jvm.hotspot.debugger.NoSuchSymbolException; + import sun.jvm.hotspot.debugger.bsd.BsdDebuggerLocal; + import sun.jvm.hotspot.debugger.linux.LinuxDebuggerLocal; +@@ -557,6 +564,8 @@ + machDesc = new MachineDescriptionAArch64(); + } else if (cpu.equals("riscv64")) { + machDesc = new MachineDescriptionRISCV64(); ++ } else if (cpu.equals("loongarch64")) { ++ machDesc = new MachineDescriptionLOONGARCH64(); + } else { + try { + machDesc = (MachineDescription) +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/linux_loongarch64/LinuxLOONGARCH64JavaThreadPDAccess.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/linux_loongarch64/LinuxLOONGARCH64JavaThreadPDAccess.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/linux_loongarch64/LinuxLOONGARCH64JavaThreadPDAccess.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/linux_loongarch64/LinuxLOONGARCH64JavaThreadPDAccess.java 2024-02-20 10:42:37.678862251 +0800 +@@ -0,0 +1,139 @@ ++/* ++ * Copyright (c) 2014, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++package sun.jvm.hotspot.runtime.linux_loongarch64; ++ ++import java.io.*; ++import java.util.*; ++ ++import sun.jvm.hotspot.debugger.*; ++import sun.jvm.hotspot.debugger.loongarch64.*; ++import sun.jvm.hotspot.runtime.*; ++import sun.jvm.hotspot.runtime.loongarch64.*; ++import sun.jvm.hotspot.types.*; ++import sun.jvm.hotspot.utilities.*; ++import sun.jvm.hotspot.utilities.Observable; ++import sun.jvm.hotspot.utilities.Observer; ++ ++public class LinuxLOONGARCH64JavaThreadPDAccess implements JavaThreadPDAccess { ++ private static AddressField lastJavaFPField; ++ private static AddressField osThreadField; ++ ++ // Field from OSThread ++ private static CIntegerField osThreadThreadIDField; ++ ++ // This is currently unneeded but is being kept in case we change ++ // the currentFrameGuess algorithm ++ private static final long GUESS_SCAN_RANGE = 128 * 1024; ++ ++ static { ++ VM.registerVMInitializedObserver(new Observer() { ++ public void update(Observable o, Object data) { ++ initialize(VM.getVM().getTypeDataBase()); ++ } ++ }); ++ } ++ ++ private static synchronized void initialize(TypeDataBase db) { ++ Type type = db.lookupType("JavaThread"); ++ osThreadField = type.getAddressField("_osthread"); ++ ++ Type anchorType = db.lookupType("JavaFrameAnchor"); ++ lastJavaFPField = anchorType.getAddressField("_last_Java_fp"); ++ ++ Type osThreadType = db.lookupType("OSThread"); ++ osThreadThreadIDField = osThreadType.getCIntegerField("_thread_id"); ++ } ++ ++ public Address getLastJavaFP(Address addr) { ++ return lastJavaFPField.getValue(addr.addOffsetTo(sun.jvm.hotspot.runtime.JavaThread.getAnchorField().getOffset())); ++ } ++ ++ public Address getLastJavaPC(Address addr) { ++ return null; ++ } ++ ++ public Address getBaseOfStackPointer(Address addr) { ++ return null; ++ } ++ ++ public Frame getLastFramePD(JavaThread thread, Address addr) { ++ Address fp = thread.getLastJavaFP(); ++ if (fp == null) { ++ return null; // no information ++ } ++ return new LOONGARCH64Frame(thread.getLastJavaSP(), fp); ++ } ++ ++ public RegisterMap newRegisterMap(JavaThread thread, boolean updateMap) { ++ return new LOONGARCH64RegisterMap(thread, updateMap); ++ } ++ ++ public Frame getCurrentFrameGuess(JavaThread thread, Address addr) { ++ ThreadProxy t = getThreadProxy(addr); ++ LOONGARCH64ThreadContext context = (LOONGARCH64ThreadContext) t.getContext(); ++ LOONGARCH64CurrentFrameGuess guesser = new LOONGARCH64CurrentFrameGuess(context, thread); ++ if (!guesser.run(GUESS_SCAN_RANGE)) { ++ return null; ++ } ++ if (guesser.getPC() == null) { ++ return new LOONGARCH64Frame(guesser.getSP(), guesser.getFP()); ++ } else if (VM.getVM().getInterpreter().contains(guesser.getPC())) { ++ // pass the value of S0 which contains the bcp for the top level frame ++ Address bcp = context.getRegisterAsAddress(LOONGARCH64ThreadContext.S0); ++ return new LOONGARCH64Frame(guesser.getSP(), guesser.getFP(), guesser.getPC(), null, bcp); ++ } else { ++ return new LOONGARCH64Frame(guesser.getSP(), guesser.getFP(), guesser.getPC()); ++ } ++ } ++ ++ public void printThreadIDOn(Address addr, PrintStream tty) { ++ tty.print(getThreadProxy(addr)); ++ } ++ ++ public void printInfoOn(Address threadAddr, PrintStream tty) { ++ tty.print("Thread id: "); ++ printThreadIDOn(threadAddr, tty); ++ // tty.println("\nPostJavaState: " + getPostJavaState(threadAddr)); ++ } ++ ++ public Address getLastSP(Address addr) { ++ ThreadProxy t = getThreadProxy(addr); ++ LOONGARCH64ThreadContext context = (LOONGARCH64ThreadContext) t.getContext(); ++ return context.getRegisterAsAddress(LOONGARCH64ThreadContext.SP); ++ } ++ ++ public ThreadProxy getThreadProxy(Address addr) { ++ // Addr is the address of the JavaThread. ++ // Fetch the OSThread (for now and for simplicity, not making a ++ // separate "OSThread" class in this package) ++ Address osThreadAddr = osThreadField.getValue(addr); ++ // Get the address of the _thread_id from the OSThread ++ Address threadIdAddr = osThreadAddr.addOffsetTo(osThreadThreadIDField.getOffset()); ++ ++ JVMDebugger debugger = VM.getVM().getDebugger(); ++ return debugger.getThreadForIdentifierAddress(threadIdAddr); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64CurrentFrameGuess.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64CurrentFrameGuess.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64CurrentFrameGuess.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64CurrentFrameGuess.java 2024-02-20 10:42:37.678862251 +0800 +@@ -0,0 +1,250 @@ ++/* ++ * Copyright (c) 2001, 2006, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++package sun.jvm.hotspot.runtime.loongarch64; ++ ++import sun.jvm.hotspot.debugger.*; ++import sun.jvm.hotspot.debugger.loongarch64.*; ++import sun.jvm.hotspot.code.*; ++import sun.jvm.hotspot.interpreter.*; ++import sun.jvm.hotspot.runtime.*; ++ ++/**

Should be able to be used on all loongarch64 platforms we support ++ (Win32, Solaris/loongarch64, and soon Linux) to implement JavaThread's ++ "currentFrameGuess()" functionality. Input is an LOONGARCH64ThreadContext; ++ output is SP, FP, and PC for an LOONGARCH64Frame. Instantiation of the ++ LOONGARCH64Frame is left to the caller, since we may need to subclass ++ LOONGARCH64Frame to support signal handler frames on Unix platforms.

++ ++

Algorithm is to walk up the stack within a given range (say, ++ 512K at most) looking for a plausible PC and SP for a Java frame, ++ also considering those coming in from the context. If we find a PC ++ that belongs to the VM (i.e., in generated code like the ++ interpreter or CodeCache) then we try to find an associated EBP. ++ We repeat this until we either find a complete frame or run out of ++ stack to look at.

*/ ++ ++public class LOONGARCH64CurrentFrameGuess { ++ private LOONGARCH64ThreadContext context; ++ private JavaThread thread; ++ private Address spFound; ++ private Address fpFound; ++ private Address pcFound; ++ ++ private static final boolean DEBUG = System.getProperty("sun.jvm.hotspot.runtime.loongarch64.LOONGARCH64Frame.DEBUG") ++ != null; ++ ++ public LOONGARCH64CurrentFrameGuess(LOONGARCH64ThreadContext context, ++ JavaThread thread) { ++ this.context = context; ++ this.thread = thread; ++ } ++ ++ /** Returns false if not able to find a frame within a reasonable range. */ ++ public boolean run(long regionInBytesToSearch) { ++ Address sp = context.getRegisterAsAddress(LOONGARCH64ThreadContext.SP); ++ Address pc = context.getRegisterAsAddress(LOONGARCH64ThreadContext.PC); ++ Address fp = context.getRegisterAsAddress(LOONGARCH64ThreadContext.FP); ++ if (sp == null) { ++ // Bail out if no last java frame eithe ++ if (thread.getLastJavaSP() != null) { ++ setValues(thread.getLastJavaSP(), thread.getLastJavaFP(), null); ++ return true; ++ } ++ // Bail out ++ return false; ++ } ++ Address end = sp.addOffsetTo(regionInBytesToSearch); ++ VM vm = VM.getVM(); ++ ++ setValues(null, null, null); // Assume we're not going to find anything ++ ++ if (vm.isJavaPCDbg(pc)) { ++ if (vm.isClientCompiler()) { ++ // If the topmost frame is a Java frame, we are (pretty much) ++ // guaranteed to have a viable EBP. We should be more robust ++ // than this (we have the potential for losing entire threads' ++ // stack traces) but need to see how much work we really have ++ // to do here. Searching the stack for an (SP, FP) pair is ++ // hard since it's easy to misinterpret inter-frame stack ++ // pointers as base-of-frame pointers; we also don't know the ++ // sizes of C1 frames (not registered in the nmethod) so can't ++ // derive them from ESP. ++ ++ setValues(sp, fp, pc); ++ return true; ++ } else { ++ if (vm.getInterpreter().contains(pc)) { ++ if (DEBUG) { ++ System.out.println("CurrentFrameGuess: choosing interpreter frame: sp = " + ++ sp + ", fp = " + fp + ", pc = " + pc); ++ } ++ setValues(sp, fp, pc); ++ return true; ++ } ++ ++ // For the server compiler, EBP is not guaranteed to be valid ++ // for compiled code. In addition, an earlier attempt at a ++ // non-searching algorithm (see below) failed because the ++ // stack pointer from the thread context was pointing ++ // (considerably) beyond the ostensible end of the stack, into ++ // garbage; walking from the topmost frame back caused a crash. ++ // ++ // This algorithm takes the current PC as a given and tries to ++ // find the correct corresponding SP by walking up the stack ++ // and repeatedly performing stackwalks (very inefficient). ++ // ++ // FIXME: there is something wrong with stackwalking across ++ // adapter frames...this is likely to be the root cause of the ++ // failure with the simpler algorithm below. ++ ++ for (long offset = 0; ++ offset < regionInBytesToSearch; ++ offset += vm.getAddressSize()) { ++ try { ++ Address curSP = sp.addOffsetTo(offset); ++ Frame frame = new LOONGARCH64Frame(curSP, null, pc); ++ RegisterMap map = thread.newRegisterMap(false); ++ while (frame != null) { ++ if (frame.isEntryFrame() && frame.entryFrameIsFirst()) { ++ // We were able to traverse all the way to the ++ // bottommost Java frame. ++ // This sp looks good. Keep it. ++ if (DEBUG) { ++ System.out.println("CurrentFrameGuess: Choosing sp = " + curSP + ", pc = " + pc); ++ } ++ setValues(curSP, null, pc); ++ return true; ++ } ++ frame = frame.sender(map); ++ } ++ } catch (Exception e) { ++ if (DEBUG) { ++ System.out.println("CurrentFrameGuess: Exception " + e + " at offset " + offset); ++ } ++ // Bad SP. Try another. ++ } ++ } ++ ++ // Were not able to find a plausible SP to go with this PC. ++ // Bail out. ++ return false; ++ ++ /* ++ // Original algorithm which does not work because SP was ++ // pointing beyond where it should have: ++ ++ // For the server compiler, EBP is not guaranteed to be valid ++ // for compiled code. We see whether the PC is in the ++ // interpreter and take care of that, otherwise we run code ++ // (unfortunately) duplicated from LOONGARCH64Frame.senderForCompiledFrame. ++ ++ CodeCache cc = vm.getCodeCache(); ++ if (cc.contains(pc)) { ++ CodeBlob cb = cc.findBlob(pc); ++ ++ // See if we can derive a frame pointer from SP and PC ++ // NOTE: This is the code duplicated from LOONGARCH64Frame ++ Address saved_fp = null; ++ int llink_offset = cb.getLinkOffset(); ++ if (llink_offset >= 0) { ++ // Restore base-pointer, since next frame might be an interpreter frame. ++ Address fp_addr = sp.addOffsetTo(VM.getVM().getAddressSize() * llink_offset); ++ saved_fp = fp_addr.getAddressAt(0); ++ } ++ ++ setValues(sp, saved_fp, pc); ++ return true; ++ } ++ */ ++ } ++ } else { ++ // If the current program counter was not known to us as a Java ++ // PC, we currently assume that we are in the run-time system ++ // and attempt to look to thread-local storage for saved ESP and ++ // EBP. Note that if these are null (because we were, in fact, ++ // in Java code, i.e., vtable stubs or similar, and the SA ++ // didn't have enough insight into the target VM to understand ++ // that) then we are going to lose the entire stack trace for ++ // the thread, which is sub-optimal. FIXME. ++ ++ if (DEBUG) { ++ System.out.println("CurrentFrameGuess: choosing last Java frame: sp = " + ++ thread.getLastJavaSP() + ", fp = " + thread.getLastJavaFP()); ++ } ++ if (thread.getLastJavaSP() == null) { ++ return false; // No known Java frames on stack ++ } ++ ++ // The runtime has a nasty habit of not saving fp in the frame ++ // anchor, leaving us to grovel about in the stack to find a ++ // plausible address. Fortunately, this only happens in ++ // compiled code; there we always have a valid PC, and we always ++ // push LR and FP onto the stack as a pair, with FP at the lower ++ // address. ++ pc = thread.getLastJavaPC(); ++ fp = thread.getLastJavaFP(); ++ sp = thread.getLastJavaSP(); ++ ++ if (fp == null) { ++ CodeCache cc = vm.getCodeCache(); ++ if (cc.contains(pc)) { ++ CodeBlob cb = cc.findBlob(pc); ++ if (DEBUG) { ++ System.out.println("FP is null. Found blob frame size " + cb.getFrameSize()); ++ } ++ // See if we can derive a frame pointer from SP and PC ++ long link_offset = cb.getFrameSize() - 2 * VM.getVM().getAddressSize(); ++ if (link_offset >= 0) { ++ fp = sp.addOffsetTo(link_offset); ++ } ++ } ++ } ++ ++ // We found a PC in the frame anchor. Check that it's plausible, and ++ // if it is, use it. ++ if (vm.isJavaPCDbg(pc)) { ++ setValues(sp, fp, pc); ++ } else { ++ setValues(sp, fp, null); ++ } ++ ++ return true; ++ } ++ } ++ ++ public Address getSP() { return spFound; } ++ public Address getFP() { return fpFound; } ++ /** May be null if getting values from thread-local storage; take ++ care to call the correct LOONGARCH64Frame constructor to recover this if ++ necessary */ ++ public Address getPC() { return pcFound; } ++ ++ private void setValues(Address sp, Address fp, Address pc) { ++ spFound = sp; ++ fpFound = fp; ++ pcFound = pc; ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64Frame.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64Frame.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64Frame.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64Frame.java 2024-02-20 10:42:37.678862251 +0800 +@@ -0,0 +1,540 @@ ++/* ++ * Copyright (c) 2001, 2015, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++package sun.jvm.hotspot.runtime.loongarch64; ++ ++import java.util.*; ++import sun.jvm.hotspot.code.*; ++import sun.jvm.hotspot.compiler.*; ++import sun.jvm.hotspot.debugger.*; ++import sun.jvm.hotspot.oops.*; ++import sun.jvm.hotspot.runtime.*; ++import sun.jvm.hotspot.types.*; ++import sun.jvm.hotspot.utilities.*; ++import sun.jvm.hotspot.utilities.Observable; ++import sun.jvm.hotspot.utilities.Observer; ++ ++/** Specialization of and implementation of abstract methods of the ++ Frame class for the loongarch64 family of CPUs. */ ++ ++public class LOONGARCH64Frame extends Frame { ++ private static final boolean DEBUG; ++ static { ++ DEBUG = System.getProperty("sun.jvm.hotspot.runtime.loongarch64.LOONGARCH64Frame.DEBUG") != null; ++ } ++ ++ private static final int LINK_OFFSET = -2; ++ private static final int RETURN_ADDR_OFFSET = -1; ++ private static final int SENDER_SP_OFFSET = 0; ++ ++ // Interpreter frames ++ private static final int INTERPRETER_FRAME_SENDER_SP_OFFSET = -3; ++ private static final int INTERPRETER_FRAME_LAST_SP_OFFSET = INTERPRETER_FRAME_SENDER_SP_OFFSET - 1; ++ private static final int INTERPRETER_FRAME_LOCALS_OFFSET = INTERPRETER_FRAME_LAST_SP_OFFSET - 1; ++ private static final int INTERPRETER_FRAME_METHOD_OFFSET = INTERPRETER_FRAME_LOCALS_OFFSET - 1; ++ private static final int INTERPRETER_FRAME_MIRROR_OFFSET = INTERPRETER_FRAME_METHOD_OFFSET - 1; ++ private static final int INTERPRETER_FRAME_MDX_OFFSET = INTERPRETER_FRAME_MIRROR_OFFSET - 1; ++ private static final int INTERPRETER_FRAME_CACHE_OFFSET = INTERPRETER_FRAME_MDX_OFFSET - 1; ++ private static final int INTERPRETER_FRAME_BCX_OFFSET = INTERPRETER_FRAME_CACHE_OFFSET - 1; ++ private static final int INTERPRETER_FRAME_INITIAL_SP_OFFSET = INTERPRETER_FRAME_BCX_OFFSET - 1; ++ private static final int INTERPRETER_FRAME_MONITOR_BLOCK_TOP_OFFSET = INTERPRETER_FRAME_INITIAL_SP_OFFSET; ++ private static final int INTERPRETER_FRAME_MONITOR_BLOCK_BOTTOM_OFFSET = INTERPRETER_FRAME_INITIAL_SP_OFFSET; ++ ++ // Entry frames ++ private static final int ENTRY_FRAME_CALL_WRAPPER_OFFSET = -3; ++ ++ private static VMReg fp = new VMReg(22 << 1); ++ ++ // an additional field beyond sp and pc: ++ Address raw_fp; // frame pointer ++ private Address raw_unextendedSP; ++ private Address live_bcp; ++ ++ private LOONGARCH64Frame() { ++ } ++ ++ private void adjustForDeopt() { ++ if ( pc != null) { ++ // Look for a deopt pc and if it is deopted convert to original pc ++ CodeBlob cb = VM.getVM().getCodeCache().findBlob(pc); ++ if (cb != null && cb.isJavaMethod()) { ++ NMethod nm = (NMethod) cb; ++ if (pc.equals(nm.deoptHandlerBegin())) { ++ if (Assert.ASSERTS_ENABLED) { ++ Assert.that(this.getUnextendedSP() != null, "null SP in Java frame"); ++ } ++ // adjust pc if frame is deoptimized. ++ pc = this.getUnextendedSP().getAddressAt(nm.origPCOffset()); ++ deoptimized = true; ++ } ++ } ++ } ++ } ++ ++ private void initFrame(Address raw_sp, Address raw_fp, Address pc, Address raw_unextendedSp, Address live_bcp) { ++ this.raw_sp = raw_sp; ++ this.raw_fp = raw_fp; ++ if (raw_unextendedSp == null) { ++ this.raw_unextendedSP = raw_sp; ++ } else { ++ this.raw_unextendedSP = raw_unextendedSp; ++ } ++ if (pc == null) { ++ this.pc = raw_sp.getAddressAt(-1 * VM.getVM().getAddressSize()); ++ } else { ++ this.pc = pc; ++ } ++ this.live_bcp = live_bcp; ++ adjustUnextendedSP(); ++ ++ // Frame must be fully constructed before this call ++ adjustForDeopt(); ++ } ++ ++ public LOONGARCH64Frame(Address raw_sp, Address raw_fp, Address pc) { ++ initFrame(raw_sp, raw_fp, pc, null, null); ++ ++ if (DEBUG) { ++ System.out.println("LOONGARCH64Frame(sp, fp, pc): " + this); ++ dumpStack(); ++ } ++ } ++ ++ public LOONGARCH64Frame(Address raw_sp, Address raw_fp) { ++ initFrame(raw_sp, raw_fp, null, null, null); ++ ++ if (DEBUG) { ++ System.out.println("LOONGARCH64Frame(sp, fp): " + this); ++ dumpStack(); ++ } ++ } ++ ++ public LOONGARCH64Frame(Address raw_sp, Address raw_unextendedSp, Address raw_fp, Address pc) { ++ initFrame(raw_sp, raw_fp, pc, raw_unextendedSp, null); ++ ++ if (DEBUG) { ++ System.out.println("LOONGARCH64Frame(sp, unextendedSP, fp, pc): " + this); ++ dumpStack(); ++ } ++ ++ } ++ ++ public LOONGARCH64Frame(Address raw_sp, Address raw_fp, Address pc, Address raw_unextendedSp, Address live_bcp) { ++ initFrame(raw_sp, raw_fp, pc, raw_unextendedSp, live_bcp); ++ ++ if (DEBUG) { ++ System.out.println("LOONGARCH64Frame(sp, fp, pc, unextendedSP, live_bcp): " + this); ++ dumpStack(); ++ } ++ } ++ ++ public Object clone() { ++ LOONGARCH64Frame frame = new LOONGARCH64Frame(); ++ frame.raw_sp = raw_sp; ++ frame.raw_unextendedSP = raw_unextendedSP; ++ frame.raw_fp = raw_fp; ++ frame.pc = pc; ++ frame.deoptimized = deoptimized; ++ frame.live_bcp = live_bcp; ++ return frame; ++ } ++ ++ public boolean equals(Object arg) { ++ if (arg == null) { ++ return false; ++ } ++ ++ if (!(arg instanceof LOONGARCH64Frame)) { ++ return false; ++ } ++ ++ LOONGARCH64Frame other = (LOONGARCH64Frame) arg; ++ ++ return (AddressOps.equal(getSP(), other.getSP()) && ++ AddressOps.equal(getUnextendedSP(), other.getUnextendedSP()) && ++ AddressOps.equal(getFP(), other.getFP()) && ++ AddressOps.equal(getPC(), other.getPC())); ++ } ++ ++ public int hashCode() { ++ if (raw_sp == null) { ++ return 0; ++ } ++ ++ return raw_sp.hashCode(); ++ } ++ ++ public String toString() { ++ return "sp: " + (getSP() == null? "null" : getSP().toString()) + ++ ", unextendedSP: " + (getUnextendedSP() == null? "null" : getUnextendedSP().toString()) + ++ ", fp: " + (getFP() == null? "null" : getFP().toString()) + ++ ", pc: " + (pc == null? "null" : pc.toString()); ++ } ++ ++ // accessors for the instance variables ++ public Address getFP() { return raw_fp; } ++ public Address getSP() { return raw_sp; } ++ public Address getID() { return raw_sp; } ++ ++ // FIXME: not implemented yet (should be done for Solaris/LOONGARCH) ++ public boolean isSignalHandlerFrameDbg() { return false; } ++ public int getSignalNumberDbg() { return 0; } ++ public String getSignalNameDbg() { return null; } ++ ++ public boolean isInterpretedFrameValid() { ++ if (Assert.ASSERTS_ENABLED) { ++ Assert.that(isInterpretedFrame(), "Not an interpreted frame"); ++ } ++ ++ // These are reasonable sanity checks ++ if (getFP() == null || getFP().andWithMask(0x3) != null) { ++ return false; ++ } ++ ++ if (getSP() == null || getSP().andWithMask(0x3) != null) { ++ return false; ++ } ++ ++ if (getFP().addOffsetTo(INTERPRETER_FRAME_INITIAL_SP_OFFSET * VM.getVM().getAddressSize()).lessThan(getSP())) { ++ return false; ++ } ++ ++ // These are hacks to keep us out of trouble. ++ // The problem with these is that they mask other problems ++ if (getFP().lessThanOrEqual(getSP())) { ++ // this attempts to deal with unsigned comparison above ++ return false; ++ } ++ ++ if (getFP().minus(getSP()) > 4096 * VM.getVM().getAddressSize()) { ++ // stack frames shouldn't be large. ++ return false; ++ } ++ ++ return true; ++ } ++ ++ // FIXME: not applicable in current system ++ // void patch_pc(Thread* thread, address pc); ++ ++ public Frame sender(RegisterMap regMap, CodeBlob cb) { ++ LOONGARCH64RegisterMap map = (LOONGARCH64RegisterMap) regMap; ++ ++ if (Assert.ASSERTS_ENABLED) { ++ Assert.that(map != null, "map must be set"); ++ } ++ ++ // Default is we done have to follow them. The sender_for_xxx will ++ // update it accordingly ++ map.setIncludeArgumentOops(false); ++ ++ if (isEntryFrame()) return senderForEntryFrame(map); ++ if (isInterpretedFrame()) return senderForInterpreterFrame(map); ++ ++ if(cb == null) { ++ cb = VM.getVM().getCodeCache().findBlob(getPC()); ++ } else { ++ if (Assert.ASSERTS_ENABLED) { ++ Assert.that(cb.equals(VM.getVM().getCodeCache().findBlob(getPC())), "Must be the same"); ++ } ++ } ++ ++ if (cb != null) { ++ return senderForCompiledFrame(map, cb); ++ } ++ ++ // Must be native-compiled frame, i.e. the marshaling code for native ++ // methods that exists in the core system. ++ return new LOONGARCH64Frame(getSenderSP(), getLink(), getSenderPC()); ++ } ++ ++ private Frame senderForEntryFrame(LOONGARCH64RegisterMap map) { ++ if (DEBUG) { ++ System.out.println("senderForEntryFrame"); ++ } ++ if (Assert.ASSERTS_ENABLED) { ++ Assert.that(map != null, "map must be set"); ++ } ++ // Java frame called from C; skip all C frames and return top C ++ // frame of that chunk as the sender ++ LOONGARCH64JavaCallWrapper jcw = (LOONGARCH64JavaCallWrapper) getEntryFrameCallWrapper(); ++ if (Assert.ASSERTS_ENABLED) { ++ Assert.that(!entryFrameIsFirst(), "next Java fp must be non zero"); ++ Assert.that(jcw.getLastJavaSP().greaterThan(getSP()), "must be above this frame on stack"); ++ } ++ LOONGARCH64Frame fr; ++ if (jcw.getLastJavaPC() != null) { ++ fr = new LOONGARCH64Frame(jcw.getLastJavaSP(), jcw.getLastJavaFP(), jcw.getLastJavaPC()); ++ } else { ++ fr = new LOONGARCH64Frame(jcw.getLastJavaSP(), jcw.getLastJavaFP()); ++ } ++ map.clear(); ++ if (Assert.ASSERTS_ENABLED) { ++ Assert.that(map.getIncludeArgumentOops(), "should be set by clear"); ++ } ++ return fr; ++ } ++ ++ //------------------------------------------------------------------------------ ++ // frame::adjust_unextended_sp ++ private void adjustUnextendedSP() { ++ // On loongarch, sites calling method handle intrinsics and lambda forms are treated ++ // as any other call site. Therefore, no special action is needed when we are ++ // returning to any of these call sites. ++ ++ CodeBlob cb = cb(); ++ NMethod senderNm = (cb == null) ? null : cb.asNMethodOrNull(); ++ if (senderNm != null) { ++ // If the sender PC is a deoptimization point, get the original PC. ++ if (senderNm.isDeoptEntry(getPC()) || ++ senderNm.isDeoptMhEntry(getPC())) { ++ // DEBUG_ONLY(verifyDeoptriginalPc(senderNm, raw_unextendedSp)); ++ } ++ } ++ } ++ ++ private Frame senderForInterpreterFrame(LOONGARCH64RegisterMap map) { ++ if (DEBUG) { ++ System.out.println("senderForInterpreterFrame"); ++ } ++ Address unextendedSP = addressOfStackSlot(INTERPRETER_FRAME_SENDER_SP_OFFSET).getAddressAt(0); ++ Address sp = getSenderSP(); ++ // We do not need to update the callee-save register mapping because above ++ // us is either another interpreter frame or a converter-frame, but never ++ // directly a compiled frame. ++ // 11/24/04 SFG. With the removal of adapter frames this is no longer true. ++ // However c2 no longer uses callee save register for java calls so there ++ // are no callee register to find. ++ ++ if (map.getUpdateMap()) ++ updateMapWithSavedLink(map, addressOfStackSlot(LINK_OFFSET)); ++ ++ return new LOONGARCH64Frame(sp, unextendedSP, getLink(), getSenderPC()); ++ } ++ ++ private void updateMapWithSavedLink(RegisterMap map, Address savedFPAddr) { ++ map.setLocation(fp, savedFPAddr); ++ } ++ ++ private Frame senderForCompiledFrame(LOONGARCH64RegisterMap map, CodeBlob cb) { ++ if (DEBUG) { ++ System.out.println("senderForCompiledFrame"); ++ } ++ ++ // ++ // NOTE: some of this code is (unfortunately) duplicated in LOONGARCH64CurrentFrameGuess ++ // ++ ++ if (Assert.ASSERTS_ENABLED) { ++ Assert.that(map != null, "map must be set"); ++ } ++ ++ // frame owned by optimizing compiler ++ if (Assert.ASSERTS_ENABLED) { ++ Assert.that(cb.getFrameSize() > 0, "must have non-zero frame size"); ++ } ++ Address senderSP = getUnextendedSP().addOffsetTo(cb.getFrameSize()); ++ ++ // On Intel the return_address is always the word on the stack ++ Address senderPC = senderSP.getAddressAt(-1 * VM.getVM().getAddressSize()); ++ ++ // This is the saved value of EBP which may or may not really be an FP. ++ // It is only an FP if the sender is an interpreter frame (or C1?). ++ Address savedFPAddr = senderSP.addOffsetTo(-2 * VM.getVM().getAddressSize()); ++ ++ if (map.getUpdateMap()) { ++ // Tell GC to use argument oopmaps for some runtime stubs that need it. ++ // For C1, the runtime stub might not have oop maps, so set this flag ++ // outside of update_register_map. ++ map.setIncludeArgumentOops(cb.callerMustGCArguments()); ++ ++ if (cb.getOopMaps() != null) { ++ ImmutableOopMapSet.updateRegisterMap(this, cb, map, true); ++ } ++ ++ // Since the prolog does the save and restore of EBP there is no oopmap ++ // for it so we must fill in its location as if there was an oopmap entry ++ // since if our caller was compiled code there could be live jvm state in it. ++ updateMapWithSavedLink(map, savedFPAddr); ++ } ++ ++ return new LOONGARCH64Frame(senderSP, savedFPAddr.getAddressAt(0), senderPC); ++ } ++ ++ protected boolean hasSenderPD() { ++ // FIXME ++ // Check for null ebp? Need to do some tests. ++ return true; ++ } ++ ++ public long frameSize() { ++ return (getSenderSP().minus(getSP()) / VM.getVM().getAddressSize()); ++ } ++ ++ public Address getLink() { ++ return addressOfStackSlot(LINK_OFFSET).getAddressAt(0); ++ } ++ ++ public Address getUnextendedSP() { return raw_unextendedSP; } ++ ++ // Return address: ++ public Address getSenderPCAddr() { ++ return addressOfStackSlot(RETURN_ADDR_OFFSET); ++ } ++ ++ public Address getSenderPC() { return getSenderPCAddr().getAddressAt(0); } ++ ++ public Address getSenderSP() { ++ return addressOfStackSlot(SENDER_SP_OFFSET); ++ } ++ ++ public Address addressOfInterpreterFrameLocals() { ++ long n = addressOfStackSlot(INTERPRETER_FRAME_LOCALS_OFFSET).getCIntegerAt(0, VM.getVM().getAddressSize(), false); ++ return getFP().addOffsetTo(n * VM.getVM().getAddressSize()); ++ } ++ ++ private Address addressOfInterpreterFrameBCX() { ++ return addressOfStackSlot(INTERPRETER_FRAME_BCX_OFFSET); ++ } ++ ++ public int getInterpreterFrameBCI() { ++ // FIXME: this is not atomic with respect to GC and is unsuitable ++ // for use in a non-debugging, or reflective, system. Need to ++ // figure out how to express this. ++ Address methodHandle = addressOfInterpreterFrameMethod().getAddressAt(0); ++ Method method = (Method)Metadata.instantiateWrapperFor(methodHandle); ++ Address bcp = addressOfInterpreterFrameBCX().getAddressAt(0); ++ ++ // If we are in the top level frame then the bcp may have been set for us. If so then let it ++ // take priority. If we are in a top level interpreter frame, the bcp is live in S0 (on LA) ++ // and not saved in the BCX stack slot. ++ if (live_bcp != null) { ++ // Only use live_bcp if it points within the Method's bytecodes. Sometimes S0 is used ++ // for scratch purposes and is not a valid BCP. If it is not valid, then we stick with ++ // the bcp stored in the frame, which S0 should have been flushed to. ++ if (method.getConstMethod().isAddressInMethod(live_bcp)) { ++ bcp = live_bcp; ++ } ++ } ++ ++ return bcpToBci(bcp, method); ++ } ++ ++ public Address addressOfInterpreterFrameMDX() { ++ return addressOfStackSlot(INTERPRETER_FRAME_MDX_OFFSET); ++ } ++ ++ // FIXME ++ //inline int frame::interpreter_frame_monitor_size() { ++ // return BasicObjectLock::size(); ++ //} ++ ++ // expression stack ++ // (the max_stack arguments are used by the GC; see class FrameClosure) ++ ++ public Address addressOfInterpreterFrameExpressionStack() { ++ Address monitorEnd = interpreterFrameMonitorEnd().address(); ++ return monitorEnd.addOffsetTo(-1 * VM.getVM().getAddressSize()); ++ } ++ ++ public int getInterpreterFrameExpressionStackDirection() { return -1; } ++ ++ // top of expression stack ++ public Address addressOfInterpreterFrameTOS() { ++ return getSP(); ++ } ++ ++ /** Expression stack from top down */ ++ public Address addressOfInterpreterFrameTOSAt(int slot) { ++ return addressOfInterpreterFrameTOS().addOffsetTo(slot * VM.getVM().getAddressSize()); ++ } ++ ++ public Address getInterpreterFrameSenderSP() { ++ if (Assert.ASSERTS_ENABLED) { ++ Assert.that(isInterpretedFrame(), "interpreted frame expected"); ++ } ++ return addressOfStackSlot(INTERPRETER_FRAME_SENDER_SP_OFFSET).getAddressAt(0); ++ } ++ ++ // Monitors ++ public BasicObjectLock interpreterFrameMonitorBegin() { ++ return new BasicObjectLock(addressOfStackSlot(INTERPRETER_FRAME_MONITOR_BLOCK_BOTTOM_OFFSET)); ++ } ++ ++ public BasicObjectLock interpreterFrameMonitorEnd() { ++ Address result = addressOfStackSlot(INTERPRETER_FRAME_MONITOR_BLOCK_TOP_OFFSET).getAddressAt(0); ++ if (Assert.ASSERTS_ENABLED) { ++ // make sure the pointer points inside the frame ++ Assert.that(AddressOps.gt(getFP(), result), "result must < than frame pointer"); ++ Assert.that(AddressOps.lte(getSP(), result), "result must >= than stack pointer"); ++ } ++ return new BasicObjectLock(result); ++ } ++ ++ public int interpreterFrameMonitorSize() { ++ return BasicObjectLock.size(); ++ } ++ ++ // Method ++ public Address addressOfInterpreterFrameMethod() { ++ return addressOfStackSlot(INTERPRETER_FRAME_METHOD_OFFSET); ++ } ++ ++ // Constant pool cache ++ public Address addressOfInterpreterFrameCPCache() { ++ return addressOfStackSlot(INTERPRETER_FRAME_CACHE_OFFSET); ++ } ++ ++ // Entry frames ++ public JavaCallWrapper getEntryFrameCallWrapper() { ++ return new LOONGARCH64JavaCallWrapper(addressOfStackSlot(ENTRY_FRAME_CALL_WRAPPER_OFFSET).getAddressAt(0)); ++ } ++ ++ protected Address addressOfSavedOopResult() { ++ // offset is 2 for compiler2 and 3 for compiler1 ++ return getSP().addOffsetTo((VM.getVM().isClientCompiler() ? 2 : 3) * ++ VM.getVM().getAddressSize()); ++ } ++ ++ protected Address addressOfSavedReceiver() { ++ return getSP().addOffsetTo(-4 * VM.getVM().getAddressSize()); ++ } ++ ++ private void dumpStack() { ++ if (getFP() != null) { ++ for (Address addr = getSP().addOffsetTo(-5 * VM.getVM().getAddressSize()); ++ AddressOps.lte(addr, getFP().addOffsetTo(5 * VM.getVM().getAddressSize())); ++ addr = addr.addOffsetTo(VM.getVM().getAddressSize())) { ++ System.out.println(addr + ": " + addr.getAddressAt(0)); ++ } ++ } else { ++ for (Address addr = getSP().addOffsetTo(-5 * VM.getVM().getAddressSize()); ++ AddressOps.lte(addr, getSP().addOffsetTo(20 * VM.getVM().getAddressSize())); ++ addr = addr.addOffsetTo(VM.getVM().getAddressSize())) { ++ System.out.println(addr + ": " + addr.getAddressAt(0)); ++ } ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64JavaCallWrapper.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64JavaCallWrapper.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64JavaCallWrapper.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64JavaCallWrapper.java 2024-02-20 10:42:37.678862251 +0800 +@@ -0,0 +1,59 @@ ++/* ++ * Copyright (c) 2001, 2002, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++package sun.jvm.hotspot.runtime.loongarch64; ++ ++import java.util.*; ++import sun.jvm.hotspot.debugger.*; ++import sun.jvm.hotspot.types.*; ++import sun.jvm.hotspot.runtime.*; ++import sun.jvm.hotspot.utilities.Observable; ++import sun.jvm.hotspot.utilities.Observer; ++ ++public class LOONGARCH64JavaCallWrapper extends JavaCallWrapper { ++ private static AddressField lastJavaFPField; ++ ++ static { ++ VM.registerVMInitializedObserver(new Observer() { ++ public void update(Observable o, Object data) { ++ initialize(VM.getVM().getTypeDataBase()); ++ } ++ }); ++ } ++ ++ private static synchronized void initialize(TypeDataBase db) { ++ Type type = db.lookupType("JavaFrameAnchor"); ++ ++ lastJavaFPField = type.getAddressField("_last_Java_fp"); ++ } ++ ++ public LOONGARCH64JavaCallWrapper(Address addr) { ++ super(addr); ++ } ++ ++ public Address getLastJavaFP() { ++ return lastJavaFPField.getValue(addr.addOffsetTo(anchorField.getOffset())); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64RegisterMap.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64RegisterMap.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64RegisterMap.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/loongarch64/LOONGARCH64RegisterMap.java 2024-02-20 10:42:37.678862251 +0800 +@@ -0,0 +1,52 @@ ++/* ++ * Copyright (c) 2001, 2012, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2018, 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++package sun.jvm.hotspot.runtime.loongarch64; ++ ++import sun.jvm.hotspot.debugger.*; ++import sun.jvm.hotspot.runtime.*; ++ ++public class LOONGARCH64RegisterMap extends RegisterMap { ++ ++ /** This is the only public constructor */ ++ public LOONGARCH64RegisterMap(JavaThread thread, boolean updateMap) { ++ super(thread, updateMap); ++ } ++ ++ protected LOONGARCH64RegisterMap(RegisterMap map) { ++ super(map); ++ } ++ ++ public Object clone() { ++ LOONGARCH64RegisterMap retval = new LOONGARCH64RegisterMap(this); ++ return retval; ++ } ++ ++ // no PD state to clear or copy: ++ protected void clearPD() {} ++ protected void initializePD() {} ++ protected void initializeFromPD(RegisterMap map) {} ++ protected Address getLocationPD(VMReg reg) { return null; } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/runtime/Threads.java 2024-02-20 10:42:37.678862251 +0800 +@@ -22,6 +22,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2019, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package sun.jvm.hotspot.runtime; + + import java.util.*; +@@ -36,6 +42,7 @@ + import sun.jvm.hotspot.runtime.linux_aarch64.LinuxAARCH64JavaThreadPDAccess; + import sun.jvm.hotspot.runtime.linux_riscv64.LinuxRISCV64JavaThreadPDAccess; + import sun.jvm.hotspot.runtime.linux_ppc64.LinuxPPC64JavaThreadPDAccess; ++import sun.jvm.hotspot.runtime.linux_loongarch64.LinuxLOONGARCH64JavaThreadPDAccess; + import sun.jvm.hotspot.runtime.bsd_x86.BsdX86JavaThreadPDAccess; + import sun.jvm.hotspot.runtime.bsd_amd64.BsdAMD64JavaThreadPDAccess; + import sun.jvm.hotspot.runtime.bsd_aarch64.BsdAARCH64JavaThreadPDAccess; +@@ -116,6 +123,8 @@ + access = new LinuxAARCH64JavaThreadPDAccess(); + } else if (cpu.equals("riscv64")) { + access = new LinuxRISCV64JavaThreadPDAccess(); ++ } else if (cpu.equals("loongarch64")) { ++ access = new LinuxLOONGARCH64JavaThreadPDAccess(); + } else { + try { + access = (JavaThreadPDAccess) +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/PlatformInfo.java b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/PlatformInfo.java +--- a/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/PlatformInfo.java 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/utilities/PlatformInfo.java 2024-02-20 10:42:37.688862243 +0800 +@@ -22,6 +22,13 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2018, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ * ++ */ ++ + package sun.jvm.hotspot.utilities; + + /** Provides canonicalized OS and CPU information for the rest of the +@@ -50,7 +57,7 @@ + + public static boolean knownCPU(String cpu) { + final String[] KNOWN = +- new String[] {"i386", "x86", "x86_64", "amd64", "ppc64", "ppc64le", "aarch64", "riscv64"}; ++ new String[] {"i386", "x86", "x86_64", "amd64", "ppc64", "ppc64le", "aarch64", "riscv64", "loongarch64"}; + + for(String s : KNOWN) { + if(s.equals(cpu)) +@@ -83,6 +90,9 @@ + if (cpu.equals("ppc64le")) + return "ppc64"; + ++ if (cpu.equals("loongarch64")) ++ return "loongarch64"; ++ + return cpu; + + } +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/LoongArch64HotSpotJVMCIBackendFactory.java b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/LoongArch64HotSpotJVMCIBackendFactory.java +--- a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/LoongArch64HotSpotJVMCIBackendFactory.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/LoongArch64HotSpotJVMCIBackendFactory.java 2024-02-20 10:42:37.782195505 +0800 +@@ -0,0 +1,142 @@ ++/* ++ * Copyright (c) 2015, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++package jdk.vm.ci.hotspot.loongarch64; ++ ++import static java.util.Collections.emptyMap; ++import static jdk.vm.ci.common.InitTimer.timer; ++ ++import java.util.EnumSet; ++import java.util.Map; ++ ++import jdk.vm.ci.loongarch64.LoongArch64; ++import jdk.vm.ci.loongarch64.LoongArch64.CPUFeature; ++import jdk.vm.ci.code.Architecture; ++import jdk.vm.ci.code.RegisterConfig; ++import jdk.vm.ci.code.TargetDescription; ++import jdk.vm.ci.code.stack.StackIntrospection; ++import jdk.vm.ci.common.InitTimer; ++import jdk.vm.ci.hotspot.HotSpotCodeCacheProvider; ++import jdk.vm.ci.hotspot.HotSpotConstantReflectionProvider; ++import jdk.vm.ci.hotspot.HotSpotJVMCIBackendFactory; ++import jdk.vm.ci.hotspot.HotSpotJVMCIRuntime; ++import jdk.vm.ci.hotspot.HotSpotMetaAccessProvider; ++import jdk.vm.ci.hotspot.HotSpotStackIntrospection; ++import jdk.vm.ci.meta.ConstantReflectionProvider; ++import jdk.vm.ci.runtime.JVMCIBackend; ++ ++public class LoongArch64HotSpotJVMCIBackendFactory implements HotSpotJVMCIBackendFactory { ++ ++ private static EnumSet computeFeatures(LoongArch64HotSpotVMConfig config) { ++ // Configure the feature set using the HotSpot flag settings. ++ Map constants = config.getStore().getConstants(); ++ return HotSpotJVMCIBackendFactory.convertFeatures(CPUFeature.class, constants, config.vmVersionFeatures, emptyMap()); ++ } ++ ++ private static EnumSet computeFlags(LoongArch64HotSpotVMConfig config) { ++ EnumSet flags = EnumSet.noneOf(LoongArch64.Flag.class); ++ ++ if (config.useLSX) { ++ flags.add(LoongArch64.Flag.useLSX); ++ } ++ if (config.useLASX) { ++ flags.add(LoongArch64.Flag.useLASX); ++ } ++ ++ return flags; ++ } ++ ++ private static TargetDescription createTarget(LoongArch64HotSpotVMConfig config) { ++ final int stackFrameAlignment = 16; ++ final int implicitNullCheckLimit = 4096; ++ final boolean inlineObjects = true; ++ Architecture arch = new LoongArch64(computeFeatures(config), computeFlags(config)); ++ return new TargetDescription(arch, true, stackFrameAlignment, implicitNullCheckLimit, inlineObjects); ++ } ++ ++ protected HotSpotConstantReflectionProvider createConstantReflection(HotSpotJVMCIRuntime runtime) { ++ return new HotSpotConstantReflectionProvider(runtime); ++ } ++ ++ private static RegisterConfig createRegisterConfig(LoongArch64HotSpotVMConfig config, TargetDescription target) { ++ return new LoongArch64HotSpotRegisterConfig(target, config.useCompressedOops); ++ } ++ ++ protected HotSpotCodeCacheProvider createCodeCache(HotSpotJVMCIRuntime runtime, TargetDescription target, RegisterConfig regConfig) { ++ return new HotSpotCodeCacheProvider(runtime, target, regConfig); ++ } ++ ++ protected HotSpotMetaAccessProvider createMetaAccess(HotSpotJVMCIRuntime runtime) { ++ return new HotSpotMetaAccessProvider(runtime); ++ } ++ ++ @Override ++ public String getArchitecture() { ++ return "loongarch64"; ++ } ++ ++ @Override ++ public String toString() { ++ return "JVMCIBackend:" + getArchitecture(); ++ } ++ ++ @Override ++ @SuppressWarnings("try") ++ public JVMCIBackend createJVMCIBackend(HotSpotJVMCIRuntime runtime, JVMCIBackend host) { ++ ++ assert host == null; ++ LoongArch64HotSpotVMConfig config = new LoongArch64HotSpotVMConfig(runtime.getConfigStore()); ++ TargetDescription target = createTarget(config); ++ ++ RegisterConfig regConfig; ++ HotSpotCodeCacheProvider codeCache; ++ ConstantReflectionProvider constantReflection; ++ HotSpotMetaAccessProvider metaAccess; ++ StackIntrospection stackIntrospection; ++ try (InitTimer t = timer("create providers")) { ++ try (InitTimer rt = timer("create MetaAccess provider")) { ++ metaAccess = createMetaAccess(runtime); ++ } ++ try (InitTimer rt = timer("create RegisterConfig")) { ++ regConfig = createRegisterConfig(config, target); ++ } ++ try (InitTimer rt = timer("create CodeCache provider")) { ++ codeCache = createCodeCache(runtime, target, regConfig); ++ } ++ try (InitTimer rt = timer("create ConstantReflection provider")) { ++ constantReflection = createConstantReflection(runtime); ++ } ++ try (InitTimer rt = timer("create StackIntrospection provider")) { ++ stackIntrospection = new HotSpotStackIntrospection(runtime); ++ } ++ } ++ try (InitTimer rt = timer("instantiate backend")) { ++ return createBackend(metaAccess, codeCache, constantReflection, stackIntrospection); ++ } ++ } ++ ++ protected JVMCIBackend createBackend(HotSpotMetaAccessProvider metaAccess, HotSpotCodeCacheProvider codeCache, ConstantReflectionProvider constantReflection, ++ StackIntrospection stackIntrospection) { ++ return new JVMCIBackend(metaAccess, codeCache, constantReflection, stackIntrospection); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/LoongArch64HotSpotRegisterConfig.java b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/LoongArch64HotSpotRegisterConfig.java +--- a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/LoongArch64HotSpotRegisterConfig.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/LoongArch64HotSpotRegisterConfig.java 2024-02-20 10:42:37.782195505 +0800 +@@ -0,0 +1,297 @@ ++/* ++ * Copyright (c) 2015, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++package jdk.vm.ci.hotspot.loongarch64; ++ ++import static jdk.vm.ci.loongarch64.LoongArch64.ra; ++import static jdk.vm.ci.loongarch64.LoongArch64.a0; ++import static jdk.vm.ci.loongarch64.LoongArch64.a1; ++import static jdk.vm.ci.loongarch64.LoongArch64.a2; ++import static jdk.vm.ci.loongarch64.LoongArch64.a3; ++import static jdk.vm.ci.loongarch64.LoongArch64.a4; ++import static jdk.vm.ci.loongarch64.LoongArch64.a5; ++import static jdk.vm.ci.loongarch64.LoongArch64.a6; ++import static jdk.vm.ci.loongarch64.LoongArch64.a7; ++import static jdk.vm.ci.loongarch64.LoongArch64.SCR1; ++import static jdk.vm.ci.loongarch64.LoongArch64.SCR2; ++import static jdk.vm.ci.loongarch64.LoongArch64.t0; ++import static jdk.vm.ci.loongarch64.LoongArch64.v0; ++import static jdk.vm.ci.loongarch64.LoongArch64.s5; ++import static jdk.vm.ci.loongarch64.LoongArch64.s6; ++import static jdk.vm.ci.loongarch64.LoongArch64.sp; ++import static jdk.vm.ci.loongarch64.LoongArch64.fp; ++import static jdk.vm.ci.loongarch64.LoongArch64.tp; ++import static jdk.vm.ci.loongarch64.LoongArch64.rx; ++import static jdk.vm.ci.loongarch64.LoongArch64.f0; ++import static jdk.vm.ci.loongarch64.LoongArch64.f1; ++import static jdk.vm.ci.loongarch64.LoongArch64.f2; ++import static jdk.vm.ci.loongarch64.LoongArch64.f3; ++import static jdk.vm.ci.loongarch64.LoongArch64.f4; ++import static jdk.vm.ci.loongarch64.LoongArch64.f5; ++import static jdk.vm.ci.loongarch64.LoongArch64.f6; ++import static jdk.vm.ci.loongarch64.LoongArch64.f7; ++import static jdk.vm.ci.loongarch64.LoongArch64.fv0; ++import static jdk.vm.ci.loongarch64.LoongArch64.zero; ++ ++import java.util.ArrayList; ++import java.util.HashSet; ++import java.util.List; ++import java.util.Set; ++ ++import jdk.vm.ci.loongarch64.LoongArch64; ++import jdk.vm.ci.code.Architecture; ++import jdk.vm.ci.code.CallingConvention; ++import jdk.vm.ci.code.CallingConvention.Type; ++import jdk.vm.ci.code.Register; ++import jdk.vm.ci.code.RegisterArray; ++import jdk.vm.ci.code.RegisterAttributes; ++import jdk.vm.ci.code.RegisterConfig; ++import jdk.vm.ci.code.StackSlot; ++import jdk.vm.ci.code.TargetDescription; ++import jdk.vm.ci.code.ValueKindFactory; ++import jdk.vm.ci.common.JVMCIError; ++import jdk.vm.ci.hotspot.HotSpotCallingConventionType; ++import jdk.vm.ci.meta.AllocatableValue; ++import jdk.vm.ci.meta.JavaKind; ++import jdk.vm.ci.meta.JavaType; ++import jdk.vm.ci.meta.PlatformKind; ++import jdk.vm.ci.meta.Value; ++import jdk.vm.ci.meta.ValueKind; ++ ++public class LoongArch64HotSpotRegisterConfig implements RegisterConfig { ++ ++ private final TargetDescription target; ++ ++ private final RegisterArray allocatable; ++ ++ /** ++ * The caller saved registers always include all parameter registers. ++ */ ++ private final RegisterArray callerSaved; ++ ++ private final boolean allAllocatableAreCallerSaved; ++ ++ private final RegisterAttributes[] attributesMap; ++ ++ @Override ++ public RegisterArray getAllocatableRegisters() { ++ return allocatable; ++ } ++ ++ @Override ++ public RegisterArray filterAllocatableRegisters(PlatformKind kind, RegisterArray registers) { ++ ArrayList list = new ArrayList<>(); ++ for (Register reg : registers) { ++ if (target.arch.canStoreValue(reg.getRegisterCategory(), kind)) { ++ list.add(reg); ++ } ++ } ++ ++ return new RegisterArray(list); ++ } ++ ++ @Override ++ public RegisterAttributes[] getAttributesMap() { ++ return attributesMap.clone(); ++ } ++ ++ private final RegisterArray javaGeneralParameterRegisters = new RegisterArray(t0, a0, a1, a2, a3, a4, a5, a6, a7); ++ private final RegisterArray nativeGeneralParameterRegisters = new RegisterArray(a0, a1, a2, a3, a4, a5, a6, a7); ++ private final RegisterArray floatParameterRegisters = new RegisterArray(f0, f1, f2, f3, f4, f5, f6, f7); ++ ++ public static final Register heapBaseRegister = s5; ++ public static final Register TREG = s6; ++ ++ private static final RegisterArray reservedRegisters = new RegisterArray(fp, ra, zero, sp, tp, rx, SCR1, SCR2, TREG); ++ ++ private static RegisterArray initAllocatable(Architecture arch, boolean reserveForHeapBase) { ++ RegisterArray allRegisters = arch.getAvailableValueRegisters(); ++ Register[] registers = new Register[allRegisters.size() - reservedRegisters.size() - (reserveForHeapBase ? 1 : 0)]; ++ List reservedRegistersList = reservedRegisters.asList(); ++ ++ int idx = 0; ++ for (Register reg : allRegisters) { ++ if (reservedRegistersList.contains(reg)) { ++ // skip reserved registers ++ continue; ++ } ++ if (reserveForHeapBase && reg.equals(heapBaseRegister)) { ++ // skip heap base register ++ continue; ++ } ++ ++ registers[idx++] = reg; ++ } ++ ++ assert idx == registers.length; ++ return new RegisterArray(registers); ++ } ++ ++ public LoongArch64HotSpotRegisterConfig(TargetDescription target, boolean useCompressedOops) { ++ this(target, initAllocatable(target.arch, useCompressedOops)); ++ assert callerSaved.size() >= allocatable.size(); ++ } ++ ++ public LoongArch64HotSpotRegisterConfig(TargetDescription target, RegisterArray allocatable) { ++ this.target = target; ++ ++ this.allocatable = allocatable; ++ Set callerSaveSet = new HashSet<>(); ++ allocatable.addTo(callerSaveSet); ++ floatParameterRegisters.addTo(callerSaveSet); ++ javaGeneralParameterRegisters.addTo(callerSaveSet); ++ nativeGeneralParameterRegisters.addTo(callerSaveSet); ++ callerSaved = new RegisterArray(callerSaveSet); ++ ++ allAllocatableAreCallerSaved = true; ++ attributesMap = RegisterAttributes.createMap(this, LoongArch64.allRegisters); ++ } ++ ++ @Override ++ public RegisterArray getCallerSaveRegisters() { ++ return callerSaved; ++ } ++ ++ @Override ++ public RegisterArray getCalleeSaveRegisters() { ++ return null; ++ } ++ ++ @Override ++ public boolean areAllAllocatableRegistersCallerSaved() { ++ return allAllocatableAreCallerSaved; ++ } ++ ++ @Override ++ public CallingConvention getCallingConvention(Type type, JavaType returnType, JavaType[] parameterTypes, ValueKindFactory valueKindFactory) { ++ HotSpotCallingConventionType hotspotType = (HotSpotCallingConventionType) type; ++ if (type == HotSpotCallingConventionType.NativeCall) { ++ return callingConvention(nativeGeneralParameterRegisters, returnType, parameterTypes, hotspotType, valueKindFactory); ++ } ++ // On x64, parameter locations are the same whether viewed ++ // from the caller or callee perspective ++ return callingConvention(javaGeneralParameterRegisters, returnType, parameterTypes, hotspotType, valueKindFactory); ++ } ++ ++ @Override ++ public RegisterArray getCallingConventionRegisters(Type type, JavaKind kind) { ++ HotSpotCallingConventionType hotspotType = (HotSpotCallingConventionType) type; ++ switch (kind) { ++ case Boolean: ++ case Byte: ++ case Short: ++ case Char: ++ case Int: ++ case Long: ++ case Object: ++ return hotspotType == HotSpotCallingConventionType.NativeCall ? nativeGeneralParameterRegisters : javaGeneralParameterRegisters; ++ case Float: ++ case Double: ++ return floatParameterRegisters; ++ default: ++ throw JVMCIError.shouldNotReachHere(); ++ } ++ } ++ ++ private CallingConvention callingConvention(RegisterArray generalParameterRegisters, JavaType returnType, JavaType[] parameterTypes, HotSpotCallingConventionType type, ++ ValueKindFactory valueKindFactory) { ++ AllocatableValue[] locations = new AllocatableValue[parameterTypes.length]; ++ ++ int currentGeneral = 0; ++ int currentFloat = 0; ++ int currentStackOffset = 0; ++ ++ for (int i = 0; i < parameterTypes.length; i++) { ++ final JavaKind kind = parameterTypes[i].getJavaKind().getStackKind(); ++ ++ switch (kind) { ++ case Byte: ++ case Boolean: ++ case Short: ++ case Char: ++ case Int: ++ case Long: ++ case Object: ++ if (currentGeneral < generalParameterRegisters.size()) { ++ Register register = generalParameterRegisters.get(currentGeneral++); ++ locations[i] = register.asValue(valueKindFactory.getValueKind(kind)); ++ } ++ break; ++ case Float: ++ case Double: ++ if (currentFloat < floatParameterRegisters.size()) { ++ Register register = floatParameterRegisters.get(currentFloat++); ++ locations[i] = register.asValue(valueKindFactory.getValueKind(kind)); ++ } else if (currentGeneral < generalParameterRegisters.size()) { ++ Register register = generalParameterRegisters.get(currentGeneral++); ++ locations[i] = register.asValue(valueKindFactory.getValueKind(kind)); ++ } ++ break; ++ default: ++ throw JVMCIError.shouldNotReachHere(); ++ } ++ ++ if (locations[i] == null) { ++ ValueKind valueKind = valueKindFactory.getValueKind(kind); ++ locations[i] = StackSlot.get(valueKind, currentStackOffset, !type.out); ++ currentStackOffset += Math.max(valueKind.getPlatformKind().getSizeInBytes(), target.wordSize); ++ } ++ } ++ ++ JavaKind returnKind = returnType == null ? JavaKind.Void : returnType.getJavaKind(); ++ AllocatableValue returnLocation = returnKind == JavaKind.Void ? Value.ILLEGAL : getReturnRegister(returnKind).asValue(valueKindFactory.getValueKind(returnKind.getStackKind())); ++ return new CallingConvention(currentStackOffset, returnLocation, locations); ++ } ++ ++ @Override ++ public Register getReturnRegister(JavaKind kind) { ++ switch (kind) { ++ case Boolean: ++ case Byte: ++ case Char: ++ case Short: ++ case Int: ++ case Long: ++ case Object: ++ return v0; ++ case Float: ++ case Double: ++ return fv0; ++ case Void: ++ case Illegal: ++ return null; ++ default: ++ throw new UnsupportedOperationException("no return register for type " + kind); ++ } ++ } ++ ++ @Override ++ public Register getFrameRegister() { ++ return sp; ++ } ++ ++ @Override ++ public String toString() { ++ return String.format("Allocatable: " + getAllocatableRegisters() + "%n" + "CallerSave: " + getCallerSaveRegisters() + "%n"); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/LoongArch64HotSpotVMConfig.java b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/LoongArch64HotSpotVMConfig.java +--- a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/LoongArch64HotSpotVMConfig.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/LoongArch64HotSpotVMConfig.java 2024-02-20 10:42:37.782195505 +0800 +@@ -0,0 +1,79 @@ ++/* ++ * Copyright (c) 2016, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++package jdk.vm.ci.hotspot.loongarch64; ++ ++import jdk.vm.ci.hotspot.HotSpotVMConfigAccess; ++import jdk.vm.ci.hotspot.HotSpotVMConfigStore; ++import jdk.vm.ci.services.Services; ++ ++/** ++ * Used to access native configuration details. ++ * ++ * All non-static, public fields in this class are so that they can be compiled as constants. ++ */ ++class LoongArch64HotSpotVMConfig extends HotSpotVMConfigAccess { ++ ++ LoongArch64HotSpotVMConfig(HotSpotVMConfigStore config) { ++ super(config); ++ } ++ ++ final boolean useCompressedOops = getFlag("UseCompressedOops", Boolean.class); ++ ++ // CPU Capabilities ++ ++ /* ++ * These flags are set based on the corresponding command line flags. ++ */ ++ final boolean useLSX = getFlag("UseLSX", Boolean.class); ++ final boolean useLASX = getFlag("UseLASX", Boolean.class); ++ ++ final long vmVersionFeatures = getFieldValue("Abstract_VM_Version::_features", Long.class, "uint64_t"); ++ ++ /* ++ * These flags are set if the corresponding support is in the hardware. ++ */ ++ // Checkstyle: stop ++ final long loongarch64LA32 = getConstant("VM_Version::CPU_LA32", Long.class); ++ final long loongarch64LA64 = getConstant("VM_Version::CPU_LA64", Long.class); ++ final long loongarch64LLEXC = getConstant("VM_Version::CPU_LLEXC", Long.class); ++ final long loongarch64SCDLY = getConstant("VM_Version::CPU_SCDLY", Long.class); ++ final long loongarch64LLDBAR = getConstant("VM_Version::CPU_LLDBAR", Long.class); ++ final long loongarch64LBT_X86 = getConstant("VM_Version::CPU_LBT_X86", Long.class); ++ final long loongarch64LBT_ARM = getConstant("VM_Version::CPU_LBT_ARM", Long.class); ++ final long loongarch64LBT_MIPS = getConstant("VM_Version::CPU_LBT_MIPS", Long.class); ++ final long loongarch64CCDMA = getConstant("VM_Version::CPU_CCDMA", Long.class); ++ final long loongarch64COMPLEX = getConstant("VM_Version::CPU_COMPLEX", Long.class); ++ final long loongarch64FP = getConstant("VM_Version::CPU_FP", Long.class); ++ final long loongarch64CRYPTO = getConstant("VM_Version::CPU_CRYPTO", Long.class); ++ final long loongarch64LSX = getConstant("VM_Version::CPU_LSX", Long.class); ++ final long loongarch64LASX = getConstant("VM_Version::CPU_LASX", Long.class); ++ final long loongarch64LAM = getConstant("VM_Version::CPU_LAM", Long.class); ++ final long loongarch64LLSYNC = getConstant("VM_Version::CPU_LLSYNC", Long.class); ++ final long loongarch64TGTSYNC = getConstant("VM_Version::CPU_TGTSYNC", Long.class); ++ final long loongarch64ULSYNC = getConstant("VM_Version::CPU_ULSYNC", Long.class); ++ final long loongarch64LAM_BH = getConstant("VM_Version::CPU_LAM_BH", Long.class); ++ final long loongarch64LAMCAS = getConstant("VM_Version::CPU_LAMCAS", Long.class); ++ final long loongarch64UAL = getConstant("VM_Version::CPU_UAL", Long.class); ++ // Checkstyle: resume ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/package-info.java b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/package-info.java +--- a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/package-info.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/hotspot/loongarch64/package-info.java 2024-02-20 10:42:37.782195505 +0800 +@@ -0,0 +1,28 @@ ++/* ++ * Copyright (c) 2018, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++/** ++ * The LoongArch64 HotSpot specific portions of the JVMCI API. ++ */ ++package jdk.vm.ci.hotspot.loongarch64; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/loongarch64/LoongArch64.java b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/loongarch64/LoongArch64.java +--- a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/loongarch64/LoongArch64.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/loongarch64/LoongArch64.java 2024-02-20 10:42:37.782195505 +0800 +@@ -0,0 +1,251 @@ ++/* ++ * Copyright (c) 2015, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++package jdk.vm.ci.loongarch64; ++ ++import java.nio.ByteOrder; ++import java.util.EnumSet; ++ ++import jdk.vm.ci.code.Architecture; ++import jdk.vm.ci.code.CPUFeatureName; ++import jdk.vm.ci.code.Register; ++import jdk.vm.ci.code.Register.RegisterCategory; ++import jdk.vm.ci.code.RegisterArray; ++import jdk.vm.ci.meta.JavaKind; ++import jdk.vm.ci.meta.PlatformKind; ++ ++/** ++ * Represents the LoongArch64 architecture. ++ */ ++public class LoongArch64 extends Architecture { ++ ++ public static final RegisterCategory CPU = new RegisterCategory("CPU"); ++ ++ // General purpose CPU registers ++ public static final Register zero = new Register(0, 0, "r0", CPU); ++ public static final Register ra = new Register(1, 1, "r1", CPU); ++ public static final Register tp = new Register(2, 2, "r2", CPU); ++ public static final Register sp = new Register(3, 3, "r3", CPU); ++ public static final Register a0 = new Register(4, 4, "r4", CPU); ++ public static final Register a1 = new Register(5, 5, "r5", CPU); ++ public static final Register a2 = new Register(6, 6, "r6", CPU); ++ public static final Register a3 = new Register(7, 7, "r7", CPU); ++ public static final Register a4 = new Register(8, 8, "r8", CPU); ++ public static final Register a5 = new Register(9, 9, "r9", CPU); ++ public static final Register a6 = new Register(10, 10, "r10", CPU); ++ public static final Register a7 = new Register(11, 11, "r11", CPU); ++ public static final Register t0 = new Register(12, 12, "r12", CPU); ++ public static final Register t1 = new Register(13, 13, "r13", CPU); ++ public static final Register t2 = new Register(14, 14, "r14", CPU); ++ public static final Register t3 = new Register(15, 15, "r15", CPU); ++ public static final Register t4 = new Register(16, 16, "r16", CPU); ++ public static final Register t5 = new Register(17, 17, "r17", CPU); ++ public static final Register t6 = new Register(18, 18, "r18", CPU); ++ public static final Register t7 = new Register(19, 19, "r19", CPU); ++ public static final Register t8 = new Register(20, 20, "r20", CPU); ++ public static final Register rx = new Register(21, 21, "r21", CPU); ++ public static final Register fp = new Register(22, 22, "r22", CPU); ++ public static final Register s0 = new Register(23, 23, "r23", CPU); ++ public static final Register s1 = new Register(24, 24, "r24", CPU); ++ public static final Register s2 = new Register(25, 25, "r25", CPU); ++ public static final Register s3 = new Register(26, 26, "r26", CPU); ++ public static final Register s4 = new Register(27, 27, "r27", CPU); ++ public static final Register s5 = new Register(28, 28, "r28", CPU); ++ public static final Register s6 = new Register(29, 29, "r29", CPU); ++ public static final Register s7 = new Register(30, 30, "r30", CPU); ++ public static final Register s8 = new Register(31, 31, "r31", CPU); ++ ++ public static final Register SCR1 = t7; ++ public static final Register SCR2 = t4; ++ public static final Register v0 = a0; ++ ++ // @formatter:off ++ public static final RegisterArray cpuRegisters = new RegisterArray( ++ zero, ra, tp, sp, a0, a1, a2, a3, ++ a4, a5, a6, a7, t0, t1, t2, t3, ++ t4, t5, t6, t7, t8, rx, fp, s0, ++ s1, s2, s3, s4, s5, s6, s7, s8 ++ ); ++ // @formatter:on ++ ++ public static final RegisterCategory SIMD = new RegisterCategory("SIMD"); ++ ++ // Simd registers ++ public static final Register f0 = new Register(32, 0, "f0", SIMD); ++ public static final Register f1 = new Register(33, 1, "f1", SIMD); ++ public static final Register f2 = new Register(34, 2, "f2", SIMD); ++ public static final Register f3 = new Register(35, 3, "f3", SIMD); ++ public static final Register f4 = new Register(36, 4, "f4", SIMD); ++ public static final Register f5 = new Register(37, 5, "f5", SIMD); ++ public static final Register f6 = new Register(38, 6, "f6", SIMD); ++ public static final Register f7 = new Register(39, 7, "f7", SIMD); ++ public static final Register f8 = new Register(40, 8, "f8", SIMD); ++ public static final Register f9 = new Register(41, 9, "f9", SIMD); ++ public static final Register f10 = new Register(42, 10, "f10", SIMD); ++ public static final Register f11 = new Register(43, 11, "f11", SIMD); ++ public static final Register f12 = new Register(44, 12, "f12", SIMD); ++ public static final Register f13 = new Register(45, 13, "f13", SIMD); ++ public static final Register f14 = new Register(46, 14, "f14", SIMD); ++ public static final Register f15 = new Register(47, 15, "f15", SIMD); ++ public static final Register f16 = new Register(48, 16, "f16", SIMD); ++ public static final Register f17 = new Register(49, 17, "f17", SIMD); ++ public static final Register f18 = new Register(50, 18, "f18", SIMD); ++ public static final Register f19 = new Register(51, 19, "f19", SIMD); ++ public static final Register f20 = new Register(52, 20, "f20", SIMD); ++ public static final Register f21 = new Register(53, 21, "f21", SIMD); ++ public static final Register f22 = new Register(54, 22, "f22", SIMD); ++ public static final Register f23 = new Register(55, 23, "f23", SIMD); ++ public static final Register f24 = new Register(56, 24, "f24", SIMD); ++ public static final Register f25 = new Register(57, 25, "f25", SIMD); ++ public static final Register f26 = new Register(58, 26, "f26", SIMD); ++ public static final Register f27 = new Register(59, 27, "f27", SIMD); ++ public static final Register f28 = new Register(60, 28, "f28", SIMD); ++ public static final Register f29 = new Register(61, 29, "f29", SIMD); ++ public static final Register f30 = new Register(62, 30, "f30", SIMD); ++ public static final Register f31 = new Register(63, 31, "f31", SIMD); ++ ++ public static final Register fv0 = f0; ++ ++ // @formatter:off ++ public static final RegisterArray simdRegisters = new RegisterArray( ++ f0, f1, f2, f3, f4, f5, f6, f7, ++ f8, f9, f10, f11, f12, f13, f14, f15, ++ f16, f17, f18, f19, f20, f21, f22, f23, ++ f24, f25, f26, f27, f28, f29, f30, f31 ++ ); ++ // @formatter:on ++ ++ // @formatter:off ++ public static final RegisterArray allRegisters = new RegisterArray( ++ zero, ra, tp, sp, a0, a1, a2, a3, ++ a4, a5, a6, a7, t0, t1, t2, t3, ++ t4, t5, t6, t7, t8, rx, fp, s0, ++ s1, s2, s3, s4, s5, s6, s7, s8, ++ ++ f0, f1, f2, f3, f4, f5, f6, f7, ++ f8, f9, f10, f11, f12, f13, f14, f15, ++ f16, f17, f18, f19, f20, f21, f22, f23, ++ f24, f25, f26, f27, f28, f29, f30, f31 ++ ); ++ // @formatter:on ++ ++ /** ++ * Basic set of CPU features mirroring what is returned from the cpuid instruction. See: ++ * {@code VM_Version::cpuFeatureFlags}. ++ */ ++ public enum CPUFeature implements CPUFeatureName { ++ LA32, ++ LA64, ++ LLEXC, ++ SCDLY, ++ LLDBAR, ++ LBT_X86, ++ LBT_ARM, ++ LBT_MIPS, ++ CCDMA, ++ COMPLEX, ++ FP, ++ CRYPTO, ++ LSX, ++ LASX, ++ LAM, ++ LLSYNC, ++ TGTSYNC, ++ ULSYNC, ++ LAM_BH, ++ LAMCAS, ++ UAL ++ } ++ ++ private final EnumSet features; ++ ++ /** ++ * Set of flags to control code emission. ++ */ ++ public enum Flag { ++ useLSX, ++ useLASX ++ } ++ ++ private final EnumSet flags; ++ ++ public LoongArch64(EnumSet features, EnumSet flags) { ++ super("loongarch64", LoongArch64Kind.QWORD, ByteOrder.LITTLE_ENDIAN, true, allRegisters, 0, 0, 0); ++ this.features = features; ++ this.flags = flags; ++ } ++ ++ @Override ++ public EnumSet getFeatures() { ++ return features; ++ } ++ ++ public EnumSet getFlags() { ++ return flags; ++ } ++ ++ @Override ++ public PlatformKind getPlatformKind(JavaKind javaKind) { ++ switch (javaKind) { ++ case Boolean: ++ case Byte: ++ return LoongArch64Kind.BYTE; ++ case Short: ++ case Char: ++ return LoongArch64Kind.WORD; ++ case Int: ++ return LoongArch64Kind.DWORD; ++ case Long: ++ case Object: ++ return LoongArch64Kind.QWORD; ++ case Float: ++ return LoongArch64Kind.SINGLE; ++ case Double: ++ return LoongArch64Kind.DOUBLE; ++ default: ++ return null; ++ } ++ } ++ ++ @Override ++ public boolean canStoreValue(RegisterCategory category, PlatformKind platformKind) { ++ LoongArch64Kind kind = (LoongArch64Kind) platformKind; ++ if (kind.isInteger()) { ++ return category.equals(CPU); ++ } else if (kind.isSIMD()) { ++ return category.equals(SIMD); ++ } ++ return false; ++ } ++ ++ @Override ++ public LoongArch64Kind getLargestStorableKind(RegisterCategory category) { ++ if (category.equals(CPU)) { ++ return LoongArch64Kind.QWORD; ++ } else if (category.equals(SIMD)) { ++ return LoongArch64Kind.V256_QWORD; ++ } else { ++ return null; ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/loongarch64/LoongArch64Kind.java b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/loongarch64/LoongArch64Kind.java +--- a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/loongarch64/LoongArch64Kind.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/loongarch64/LoongArch64Kind.java 2024-02-20 10:42:37.782195505 +0800 +@@ -0,0 +1,163 @@ ++/* ++ * Copyright (c) 2015, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++package jdk.vm.ci.loongarch64; ++ ++import jdk.vm.ci.meta.PlatformKind; ++ ++public enum LoongArch64Kind implements PlatformKind { ++ ++ // scalar ++ BYTE(1), ++ WORD(2), ++ DWORD(4), ++ QWORD(8), ++ UBYTE(1), ++ UWORD(2), ++ UDWORD(4), ++ SINGLE(4), ++ DOUBLE(8), ++ ++ // SIMD ++ V128_BYTE(16, BYTE), ++ V128_WORD(16, WORD), ++ V128_DWORD(16, DWORD), ++ V128_QWORD(16, QWORD), ++ V128_SINGLE(16, SINGLE), ++ V128_DOUBLE(16, DOUBLE), ++ V256_BYTE(32, BYTE), ++ V256_WORD(32, WORD), ++ V256_DWORD(32, DWORD), ++ V256_QWORD(32, QWORD), ++ V256_SINGLE(32, SINGLE), ++ V256_DOUBLE(32, DOUBLE); ++ ++ private final int size; ++ private final int vectorLength; ++ ++ private final LoongArch64Kind scalar; ++ private final EnumKey key = new EnumKey<>(this); ++ ++ LoongArch64Kind(int size) { ++ this.size = size; ++ this.scalar = this; ++ this.vectorLength = 1; ++ } ++ ++ LoongArch64Kind(int size, LoongArch64Kind scalar) { ++ this.size = size; ++ this.scalar = scalar; ++ ++ assert size % scalar.size == 0; ++ this.vectorLength = size / scalar.size; ++ } ++ ++ public LoongArch64Kind getScalar() { ++ return scalar; ++ } ++ ++ @Override ++ public int getSizeInBytes() { ++ return size; ++ } ++ ++ @Override ++ public int getVectorLength() { ++ return vectorLength; ++ } ++ ++ @Override ++ public Key getKey() { ++ return key; ++ } ++ ++ public boolean isInteger() { ++ switch (this) { ++ case BYTE: ++ case WORD: ++ case DWORD: ++ case QWORD: ++ case UBYTE: ++ case UWORD: ++ case UDWORD: ++ return true; ++ default: ++ return false; ++ } ++ } ++ ++ public boolean isSIMD() { ++ switch (this) { ++ case SINGLE: ++ case DOUBLE: ++ case V128_BYTE: ++ case V128_WORD: ++ case V128_DWORD: ++ case V128_QWORD: ++ case V128_SINGLE: ++ case V128_DOUBLE: ++ case V256_BYTE: ++ case V256_WORD: ++ case V256_DWORD: ++ case V256_QWORD: ++ case V256_SINGLE: ++ case V256_DOUBLE: ++ return true; ++ default: ++ return false; ++ } ++ } ++ ++ @Override ++ public char getTypeChar() { ++ switch (this) { ++ case BYTE: ++ return 'b'; ++ case WORD: ++ return 'w'; ++ case DWORD: ++ return 'd'; ++ case QWORD: ++ return 'q'; ++ case SINGLE: ++ return 'S'; ++ case DOUBLE: ++ return 'D'; ++ case V128_BYTE: ++ case V128_WORD: ++ case V128_DWORD: ++ case V128_QWORD: ++ case V128_SINGLE: ++ case V128_DOUBLE: ++ case V256_BYTE: ++ case V256_WORD: ++ case V256_DWORD: ++ case V256_QWORD: ++ case V256_SINGLE: ++ case V256_DOUBLE: ++ return 'v'; ++ default: ++ return '-'; ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/loongarch64/package-info.java b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/loongarch64/package-info.java +--- a/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/loongarch64/package-info.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/src/jdk.internal.vm.ci/share/classes/jdk/vm/ci/loongarch64/package-info.java 2024-02-20 10:42:37.782195505 +0800 +@@ -0,0 +1,28 @@ ++/* ++ * Copyright (c) 2018, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++/** ++ * The LoongArch64 platform independent portions of the JVMCI API. ++ */ ++package jdk.vm.ci.loongarch64; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/jdk.internal.vm.ci/share/classes/module-info.java b/src/jdk.internal.vm.ci/share/classes/module-info.java +--- a/src/jdk.internal.vm.ci/share/classes/module-info.java 2024-01-17 09:43:21.000000000 +0800 ++++ b/src/jdk.internal.vm.ci/share/classes/module-info.java 2024-02-20 10:42:37.785528835 +0800 +@@ -23,6 +23,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + module jdk.internal.vm.ci { + exports jdk.vm.ci.services to + jdk.internal.vm.compiler, +@@ -40,5 +46,6 @@ + provides jdk.vm.ci.hotspot.HotSpotJVMCIBackendFactory with + jdk.vm.ci.hotspot.aarch64.AArch64HotSpotJVMCIBackendFactory, + jdk.vm.ci.hotspot.amd64.AMD64HotSpotJVMCIBackendFactory, ++ jdk.vm.ci.hotspot.loongarch64.LoongArch64HotSpotJVMCIBackendFactory, + jdk.vm.ci.hotspot.riscv64.RISCV64HotSpotJVMCIBackendFactory; + } +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/src/utils/hsdis/binutils/hsdis-binutils.c b/src/utils/hsdis/binutils/hsdis-binutils.c +--- a/src/utils/hsdis/binutils/hsdis-binutils.c 2024-01-17 09:43:22.000000000 +0800 ++++ b/src/utils/hsdis/binutils/hsdis-binutils.c 2024-02-20 10:42:37.985528676 +0800 +@@ -44,6 +44,12 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /* hsdis.c -- dump a range of addresses as native instructions + This implements the plugin protocol required by the + HotSpot PrintAssembly option. +@@ -501,6 +507,9 @@ + #ifdef LIBARCH_riscv64 + res = "riscv:rv64"; + #endif ++#ifdef LIBARCH_loongarch64 ++ res = "loongarch64"; ++#endif + if (res == NULL) + res = "architecture not set in Makefile!"; + return res; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/.src-rev b/.src-rev +--- a/.src-rev 2024-01-17 09:43:33.000000000 +0800 ++++ b/.src-rev 1970-01-01 08:00:00.000000000 +0800 +@@ -1 +0,0 @@ +-.:git:375769c69868 +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java b/test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java +--- a/test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/arguments/TestCodeEntryAlignment.java 2024-02-20 10:42:38.005528660 +0800 +@@ -23,11 +23,17 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @library /test/lib / + * @bug 8281467 + * @requires vm.flagless +- * @requires os.arch=="amd64" | os.arch=="x86_64" ++ * @requires os.arch=="amd64" | os.arch=="x86_64" | os.arch=="loongarch64" + * + * @summary Test large CodeEntryAlignments are accepted + * @run driver compiler.arguments.TestCodeEntryAlignment +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/arraycopy/stress/TestStressArrayCopy.java b/test/hotspot/jtreg/compiler/arraycopy/stress/TestStressArrayCopy.java +--- a/test/hotspot/jtreg/compiler/arraycopy/stress/TestStressArrayCopy.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/arraycopy/stress/TestStressArrayCopy.java 2024-02-20 10:42:38.008861992 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2022 Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.arraycopy.stress; + + import java.util.ArrayList; +@@ -149,6 +155,14 @@ + // Alternate configs with other flags + configs = alternate(configs, "UseCompressedOops"); + configs = alternate(configs, "UseSIMDForMemoryOps"); ++ } else if (Platform.isLoongArch64()) { ++ // LoongArch64 ++ configs.add(new ArrayList()); ++ ++ // Alternate configs with other flags ++ configs = alternate(configs, "UseLASX"); ++ configs = alternate(configs, "UseLSX"); ++ configs = alternate(configs, "UseCompressedOops"); + } else { + // Generic config. + configs.add(new ArrayList()); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/c2/irTests/CmpUWithZero.java b/test/hotspot/jtreg/compiler/c2/irTests/CmpUWithZero.java +--- a/test/hotspot/jtreg/compiler/c2/irTests/CmpUWithZero.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/c2/irTests/CmpUWithZero.java 2024-02-20 10:42:38.025528645 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.c2.irTests; + + import compiler.lib.ir_framework.*; +@@ -29,7 +35,7 @@ + * @test + * bug 8290529 + * @summary verify that x 3)) | os.arch=="aarch64" ++ * @requires ((os.arch=="amd64" | os.arch=="x86_64") & (vm.opt.UseSSE == "null" | vm.opt.UseSSE > 3)) | os.arch=="aarch64" | os.arch=="loongarch64" + * @library /test/lib / + * @run driver compiler.c2.irTests.TestVectorizeURShiftSubword + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/c2/TestBit.java b/test/hotspot/jtreg/compiler/c2/TestBit.java +--- a/test/hotspot/jtreg/compiler/c2/TestBit.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/c2/TestBit.java 2024-02-20 10:42:38.018861984 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.c2; + + import jdk.test.lib.process.OutputAnalyzer; +@@ -33,7 +39,7 @@ + * @library /test/lib / + * + * @requires vm.flagless +- * @requires os.arch=="aarch64" | os.arch=="amd64" | os.arch == "ppc64le" | os.arch == "riscv64" ++ * @requires os.arch=="aarch64" | os.arch=="amd64" | os.arch == "ppc64le" | os.arch == "riscv64" | os.arch=="loongarch64" + * @requires vm.debug == true & vm.compiler2.enabled + * + * @run driver compiler.c2.TestBit +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java b/test/hotspot/jtreg/compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java +--- a/test/hotspot/jtreg/compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/cpuflags/TestAESIntrinsicsOnSupportedConfig.java 2024-02-20 10:42:38.045528629 +0800 +@@ -22,11 +22,17 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2021, These ++ * modifications are Copyright (c) 2021, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @library /test/lib / + * @modules java.base/jdk.internal.misc + * java.management +- * @requires vm.cpu.features ~= ".*aes.*" & !vm.graal.enabled ++ * @requires (vm.cpu.features ~= ".*aes.*" | os.arch == "loongarch64") & !vm.graal.enabled + * @build jdk.test.whitebox.WhiteBox + * @run driver jdk.test.lib.helpers.ClassFileInstaller jdk.test.whitebox.WhiteBox + * @run main/othervm/timeout=600 -Xbootclasspath/a:. +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/cpuflags/TestAESIntrinsicsOnUnsupportedConfig.java b/test/hotspot/jtreg/compiler/cpuflags/TestAESIntrinsicsOnUnsupportedConfig.java +--- a/test/hotspot/jtreg/compiler/cpuflags/TestAESIntrinsicsOnUnsupportedConfig.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/cpuflags/TestAESIntrinsicsOnUnsupportedConfig.java 2024-02-20 10:42:38.045528629 +0800 +@@ -22,13 +22,19 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2021, These ++ * modifications are Copyright (c) 2021, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @library /test/lib / + * @modules java.base/jdk.internal.misc + * java.management + * + * @build jdk.test.whitebox.WhiteBox +- * @requires !(vm.cpu.features ~= ".*aes.*") ++ * @requires !(vm.cpu.features ~= ".*aes.*" | os.arch == "loongarch64") + * @requires vm.compiler1.enabled | !vm.graal.enabled + * @run driver jdk.test.lib.helpers.ClassFileInstaller jdk.test.whitebox.WhiteBox + * @run main/othervm -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/chacha/TestChaCha20.java b/test/hotspot/jtreg/compiler/intrinsics/chacha/TestChaCha20.java +--- a/test/hotspot/jtreg/compiler/intrinsics/chacha/TestChaCha20.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/chacha/TestChaCha20.java 2024-02-20 10:42:38.055528622 +0800 +@@ -22,6 +22,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.intrinsics.chacha; + + import java.util.ArrayList; +@@ -97,6 +103,12 @@ + System.out.println("Setting up ASIMD worker"); + configs.add(new ArrayList()); + } ++ } else if (Platform.isLoongArch64()) { ++ // LoongArch64 intrinsics require the lsx instructions ++ if (containsFuzzy(cpuFeatures, "lsx")) { ++ System.out.println("Setting up LSX worker"); ++ configs.add(new ArrayList()); ++ } + } else { + // We only have ChaCha20 intrinsics on x64 and aarch64 + // currently. If the platform is neither of these then +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/float16/Binary16Conversion.java b/test/hotspot/jtreg/compiler/intrinsics/float16/Binary16Conversion.java +--- a/test/hotspot/jtreg/compiler/intrinsics/float16/Binary16Conversion.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/float16/Binary16Conversion.java 2024-02-20 10:42:38.055528622 +0800 +@@ -22,10 +22,16 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @bug 8289551 8302976 + * @summary Verify conversion between float and the binary16 format +- * @requires (vm.cpu.features ~= ".*avx512vl.*" | vm.cpu.features ~= ".*f16c.*") | os.arch=="aarch64" ++ * @requires (vm.cpu.features ~= ".*avx512vl.*" | vm.cpu.features ~= ".*f16c.*") | os.arch=="aarch64" | os.arch=="loongarch64" + * @requires vm.compiler1.enabled & vm.compiler2.enabled + * @requires vm.compMode != "Xcomp" + * @comment default run +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/float16/Binary16ConversionNaN.java b/test/hotspot/jtreg/compiler/intrinsics/float16/Binary16ConversionNaN.java +--- a/test/hotspot/jtreg/compiler/intrinsics/float16/Binary16ConversionNaN.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/float16/Binary16ConversionNaN.java 2024-02-20 10:42:38.055528622 +0800 +@@ -22,10 +22,16 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @bug 8289551 8302976 + * @summary Verify NaN sign and significand bits are preserved across conversions +- * @requires (vm.cpu.features ~= ".*avx512vl.*" | vm.cpu.features ~= ".*f16c.*") | os.arch=="aarch64" ++ * @requires (vm.cpu.features ~= ".*avx512vl.*" | vm.cpu.features ~= ".*f16c.*") | os.arch=="aarch64" | os.arch=="loongarch64" + * @requires vm.compiler1.enabled & vm.compiler2.enabled + * @requires vm.compMode != "Xcomp" + * @library /test/lib / +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/float16/TestAllFloat16ToFloat.java b/test/hotspot/jtreg/compiler/intrinsics/float16/TestAllFloat16ToFloat.java +--- a/test/hotspot/jtreg/compiler/intrinsics/float16/TestAllFloat16ToFloat.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/float16/TestAllFloat16ToFloat.java 2024-02-20 10:42:38.055528622 +0800 +@@ -22,10 +22,16 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @bug 8302976 + * @summary Verify conversion between float and the binary16 format +- * @requires (vm.cpu.features ~= ".*avx512vl.*" | vm.cpu.features ~= ".*f16c.*") | os.arch == "aarch64" ++ * @requires (vm.cpu.features ~= ".*avx512vl.*" | vm.cpu.features ~= ".*f16c.*") | os.arch == "aarch64" | os.arch == "loongarch64" + * @requires vm.compiler1.enabled & vm.compiler2.enabled + * @requires vm.compMode != "Xcomp" + * @comment default run: +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/float16/TestConstFloat16ToFloat.java b/test/hotspot/jtreg/compiler/intrinsics/float16/TestConstFloat16ToFloat.java +--- a/test/hotspot/jtreg/compiler/intrinsics/float16/TestConstFloat16ToFloat.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/float16/TestConstFloat16ToFloat.java 2024-02-20 10:42:38.055528622 +0800 +@@ -22,10 +22,16 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @bug 8302976 + * @summary Verify conversion cons between float and the binary16 format +- * @requires (vm.cpu.features ~= ".*avx512vl.*" | vm.cpu.features ~= ".*f16c.*") | os.arch=="aarch64" ++ * @requires (vm.cpu.features ~= ".*avx512vl.*" | vm.cpu.features ~= ".*f16c.*") | os.arch=="aarch64" | os.arch == "loongarch64" + * @requires vm.compiler1.enabled & vm.compiler2.enabled + * @requires vm.compMode != "Xcomp" + * @comment default run: +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForOtherCPU.java b/test/hotspot/jtreg/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForOtherCPU.java +--- a/test/hotspot/jtreg/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForOtherCPU.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/sha/cli/testcases/GenericTestCaseForOtherCPU.java 2024-02-20 10:42:38.058861951 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2021, These ++ * modifications are Copyright (c) 2021, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.intrinsics.sha.cli.testcases; + + import compiler.intrinsics.sha.cli.DigestOptionsBase; +@@ -32,7 +38,7 @@ + + /** + * Generic test case for SHA-related options targeted to any CPU except +- * AArch64, RISCV64, PPC, S390x, and X86. ++ * AArch64, RISCV64, PPC, S390x, LoongArch64, and X86. + */ + public class GenericTestCaseForOtherCPU extends + DigestOptionsBase.TestCase { +@@ -44,14 +50,15 @@ + } + + public GenericTestCaseForOtherCPU(String optionName, boolean checkUseSHA) { +- // Execute the test case on any CPU except AArch64, RISCV64, PPC, S390x, and X86. ++ // Execute the test case on any CPU except AArch64, RISCV64, PPC, S390x, LoongArch64, and X86. + super(optionName, new NotPredicate( + new OrPredicate(Platform::isAArch64, + new OrPredicate(Platform::isRISCV64, + new OrPredicate(Platform::isS390x, + new OrPredicate(Platform::isPPC, ++ new OrPredicate(Platform::isLoongArch64, + new OrPredicate(Platform::isX64, +- Platform::isX86))))))); ++ Platform::isX86)))))))); + + this.checkUseSHA = checkUseSHA; + } +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/string/TestCountPositives.java b/test/hotspot/jtreg/compiler/intrinsics/string/TestCountPositives.java +--- a/test/hotspot/jtreg/compiler/intrinsics/string/TestCountPositives.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/string/TestCountPositives.java 2024-02-20 10:42:38.058861951 +0800 +@@ -21,13 +21,21 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2022 Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.intrinsics.string; + ++import jdk.test.lib.Platform; ++ + /* + * @test + * @bug 8999999 + * @summary Validates StringCoding.countPositives intrinsic with a small range of tests. +- * @library /compiler/patches ++ * @library /compiler/patches /test/lib + * + * @build java.base/java.lang.Helper + * @run main compiler.intrinsics.string.TestCountPositives +@@ -91,7 +99,7 @@ + int calculated = Helper.StringCodingCountPositives(tBa, off, len); + int expected = countPositives(tBa, off, len); + if (calculated != expected) { +- if (expected != len && calculated >= 0 && calculated < expected) { ++ if (!Platform.isLoongArch64() && expected != len && calculated >= 0 && calculated < expected) { + // allow intrinsics to return early with a lower value, + // but only if we're not expecting the full length (no + // negative bytes) +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/string/TestStringCompareToDifferentLength.java b/test/hotspot/jtreg/compiler/intrinsics/string/TestStringCompareToDifferentLength.java +--- a/test/hotspot/jtreg/compiler/intrinsics/string/TestStringCompareToDifferentLength.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/string/TestStringCompareToDifferentLength.java 2024-02-20 10:42:38.058861951 +0800 +@@ -22,9 +22,15 @@ + * questions. + */ + ++ /* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023 Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /* + * @test +- * @requires os.arch=="aarch64" | os.arch=="riscv64" ++ * @requires os.arch=="aarch64" | os.arch=="riscv64" | os.arch=="loongarch64" + * @summary String::compareTo implementation uses different algorithms for + * different string length. This test creates string with specified + * size and longer string, which is same at beginning. +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/TestBitShuffleOpers.java b/test/hotspot/jtreg/compiler/intrinsics/TestBitShuffleOpers.java +--- a/test/hotspot/jtreg/compiler/intrinsics/TestBitShuffleOpers.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/TestBitShuffleOpers.java 2024-02-20 10:42:38.052195290 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @bug 8283894 +@@ -31,7 +37,8 @@ + * (vm.cpu.features ~= ".*bmi2.*" & vm.cpu.features ~= ".*bmi1.*" & + * vm.cpu.features ~= ".*sse2.*")) | + * ((vm.opt.UseSVE == "null" | vm.opt.UseSVE > 1) & +- * os.arch=="aarch64" & vm.cpu.features ~= ".*svebitperm.*")) ++ * os.arch=="aarch64" & vm.cpu.features ~= ".*svebitperm.*") | ++ * os.arch=="loongarch64") + * @library /test/lib / + * @run driver compiler.intrinsics.TestBitShuffleOpers + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/TestCompareUnsigned.java b/test/hotspot/jtreg/compiler/intrinsics/TestCompareUnsigned.java +--- a/test/hotspot/jtreg/compiler/intrinsics/TestCompareUnsigned.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/TestCompareUnsigned.java 2024-02-20 10:42:38.052195290 +0800 +@@ -20,6 +20,13 @@ + * or visit www.oracle.com if you need additional information or have any + * questions. + */ ++ ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.intrinsics; + + import compiler.lib.ir_framework.*; +@@ -30,7 +37,7 @@ + * @test + * @key randomness + * @bug 8283726 8287925 +- * @requires os.arch=="amd64" | os.arch=="x86_64" | os.arch=="aarch64" ++ * @requires os.arch=="amd64" | os.arch=="x86_64" | os.arch=="aarch64" | os.arch=="loongarch64" + * @summary Test the intrinsics implementation of Integer/Long::compareUnsigned + * @library /test/lib / + * @run driver compiler.intrinsics.TestCompareUnsigned +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/TestDoubleIsFinite.java b/test/hotspot/jtreg/compiler/intrinsics/TestDoubleIsFinite.java +--- a/test/hotspot/jtreg/compiler/intrinsics/TestDoubleIsFinite.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/TestDoubleIsFinite.java 2024-02-20 10:42:38.052195290 +0800 +@@ -22,10 +22,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @summary Test intrinsic for Double.isFinite. +-* @requires os.arch == "riscv64" ++* @requires os.arch == "riscv64" | os.arch == "loongarch64" + * @library /test/lib / + * @run driver compiler.intrinsics.TestDoubleIsFinite + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/TestDoubleIsInfinite.java b/test/hotspot/jtreg/compiler/intrinsics/TestDoubleIsInfinite.java +--- a/test/hotspot/jtreg/compiler/intrinsics/TestDoubleIsInfinite.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/TestDoubleIsInfinite.java 2024-02-20 10:42:38.052195290 +0800 +@@ -22,10 +22,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @summary Test intrinsic for Double.isInfinite. +-* @requires vm.cpu.features ~= ".*avx512dq.*" | os.arch == "riscv64" ++* @requires vm.cpu.features ~= ".*avx512dq.*" | os.arch == "riscv64" | os.arch == "loongarch64" + * @library /test/lib / + * @run driver compiler.intrinsics.TestDoubleIsInfinite + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/TestFloatIsFinite.java b/test/hotspot/jtreg/compiler/intrinsics/TestFloatIsFinite.java +--- a/test/hotspot/jtreg/compiler/intrinsics/TestFloatIsFinite.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/TestFloatIsFinite.java 2024-02-20 10:42:38.052195290 +0800 +@@ -22,10 +22,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @summary Test intrinsics for Float.isFinite. +-* @requires os.arch == "riscv64" ++* @requires os.arch == "riscv64" | os.arch == "loongarch64" + * @library /test/lib / + * @run driver compiler.intrinsics.TestFloatIsFinite + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/TestFloatIsInfinite.java b/test/hotspot/jtreg/compiler/intrinsics/TestFloatIsInfinite.java +--- a/test/hotspot/jtreg/compiler/intrinsics/TestFloatIsInfinite.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/TestFloatIsInfinite.java 2024-02-20 10:42:38.052195290 +0800 +@@ -22,10 +22,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @summary Test intrinsics for Float.isInfinite. +-* @requires vm.cpu.features ~= ".*avx512dq.*" | os.arch == "riscv64" ++* @requires vm.cpu.features ~= ".*avx512dq.*" | os.arch == "riscv64" | os.arch == "loongarch64" + * @library /test/lib / + * @run driver compiler.intrinsics.TestFloatIsInfinite + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/TestIntegerUnsignedDivMod.java b/test/hotspot/jtreg/compiler/intrinsics/TestIntegerUnsignedDivMod.java +--- a/test/hotspot/jtreg/compiler/intrinsics/TestIntegerUnsignedDivMod.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/TestIntegerUnsignedDivMod.java 2024-02-20 10:42:38.052195290 +0800 +@@ -21,10 +21,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @summary Test x86_64 intrinsic for divideUnsigned() and remainderUnsigned() methods for Integer +-* @requires os.arch=="amd64" | os.arch=="x86_64" ++* @requires os.arch=="amd64" | os.arch=="x86_64" | os.arch=="loongarch64" + * @library /test/lib / + * @run driver compiler.intrinsics.TestIntegerUnsignedDivMod + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/intrinsics/TestLongUnsignedDivMod.java b/test/hotspot/jtreg/compiler/intrinsics/TestLongUnsignedDivMod.java +--- a/test/hotspot/jtreg/compiler/intrinsics/TestLongUnsignedDivMod.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/intrinsics/TestLongUnsignedDivMod.java 2024-02-20 10:42:38.052195290 +0800 +@@ -21,10 +21,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @summary Test x86_64 intrinsic for divideUnsigned() and remainderUnsigned() methods for Long +-* @requires os.arch=="amd64" | os.arch=="x86_64" ++* @requires os.arch=="amd64" | os.arch=="x86_64" | os.arch=="loongarch64" + * @library /test/lib / + * @run driver compiler.intrinsics.TestLongUnsignedDivMod + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/CodeInstallationTest.java b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/CodeInstallationTest.java +--- a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/CodeInstallationTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/CodeInstallationTest.java 2024-02-20 10:42:38.068861944 +0800 +@@ -20,10 +20,18 @@ + * or visit www.oracle.com if you need additional information or have any + * questions. + */ ++ ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2022, 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package jdk.vm.ci.code.test; + + import jdk.vm.ci.aarch64.AArch64; + import jdk.vm.ci.amd64.AMD64; ++import jdk.vm.ci.loongarch64.LoongArch64; + import jdk.vm.ci.riscv64.RISCV64; + import jdk.vm.ci.code.Architecture; + import jdk.vm.ci.code.CodeCacheProvider; +@@ -31,6 +39,7 @@ + import jdk.vm.ci.code.TargetDescription; + import jdk.vm.ci.code.test.aarch64.AArch64TestAssembler; + import jdk.vm.ci.code.test.amd64.AMD64TestAssembler; ++import jdk.vm.ci.code.test.loongarch64.LoongArch64TestAssembler; + import jdk.vm.ci.code.test.riscv64.RISCV64TestAssembler; + import jdk.vm.ci.hotspot.HotSpotCodeCacheProvider; + import jdk.vm.ci.hotspot.HotSpotCompiledCode; +@@ -80,6 +89,8 @@ + return new AArch64TestAssembler(codeCache, config); + } else if (arch instanceof RISCV64) { + return new RISCV64TestAssembler(codeCache, config); ++ } else if (arch instanceof LoongArch64) { ++ return new LoongArch64TestAssembler(codeCache, config); + } else { + Assert.fail("unsupported architecture"); + return null; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/DataPatchTest.java b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/DataPatchTest.java +--- a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/DataPatchTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/DataPatchTest.java 2024-02-20 10:42:38.068861944 +0800 +@@ -21,10 +21,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @requires vm.jvmci +- * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" ++ * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" | vm.simpleArch == "loongarch64" + * @library / + * @modules jdk.internal.vm.ci/jdk.vm.ci.hotspot + * jdk.internal.vm.ci/jdk.vm.ci.meta +@@ -32,9 +38,10 @@ + * jdk.internal.vm.ci/jdk.vm.ci.code.site + * jdk.internal.vm.ci/jdk.vm.ci.runtime + * jdk.internal.vm.ci/jdk.vm.ci.aarch64 ++ * jdk.internal.vm.ci/jdk.vm.ci.loongarch64 + * jdk.internal.vm.ci/jdk.vm.ci.amd64 + * jdk.internal.vm.ci/jdk.vm.ci.riscv64 +- * @compile CodeInstallationTest.java DebugInfoTest.java TestAssembler.java TestHotSpotVMConfig.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java ++ * @compile CodeInstallationTest.java DebugInfoTest.java TestAssembler.java TestHotSpotVMConfig.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java loongarch64/LoongArch64TestAssembler.java + * @run junit/othervm -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:-UseJVMCICompiler jdk.vm.ci.code.test.DataPatchTest + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/InterpreterFrameSizeTest.java b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/InterpreterFrameSizeTest.java +--- a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/InterpreterFrameSizeTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/InterpreterFrameSizeTest.java 2024-02-20 10:42:38.068861944 +0800 +@@ -21,10 +21,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @requires vm.jvmci +- * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" ++ * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" | vm.simpleArch == "loongarch64" + * @modules jdk.internal.vm.ci/jdk.vm.ci.hotspot + * jdk.internal.vm.ci/jdk.vm.ci.code + * jdk.internal.vm.ci/jdk.vm.ci.code.site +@@ -32,9 +38,10 @@ + * jdk.internal.vm.ci/jdk.vm.ci.runtime + * jdk.internal.vm.ci/jdk.vm.ci.common + * jdk.internal.vm.ci/jdk.vm.ci.aarch64 ++ * jdk.internal.vm.ci/jdk.vm.ci.loongarch64 + * jdk.internal.vm.ci/jdk.vm.ci.amd64 + * jdk.internal.vm.ci/jdk.vm.ci.riscv64 +- * @compile CodeInstallationTest.java TestAssembler.java TestHotSpotVMConfig.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java ++ * @compile CodeInstallationTest.java TestAssembler.java TestHotSpotVMConfig.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java loongarch64/LoongArch64TestAssembler.java + * @run junit/othervm -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:-UseJVMCICompiler jdk.vm.ci.code.test.InterpreterFrameSizeTest + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/loongarch64/LoongArch64TestAssembler.java b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/loongarch64/LoongArch64TestAssembler.java +--- a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/loongarch64/LoongArch64TestAssembler.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/loongarch64/LoongArch64TestAssembler.java 2024-02-20 10:42:38.068861944 +0800 +@@ -0,0 +1,568 @@ ++/* ++ * Copyright (c) 2020, 2022, Oracle and/or its affiliates. All rights reserved. ++ * Copyright (c) 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++package jdk.vm.ci.code.test.loongarch64; ++ ++import jdk.vm.ci.loongarch64.LoongArch64; ++import jdk.vm.ci.loongarch64.LoongArch64Kind; ++import jdk.vm.ci.code.CallingConvention; ++import jdk.vm.ci.code.CodeCacheProvider; ++import jdk.vm.ci.code.DebugInfo; ++import jdk.vm.ci.code.Register; ++import jdk.vm.ci.code.RegisterArray; ++import jdk.vm.ci.code.RegisterValue; ++import jdk.vm.ci.code.StackSlot; ++import jdk.vm.ci.code.site.ConstantReference; ++import jdk.vm.ci.code.site.DataSectionReference; ++import jdk.vm.ci.code.test.TestAssembler; ++import jdk.vm.ci.code.test.TestHotSpotVMConfig; ++import jdk.vm.ci.hotspot.HotSpotCallingConventionType; ++import jdk.vm.ci.hotspot.HotSpotConstant; ++import jdk.vm.ci.hotspot.HotSpotForeignCallTarget; ++import jdk.vm.ci.meta.AllocatableValue; ++import jdk.vm.ci.meta.JavaKind; ++import jdk.vm.ci.meta.VMConstant; ++ ++public class LoongArch64TestAssembler extends TestAssembler { ++ ++ private static final Register scratchRegister = LoongArch64.SCR1; ++ private static final Register doubleScratch = LoongArch64.f23; ++ private static final RegisterArray nativeGeneralParameterRegisters = new RegisterArray(LoongArch64.a0, ++ LoongArch64.a1, LoongArch64.a2, ++ LoongArch64.a3, LoongArch64.a4, ++ LoongArch64.a5, LoongArch64.a6, ++ LoongArch64.a7); ++ private static final RegisterArray floatParameterRegisters = new RegisterArray(LoongArch64.f0, ++ LoongArch64.f1, LoongArch64.f2, ++ LoongArch64.f3, LoongArch64.f4, ++ LoongArch64.f5, LoongArch64.f6, ++ LoongArch64.f7); ++ private static int currentGeneral = 0; ++ private static int currentFloat = 0; ++ public LoongArch64TestAssembler(CodeCacheProvider codeCache, TestHotSpotVMConfig config) { ++ super(codeCache, config, ++ 16 /* initialFrameSize */, 16 /* stackAlignment */, ++ LoongArch64Kind.UDWORD /* narrowOopKind */, ++ /* registers */ ++ LoongArch64.a0, LoongArch64.a1, LoongArch64.a2, LoongArch64.a3, ++ LoongArch64.a4, LoongArch64.a5, LoongArch64.a6, LoongArch64.a7); ++ } ++ ++ private static int low(int x, int l) { ++ assert l < 32; ++ return (x >> 0) & ((1 << l)-1); ++ } ++ ++ private static int low16(int x) { ++ return low(x, 16); ++ } ++ ++ private void emitNop() { ++ code.emitInt(0x3400000); ++ } ++ ++ private void emitPcaddu12i(Register rj, int si20) { ++ // pcaddu12i ++ code.emitInt((0b0001110 << 25) ++ | (low(si20, 20) << 5) ++ | rj.encoding); ++ } ++ ++ private void emitAdd(Register rd, Register rj, Register rk) { ++ // add_d ++ code.emitInt((0b00000000000100001 << 15) ++ | (rk.encoding << 10) ++ | (rj.encoding << 5) ++ | rd.encoding); ++ } ++ ++ private void emitAdd(Register rd, Register rj, int si12) { ++ // addi_d ++ code.emitInt((0b0000001011 << 22) ++ | (low(si12, 12) << 10) ++ | (rj.encoding << 5) ++ | rd.encoding); ++ } ++ ++ private void emitSub(Register rd, Register rj, Register rk) { ++ // sub_d ++ code.emitInt((0b00000000000100011 << 15) ++ | (rk.encoding << 10) ++ | (rj.encoding << 5) ++ | rd.encoding); ++ } ++ ++ private void emitShiftLeft(Register rd, Register rj, int shift) { ++ // slli_d ++ code.emitInt((0b00000000010000 << 18) ++ | (low(( (0b01 << 6) | shift ), 8) << 10) ++ | (rj.encoding << 5) ++ | rd.encoding); ++ } ++ ++ private void emitLu12i_w(Register rj, int imm20) { ++ // lu12i_w ++ code.emitInt((0b0001010 << 25) ++ | (low(imm20, 20)<<5) ++ | rj.encoding); ++ } ++ ++ private void emitOri(Register rd, Register rj, int ui12) { ++ // ori ++ code.emitInt((0b0000001110 << 22) ++ | (low(ui12, 12) << 10) ++ | (rj.encoding << 5) ++ | rd.encoding); ++ } ++ ++ private void emitLu32i_d(Register rj, int imm20) { ++ // lu32i_d ++ code.emitInt((0b0001011 << 25) ++ | (low(imm20, 20)<<5) ++ | rj.encoding); ++ } ++ ++ private void emitLu52i_d(Register rd, Register rj, int imm12) { ++ // lu52i_d ++ code.emitInt((0b0000001100 << 22) ++ | (low(imm12, 12) << 10) ++ | (rj.encoding << 5) ++ | rd.encoding); ++ } ++ ++ private void emitLoadImmediate(Register rd, int imm32) { ++ emitLu12i_w(rd, (imm32 >> 12) & 0xfffff); ++ emitOri(rd, rd, imm32 & 0xfff); ++ } ++ ++ private void emitLi52(Register rj, long imm) { ++ emitLu12i_w(rj, (int) ((imm >> 12) & 0xfffff)); ++ emitOri(rj, rj, (int) (imm & 0xfff)); ++ emitLu32i_d(rj, (int) ((imm >> 32) & 0xfffff)); ++ } ++ ++ private void emitLi64(Register rj, long imm) { ++ emitLu12i_w(rj, (int) ((imm >> 12) & 0xfffff)); ++ emitOri(rj, rj, (int) (imm & 0xfff)); ++ emitLu32i_d(rj, (int) ((imm >> 32) & 0xfffff)); ++ emitLu52i_d(rj, rj, (int) ((imm >> 52) & 0xfff)); ++ } ++ ++ private void emitOr(Register rd, Register rj, Register rk) { ++ // orr ++ code.emitInt((0b00000000000101010 << 15) ++ | (rk.encoding << 10) ++ | (rj.encoding << 5) ++ | rd.encoding); ++ } ++ ++ private void emitMove(Register rd, Register rs) { ++ // move ++ emitOr(rd, rs, LoongArch64.zero); ++ } ++ ++ private void emitMovfr2gr(Register rd, LoongArch64Kind kind, Register rj) { ++ // movfr2gr_s/movfr2gr_d ++ int opc = 0; ++ switch (kind) { ++ case SINGLE: opc = 0b0000000100010100101101; break; ++ case DOUBLE: opc = 0b0000000100010100101110; break; ++ default: throw new IllegalArgumentException(); ++ } ++ code.emitInt((opc << 10) ++ | (rj.encoding << 5) ++ | rd.encoding); ++ } ++ ++ private void emitLoadRegister(Register rd, LoongArch64Kind kind, Register rj, int offset) { ++ // load ++ assert offset >= 0; ++ int opc = 0; ++ switch (kind) { ++ case BYTE: opc = 0b0010100000; break; ++ case WORD: opc = 0b0010100001; break; ++ case DWORD: opc = 0b0010100010; break; ++ case QWORD: opc = 0b0010100011; break; ++ case UDWORD: opc = 0b0010101010; break; ++ case SINGLE: opc = 0b0010101100; break; ++ case DOUBLE: opc = 0b0010101110; break; ++ default: throw new IllegalArgumentException(); ++ } ++ code.emitInt((opc << 22) ++ | (low(offset, 12) << 10) ++ | (rj.encoding << 5) ++ | rd.encoding); ++ } ++ ++ private void emitStoreRegister(Register rd, LoongArch64Kind kind, Register rj, int offset) { ++ // store ++ assert offset >= 0; ++ int opc = 0; ++ switch (kind) { ++ case BYTE: opc = 0b0010100100; break; ++ case WORD: opc = 0b0010100101; break; ++ case DWORD: opc = 0b0010100110; break; ++ case QWORD: opc = 0b0010100111; break; ++ case SINGLE: opc = 0b0010101101; break; ++ case DOUBLE: opc = 0b0010101111; break; ++ default: throw new IllegalArgumentException(); ++ } ++ code.emitInt((opc << 22) ++ | (low(offset, 12) << 10) ++ | (rj.encoding << 5) ++ | rd.encoding); ++ } ++ ++ private void emitJirl(Register rd, Register rj, int offs) { ++ // jirl ++ code.emitInt((0b010011 << 26) ++ | (low16(offs >> 2) << 10) ++ | (rj.encoding << 5) ++ | rd.encoding); ++ } ++ ++ @Override ++ public void emitGrowStack(int size) { ++ assert size % 16 == 0; ++ if (size > -4096 && size < 0) { ++ emitAdd(LoongArch64.sp, LoongArch64.sp, -size); ++ } else if (size == 0) { ++ // No-op ++ } else if (size < 4096) { ++ emitAdd(LoongArch64.sp, LoongArch64.sp, -size); ++ } else if (size < 65535) { ++ emitLoadImmediate(scratchRegister, size); ++ emitSub(LoongArch64.sp, LoongArch64.sp, scratchRegister); ++ } else { ++ throw new IllegalArgumentException(); ++ } ++ } ++ ++ @Override ++ public void emitPrologue() { ++ // Must be patchable by NativeJump::patch_verified_entry ++ emitNop(); ++ emitGrowStack(32); ++ emitStoreRegister(LoongArch64.ra, LoongArch64Kind.QWORD, LoongArch64.sp, 24); ++ emitStoreRegister(LoongArch64.fp, LoongArch64Kind.QWORD, LoongArch64.sp, 16); ++ emitGrowStack(-16); ++ emitMove(LoongArch64.fp, LoongArch64.sp); ++ setDeoptRescueSlot(newStackSlot(LoongArch64Kind.QWORD)); ++ } ++ ++ @Override ++ public void emitEpilogue() { ++ recordMark(config.MARKID_DEOPT_HANDLER_ENTRY); ++ recordCall(new HotSpotForeignCallTarget(config.handleDeoptStub), 4*4, true, null); ++ emitCall(0xdeaddeaddeadL); ++ } ++ ++ @Override ++ public void emitCallPrologue(CallingConvention cc, Object... prim) { ++ emitGrowStack(cc.getStackSize()); ++ frameSize += cc.getStackSize(); ++ AllocatableValue[] args = cc.getArguments(); ++ for (int i = 0; i < args.length; i++) { ++ emitLoad(args[i], prim[i]); ++ } ++ currentGeneral = 0; ++ currentFloat = 0; ++ } ++ ++ @Override ++ public void emitCallEpilogue(CallingConvention cc) { ++ emitGrowStack(-cc.getStackSize()); ++ frameSize -= cc.getStackSize(); ++ } ++ ++ @Override ++ public void emitCall(long addr) { ++ // long call (absolute) ++ // lu12i_w(T4, split_low20(value >> 12)); ++ // lu32i_d(T4, split_low20(value >> 32)); ++ // jirl(RA, T4, split_low12(value)); ++ emitLu12i_w(LoongArch64.t4, (int) ((addr >> 12) & 0xfffff)); ++ emitLu32i_d(LoongArch64.t4, (int) ((addr >> 32) & 0xfffff)); ++ emitJirl(LoongArch64.ra, LoongArch64.t4, (int) (addr & 0xfff)); ++ } ++ ++ @Override ++ public void emitLoad(AllocatableValue av, Object prim) { ++ if (av instanceof RegisterValue) { ++ Register reg = ((RegisterValue) av).getRegister(); ++ if (prim instanceof Float) { ++ if (currentFloat < floatParameterRegisters.size()) { ++ currentFloat++; ++ emitLoadFloat(reg, (Float) prim); ++ } else if (currentGeneral < nativeGeneralParameterRegisters.size()) { ++ currentGeneral++; ++ emitLoadFloat(doubleScratch, (Float) prim); ++ emitMovfr2gr(reg, LoongArch64Kind.SINGLE, doubleScratch); ++ } ++ } else if (prim instanceof Double) { ++ if (currentFloat < floatParameterRegisters.size()) { ++ currentFloat++; ++ emitLoadDouble(reg, (Double) prim); ++ } else if (currentGeneral < nativeGeneralParameterRegisters.size()) { ++ currentGeneral++; ++ emitLoadDouble(doubleScratch, (Double) prim); ++ emitMovfr2gr(reg, LoongArch64Kind.DOUBLE, doubleScratch); ++ } ++ } else if (prim instanceof Integer) { ++ emitLoadInt(reg, (Integer) prim); ++ } else if (prim instanceof Long) { ++ emitLoadLong(reg, (Long) prim); ++ } ++ } else if (av instanceof StackSlot) { ++ StackSlot slot = (StackSlot) av; ++ if (prim instanceof Float) { ++ emitFloatToStack(slot, emitLoadFloat(doubleScratch, (Float) prim)); ++ } else if (prim instanceof Double) { ++ emitDoubleToStack(slot, emitLoadDouble(doubleScratch, (Double) prim)); ++ } else if (prim instanceof Integer) { ++ emitIntToStack(slot, emitLoadInt(scratchRegister, (Integer) prim)); ++ } else if (prim instanceof Long) { ++ emitLongToStack(slot, emitLoadLong(scratchRegister, (Long) prim)); ++ } else { ++ assert false : "Unimplemented"; ++ } ++ } else { ++ throw new IllegalArgumentException("Unknown value " + av); ++ } ++ } ++ ++ @Override ++ public Register emitLoadPointer(HotSpotConstant c) { ++ recordDataPatchInCode(new ConstantReference((VMConstant) c)); ++ ++ Register ret = newRegister(); ++ // need to match patchable_li52 instruction sequence ++ // lu12i_ori_lu32i ++ emitLi52(ret, 0xdeaddead); ++ return ret; ++ } ++ ++ @Override ++ public Register emitLoadPointer(Register b, int offset) { ++ Register ret = newRegister(); ++ emitLoadRegister(ret, LoongArch64Kind.QWORD, b, offset); ++ return ret; ++ } ++ ++ @Override ++ public Register emitLoadNarrowPointer(DataSectionReference ref) { ++ recordDataPatchInCode(ref); ++ ++ Register ret = newRegister(); ++ emitPcaddu12i(ret, 0xdead >> 12); ++ emitAdd(ret, ret, 0xdead & 0xfff); ++ emitLoadRegister(ret, LoongArch64Kind.UDWORD, ret, 0); ++ return ret; ++ } ++ ++ @Override ++ public Register emitLoadPointer(DataSectionReference ref) { ++ recordDataPatchInCode(ref); ++ ++ Register ret = newRegister(); ++ emitPcaddu12i(ret, 0xdead >> 12); ++ emitAdd(ret, ret, 0xdead & 0xfff); ++ emitLoadRegister(ret, LoongArch64Kind.QWORD, ret, 0); ++ return ret; ++ } ++ ++ private Register emitLoadDouble(Register reg, double c) { ++ DataSectionReference ref = new DataSectionReference(); ++ ref.setOffset(data.position()); ++ data.emitDouble(c); ++ ++ recordDataPatchInCode(ref); ++ emitPcaddu12i(scratchRegister, 0xdead >> 12); ++ emitAdd(scratchRegister, scratchRegister, 0xdead & 0xfff); ++ emitLoadRegister(reg, LoongArch64Kind.DOUBLE, scratchRegister, 0); ++ return reg; ++ } ++ ++ private Register emitLoadFloat(Register reg, float c) { ++ DataSectionReference ref = new DataSectionReference(); ++ ref.setOffset(data.position()); ++ data.emitFloat(c); ++ ++ recordDataPatchInCode(ref); ++ emitPcaddu12i(scratchRegister, 0xdead >> 12); ++ emitAdd(scratchRegister, scratchRegister, 0xdead & 0xfff); ++ emitLoadRegister(reg, LoongArch64Kind.SINGLE, scratchRegister, 0); ++ return reg; ++ } ++ ++ @Override ++ public Register emitLoadFloat(float c) { ++ Register ret = LoongArch64.fv0; ++ return emitLoadFloat(ret, c); ++ } ++ ++ private Register emitLoadLong(Register reg, long c) { ++ emitLi64(reg, c); ++ return reg; ++ } ++ ++ @Override ++ public Register emitLoadLong(long c) { ++ Register ret = newRegister(); ++ return emitLoadLong(ret, c); ++ } ++ ++ private Register emitLoadInt(Register reg, int c) { ++ emitLoadImmediate(reg, c); ++ return reg; ++ } ++ ++ @Override ++ public Register emitLoadInt(int c) { ++ Register ret = newRegister(); ++ return emitLoadInt(ret, c); ++ } ++ ++ @Override ++ public Register emitIntArg0() { ++ return codeCache.getRegisterConfig() ++ .getCallingConventionRegisters(HotSpotCallingConventionType.JavaCall, JavaKind.Int) ++ .get(0); ++ } ++ ++ @Override ++ public Register emitIntArg1() { ++ return codeCache.getRegisterConfig() ++ .getCallingConventionRegisters(HotSpotCallingConventionType.JavaCall, JavaKind.Int) ++ .get(1); ++ } ++ ++ @Override ++ public Register emitIntAdd(Register a, Register b) { ++ emitAdd(a, a, b); ++ return a; ++ } ++ ++ @Override ++ public void emitTrap(DebugInfo info) { ++ // Dereference null pointer ++ emitMove(scratchRegister, LoongArch64.zero); ++ recordImplicitException(info); ++ emitLoadRegister(LoongArch64.zero, LoongArch64Kind.QWORD, scratchRegister, 0); ++ } ++ ++ @Override ++ public void emitIntRet(Register a) { ++ emitMove(LoongArch64.v0, a); ++ emitMove(LoongArch64.sp, LoongArch64.fp); ++ emitLoadRegister(LoongArch64.ra, LoongArch64Kind.QWORD, LoongArch64.sp, 8); ++ emitLoadRegister(LoongArch64.fp, LoongArch64Kind.QWORD, LoongArch64.sp, 0); ++ emitGrowStack(-16); ++ emitJirl(LoongArch64.zero, LoongArch64.ra, 0); ++ } ++ ++ @Override ++ public void emitFloatRet(Register a) { ++ assert a == LoongArch64.fv0 : "Unimplemented move " + a; ++ emitMove(LoongArch64.sp, LoongArch64.fp); ++ emitLoadRegister(LoongArch64.ra, LoongArch64Kind.QWORD, LoongArch64.sp, 8); ++ emitLoadRegister(LoongArch64.fp, LoongArch64Kind.QWORD, LoongArch64.sp, 0); ++ emitGrowStack(-16); ++ emitJirl(LoongArch64.zero, LoongArch64.ra, 0); ++ } ++ ++ @Override ++ public void emitPointerRet(Register a) { ++ emitIntRet(a); ++ } ++ ++ @Override ++ public StackSlot emitPointerToStack(Register a) { ++ return emitLongToStack(a); ++ } ++ ++ @Override ++ public StackSlot emitNarrowPointerToStack(Register a) { ++ return emitIntToStack(a); ++ } ++ ++ @Override ++ public Register emitUncompressPointer(Register compressed, long base, int shift) { ++ if (shift > 0) { ++ emitShiftLeft(compressed, compressed, shift); ++ } ++ ++ if (base != 0) { ++ emitLoadLong(scratchRegister, base); ++ emitAdd(compressed, compressed, scratchRegister); ++ } ++ ++ return compressed; ++ } ++ ++ private StackSlot emitDoubleToStack(StackSlot slot, Register a) { ++ emitStoreRegister(a, LoongArch64Kind.DOUBLE, LoongArch64.sp, slot.getOffset(frameSize)); ++ return slot; ++ } ++ ++ @Override ++ public StackSlot emitDoubleToStack(Register a) { ++ StackSlot ret = newStackSlot(LoongArch64Kind.DOUBLE); ++ return emitDoubleToStack(ret, a); ++ } ++ ++ private StackSlot emitFloatToStack(StackSlot slot, Register a) { ++ emitStoreRegister(a, LoongArch64Kind.SINGLE, LoongArch64.sp, slot.getOffset(frameSize)); ++ return slot; ++ } ++ ++ @Override ++ public StackSlot emitFloatToStack(Register a) { ++ StackSlot ret = newStackSlot(LoongArch64Kind.SINGLE); ++ return emitFloatToStack(ret, a); ++ } ++ ++ private StackSlot emitIntToStack(StackSlot slot, Register a) { ++ emitStoreRegister(a, LoongArch64Kind.DWORD, LoongArch64.sp, slot.getOffset(frameSize)); ++ return slot; ++ } ++ ++ @Override ++ public StackSlot emitIntToStack(Register a) { ++ StackSlot ret = newStackSlot(LoongArch64Kind.DWORD); ++ return emitIntToStack(ret, a); ++ } ++ ++ private StackSlot emitLongToStack(StackSlot slot, Register a) { ++ emitStoreRegister(a, LoongArch64Kind.QWORD, LoongArch64.sp, slot.getOffset(frameSize)); ++ return slot; ++ } ++ ++ @Override ++ public StackSlot emitLongToStack(Register a) { ++ StackSlot ret = newStackSlot(LoongArch64Kind.QWORD); ++ return emitLongToStack(ret, a); ++ } ++ ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/MaxOopMapStackOffsetTest.java b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/MaxOopMapStackOffsetTest.java +--- a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/MaxOopMapStackOffsetTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/MaxOopMapStackOffsetTest.java 2024-02-20 10:42:38.068861944 +0800 +@@ -21,10 +21,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @requires vm.jvmci +- * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" ++ * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" | vm.simpleArch == "loongarch64" + * @library / + * @modules jdk.internal.vm.ci/jdk.vm.ci.hotspot + * jdk.internal.vm.ci/jdk.vm.ci.meta +@@ -33,9 +39,10 @@ + * jdk.internal.vm.ci/jdk.vm.ci.common + * jdk.internal.vm.ci/jdk.vm.ci.runtime + * jdk.internal.vm.ci/jdk.vm.ci.aarch64 ++ * jdk.internal.vm.ci/jdk.vm.ci.loongarch64 + * jdk.internal.vm.ci/jdk.vm.ci.amd64 + * jdk.internal.vm.ci/jdk.vm.ci.riscv64 +- * @compile CodeInstallationTest.java DebugInfoTest.java TestAssembler.java TestHotSpotVMConfig.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java ++ * @compile CodeInstallationTest.java DebugInfoTest.java TestAssembler.java TestHotSpotVMConfig.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java loongarch64/LoongArch64TestAssembler.java + * @run junit/othervm -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:-UseJVMCICompiler jdk.vm.ci.code.test.MaxOopMapStackOffsetTest + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/NativeCallTest.java b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/NativeCallTest.java +--- a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/NativeCallTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/NativeCallTest.java 2024-02-20 10:42:38.068861944 +0800 +@@ -21,10 +21,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @requires vm.jvmci +- * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" ++ * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" | vm.simpleArch == "loongarch64" + * @library /test/lib / + * @modules jdk.internal.vm.ci/jdk.vm.ci.hotspot + * jdk.internal.vm.ci/jdk.vm.ci.code +@@ -33,9 +39,10 @@ + * jdk.internal.vm.ci/jdk.vm.ci.runtime + * jdk.internal.vm.ci/jdk.vm.ci.common + * jdk.internal.vm.ci/jdk.vm.ci.aarch64 ++ * jdk.internal.vm.ci/jdk.vm.ci.loongarch64 + * jdk.internal.vm.ci/jdk.vm.ci.amd64 + * jdk.internal.vm.ci/jdk.vm.ci.riscv64 +- * @compile CodeInstallationTest.java TestHotSpotVMConfig.java NativeCallTest.java TestAssembler.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java ++ * @compile CodeInstallationTest.java TestHotSpotVMConfig.java NativeCallTest.java TestAssembler.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java loongarch64/LoongArch64TestAssembler.java + * @run junit/othervm/native -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -Xbootclasspath/a:. jdk.vm.ci.code.test.NativeCallTest + */ + package jdk.vm.ci.code.test; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/SimpleCodeInstallationTest.java b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/SimpleCodeInstallationTest.java +--- a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/SimpleCodeInstallationTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/SimpleCodeInstallationTest.java 2024-02-20 10:42:38.068861944 +0800 +@@ -21,10 +21,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @requires vm.jvmci +- * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" ++ * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" | vm.simpleArch == "loongarch64" + * @library / + * @modules jdk.internal.vm.ci/jdk.vm.ci.hotspot + * jdk.internal.vm.ci/jdk.vm.ci.meta +@@ -32,9 +38,10 @@ + * jdk.internal.vm.ci/jdk.vm.ci.code.site + * jdk.internal.vm.ci/jdk.vm.ci.runtime + * jdk.internal.vm.ci/jdk.vm.ci.aarch64 ++ * jdk.internal.vm.ci/jdk.vm.ci.loongarch64 + * jdk.internal.vm.ci/jdk.vm.ci.amd64 + * jdk.internal.vm.ci/jdk.vm.ci.riscv64 +- * @compile CodeInstallationTest.java DebugInfoTest.java TestAssembler.java TestHotSpotVMConfig.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java ++ * @compile CodeInstallationTest.java DebugInfoTest.java TestAssembler.java TestHotSpotVMConfig.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java loongarch64/LoongArch64TestAssembler.java + * @run junit/othervm -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:-UseJVMCICompiler jdk.vm.ci.code.test.SimpleCodeInstallationTest + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/SimpleDebugInfoTest.java b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/SimpleDebugInfoTest.java +--- a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/SimpleDebugInfoTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/SimpleDebugInfoTest.java 2024-02-20 10:42:38.068861944 +0800 +@@ -21,10 +21,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @requires vm.jvmci +- * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" ++ * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" | vm.simpleArch == "loongarch64" + * @library / + * @modules jdk.internal.vm.ci/jdk.vm.ci.hotspot + * jdk.internal.vm.ci/jdk.vm.ci.meta +@@ -32,9 +38,10 @@ + * jdk.internal.vm.ci/jdk.vm.ci.code.site + * jdk.internal.vm.ci/jdk.vm.ci.runtime + * jdk.internal.vm.ci/jdk.vm.ci.aarch64 ++ * jdk.internal.vm.ci/jdk.vm.ci.loongarch64 + * jdk.internal.vm.ci/jdk.vm.ci.amd64 + * jdk.internal.vm.ci/jdk.vm.ci.riscv64 +- * @compile CodeInstallationTest.java DebugInfoTest.java TestAssembler.java TestHotSpotVMConfig.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java ++ * @compile CodeInstallationTest.java DebugInfoTest.java TestAssembler.java TestHotSpotVMConfig.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java loongarch64/LoongArch64TestAssembler.java + * @run junit/othervm -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:-UseJVMCICompiler jdk.vm.ci.code.test.SimpleDebugInfoTest + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/VirtualObjectDebugInfoTest.java b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/VirtualObjectDebugInfoTest.java +--- a/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/VirtualObjectDebugInfoTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/jvmci/jdk.vm.ci.code.test/src/jdk/vm/ci/code/test/VirtualObjectDebugInfoTest.java 2024-02-20 10:42:38.068861944 +0800 +@@ -21,10 +21,16 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @requires vm.jvmci +- * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" ++ * @requires vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" | vm.simpleArch == "loongarch64" + * @library / + * @modules jdk.internal.vm.ci/jdk.vm.ci.hotspot + * jdk.internal.vm.ci/jdk.vm.ci.meta +@@ -32,9 +38,10 @@ + * jdk.internal.vm.ci/jdk.vm.ci.code.site + * jdk.internal.vm.ci/jdk.vm.ci.runtime + * jdk.internal.vm.ci/jdk.vm.ci.aarch64 ++ * jdk.internal.vm.ci/jdk.vm.ci.loongarch64 + * jdk.internal.vm.ci/jdk.vm.ci.amd64 + * jdk.internal.vm.ci/jdk.vm.ci.riscv64 +- * @compile CodeInstallationTest.java DebugInfoTest.java TestAssembler.java TestHotSpotVMConfig.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java ++ * @compile CodeInstallationTest.java DebugInfoTest.java TestAssembler.java TestHotSpotVMConfig.java amd64/AMD64TestAssembler.java aarch64/AArch64TestAssembler.java riscv64/RISCV64TestAssembler.java loongarch64/LoongArch64TestAssembler.java + * @run junit/othervm -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:-UseJVMCICompiler jdk.vm.ci.code.test.VirtualObjectDebugInfoTest + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/lib/ir_framework/IRNode.java b/test/hotspot/jtreg/compiler/lib/ir_framework/IRNode.java +--- a/test/hotspot/jtreg/compiler/lib/ir_framework/IRNode.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/lib/ir_framework/IRNode.java 2024-02-20 10:42:38.075528605 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.lib.ir_framework; + + import compiler.lib.ir_framework.driver.irmatching.mapping.*; +@@ -277,13 +283,13 @@ + + public static final String CHECKCAST_ARRAY = PREFIX + "CHECKCAST_ARRAY" + POSTFIX; + static { +- String regex = "(((?i:cmp|CLFI|CLR).*precise \\[.*:|.*(?i:mov|mv|or).*precise \\[.*:.*\\R.*(cmp|CMP|CLR))" + END; ++ String regex = "(((?i:cmp|CLFI|CLR).*precise \\[.*:|.*(?i:mov|mv|or|li).*precise \\[.*:.*\\R.*(cmp|CMP|CLR))" + END; + optoOnly(CHECKCAST_ARRAY, regex); + } + + public static final String CHECKCAST_ARRAY_OF = COMPOSITE_PREFIX + "CHECKCAST_ARRAY_OF" + POSTFIX; + static { +- String regex = "(((?i:cmp|CLFI|CLR).*precise \\[.*" + IS_REPLACED + ":|.*(?i:mov|mv|or).*precise \\[.*" + IS_REPLACED + ":.*\\R.*(cmp|CMP|CLR))" + END; ++ String regex = "(((?i:cmp|CLFI|CLR).*precise \\[.*" + IS_REPLACED + ":|.*(?i:mov|mv|or|li).*precise \\[.*" + IS_REPLACED + ":.*\\R.*(cmp|CMP|CLR))" + END; + optoOnly(CHECKCAST_ARRAY_OF, regex); + } + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java b/test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java +--- a/test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/loopopts/superword/ReductionPerf.java 2024-02-20 10:42:38.088861927 +0800 +@@ -22,11 +22,17 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @bug 8074981 8302652 + * @summary Test SuperWord Reduction Perf. + * @requires vm.compiler2.enabled +- * @requires vm.simpleArch == "x86" | vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" ++ * @requires vm.simpleArch == "x86" | vm.simpleArch == "x64" | vm.simpleArch == "aarch64" | vm.simpleArch == "riscv64" | vm.simpleArch == "loongarch64" + * @library /test/lib / + * @run main/othervm -Xbatch -XX:LoopUnrollLimit=250 + * -XX:CompileCommand=exclude,compiler.loopopts.superword.ReductionPerf::main +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/rangechecks/TestRangeCheckHoistingScaledIV.java b/test/hotspot/jtreg/compiler/rangechecks/TestRangeCheckHoistingScaledIV.java +--- a/test/hotspot/jtreg/compiler/rangechecks/TestRangeCheckHoistingScaledIV.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/rangechecks/TestRangeCheckHoistingScaledIV.java 2024-02-20 10:42:38.095528589 +0800 +@@ -22,11 +22,17 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @bug 8289996 + * @summary Test range check hoisting for some scaled iv at array index + * @library /test/lib / +- * @requires vm.debug & vm.compiler2.enabled & (os.simpleArch == "x64" | os.arch == "aarch64") ++ * @requires vm.debug & vm.compiler2.enabled & (os.simpleArch == "x64" | os.arch == "aarch64" | os.arch == "loongarch64") + * @modules jdk.incubator.vector + * @compile --enable-preview -source ${jdk.version} TestRangeCheckHoistingScaledIV.java + * @run main/othervm --enable-preview compiler.rangechecks.TestRangeCheckHoistingScaledIV +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/runtime/TestConstantsInError.java b/test/hotspot/jtreg/compiler/runtime/TestConstantsInError.java +--- a/test/hotspot/jtreg/compiler/runtime/TestConstantsInError.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/runtime/TestConstantsInError.java 2024-02-20 10:42:38.098861920 +0800 +@@ -22,6 +22,12 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2022. These ++ * modifications are Copyright (c) 2022 Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @bug 8279822 + * @requires vm.flagless +@@ -130,7 +136,7 @@ + results.shouldMatch("Test_C1/.*::test \\(3 bytes\\)$") + .shouldMatch("Test_C2/.*::test \\(3 bytes\\)$"); + +- if (isC1 && (Platform.isAArch64() || Platform.isRISCV64())) { // no code patching ++ if (isC1 && (Platform.isAArch64() || Platform.isRISCV64() || Platform.isLoongArch64())) { // no code patching + results.shouldMatch("Test_C1/.*::test \\(3 bytes\\) made not entrant") + .shouldMatch("Test_C2/.*::test \\(3 bytes\\) made not entrant"); + } else { +@@ -168,7 +174,7 @@ + .shouldMatch("Test_MH3/.*::test \\(3 bytes\\)$") + .shouldMatch("Test_MH4/.*::test \\(3 bytes\\)$"); + +- if (isC1 && (Platform.isAArch64() || Platform.isRISCV64())) { // no code patching ++ if (isC1 && (Platform.isAArch64() || Platform.isRISCV64() || Platform.isLoongArch64())) { // no code patching + results.shouldMatch("Test_MH1/.*::test \\(3 bytes\\) made not entrant") + .shouldMatch("Test_MH2/.*::test \\(3 bytes\\) made not entrant") + .shouldMatch("Test_MH3/.*::test \\(3 bytes\\) made not entrant") +@@ -191,7 +197,7 @@ + results.shouldMatch("Test_MT1/.*::test \\(3 bytes\\)$") + .shouldMatch("Test_MT2/.*::test \\(3 bytes\\)$"); + +- if (isC1 && (Platform.isAArch64() || Platform.isRISCV64())) { // no code patching ++ if (isC1 && (Platform.isAArch64() || Platform.isRISCV64() || Platform.isLoongArch64())) { // no code patching + results.shouldMatch("Test_MT1/.*::test \\(3 bytes\\) made not entrant") + .shouldMatch("Test_MT2/.*::test \\(3 bytes\\) made not entrant"); + } else { +@@ -235,7 +241,7 @@ + .shouldMatch("Test_CD3.*::test \\(3 bytes\\)$") + .shouldMatch("Test_CD4.*::test \\(3 bytes\\)$"); + +- if (isC1 && (Platform.isAArch64() || Platform.isRISCV64())) { // no code patching ++ if (isC1 && (Platform.isAArch64() || Platform.isRISCV64() || Platform.isLoongArch64())) { // no code patching + results.shouldMatch("Test_CD1.*::test \\(3 bytes\\) made not entrant") + .shouldMatch("Test_CD2.*::test \\(3 bytes\\) made not entrant") + .shouldMatch("Test_CD3.*::test \\(3 bytes\\) made not entrant") +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/sharedstubs/SharedStubToInterpTest.java b/test/hotspot/jtreg/compiler/sharedstubs/SharedStubToInterpTest.java +--- a/test/hotspot/jtreg/compiler/sharedstubs/SharedStubToInterpTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/sharedstubs/SharedStubToInterpTest.java 2024-02-20 10:42:38.098861920 +0800 +@@ -22,13 +22,19 @@ + * + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test SharedStubToInterpTest + * @summary Checks that stubs to the interpreter can be shared for static or final method. + * @bug 8280481 + * @library /test/lib + * +- * @requires os.arch=="amd64" | os.arch=="x86_64" | os.arch=="i386" | os.arch=="x86" | os.arch=="aarch64" | os.arch=="riscv64" ++ * @requires os.arch=="amd64" | os.arch=="x86_64" | os.arch=="i386" | os.arch=="x86" | os.arch=="aarch64" | os.arch=="riscv64" | os.arch=="loongarch64" + * @requires vm.debug + * + * @run driver compiler.sharedstubs.SharedStubToInterpTest +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/testlibrary/sha/predicate/IntrinsicPredicates.java b/test/hotspot/jtreg/compiler/testlibrary/sha/predicate/IntrinsicPredicates.java +--- a/test/hotspot/jtreg/compiler/testlibrary/sha/predicate/IntrinsicPredicates.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/testlibrary/sha/predicate/IntrinsicPredicates.java 2024-02-20 10:42:38.102195252 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2021, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.testlibrary.sha.predicate; + + import jdk.test.lib.Platform; +@@ -61,19 +67,22 @@ + + public static final BooleanSupplier MD5_INSTRUCTION_AVAILABLE + = new OrPredicate(new CPUSpecificPredicate("aarch64.*", null, null), ++ new OrPredicate(new CPUSpecificPredicate("loongarch64.*", null, null), + // x86 variants + new OrPredicate(new CPUSpecificPredicate("amd64.*", null, null), + new OrPredicate(new CPUSpecificPredicate("i386.*", null, null), +- new CPUSpecificPredicate("x86.*", null, null)))); ++ new CPUSpecificPredicate("x86.*", null, null))))); + + public static final BooleanSupplier SHA1_INSTRUCTION_AVAILABLE + = new OrPredicate(new CPUSpecificPredicate("aarch64.*", new String[] { "sha1" }, null), + new OrPredicate(new CPUSpecificPredicate("riscv64.*", new String[] { "sha1" }, null), + new OrPredicate(new CPUSpecificPredicate("s390.*", new String[] { "sha1" }, null), ++ // Basic instructions are used to implement SHA1 Intrinsics on LA, so "sha1" feature is not needed. ++ new OrPredicate(new CPUSpecificPredicate("loongarch64.*", null, null), + // x86 variants + new OrPredicate(new CPUSpecificPredicate("amd64.*", new String[] { "sha" }, null), + new OrPredicate(new CPUSpecificPredicate("i386.*", new String[] { "sha" }, null), +- new CPUSpecificPredicate("x86.*", new String[] { "sha" }, null)))))); ++ new CPUSpecificPredicate("x86.*", new String[] { "sha" }, null))))))); + + public static final BooleanSupplier SHA256_INSTRUCTION_AVAILABLE + = new OrPredicate(new CPUSpecificPredicate("aarch64.*", new String[] { "sha256" }, null), +@@ -81,12 +90,14 @@ + new OrPredicate(new CPUSpecificPredicate("s390.*", new String[] { "sha256" }, null), + new OrPredicate(new CPUSpecificPredicate("ppc64.*", new String[] { "sha" }, null), + new OrPredicate(new CPUSpecificPredicate("ppc64le.*", new String[] { "sha" }, null), ++ // Basic instructions are used to implement SHA256 Intrinsics on LA, so "sha256" feature is not needed. ++ new OrPredicate(new CPUSpecificPredicate("loongarch64.*", null, null), + // x86 variants + new OrPredicate(new CPUSpecificPredicate("amd64.*", new String[] { "sha" }, null), + new OrPredicate(new CPUSpecificPredicate("i386.*", new String[] { "sha" }, null), + new OrPredicate(new CPUSpecificPredicate("x86.*", new String[] { "sha" }, null), + new OrPredicate(new CPUSpecificPredicate("amd64.*", new String[] { "avx2", "bmi2" }, null), +- new CPUSpecificPredicate("x86_64", new String[] { "avx2", "bmi2" }, null)))))))))); ++ new CPUSpecificPredicate("x86_64", new String[] { "avx2", "bmi2" }, null))))))))))); + + public static final BooleanSupplier SHA512_INSTRUCTION_AVAILABLE + = new OrPredicate(new CPUSpecificPredicate("aarch64.*", new String[] { "sha512" }, null), +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorapi/TestVectorTest.java b/test/hotspot/jtreg/compiler/vectorapi/TestVectorTest.java +--- a/test/hotspot/jtreg/compiler/vectorapi/TestVectorTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorapi/TestVectorTest.java 2024-02-20 10:42:38.108861913 +0800 +@@ -20,6 +20,13 @@ + * or visit www.oracle.com if you need additional information or have any + * questions. + */ ++ ++/* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.vectorapi; + + import compiler.lib.ir_framework.*; +@@ -34,7 +41,7 @@ + * @modules jdk.incubator.vector + * @library /test/lib / + * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*sse4.*" & (vm.opt.UseSSE == "null" | vm.opt.UseSSE > 3)) +- * | os.arch == "aarch64" ++ * | os.arch == "aarch64" | os.arch == "loongarch64" + * @run driver compiler.vectorapi.TestVectorTest + */ + public class TestVectorTest { +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorapi/VectorLogicalOpIdentityTest.java b/test/hotspot/jtreg/compiler/vectorapi/VectorLogicalOpIdentityTest.java +--- a/test/hotspot/jtreg/compiler/vectorapi/VectorLogicalOpIdentityTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorapi/VectorLogicalOpIdentityTest.java 2024-02-20 10:42:38.108861913 +0800 +@@ -22,6 +22,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.vectorapi; + + import compiler.lib.ir_framework.*; +@@ -45,7 +51,7 @@ + * @key randomness + * @library /test/lib / + * @summary Add identity transformations for vector logic operations +- * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*avx.*") | os.arch=="aarch64" ++ * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*avx.*") | os.arch=="aarch64" | os.arch=="loongarch64" + * @modules jdk.incubator.vector + * + * @run driver compiler.vectorapi.VectorLogicalOpIdentityTest +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorapi/VectorReverseBytesTest.java b/test/hotspot/jtreg/compiler/vectorapi/VectorReverseBytesTest.java +--- a/test/hotspot/jtreg/compiler/vectorapi/VectorReverseBytesTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorapi/VectorReverseBytesTest.java 2024-02-20 10:42:38.108861913 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.vectorapi; + + import compiler.lib.ir_framework.*; +@@ -42,7 +48,7 @@ + * @library /test/lib / + * @summary [vectorapi] REVERSE_BYTES for byte type should not emit any instructions + * @requires vm.compiler2.enabled +- * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*avx2.*") | os.arch == "aarch64" ++ * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*avx2.*") | os.arch == "aarch64" | os.arch == "loongarch64" + * @modules jdk.incubator.vector + * + * @run driver compiler.vectorapi.VectorReverseBytesTest +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/runner/ArrayIndexFillTest.java b/test/hotspot/jtreg/compiler/vectorization/runner/ArrayIndexFillTest.java +--- a/test/hotspot/jtreg/compiler/vectorization/runner/ArrayIndexFillTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/runner/ArrayIndexFillTest.java 2024-02-20 10:42:38.112195242 +0800 +@@ -22,6 +22,12 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @summary Vectorization test on array index fill + * @library /test/lib / +@@ -35,7 +41,7 @@ + * -XX:+WhiteBoxAPI + * compiler.vectorization.runner.ArrayIndexFillTest + * +- * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") ++ * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") | (os.simpleArch == "loongarch64") + * @requires vm.compiler2.enabled & vm.flagless + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/runner/ArrayInvariantFillTest.java b/test/hotspot/jtreg/compiler/vectorization/runner/ArrayInvariantFillTest.java +--- a/test/hotspot/jtreg/compiler/vectorization/runner/ArrayInvariantFillTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/runner/ArrayInvariantFillTest.java 2024-02-20 10:42:38.112195242 +0800 +@@ -22,6 +22,12 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @summary Vectorization test on array invariant fill + * @library /test/lib / +@@ -36,7 +42,7 @@ + * -XX:-OptimizeFill + * compiler.vectorization.runner.ArrayInvariantFillTest + * +- * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") ++ * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") | (os.simpleArch == "loongarch64") + * @requires vm.compiler2.enabled & vm.flagless + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/runner/ArrayShiftOpTest.java b/test/hotspot/jtreg/compiler/vectorization/runner/ArrayShiftOpTest.java +--- a/test/hotspot/jtreg/compiler/vectorization/runner/ArrayShiftOpTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/runner/ArrayShiftOpTest.java 2024-02-20 10:42:38.112195242 +0800 +@@ -22,6 +22,12 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @summary Vectorization test on bug-prone shift operation + * @library /test/lib / +@@ -35,7 +41,7 @@ + * -XX:+WhiteBoxAPI + * compiler.vectorization.runner.ArrayShiftOpTest + * +- * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") ++ * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") | (os.simpleArch == "loongarch64") + * @requires vm.compiler2.enabled & vm.flagless + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/runner/BasicDoubleOpTest.java b/test/hotspot/jtreg/compiler/vectorization/runner/BasicDoubleOpTest.java +--- a/test/hotspot/jtreg/compiler/vectorization/runner/BasicDoubleOpTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/runner/BasicDoubleOpTest.java 2024-02-20 10:42:38.112195242 +0800 +@@ -22,6 +22,12 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @summary Vectorization test on basic double operations + * @library /test/lib / +@@ -35,7 +41,7 @@ + * -XX:+WhiteBoxAPI + * compiler.vectorization.runner.BasicDoubleOpTest + * +- * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") ++ * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") | (os.simpleArch == "loongarch64") + * @requires vm.compiler2.enabled & vm.flagless + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/runner/BasicFloatOpTest.java b/test/hotspot/jtreg/compiler/vectorization/runner/BasicFloatOpTest.java +--- a/test/hotspot/jtreg/compiler/vectorization/runner/BasicFloatOpTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/runner/BasicFloatOpTest.java 2024-02-20 10:42:38.112195242 +0800 +@@ -22,6 +22,12 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @summary Vectorization test on basic float operations + * @library /test/lib / +@@ -35,7 +41,7 @@ + * -XX:+WhiteBoxAPI + * compiler.vectorization.runner.BasicFloatOpTest + * +- * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") ++ * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") | (os.simpleArch == "loongarch64") + * @requires vm.compiler2.enabled & vm.flagless + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/runner/BasicLongOpTest.java b/test/hotspot/jtreg/compiler/vectorization/runner/BasicLongOpTest.java +--- a/test/hotspot/jtreg/compiler/vectorization/runner/BasicLongOpTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/runner/BasicLongOpTest.java 2024-02-20 10:42:38.112195242 +0800 +@@ -23,6 +23,12 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @summary Vectorization test on basic long operations + * @library /test/lib / +@@ -36,7 +42,7 @@ + * -XX:+WhiteBoxAPI + * compiler.vectorization.runner.BasicLongOpTest + * +- * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") ++ * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") | (os.simpleArch == "loongarch64") + * @requires vm.compiler2.enabled & vm.flagless + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/runner/LoopArrayIndexComputeTest.java b/test/hotspot/jtreg/compiler/vectorization/runner/LoopArrayIndexComputeTest.java +--- a/test/hotspot/jtreg/compiler/vectorization/runner/LoopArrayIndexComputeTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/runner/LoopArrayIndexComputeTest.java 2024-02-20 10:42:38.112195242 +0800 +@@ -23,6 +23,12 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @summary Vectorization test on loop array index computation + * @library /test/lib / +@@ -36,7 +42,7 @@ + * -XX:+WhiteBoxAPI + * compiler.vectorization.runner.LoopArrayIndexComputeTest + * +- * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") ++ * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") | (os.simpleArch == "loongarch64") + * @requires vm.compiler2.enabled & vm.flagless + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/runner/LoopReductionOpTest.java b/test/hotspot/jtreg/compiler/vectorization/runner/LoopReductionOpTest.java +--- a/test/hotspot/jtreg/compiler/vectorization/runner/LoopReductionOpTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/runner/LoopReductionOpTest.java 2024-02-20 10:42:38.112195242 +0800 +@@ -22,6 +22,12 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @summary Vectorization test on reduction operations + * @library /test/lib / +@@ -35,7 +41,7 @@ + * -XX:+WhiteBoxAPI + * compiler.vectorization.runner.LoopReductionOpTest + * +- * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") ++ * @requires (os.simpleArch == "x64") | (os.simpleArch == "aarch64") | (os.simpleArch == "loongarch64") + * @requires vm.compiler2.enabled & vm.flagless + * + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/TestAutoVecIntMinMax.java b/test/hotspot/jtreg/compiler/vectorization/TestAutoVecIntMinMax.java +--- a/test/hotspot/jtreg/compiler/vectorization/TestAutoVecIntMinMax.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/TestAutoVecIntMinMax.java 2024-02-20 10:42:38.112195242 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package compiler.c2.irTests; + + import compiler.lib.ir_framework.*; +@@ -34,7 +40,7 @@ + * @library /test/lib / + * @requires vm.compiler2.enabled + * @requires (os.simpleArch == "x64" & (vm.opt.UseSSE == "null" | vm.opt.UseSSE > 3)) +- * | os.arch == "aarch64" | (os.arch == "riscv64" & vm.opt.UseRVV == true) ++ * | os.arch == "aarch64" | (os.arch == "riscv64" & vm.opt.UseRVV == true) | os.arch=="loongarch64" + * @run driver compiler.c2.irTests.TestAutoVecIntMinMax + */ + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/TestBufferVectorization.java b/test/hotspot/jtreg/compiler/vectorization/TestBufferVectorization.java +--- a/test/hotspot/jtreg/compiler/vectorization/TestBufferVectorization.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/TestBufferVectorization.java 2024-02-20 10:42:38.112195242 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @bug 8257531 +@@ -29,7 +35,7 @@ + * + * @requires vm.flagless + * @requires vm.compiler2.enabled & vm.debug == true +- * @requires os.arch=="x86" | os.arch=="i386" | os.arch=="amd64" | os.arch=="x86_64" | os.arch=="aarch64" ++ * @requires os.arch=="x86" | os.arch=="i386" | os.arch=="amd64" | os.arch=="x86_64" | os.arch=="aarch64" | os.arch=="loongarch64" + * + * @run driver compiler.vectorization.TestBufferVectorization array + * @run driver compiler.vectorization.TestBufferVectorization arrayOffset +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/TestFloatConversionsVector.java b/test/hotspot/jtreg/compiler/vectorization/TestFloatConversionsVector.java +--- a/test/hotspot/jtreg/compiler/vectorization/TestFloatConversionsVector.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/TestFloatConversionsVector.java 2024-02-20 10:42:38.112195242 +0800 +@@ -21,12 +21,18 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @bug 8294588 + * @summary Auto-vectorize Float.floatToFloat16, Float.float16ToFloat APIs + * @requires vm.compiler2.enabled +- * @requires (os.simpleArch == "x64" & (vm.cpu.features ~= ".*avx512f.*" | vm.cpu.features ~= ".*f16c.*")) | os.arch == "aarch64" ++ * @requires (os.simpleArch == "x64" & (vm.cpu.features ~= ".*avx512f.*" | vm.cpu.features ~= ".*f16c.*")) | os.arch == "aarch64" | os.arch == "loongarch64" + * @library /test/lib / + * @run driver compiler.vectorization.TestFloatConversionsVector + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/TestNumberOfContinuousZeros.java b/test/hotspot/jtreg/compiler/vectorization/TestNumberOfContinuousZeros.java +--- a/test/hotspot/jtreg/compiler/vectorization/TestNumberOfContinuousZeros.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/TestNumberOfContinuousZeros.java 2024-02-20 10:42:38.112195242 +0800 +@@ -21,13 +21,20 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @key randomness + * @summary Test vectorization of numberOfTrailingZeros/numberOfLeadingZeros for Long + * @requires vm.compiler2.enabled + * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*avx2.*") | +-* (os.simpleArch == "aarch64" & vm.cpu.features ~= ".*sve.*" & (vm.opt.UseSVE == "null" | vm.opt.UseSVE > 0)) ++* (os.simpleArch == "aarch64" & vm.cpu.features ~= ".*sve.*" & (vm.opt.UseSVE == "null" | vm.opt.UseSVE > 0)) | ++* (os.simpleArch == "loongarch64") + * @library /test/lib / + * @run driver compiler.vectorization.TestNumberOfContinuousZeros + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/TestPopulateIndex.java b/test/hotspot/jtreg/compiler/vectorization/TestPopulateIndex.java +--- a/test/hotspot/jtreg/compiler/vectorization/TestPopulateIndex.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/TestPopulateIndex.java 2024-02-20 10:42:38.112195242 +0800 +@@ -21,13 +21,20 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @bug 8286972 + * @summary Test vectorization of loop induction variable usage in the loop + * @requires vm.compiler2.enabled + * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*avx2.*") | +-* (os.simpleArch == "aarch64" & vm.cpu.features ~= ".*sve.*" & (vm.opt.UseSVE == "null" | vm.opt.UseSVE > 0)) ++* (os.simpleArch == "aarch64" & vm.cpu.features ~= ".*sve.*" & (vm.opt.UseSVE == "null" | vm.opt.UseSVE > 0)) | ++* (os.simpleArch == "loongarch64") + * @library /test/lib / + * @run driver compiler.vectorization.TestPopulateIndex + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/TestReverseBitsVector.java b/test/hotspot/jtreg/compiler/vectorization/TestReverseBitsVector.java +--- a/test/hotspot/jtreg/compiler/vectorization/TestReverseBitsVector.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/TestReverseBitsVector.java 2024-02-20 10:42:38.112195242 +0800 +@@ -20,12 +20,19 @@ + * or visit www.oracle.com if you need additional information or have any + * questions. + */ ++ ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @bug 8290034 + * @summary Auto-vectorization of Reverse bit operation. + * @requires vm.compiler2.enabled +- * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*avx2.*") | os.arch == "aarch64" ++ * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*avx2.*") | os.arch == "aarch64" | os.arch=="loongarch64" + * @library /test/lib / + * @run driver compiler.vectorization.TestReverseBitsVector + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/TestReverseBytes.java b/test/hotspot/jtreg/compiler/vectorization/TestReverseBytes.java +--- a/test/hotspot/jtreg/compiler/vectorization/TestReverseBytes.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/TestReverseBytes.java 2024-02-20 10:42:38.112195242 +0800 +@@ -20,12 +20,19 @@ + * or visit www.oracle.com if you need additional information or have any + * questions. + */ ++ ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @bug 8288112 + * @summary Auto-vectorization of ReverseBytes operations. + * @requires vm.compiler2.enabled +- * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*avx2.*") | os.simpleArch == "AArch64" ++ * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*avx2.*") | os.simpleArch == "AArch64" | os.simpleArch == "loongarch64" + * @library /test/lib / + * @run driver compiler.vectorization.TestReverseBytes + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/compiler/vectorization/TestSignumVector.java b/test/hotspot/jtreg/compiler/vectorization/TestSignumVector.java +--- a/test/hotspot/jtreg/compiler/vectorization/TestSignumVector.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/compiler/vectorization/TestSignumVector.java 2024-02-20 10:42:38.112195242 +0800 +@@ -21,12 +21,18 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @bug 8282711 8290249 + * @summary Accelerate Math.signum function for AVX, AVX512 and aarch64 (Neon and SVE) + * @requires vm.compiler2.enabled +- * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*avx.*") | os.arch == "aarch64" ++ * @requires (os.simpleArch == "x64" & vm.cpu.features ~= ".*avx.*") | os.arch == "aarch64" | os.arch == "loongarch64" + * @library /test/lib / + * @run driver compiler.vectorization.TestSignumVector + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/gc/shenandoah/compiler/TestLinkToNativeRBP.java b/test/hotspot/jtreg/gc/shenandoah/compiler/TestLinkToNativeRBP.java +--- a/test/hotspot/jtreg/gc/shenandoah/compiler/TestLinkToNativeRBP.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/gc/shenandoah/compiler/TestLinkToNativeRBP.java 2024-02-20 10:42:38.132195228 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @enablePreview +@@ -28,7 +34,7 @@ + * @summary guarantee(loc != NULL) failed: missing saved register with native invoke + * + * @requires vm.flavor == "server" +- * @requires ((os.arch == "amd64" | os.arch == "x86_64") & sun.arch.data.model == "64") | os.arch == "aarch64" | os.arch == "ppc64le" ++ * @requires ((os.arch == "amd64" | os.arch == "x86_64") & sun.arch.data.model == "64") | os.arch == "aarch64" | os.arch == "ppc64le" | os.arch=="loongarch64" + * @requires vm.gc.Shenandoah + * + * @run main/othervm --enable-native-access=ALL-UNNAMED -XX:+UnlockDiagnosticVMOptions +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/loongson/25064/NUMAHelper.java b/test/hotspot/jtreg/loongson/25064/NUMAHelper.java +--- a/test/hotspot/jtreg/loongson/25064/NUMAHelper.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/loongson/25064/NUMAHelper.java 2024-02-20 10:42:38.138861889 +0800 +@@ -0,0 +1,99 @@ ++/* ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++import java.io.File; ++import java.io.BufferedReader; ++import java.io.IOException; ++import java.io.InputStreamReader; ++import jdk.test.lib.process.OutputAnalyzer; ++import jdk.test.lib.process.ProcessTools; ++ ++public class NUMAHelper { ++ ++ private static final String INITIAL_HEAP_SIZE_PATTERN ++ = "\\bInitialHeapSize\\b.*?=.*?([0-9]+)"; ++ ++ private static final String MIN_HEAP_SIZE_PER_NODE_PATTERN ++ = "\\bNUMAMinHeapSizePerNode\\b.*?=.*?([0-9]+)"; ++ ++ private static final String NEW_SIZE_PATTERN ++ = "\\bNewSize\\b.*?=.*?([0-9]+)"; ++ ++ static long getInitialHeapSize(OutputAnalyzer output) { ++ String matched = output.firstMatch(INITIAL_HEAP_SIZE_PATTERN, 1); ++ return Long.parseLong(matched); ++ } ++ ++ static long getMinHeapSizePerNode(OutputAnalyzer output) { ++ String matched = output.firstMatch(MIN_HEAP_SIZE_PER_NODE_PATTERN, 1); ++ return Long.parseLong(matched); ++ } ++ ++ static long getNewSize(OutputAnalyzer output) { ++ String matched = output.firstMatch(NEW_SIZE_PATTERN, 1); ++ return Long.parseLong(matched); ++ } ++ ++ static OutputAnalyzer invokeJvm(String... args) throws Exception { ++ return new OutputAnalyzer(ProcessTools.createTestJvm(args).start()); ++ } ++ ++ static int getNUMANodes() throws Exception { ++ String command = "ls /sys/devices/system/node | grep '^node' | wc -l"; ++ ProcessBuilder processBuilder = new ProcessBuilder("sh", "-c", command); ++ ++ Process process = processBuilder.start(); ++ BufferedReader reader = new BufferedReader( ++ new InputStreamReader(process.getInputStream())); ++ String line = reader.readLine(); ++ process.destroy(); ++ ++ int nodes = Integer.parseInt(line); ++ System.out.println("Number of NUMA nodes: " + nodes); ++ return nodes; ++ } ++ ++ static void judge(OutputAnalyzer o, int nodes, ++ Long manualInitialHeapSize, ++ boolean manualNewSize) { ++ long initialHeapSize; ++ if (manualInitialHeapSize != null) { ++ initialHeapSize = (long) manualInitialHeapSize; ++ } else { // InitialHeapSize may be aligned up via GC ++ initialHeapSize = NUMAHelper.getInitialHeapSize(o); ++ } ++ long minHeapSizePerNode = NUMAHelper.getMinHeapSizePerNode(o); ++ long newSize = NUMAHelper.getNewSize(o); ++ ++ if (nodes <= 1) { // not supported numa or only one numa node ++ o.shouldMatch("bool UseNUMA[ ]+= false"); ++ } else if (initialHeapSize < minHeapSizePerNode * nodes) { ++ o.shouldMatch("bool UseNUMA[ ]+= false"); ++ } else if (manualNewSize && newSize < 1363144 * nodes) { ++ o.shouldMatch("bool UseNUMA[ ]+= false"); ++ } else { ++ o.shouldMatch("bool UseNUMA[ ]+= true"); ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/loongson/25064/TestUseNUMADefault.java b/test/hotspot/jtreg/loongson/25064/TestUseNUMADefault.java +--- a/test/hotspot/jtreg/loongson/25064/TestUseNUMADefault.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/loongson/25064/TestUseNUMADefault.java 2024-02-20 10:42:38.138861889 +0800 +@@ -0,0 +1,152 @@ ++/* ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++/** ++ * @test TestUseNUMADefault ++ * @summary ++ * Tests that UseNUMA should be enabled by default for all collectors ++ * on machines with multiple NUMA nodes, unless NUMA init fails or ++ * the InitialHeapSize is too small. ++ * @library /test/lib ++ * @library / ++ * @requires os.family == "linux" ++ * @requires os.arch == "loongarch64" ++ * @requires vm.gc.G1 ++ * @run main/othervm TestUseNUMADefault -XX:+UseG1GC ++ */ ++ ++/** ++ * @test TestUseNUMADefault ++ * @summary ++ * Tests that UseNUMA should be enabled by default for all collectors ++ * on machines with multiple NUMA nodes, unless NUMA init fails or ++ * the InitialHeapSize is too small. ++ * @library /test/lib ++ * @library / ++ * @requires os.family == "linux" ++ * @requires os.arch == "loongarch64" ++ * @requires vm.gc.Parallel ++ * @run main/othervm TestUseNUMADefault -XX:+UseParallelGC ++ */ ++ ++/** ++ * @test TestUseNUMADefault ++ * @summary ++ * Tests that UseNUMA should be enabled by default for all collectors ++ * on machines with multiple NUMA nodes, unless NUMA init fails or ++ * the InitialHeapSize is too small. ++ * @library /test/lib ++ * @library / ++ * @requires os.family == "linux" ++ * @requires os.arch == "loongarch64" ++ * @requires vm.gc.Z ++ * @run main/othervm TestUseNUMADefault -XX:+UseZGC ++ */ ++ ++/** ++ * @test TestUseNUMADefault ++ * @summary ++ * Tests that UseNUMA should be enabled by default for all collectors ++ * on machines with multiple NUMA nodes, unless NUMA init fails or ++ * the InitialHeapSize is too small. ++ * @library /test/lib ++ * @library / ++ * @requires os.family == "linux" ++ * @requires os.arch == "loongarch64" ++ * @requires vm.gc.Shenandoah ++ * @run main/othervm TestUseNUMADefault -XX:+UseShenandoahGC ++ */ ++ ++public class TestUseNUMADefault { ++ ++ public static void main(String[] args) throws Exception { ++ String gcFlag = args[0]; ++ int nodes = NUMAHelper.getNUMANodes(); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, null, false); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NUMAMinHeapSizePerNode=127m", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, null, false); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NUMAMinHeapSizePerNode=128m", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, null, false); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NUMAMinHeapSizePerNode=129m", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, null, false); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NUMAMinHeapSizePerNode=128m", ++ "-XX:InitialHeapSize=" + 127 * nodes + "m", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, 133169152L * nodes, false); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NUMAMinHeapSizePerNode=128m", ++ "-XX:InitialHeapSize=" + 128 * nodes + "m", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, 134217728L * nodes, false); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NewSize=" + 1 * nodes + "m", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, null, true); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NewSize=" + 2 * nodes + "m", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, null, true); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NewSize=" + 2 * nodes + "m", ++ "-XX:NUMAMinHeapSizePerNode=128m", ++ "-XX:InitialHeapSize=" + 127 * nodes + "m", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, 133169152L * nodes, true); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NewSize=" + 2 * nodes + "m", ++ "-XX:NUMAMinHeapSizePerNode=128m", ++ "-XX:InitialHeapSize=" + 128 * nodes + "m", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, 134217728L * nodes, true); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/loongson/25064/TestUseNUMADisabled.java b/test/hotspot/jtreg/loongson/25064/TestUseNUMADisabled.java +--- a/test/hotspot/jtreg/loongson/25064/TestUseNUMADisabled.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/loongson/25064/TestUseNUMADisabled.java 2024-02-20 10:42:38.138861889 +0800 +@@ -0,0 +1,94 @@ ++/* ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++/** ++ * @test TestUseNUMADisabled ++ * @summary ++ * If -XX:-UseNUMA is specified at startup, then UseNUMA should be ++ * disabled for all collectors on machines with any number of NUMA ++ * nodes ergonomically. ++ * @library /test/lib ++ * @library / ++ * @requires os.family == "linux" ++ * @requires os.arch == "loongarch64" ++ * @requires vm.gc.G1 ++ * @run main/othervm TestUseNUMADisabled -XX:+UseG1GC ++ */ ++ ++/** ++ * @test TestUseNUMADisabled ++ * @summary ++ * If -XX:-UseNUMA is specified at startup, then UseNUMA should be ++ * disabled for all collectors on machines with any number of NUMA ++ * nodes ergonomically. ++ * @library /test/lib ++ * @library / ++ * @requires os.family == "linux" ++ * @requires os.arch == "loongarch64" ++ * @requires vm.gc.Parallel ++ * @run main/othervm TestUseNUMADisabled -XX:+UseParallelGC ++ */ ++ ++/** ++ * @test TestUseNUMADisabled ++ * @summary ++ * If -XX:-UseNUMA is specified at startup, then UseNUMA should be ++ * disabled for all collectors on machines with any number of NUMA ++ * nodes ergonomically. ++ * @library /test/lib ++ * @library / ++ * @requires os.family == "linux" ++ * @requires os.arch == "loongarch64" ++ * @requires vm.gc.Z ++ * @run main/othervm TestUseNUMADisabled -XX:+UseZGC ++ */ ++ ++/** ++ * @test TestUseNUMADisabled ++ * @summary ++ * If -XX:-UseNUMA is specified at startup, then UseNUMA should be ++ * disabled for all collectors on machines with any number of NUMA ++ * nodes ergonomically. ++ * @library /test/lib ++ * @library / ++ * @requires os.family == "linux" ++ * @requires os.arch == "loongarch64" ++ * @requires vm.gc.Shenandoah ++ * @run main/othervm TestUseNUMADisabled -XX:+UseShenandoahGC ++ */ ++ ++import jdk.test.lib.process.OutputAnalyzer; ++ ++public class TestUseNUMADisabled { ++ public static void main(String[] args) throws Exception { ++ String gcFlag = args[0]; ++ OutputAnalyzer o = NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:-UseNUMA", ++ "-XX:+PrintFlagsFinal", ++ "-version"); ++ ++ o.shouldMatch("bool UseNUMA[ ]+= false"); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/loongson/25064/TestUseNUMAEnabled.java b/test/hotspot/jtreg/loongson/25064/TestUseNUMAEnabled.java +--- a/test/hotspot/jtreg/loongson/25064/TestUseNUMAEnabled.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/loongson/25064/TestUseNUMAEnabled.java 2024-02-20 10:42:38.138861889 +0800 +@@ -0,0 +1,165 @@ ++/* ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++/** ++ * @test TestUseNUMAEnabled ++ * @summary ++ * Handcrafted -XX:+UseNUMA will be set to false in the following cases: ++ * 1. not supported NUMA or only one node ++ * 2. InitialHeapSize is too small ++ * 3. manually specified NewSize is too small ++ * @library /test/lib ++ * @library / ++ * @requires os.family == "linux" ++ * @requires os.arch == "loongarch64" ++ * @requires vm.gc.G1 ++ * @run main/othervm TestUseNUMAEnabled -XX:+UseG1GC ++ */ ++ ++/** ++ * @test TestUseNUMAEnabled ++ * @summary ++ * Handcrafted -XX:+UseNUMA will be set to false in the following cases: ++ * 1. not supported NUMA or only one node ++ * 2. InitialHeapSize is too small ++ * 3. manually specified NewSize is too small ++ * @library /test/lib ++ * @library / ++ * @requires os.family == "linux" ++ * @requires os.arch == "loongarch64" ++ * @requires vm.gc.Parallel ++ * @run main/othervm TestUseNUMAEnabled -XX:+UseParallelGC ++ */ ++ ++/** ++ * @test TestUseNUMAEnabled ++ * @summary ++ * Handcrafted -XX:+UseNUMA will be set to false in the following cases: ++ * 1. not supported NUMA or only one node ++ * 2. InitialHeapSize is too small ++ * 3. manually specified NewSize is too small ++ * @library /test/lib ++ * @library / ++ * @requires os.family == "linux" ++ * @requires os.arch == "loongarch64" ++ * @requires vm.gc.Z ++ * @run main/othervm TestUseNUMAEnabled -XX:+UseZGC ++ */ ++ ++/** ++ * @test TestUseNUMAEnabled ++ * @summary ++ * Handcrafted -XX:+UseNUMA will be set to false in the following cases: ++ * 1. not supported NUMA or only one node ++ * 2. InitialHeapSize is too small ++ * 3. manually specified NewSize is too small ++ * @library /test/lib ++ * @library / ++ * @requires os.family == "linux" ++ * @requires os.arch == "loongarch64" ++ * @requires vm.gc.Shenandoah ++ * @run main/othervm TestUseNUMAEnabled -XX:+UseShenandoahGC ++ */ ++ ++public class TestUseNUMAEnabled { ++ public static void main(String[] args) throws Exception { ++ String gcFlag = args[0]; ++ int nodes = NUMAHelper.getNUMANodes(); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:+UseNUMA", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, null, false); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NUMAMinHeapSizePerNode=64m", ++ "-XX:+UseNUMA", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, null, false); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NUMAMinHeapSizePerNode=128m", ++ "-XX:+UseNUMA", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, null, false); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NUMAMinHeapSizePerNode=256m", ++ "-XX:+UseNUMA", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, null, false); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NUMAMinHeapSizePerNode=128m", ++ "-XX:InitialHeapSize=" + 127 * nodes + "m", ++ "-XX:+UseNUMA", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, 133169152L * nodes, false); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NUMAMinHeapSizePerNode=128m", ++ "-XX:InitialHeapSize=" + 128 * nodes + "m", ++ "-XX:+UseNUMA", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, 134217728L * nodes, false); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NewSize=" + 1 * nodes + "m", ++ "-XX:+UseNUMA", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, null, true); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NewSize=" + 2 * nodes + "m", ++ "-XX:+UseNUMA", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, null, true); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NewSize=" + 2 * nodes + "m", ++ "-XX:NUMAMinHeapSizePerNode=128m", ++ "-XX:InitialHeapSize=" + 127 * nodes + "m", ++ "-XX:+UseNUMA", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, 133169152L * nodes, true); ++ ++ NUMAHelper.judge(NUMAHelper.invokeJvm( ++ gcFlag, ++ "-XX:NewSize=" + 2 * nodes + "m", ++ "-XX:NUMAMinHeapSizePerNode=128m", ++ "-XX:InitialHeapSize=" + 128 * nodes + "m", ++ "-XX:+UseNUMA", ++ "-XX:+PrintFlagsFinal", ++ "-version"), nodes, 134217728L * nodes, true); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/loongson/25443/Test25443.java b/test/hotspot/jtreg/loongson/25443/Test25443.java +--- a/test/hotspot/jtreg/loongson/25443/Test25443.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/loongson/25443/Test25443.java 2024-02-20 10:42:38.138861889 +0800 +@@ -0,0 +1,58 @@ ++/* ++ * Copyright (c) 2015, 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++/** ++ * @test ++ * @summary test c2 or2s ++ * ++ * @run main/othervm -Xcomp -XX:-TieredCompilation Test25443 ++ */ ++public class Test25443 { ++ static short test_ori2s(int v1) { ++ short t = (short)(v1 | 0x14); ++ return t; ++ } ++ ++ static short test_or2s(int v1, int v2) { ++ short t = (short)(v1 | v2); ++ return t; ++ } ++ ++ static short ret; ++ public static void main(String[] args) { ++ for (int i = 0; i < 12000; i++) { //warmup ++ test_ori2s(0x333300); ++ test_or2s(0x333300, 0x14); ++ } ++ ++ if ( (test_ori2s(0x333300) == 0x3314) ++ && (test_or2s(0x333300, 0x14) == 0x3314) ++ && (test_or2s(0x333300, 0x1000) == 0x3300) ++ && (test_or2s(0x333300, 0x8000) == 0xffffb300)) { ++ System.out.println("TEST PASSED"); ++ } else { ++ throw new AssertionError("Not be expected results"); ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/loongson/26733/Test26733.java b/test/hotspot/jtreg/loongson/26733/Test26733.java +--- a/test/hotspot/jtreg/loongson/26733/Test26733.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/loongson/26733/Test26733.java 2024-02-20 10:42:38.138861889 +0800 +@@ -0,0 +1,570 @@ ++/* ++ * Copyright (c) 2022, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++/** ++ * @test ++ * @summary test c2 load float immediate by vldi ++ * ++ * @run main/othervm -Xcomp -XX:-TieredCompilation Test26733 ++ */ ++ ++public class Test26733 { ++ private boolean floatVerify(float fval, int rval) { ++ return Float.floatToRawIntBits(fval) == rval; ++ } ++ ++ private boolean doubleVerify(double fval, long rval) { ++ return Double.doubleToRawLongBits(fval) == rval; ++ } ++ ++ private boolean doTest() { ++ boolean res = true; ++ ++ res &= floatVerify(2.0000000000f, 0x40000000); ++ res &= floatVerify(2.1250000000f, 0x40080000); ++ res &= floatVerify(2.2500000000f, 0x40100000); ++ res &= floatVerify(2.3750000000f, 0x40180000); ++ res &= floatVerify(2.5000000000f, 0x40200000); ++ res &= floatVerify(2.6250000000f, 0x40280000); ++ res &= floatVerify(2.7500000000f, 0x40300000); ++ res &= floatVerify(2.8750000000f, 0x40380000); ++ res &= floatVerify(3.0000000000f, 0x40400000); ++ res &= floatVerify(3.1250000000f, 0x40480000); ++ res &= floatVerify(3.2500000000f, 0x40500000); ++ res &= floatVerify(3.3750000000f, 0x40580000); ++ res &= floatVerify(3.5000000000f, 0x40600000); ++ res &= floatVerify(3.6250000000f, 0x40680000); ++ res &= floatVerify(3.7500000000f, 0x40700000); ++ res &= floatVerify(3.8750000000f, 0x40780000); ++ res &= floatVerify(4.0000000000f, 0x40800000); ++ res &= floatVerify(4.2500000000f, 0x40880000); ++ res &= floatVerify(4.5000000000f, 0x40900000); ++ res &= floatVerify(4.7500000000f, 0x40980000); ++ res &= floatVerify(5.0000000000f, 0x40a00000); ++ res &= floatVerify(5.2500000000f, 0x40a80000); ++ res &= floatVerify(5.5000000000f, 0x40b00000); ++ res &= floatVerify(5.7500000000f, 0x40b80000); ++ res &= floatVerify(6.0000000000f, 0x40c00000); ++ res &= floatVerify(6.2500000000f, 0x40c80000); ++ res &= floatVerify(6.5000000000f, 0x40d00000); ++ res &= floatVerify(6.7500000000f, 0x40d80000); ++ res &= floatVerify(7.0000000000f, 0x40e00000); ++ res &= floatVerify(7.2500000000f, 0x40e80000); ++ res &= floatVerify(7.5000000000f, 0x40f00000); ++ res &= floatVerify(7.7500000000f, 0x40f80000); ++ res &= floatVerify(8.0000000000f, 0x41000000); ++ res &= floatVerify(8.5000000000f, 0x41080000); ++ res &= floatVerify(9.0000000000f, 0x41100000); ++ res &= floatVerify(9.5000000000f, 0x41180000); ++ res &= floatVerify(10.0000000000f, 0x41200000); ++ res &= floatVerify(10.5000000000f, 0x41280000); ++ res &= floatVerify(11.0000000000f, 0x41300000); ++ res &= floatVerify(11.5000000000f, 0x41380000); ++ res &= floatVerify(12.0000000000f, 0x41400000); ++ res &= floatVerify(12.5000000000f, 0x41480000); ++ res &= floatVerify(13.0000000000f, 0x41500000); ++ res &= floatVerify(13.5000000000f, 0x41580000); ++ res &= floatVerify(14.0000000000f, 0x41600000); ++ res &= floatVerify(14.5000000000f, 0x41680000); ++ res &= floatVerify(15.0000000000f, 0x41700000); ++ res &= floatVerify(15.5000000000f, 0x41780000); ++ res &= floatVerify(16.0000000000f, 0x41800000); ++ res &= floatVerify(17.0000000000f, 0x41880000); ++ res &= floatVerify(18.0000000000f, 0x41900000); ++ res &= floatVerify(19.0000000000f, 0x41980000); ++ res &= floatVerify(20.0000000000f, 0x41a00000); ++ res &= floatVerify(21.0000000000f, 0x41a80000); ++ res &= floatVerify(22.0000000000f, 0x41b00000); ++ res &= floatVerify(23.0000000000f, 0x41b80000); ++ res &= floatVerify(24.0000000000f, 0x41c00000); ++ res &= floatVerify(25.0000000000f, 0x41c80000); ++ res &= floatVerify(26.0000000000f, 0x41d00000); ++ res &= floatVerify(27.0000000000f, 0x41d80000); ++ res &= floatVerify(28.0000000000f, 0x41e00000); ++ res &= floatVerify(29.0000000000f, 0x41e80000); ++ res &= floatVerify(30.0000000000f, 0x41f00000); ++ res &= floatVerify(31.0000000000f, 0x41f80000); ++ res &= floatVerify(0.1250000000f, 0x3e000000); ++ res &= floatVerify(0.1328125000f, 0x3e080000); ++ res &= floatVerify(0.1406250000f, 0x3e100000); ++ res &= floatVerify(0.1484375000f, 0x3e180000); ++ res &= floatVerify(0.1562500000f, 0x3e200000); ++ res &= floatVerify(0.1640625000f, 0x3e280000); ++ res &= floatVerify(0.1718750000f, 0x3e300000); ++ res &= floatVerify(0.1796875000f, 0x3e380000); ++ res &= floatVerify(0.1875000000f, 0x3e400000); ++ res &= floatVerify(0.1953125000f, 0x3e480000); ++ res &= floatVerify(0.2031250000f, 0x3e500000); ++ res &= floatVerify(0.2109375000f, 0x3e580000); ++ res &= floatVerify(0.2187500000f, 0x3e600000); ++ res &= floatVerify(0.2265625000f, 0x3e680000); ++ res &= floatVerify(0.2343750000f, 0x3e700000); ++ res &= floatVerify(0.2421875000f, 0x3e780000); ++ res &= floatVerify(0.2500000000f, 0x3e800000); ++ res &= floatVerify(0.2656250000f, 0x3e880000); ++ res &= floatVerify(0.2812500000f, 0x3e900000); ++ res &= floatVerify(0.2968750000f, 0x3e980000); ++ res &= floatVerify(0.3125000000f, 0x3ea00000); ++ res &= floatVerify(0.3281250000f, 0x3ea80000); ++ res &= floatVerify(0.3437500000f, 0x3eb00000); ++ res &= floatVerify(0.3593750000f, 0x3eb80000); ++ res &= floatVerify(0.3750000000f, 0x3ec00000); ++ res &= floatVerify(0.3906250000f, 0x3ec80000); ++ res &= floatVerify(0.4062500000f, 0x3ed00000); ++ res &= floatVerify(0.4218750000f, 0x3ed80000); ++ res &= floatVerify(0.4375000000f, 0x3ee00000); ++ res &= floatVerify(0.4531250000f, 0x3ee80000); ++ res &= floatVerify(0.4687500000f, 0x3ef00000); ++ res &= floatVerify(0.4843750000f, 0x3ef80000); ++ res &= floatVerify(0.5000000000f, 0x3f000000); ++ res &= floatVerify(0.5312500000f, 0x3f080000); ++ res &= floatVerify(0.5625000000f, 0x3f100000); ++ res &= floatVerify(0.5937500000f, 0x3f180000); ++ res &= floatVerify(0.6250000000f, 0x3f200000); ++ res &= floatVerify(0.6562500000f, 0x3f280000); ++ res &= floatVerify(0.6875000000f, 0x3f300000); ++ res &= floatVerify(0.7187500000f, 0x3f380000); ++ res &= floatVerify(0.7500000000f, 0x3f400000); ++ res &= floatVerify(0.7812500000f, 0x3f480000); ++ res &= floatVerify(0.8125000000f, 0x3f500000); ++ res &= floatVerify(0.8437500000f, 0x3f580000); ++ res &= floatVerify(0.8750000000f, 0x3f600000); ++ res &= floatVerify(0.9062500000f, 0x3f680000); ++ res &= floatVerify(0.9375000000f, 0x3f700000); ++ res &= floatVerify(0.9687500000f, 0x3f780000); ++ res &= floatVerify(1.0000000000f, 0x3f800000); ++ res &= floatVerify(1.0625000000f, 0x3f880000); ++ res &= floatVerify(1.1250000000f, 0x3f900000); ++ res &= floatVerify(1.1875000000f, 0x3f980000); ++ res &= floatVerify(1.2500000000f, 0x3fa00000); ++ res &= floatVerify(1.3125000000f, 0x3fa80000); ++ res &= floatVerify(1.3750000000f, 0x3fb00000); ++ res &= floatVerify(1.4375000000f, 0x3fb80000); ++ res &= floatVerify(1.5000000000f, 0x3fc00000); ++ res &= floatVerify(1.5625000000f, 0x3fc80000); ++ res &= floatVerify(1.6250000000f, 0x3fd00000); ++ res &= floatVerify(1.6875000000f, 0x3fd80000); ++ res &= floatVerify(1.7500000000f, 0x3fe00000); ++ res &= floatVerify(1.8125000000f, 0x3fe80000); ++ res &= floatVerify(1.8750000000f, 0x3ff00000); ++ res &= floatVerify(1.9375000000f, 0x3ff80000); ++ res &= floatVerify(-2.0000000000f, 0xc0000000); ++ res &= floatVerify(-2.1250000000f, 0xc0080000); ++ res &= floatVerify(-2.2500000000f, 0xc0100000); ++ res &= floatVerify(-2.3750000000f, 0xc0180000); ++ res &= floatVerify(-2.5000000000f, 0xc0200000); ++ res &= floatVerify(-2.6250000000f, 0xc0280000); ++ res &= floatVerify(-2.7500000000f, 0xc0300000); ++ res &= floatVerify(-2.8750000000f, 0xc0380000); ++ res &= floatVerify(-3.0000000000f, 0xc0400000); ++ res &= floatVerify(-3.1250000000f, 0xc0480000); ++ res &= floatVerify(-3.2500000000f, 0xc0500000); ++ res &= floatVerify(-3.3750000000f, 0xc0580000); ++ res &= floatVerify(-3.5000000000f, 0xc0600000); ++ res &= floatVerify(-3.6250000000f, 0xc0680000); ++ res &= floatVerify(-3.7500000000f, 0xc0700000); ++ res &= floatVerify(-3.8750000000f, 0xc0780000); ++ res &= floatVerify(-4.0000000000f, 0xc0800000); ++ res &= floatVerify(-4.2500000000f, 0xc0880000); ++ res &= floatVerify(-4.5000000000f, 0xc0900000); ++ res &= floatVerify(-4.7500000000f, 0xc0980000); ++ res &= floatVerify(-5.0000000000f, 0xc0a00000); ++ res &= floatVerify(-5.2500000000f, 0xc0a80000); ++ res &= floatVerify(-5.5000000000f, 0xc0b00000); ++ res &= floatVerify(-5.7500000000f, 0xc0b80000); ++ res &= floatVerify(-6.0000000000f, 0xc0c00000); ++ res &= floatVerify(-6.2500000000f, 0xc0c80000); ++ res &= floatVerify(-6.5000000000f, 0xc0d00000); ++ res &= floatVerify(-6.7500000000f, 0xc0d80000); ++ res &= floatVerify(-7.0000000000f, 0xc0e00000); ++ res &= floatVerify(-7.2500000000f, 0xc0e80000); ++ res &= floatVerify(-7.5000000000f, 0xc0f00000); ++ res &= floatVerify(-7.7500000000f, 0xc0f80000); ++ res &= floatVerify(-8.0000000000f, 0xc1000000); ++ res &= floatVerify(-8.5000000000f, 0xc1080000); ++ res &= floatVerify(-9.0000000000f, 0xc1100000); ++ res &= floatVerify(-9.5000000000f, 0xc1180000); ++ res &= floatVerify(-10.0000000000f, 0xc1200000); ++ res &= floatVerify(-10.5000000000f, 0xc1280000); ++ res &= floatVerify(-11.0000000000f, 0xc1300000); ++ res &= floatVerify(-11.5000000000f, 0xc1380000); ++ res &= floatVerify(-12.0000000000f, 0xc1400000); ++ res &= floatVerify(-12.5000000000f, 0xc1480000); ++ res &= floatVerify(-13.0000000000f, 0xc1500000); ++ res &= floatVerify(-13.5000000000f, 0xc1580000); ++ res &= floatVerify(-14.0000000000f, 0xc1600000); ++ res &= floatVerify(-14.5000000000f, 0xc1680000); ++ res &= floatVerify(-15.0000000000f, 0xc1700000); ++ res &= floatVerify(-15.5000000000f, 0xc1780000); ++ res &= floatVerify(-16.0000000000f, 0xc1800000); ++ res &= floatVerify(-17.0000000000f, 0xc1880000); ++ res &= floatVerify(-18.0000000000f, 0xc1900000); ++ res &= floatVerify(-19.0000000000f, 0xc1980000); ++ res &= floatVerify(-20.0000000000f, 0xc1a00000); ++ res &= floatVerify(-21.0000000000f, 0xc1a80000); ++ res &= floatVerify(-22.0000000000f, 0xc1b00000); ++ res &= floatVerify(-23.0000000000f, 0xc1b80000); ++ res &= floatVerify(-24.0000000000f, 0xc1c00000); ++ res &= floatVerify(-25.0000000000f, 0xc1c80000); ++ res &= floatVerify(-26.0000000000f, 0xc1d00000); ++ res &= floatVerify(-27.0000000000f, 0xc1d80000); ++ res &= floatVerify(-28.0000000000f, 0xc1e00000); ++ res &= floatVerify(-29.0000000000f, 0xc1e80000); ++ res &= floatVerify(-30.0000000000f, 0xc1f00000); ++ res &= floatVerify(-31.0000000000f, 0xc1f80000); ++ res &= floatVerify(-0.1250000000f, 0xbe000000); ++ res &= floatVerify(-0.1328125000f, 0xbe080000); ++ res &= floatVerify(-0.1406250000f, 0xbe100000); ++ res &= floatVerify(-0.1484375000f, 0xbe180000); ++ res &= floatVerify(-0.1562500000f, 0xbe200000); ++ res &= floatVerify(-0.1640625000f, 0xbe280000); ++ res &= floatVerify(-0.1718750000f, 0xbe300000); ++ res &= floatVerify(-0.1796875000f, 0xbe380000); ++ res &= floatVerify(-0.1875000000f, 0xbe400000); ++ res &= floatVerify(-0.1953125000f, 0xbe480000); ++ res &= floatVerify(-0.2031250000f, 0xbe500000); ++ res &= floatVerify(-0.2109375000f, 0xbe580000); ++ res &= floatVerify(-0.2187500000f, 0xbe600000); ++ res &= floatVerify(-0.2265625000f, 0xbe680000); ++ res &= floatVerify(-0.2343750000f, 0xbe700000); ++ res &= floatVerify(-0.2421875000f, 0xbe780000); ++ res &= floatVerify(-0.2500000000f, 0xbe800000); ++ res &= floatVerify(-0.2656250000f, 0xbe880000); ++ res &= floatVerify(-0.2812500000f, 0xbe900000); ++ res &= floatVerify(-0.2968750000f, 0xbe980000); ++ res &= floatVerify(-0.3125000000f, 0xbea00000); ++ res &= floatVerify(-0.3281250000f, 0xbea80000); ++ res &= floatVerify(-0.3437500000f, 0xbeb00000); ++ res &= floatVerify(-0.3593750000f, 0xbeb80000); ++ res &= floatVerify(-0.3750000000f, 0xbec00000); ++ res &= floatVerify(-0.3906250000f, 0xbec80000); ++ res &= floatVerify(-0.4062500000f, 0xbed00000); ++ res &= floatVerify(-0.4218750000f, 0xbed80000); ++ res &= floatVerify(-0.4375000000f, 0xbee00000); ++ res &= floatVerify(-0.4531250000f, 0xbee80000); ++ res &= floatVerify(-0.4687500000f, 0xbef00000); ++ res &= floatVerify(-0.4843750000f, 0xbef80000); ++ res &= floatVerify(-0.5000000000f, 0xbf000000); ++ res &= floatVerify(-0.5312500000f, 0xbf080000); ++ res &= floatVerify(-0.5625000000f, 0xbf100000); ++ res &= floatVerify(-0.5937500000f, 0xbf180000); ++ res &= floatVerify(-0.6250000000f, 0xbf200000); ++ res &= floatVerify(-0.6562500000f, 0xbf280000); ++ res &= floatVerify(-0.6875000000f, 0xbf300000); ++ res &= floatVerify(-0.7187500000f, 0xbf380000); ++ res &= floatVerify(-0.7500000000f, 0xbf400000); ++ res &= floatVerify(-0.7812500000f, 0xbf480000); ++ res &= floatVerify(-0.8125000000f, 0xbf500000); ++ res &= floatVerify(-0.8437500000f, 0xbf580000); ++ res &= floatVerify(-0.8750000000f, 0xbf600000); ++ res &= floatVerify(-0.9062500000f, 0xbf680000); ++ res &= floatVerify(-0.9375000000f, 0xbf700000); ++ res &= floatVerify(-0.9687500000f, 0xbf780000); ++ res &= floatVerify(-1.0000000000f, 0xbf800000); ++ res &= floatVerify(-1.0625000000f, 0xbf880000); ++ res &= floatVerify(-1.1250000000f, 0xbf900000); ++ res &= floatVerify(-1.1875000000f, 0xbf980000); ++ res &= floatVerify(-1.2500000000f, 0xbfa00000); ++ res &= floatVerify(-1.3125000000f, 0xbfa80000); ++ res &= floatVerify(-1.3750000000f, 0xbfb00000); ++ res &= floatVerify(-1.4375000000f, 0xbfb80000); ++ res &= floatVerify(-1.5000000000f, 0xbfc00000); ++ res &= floatVerify(-1.5625000000f, 0xbfc80000); ++ res &= floatVerify(-1.6250000000f, 0xbfd00000); ++ res &= floatVerify(-1.6875000000f, 0xbfd80000); ++ res &= floatVerify(-1.7500000000f, 0xbfe00000); ++ res &= floatVerify(-1.8125000000f, 0xbfe80000); ++ res &= floatVerify(-1.8750000000f, 0xbff00000); ++ res &= floatVerify(-1.9375000000f, 0xbff80000); ++ ++ res &= doubleVerify(2.0000000000, 0x4000000000000000L); ++ res &= doubleVerify(2.1250000000, 0x4001000000000000L); ++ res &= doubleVerify(2.2500000000, 0x4002000000000000L); ++ res &= doubleVerify(2.3750000000, 0x4003000000000000L); ++ res &= doubleVerify(2.5000000000, 0x4004000000000000L); ++ res &= doubleVerify(2.6250000000, 0x4005000000000000L); ++ res &= doubleVerify(2.7500000000, 0x4006000000000000L); ++ res &= doubleVerify(2.8750000000, 0x4007000000000000L); ++ res &= doubleVerify(3.0000000000, 0x4008000000000000L); ++ res &= doubleVerify(3.1250000000, 0x4009000000000000L); ++ res &= doubleVerify(3.2500000000, 0x400a000000000000L); ++ res &= doubleVerify(3.3750000000, 0x400b000000000000L); ++ res &= doubleVerify(3.5000000000, 0x400c000000000000L); ++ res &= doubleVerify(3.6250000000, 0x400d000000000000L); ++ res &= doubleVerify(3.7500000000, 0x400e000000000000L); ++ res &= doubleVerify(3.8750000000, 0x400f000000000000L); ++ res &= doubleVerify(4.0000000000, 0x4010000000000000L); ++ res &= doubleVerify(4.2500000000, 0x4011000000000000L); ++ res &= doubleVerify(4.5000000000, 0x4012000000000000L); ++ res &= doubleVerify(4.7500000000, 0x4013000000000000L); ++ res &= doubleVerify(5.0000000000, 0x4014000000000000L); ++ res &= doubleVerify(5.2500000000, 0x4015000000000000L); ++ res &= doubleVerify(5.5000000000, 0x4016000000000000L); ++ res &= doubleVerify(5.7500000000, 0x4017000000000000L); ++ res &= doubleVerify(6.0000000000, 0x4018000000000000L); ++ res &= doubleVerify(6.2500000000, 0x4019000000000000L); ++ res &= doubleVerify(6.5000000000, 0x401a000000000000L); ++ res &= doubleVerify(6.7500000000, 0x401b000000000000L); ++ res &= doubleVerify(7.0000000000, 0x401c000000000000L); ++ res &= doubleVerify(7.2500000000, 0x401d000000000000L); ++ res &= doubleVerify(7.5000000000, 0x401e000000000000L); ++ res &= doubleVerify(7.7500000000, 0x401f000000000000L); ++ res &= doubleVerify(8.0000000000, 0x4020000000000000L); ++ res &= doubleVerify(8.5000000000, 0x4021000000000000L); ++ res &= doubleVerify(9.0000000000, 0x4022000000000000L); ++ res &= doubleVerify(9.5000000000, 0x4023000000000000L); ++ res &= doubleVerify(10.0000000000, 0x4024000000000000L); ++ res &= doubleVerify(10.5000000000, 0x4025000000000000L); ++ res &= doubleVerify(11.0000000000, 0x4026000000000000L); ++ res &= doubleVerify(11.5000000000, 0x4027000000000000L); ++ res &= doubleVerify(12.0000000000, 0x4028000000000000L); ++ res &= doubleVerify(12.5000000000, 0x4029000000000000L); ++ res &= doubleVerify(13.0000000000, 0x402a000000000000L); ++ res &= doubleVerify(13.5000000000, 0x402b000000000000L); ++ res &= doubleVerify(14.0000000000, 0x402c000000000000L); ++ res &= doubleVerify(14.5000000000, 0x402d000000000000L); ++ res &= doubleVerify(15.0000000000, 0x402e000000000000L); ++ res &= doubleVerify(15.5000000000, 0x402f000000000000L); ++ res &= doubleVerify(16.0000000000, 0x4030000000000000L); ++ res &= doubleVerify(17.0000000000, 0x4031000000000000L); ++ res &= doubleVerify(18.0000000000, 0x4032000000000000L); ++ res &= doubleVerify(19.0000000000, 0x4033000000000000L); ++ res &= doubleVerify(20.0000000000, 0x4034000000000000L); ++ res &= doubleVerify(21.0000000000, 0x4035000000000000L); ++ res &= doubleVerify(22.0000000000, 0x4036000000000000L); ++ res &= doubleVerify(23.0000000000, 0x4037000000000000L); ++ res &= doubleVerify(24.0000000000, 0x4038000000000000L); ++ res &= doubleVerify(25.0000000000, 0x4039000000000000L); ++ res &= doubleVerify(26.0000000000, 0x403a000000000000L); ++ res &= doubleVerify(27.0000000000, 0x403b000000000000L); ++ res &= doubleVerify(28.0000000000, 0x403c000000000000L); ++ res &= doubleVerify(29.0000000000, 0x403d000000000000L); ++ res &= doubleVerify(30.0000000000, 0x403e000000000000L); ++ res &= doubleVerify(31.0000000000, 0x403f000000000000L); ++ res &= doubleVerify(0.1250000000, 0x3fc0000000000000L); ++ res &= doubleVerify(0.1328125000, 0x3fc1000000000000L); ++ res &= doubleVerify(0.1406250000, 0x3fc2000000000000L); ++ res &= doubleVerify(0.1484375000, 0x3fc3000000000000L); ++ res &= doubleVerify(0.1562500000, 0x3fc4000000000000L); ++ res &= doubleVerify(0.1640625000, 0x3fc5000000000000L); ++ res &= doubleVerify(0.1718750000, 0x3fc6000000000000L); ++ res &= doubleVerify(0.1796875000, 0x3fc7000000000000L); ++ res &= doubleVerify(0.1875000000, 0x3fc8000000000000L); ++ res &= doubleVerify(0.1953125000, 0x3fc9000000000000L); ++ res &= doubleVerify(0.2031250000, 0x3fca000000000000L); ++ res &= doubleVerify(0.2109375000, 0x3fcb000000000000L); ++ res &= doubleVerify(0.2187500000, 0x3fcc000000000000L); ++ res &= doubleVerify(0.2265625000, 0x3fcd000000000000L); ++ res &= doubleVerify(0.2343750000, 0x3fce000000000000L); ++ res &= doubleVerify(0.2421875000, 0x3fcf000000000000L); ++ res &= doubleVerify(0.2500000000, 0x3fd0000000000000L); ++ res &= doubleVerify(0.2656250000, 0x3fd1000000000000L); ++ res &= doubleVerify(0.2812500000, 0x3fd2000000000000L); ++ res &= doubleVerify(0.2968750000, 0x3fd3000000000000L); ++ res &= doubleVerify(0.3125000000, 0x3fd4000000000000L); ++ res &= doubleVerify(0.3281250000, 0x3fd5000000000000L); ++ res &= doubleVerify(0.3437500000, 0x3fd6000000000000L); ++ res &= doubleVerify(0.3593750000, 0x3fd7000000000000L); ++ res &= doubleVerify(0.3750000000, 0x3fd8000000000000L); ++ res &= doubleVerify(0.3906250000, 0x3fd9000000000000L); ++ res &= doubleVerify(0.4062500000, 0x3fda000000000000L); ++ res &= doubleVerify(0.4218750000, 0x3fdb000000000000L); ++ res &= doubleVerify(0.4375000000, 0x3fdc000000000000L); ++ res &= doubleVerify(0.4531250000, 0x3fdd000000000000L); ++ res &= doubleVerify(0.4687500000, 0x3fde000000000000L); ++ res &= doubleVerify(0.4843750000, 0x3fdf000000000000L); ++ res &= doubleVerify(0.5000000000, 0x3fe0000000000000L); ++ res &= doubleVerify(0.5312500000, 0x3fe1000000000000L); ++ res &= doubleVerify(0.5625000000, 0x3fe2000000000000L); ++ res &= doubleVerify(0.5937500000, 0x3fe3000000000000L); ++ res &= doubleVerify(0.6250000000, 0x3fe4000000000000L); ++ res &= doubleVerify(0.6562500000, 0x3fe5000000000000L); ++ res &= doubleVerify(0.6875000000, 0x3fe6000000000000L); ++ res &= doubleVerify(0.7187500000, 0x3fe7000000000000L); ++ res &= doubleVerify(0.7500000000, 0x3fe8000000000000L); ++ res &= doubleVerify(0.7812500000, 0x3fe9000000000000L); ++ res &= doubleVerify(0.8125000000, 0x3fea000000000000L); ++ res &= doubleVerify(0.8437500000, 0x3feb000000000000L); ++ res &= doubleVerify(0.8750000000, 0x3fec000000000000L); ++ res &= doubleVerify(0.9062500000, 0x3fed000000000000L); ++ res &= doubleVerify(0.9375000000, 0x3fee000000000000L); ++ res &= doubleVerify(0.9687500000, 0x3fef000000000000L); ++ res &= doubleVerify(1.0000000000, 0x3ff0000000000000L); ++ res &= doubleVerify(1.0625000000, 0x3ff1000000000000L); ++ res &= doubleVerify(1.1250000000, 0x3ff2000000000000L); ++ res &= doubleVerify(1.1875000000, 0x3ff3000000000000L); ++ res &= doubleVerify(1.2500000000, 0x3ff4000000000000L); ++ res &= doubleVerify(1.3125000000, 0x3ff5000000000000L); ++ res &= doubleVerify(1.3750000000, 0x3ff6000000000000L); ++ res &= doubleVerify(1.4375000000, 0x3ff7000000000000L); ++ res &= doubleVerify(1.5000000000, 0x3ff8000000000000L); ++ res &= doubleVerify(1.5625000000, 0x3ff9000000000000L); ++ res &= doubleVerify(1.6250000000, 0x3ffa000000000000L); ++ res &= doubleVerify(1.6875000000, 0x3ffb000000000000L); ++ res &= doubleVerify(1.7500000000, 0x3ffc000000000000L); ++ res &= doubleVerify(1.8125000000, 0x3ffd000000000000L); ++ res &= doubleVerify(1.8750000000, 0x3ffe000000000000L); ++ res &= doubleVerify(1.9375000000, 0x3fff000000000000L); ++ res &= doubleVerify(-2.0000000000, 0xc000000000000000L); ++ res &= doubleVerify(-2.1250000000, 0xc001000000000000L); ++ res &= doubleVerify(-2.2500000000, 0xc002000000000000L); ++ res &= doubleVerify(-2.3750000000 , 0xc003000000000000L); ++ res &= doubleVerify(-2.5000000000, 0xc004000000000000L); ++ res &= doubleVerify(-2.6250000000, 0xc005000000000000L); ++ res &= doubleVerify(-2.7500000000, 0xc006000000000000L); ++ res &= doubleVerify(-2.8750000000, 0xc007000000000000L); ++ res &= doubleVerify(-3.0000000000, 0xc008000000000000L); ++ res &= doubleVerify(-3.1250000000, 0xc009000000000000L); ++ res &= doubleVerify(-3.2500000000, 0xc00a000000000000L); ++ res &= doubleVerify(-3.3750000000, 0xc00b000000000000L); ++ res &= doubleVerify(-3.5000000000, 0xc00c000000000000L); ++ res &= doubleVerify(-3.6250000000, 0xc00d000000000000L); ++ res &= doubleVerify(-3.7500000000, 0xc00e000000000000L); ++ res &= doubleVerify(-3.8750000000, 0xc00f000000000000L); ++ res &= doubleVerify(-4.0000000000, 0xc010000000000000L); ++ res &= doubleVerify(-4.2500000000, 0xc011000000000000L); ++ res &= doubleVerify(-4.5000000000, 0xc012000000000000L); ++ res &= doubleVerify(-4.7500000000, 0xc013000000000000L); ++ res &= doubleVerify(-5.0000000000, 0xc014000000000000L); ++ res &= doubleVerify(-5.2500000000, 0xc015000000000000L); ++ res &= doubleVerify(-5.5000000000, 0xc016000000000000L); ++ res &= doubleVerify(-5.7500000000, 0xc017000000000000L); ++ res &= doubleVerify(-6.0000000000, 0xc018000000000000L); ++ res &= doubleVerify(-6.2500000000, 0xc019000000000000L); ++ res &= doubleVerify(-6.5000000000, 0xc01a000000000000L); ++ res &= doubleVerify(-6.7500000000, 0xc01b000000000000L); ++ res &= doubleVerify(-7.0000000000, 0xc01c000000000000L); ++ res &= doubleVerify(-7.2500000000, 0xc01d000000000000L); ++ res &= doubleVerify(-7.5000000000, 0xc01e000000000000L); ++ res &= doubleVerify(-7.7500000000, 0xc01f000000000000L); ++ res &= doubleVerify(-8.0000000000, 0xc020000000000000L); ++ res &= doubleVerify(-8.5000000000, 0xc021000000000000L); ++ res &= doubleVerify(-9.0000000000, 0xc022000000000000L); ++ res &= doubleVerify(-9.5000000000, 0xc023000000000000L); ++ res &= doubleVerify(-10.0000000000, 0xc024000000000000L); ++ res &= doubleVerify(-10.5000000000, 0xc025000000000000L); ++ res &= doubleVerify(-11.0000000000, 0xc026000000000000L); ++ res &= doubleVerify(-11.5000000000, 0xc027000000000000L); ++ res &= doubleVerify(-12.0000000000, 0xc028000000000000L); ++ res &= doubleVerify(-12.5000000000, 0xc029000000000000L); ++ res &= doubleVerify(-13.0000000000, 0xc02a000000000000L); ++ res &= doubleVerify(-13.5000000000, 0xc02b000000000000L); ++ res &= doubleVerify(-14.0000000000, 0xc02c000000000000L); ++ res &= doubleVerify(-14.5000000000, 0xc02d000000000000L); ++ res &= doubleVerify(-15.0000000000, 0xc02e000000000000L); ++ res &= doubleVerify(-15.5000000000, 0xc02f000000000000L); ++ res &= doubleVerify(-16.0000000000, 0xc030000000000000L); ++ res &= doubleVerify(-17.0000000000, 0xc031000000000000L); ++ res &= doubleVerify(-18.0000000000, 0xc032000000000000L); ++ res &= doubleVerify(-19.0000000000, 0xc033000000000000L); ++ res &= doubleVerify(-20.0000000000, 0xc034000000000000L); ++ res &= doubleVerify(-21.0000000000, 0xc035000000000000L); ++ res &= doubleVerify(-22.0000000000, 0xc036000000000000L); ++ res &= doubleVerify(-23.0000000000, 0xc037000000000000L); ++ res &= doubleVerify(-24.0000000000, 0xc038000000000000L); ++ res &= doubleVerify(-25.0000000000, 0xc039000000000000L); ++ res &= doubleVerify(-26.0000000000, 0xc03a000000000000L); ++ res &= doubleVerify(-27.0000000000, 0xc03b000000000000L); ++ res &= doubleVerify(-28.0000000000, 0xc03c000000000000L); ++ res &= doubleVerify(-29.0000000000, 0xc03d000000000000L); ++ res &= doubleVerify(-30.0000000000, 0xc03e000000000000L); ++ res &= doubleVerify(-31.0000000000, 0xc03f000000000000L); ++ res &= doubleVerify(-0.1250000000, 0xbfc0000000000000L); ++ res &= doubleVerify(-0.1328125000, 0xbfc1000000000000L); ++ res &= doubleVerify(-0.1406250000, 0xbfc2000000000000L); ++ res &= doubleVerify(-0.1484375000, 0xbfc3000000000000L); ++ res &= doubleVerify(-0.1562500000, 0xbfc4000000000000L); ++ res &= doubleVerify(-0.1640625000, 0xbfc5000000000000L); ++ res &= doubleVerify(-0.1718750000, 0xbfc6000000000000L); ++ res &= doubleVerify(-0.1796875000, 0xbfc7000000000000L); ++ res &= doubleVerify(-0.1875000000, 0xbfc8000000000000L); ++ res &= doubleVerify(-0.1953125000, 0xbfc9000000000000L); ++ res &= doubleVerify(-0.2031250000, 0xbfca000000000000L); ++ res &= doubleVerify(-0.2109375000, 0xbfcb000000000000L); ++ res &= doubleVerify(-0.2187500000, 0xbfcc000000000000L); ++ res &= doubleVerify(-0.2265625000, 0xbfcd000000000000L); ++ res &= doubleVerify(-0.2343750000, 0xbfce000000000000L); ++ res &= doubleVerify(-0.2421875000, 0xbfcf000000000000L); ++ res &= doubleVerify(-0.2500000000, 0xbfd0000000000000L); ++ res &= doubleVerify(-0.2656250000, 0xbfd1000000000000L); ++ res &= doubleVerify(-0.2812500000, 0xbfd2000000000000L); ++ res &= doubleVerify(-0.2968750000, 0xbfd3000000000000L); ++ res &= doubleVerify(-0.3125000000, 0xbfd4000000000000L); ++ res &= doubleVerify(-0.3281250000, 0xbfd5000000000000L); ++ res &= doubleVerify(-0.3437500000, 0xbfd6000000000000L); ++ res &= doubleVerify(-0.3593750000, 0xbfd7000000000000L); ++ res &= doubleVerify(-0.3750000000, 0xbfd8000000000000L); ++ res &= doubleVerify(-0.3906250000, 0xbfd9000000000000L); ++ res &= doubleVerify(-0.4062500000, 0xbfda000000000000L); ++ res &= doubleVerify(-0.4218750000, 0xbfdb000000000000L); ++ res &= doubleVerify(-0.4375000000, 0xbfdc000000000000L); ++ res &= doubleVerify(-0.4531250000, 0xbfdd000000000000L); ++ res &= doubleVerify(-0.4687500000, 0xbfde000000000000L); ++ res &= doubleVerify(-0.4843750000, 0xbfdf000000000000L); ++ res &= doubleVerify(-0.5000000000, 0xbfe0000000000000L); ++ res &= doubleVerify(-0.5312500000, 0xbfe1000000000000L); ++ res &= doubleVerify(-0.5625000000, 0xbfe2000000000000L); ++ res &= doubleVerify(-0.5937500000, 0xbfe3000000000000L); ++ res &= doubleVerify(-0.6250000000, 0xbfe4000000000000L); ++ res &= doubleVerify(-0.6562500000, 0xbfe5000000000000L); ++ res &= doubleVerify(-0.6875000000, 0xbfe6000000000000L); ++ res &= doubleVerify(-0.7187500000, 0xbfe7000000000000L); ++ res &= doubleVerify(-0.7500000000, 0xbfe8000000000000L); ++ res &= doubleVerify(-0.7812500000, 0xbfe9000000000000L); ++ res &= doubleVerify(-0.8125000000, 0xbfea000000000000L); ++ res &= doubleVerify(-0.8437500000, 0xbfeb000000000000L); ++ res &= doubleVerify(-0.8750000000, 0xbfec000000000000L); ++ res &= doubleVerify(-0.9062500000, 0xbfed000000000000L); ++ res &= doubleVerify(-0.9375000000, 0xbfee000000000000L); ++ res &= doubleVerify(-0.9687500000, 0xbfef000000000000L); ++ res &= doubleVerify(-1.0000000000, 0xbff0000000000000L); ++ res &= doubleVerify(-1.0625000000, 0xbff1000000000000L); ++ res &= doubleVerify(-1.1250000000, 0xbff2000000000000L); ++ res &= doubleVerify(-1.1875000000, 0xbff3000000000000L); ++ res &= doubleVerify(-1.2500000000, 0xbff4000000000000L); ++ res &= doubleVerify(-1.3125000000, 0xbff5000000000000L); ++ res &= doubleVerify(-1.3750000000, 0xbff6000000000000L); ++ res &= doubleVerify(-1.4375000000, 0xbff7000000000000L); ++ res &= doubleVerify(-1.5000000000, 0xbff8000000000000L); ++ res &= doubleVerify(-1.5625000000, 0xbff9000000000000L); ++ res &= doubleVerify(-1.6250000000, 0xbffa000000000000L); ++ res &= doubleVerify(-1.6875000000, 0xbffb000000000000L); ++ res &= doubleVerify(-1.7500000000, 0xbffc000000000000L); ++ res &= doubleVerify(-1.8125000000, 0xbffd000000000000L); ++ res &= doubleVerify(-1.8750000000, 0xbffe000000000000L); ++ res &= doubleVerify(-1.9375000000, 0xbfff000000000000L); ++ ++ return res; ++ } ++ ++ public static void main(String[] args) { ++ Test26733 t = new Test26733(); ++ ++ if (t.doTest()) { ++ System.out.println("TEST PASSED"); ++ } else { ++ throw new AssertionError("TEST FAILED"); ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/loongson/30358/MEMBARType.java b/test/hotspot/jtreg/loongson/30358/MEMBARType.java +--- a/test/hotspot/jtreg/loongson/30358/MEMBARType.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/loongson/30358/MEMBARType.java 2024-02-20 10:42:38.138861889 +0800 +@@ -0,0 +1,38 @@ ++/* ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++public final class MEMBARType { ++ public static final String DBARINSCODE = "00 7238"; // dbar hint ++ public static final String DBARSTR = "dbar 0x"; ++ ++ public static final String LoadLoad = "15"; ++ public static final String LoadStore = "16"; ++ public static final String StoreLoad = "19"; ++ public static final String StoreStore = "1a"; ++ public static final String AnyAny = "10"; ++ ++ public static final String Acquire = "14"; // LoadStore & LoadLoad ++ public static final String Release = "12"; // LoadStore & StoreStore ++ public static final String Volatile = StoreLoad; ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/loongson/30358/TestLoadLoad.java b/test/hotspot/jtreg/loongson/30358/TestLoadLoad.java +--- a/test/hotspot/jtreg/loongson/30358/TestLoadLoad.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/loongson/30358/TestLoadLoad.java 2024-02-20 10:42:38.138861889 +0800 +@@ -0,0 +1,129 @@ ++/* ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++/** ++ * @test TestLoadLoad ++ * @summary Checks LoadLoad membar ++ * ++ * @library /test/lib ++ * ++ * @requires os.arch=="loongarch64" ++ * ++ * @run driver TestLoadLoad ++ */ ++ ++import java.util.ArrayList; ++import java.util.Iterator; ++import java.util.ListIterator; ++import jdk.test.lib.process.OutputAnalyzer; ++import jdk.test.lib.process.ProcessTools; ++ ++public class TestLoadLoad { ++ ++ public static void main(String[] args) throws Exception { ++ ArrayList command = new ArrayList(); ++ command.add("-XX:+UnlockDiagnosticVMOptions"); ++ command.add("-XX:+PrintInterpreter"); ++ command.add("-version"); ++ ++ ProcessBuilder pb = ProcessTools.createJavaProcessBuilder(command); ++ OutputAnalyzer analyzer = new OutputAnalyzer(pb.start()); ++ ++ analyzer.shouldHaveExitValue(0); ++ System.out.println(analyzer.getOutput()); ++ checkMembarLoadLoad(analyzer); ++ } ++ ++ private static void addInstrs(String line, ArrayList instrs) { ++ for (String instr : line.split("\\|")) { ++ instrs.add(instr.trim()); ++ } ++ } ++ ++ // The output with hsdis library is: ++ // --------------------------------------------------------------------- ++ // fast_agetfield 203 fast_agetfield [0x000000ffe8436f00, 0x000000ffe8436f68] 104 bytes ++ // ++ // -------------------------------------------------------------------------------- ++ // 0x000000ffe8436f00: ld.d $a0,$sp,0 ;;@FILE: /home/sunguoyun/jdk-ls/src/hotspot/share/interpreter/templateInterpreterGenerator.cpp ++ // ;; 357: case atos: vep = __ pc(); __ pop(atos); aep = __ pc(); generate_and_dispatch(t); break; ++ // 0x000000ffe8436f04: addi.d $sp,$sp,8(0x8) ++ // 0x000000ffe8436f08: ld.hu $t2,$s0,1(0x1) ;; 357: case atos: vep = __ pc(); __ pop(atos); aep = __ pc(); generate_and_dispatch(t); break; ++ // ;; 378: __ verify_FPU(1, t->tos_in()); ++ // ;; 391: __ dispatch_prolog(tos_out, step); ++ // 0x000000ffe8436f0c: ld.d $t3,$fp,-72(0xfb8) ++ // 0x000000ffe8436f10: slli.d $t2,$t2,0x2 ++ // 0x000000ffe8436f14: dbar 0x15 ++ // ++ // The output no hsdis library is: ++ // 0x000000ffe7b58e80: 6400 c028 | 6320 c002 | ee06 402a | cfe2 fe28 | ce09 4100 | 1500 7238 | d33d 2d00 | 6e22 c128 ++ // 0x000000ffe7b58ea0: 7342 c128 | 1440 0014 | 94ce 1400 | 800a 0058 | 1000 7238 | 9300 8028 | 84b8 1000 | 8400 802a ++ ++ private static void checkMembarLoadLoad(OutputAnalyzer output) { ++ Iterator iter = output.asLines().listIterator(); ++ ++ String match = skipTo(iter, "fast_agetfield 203 fast_agetfield"); ++ if (match == null) { ++ throw new RuntimeException("Missing interpreter output"); ++ } ++ ++ ArrayList instrs = new ArrayList(); ++ String line = null; ++ while (iter.hasNext()) { ++ line = iter.next(); ++ if (line.contains("fast_bgetfield")) { ++ break; ++ } ++ if (line.contains("0x")) { ++ addInstrs(line, instrs); ++ } ++ } ++ ++ ListIterator instrReverseIter = instrs.listIterator(instrs.size()); ++ boolean foundMembarInst = false; ++ ++ while (instrReverseIter.hasPrevious()) { ++ String inst = instrReverseIter.previous(); ++ if (inst.endsWith(MEMBARType.LoadLoad + MEMBARType.DBARINSCODE) || inst.contains(MEMBARType.DBARSTR + MEMBARType.LoadLoad)) { ++ foundMembarInst = true; ++ break; ++ } ++ } ++ ++ if (foundMembarInst == false) { ++ throw new RuntimeException("No founed MembarRelease instruction (0x" + MEMBARType.LoadLoad + ")!\n"); ++ } ++ } ++ ++ private static String skipTo(Iterator iter, String substring) { ++ while (iter.hasNext()) { ++ String nextLine = iter.next(); ++ if (nextLine.contains(substring)) { ++ return nextLine; ++ } ++ } ++ return null; ++ } ++ ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/loongson/30358/TestNewObjectWithFinal.java b/test/hotspot/jtreg/loongson/30358/TestNewObjectWithFinal.java +--- a/test/hotspot/jtreg/loongson/30358/TestNewObjectWithFinal.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/loongson/30358/TestNewObjectWithFinal.java 2024-02-20 10:42:38.138861889 +0800 +@@ -0,0 +1,247 @@ ++/* ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++/** ++ * @test TestNewObjectWithFinal ++ * @summary Checks membars for a object with final val create ++ * ++ * for c1 new object with final, two StoreStore membar will be insert ++ * store final val ++ * membar_storestore ++ * store obj ++ * membar_storestore ++ * ++ * for c2 new object with final, one Release membar will be insert ++ * store final val ++ * store obj ++ * membar_release ++ * ++ * @library /test/lib ++ * ++ * @requires os.arch=="loongarch64" ++ * ++ * @run driver TestNewObjectWithFinal c1 ++ * @run driver TestNewObjectWithFinal c2 ++ */ ++ ++import java.util.ArrayList; ++import java.util.Iterator; ++import java.util.ListIterator; ++import jdk.test.lib.process.OutputAnalyzer; ++import jdk.test.lib.process.ProcessTools; ++ ++public class TestNewObjectWithFinal { ++ ++ public static void main(String[] args) throws Exception { ++ String compiler = args[0]; ++ ArrayList command = new ArrayList(); ++ command.add("-XX:-BackgroundCompilation"); ++ command.add("-XX:+UnlockDiagnosticVMOptions"); ++ command.add("-XX:+PrintAssembly"); ++ ++ if (compiler.equals("c2")) { ++ command.add("-XX:-TieredCompilation"); ++ } else if (compiler.equals("c1")) { ++ command.add("-XX:TieredStopAtLevel=1"); ++ } else { ++ throw new RuntimeException("Unknown compiler: " + compiler); ++ } ++ command.add("-XX:CompileCommand=compileonly," + Launcher.class.getName() + "::" + "test"); ++ command.add(Launcher.class.getName()); ++ ++ ProcessBuilder pb = ProcessTools.createJavaProcessBuilder(command); ++ OutputAnalyzer analyzer = new OutputAnalyzer(pb.start()); ++ ++ analyzer.shouldHaveExitValue(0); ++ ++ System.out.println(analyzer.getOutput()); ++ ++ if (compiler.equals("c1")) { ++ checkMembarStoreStore(analyzer); ++ } else if (compiler.equals("c2")) { ++ checkMembarRelease(analyzer); ++ } ++ } ++ ++ private static void addInstrs(String line, ArrayList instrs) { ++ for (String instr : line.split("\\|")) { ++ instrs.add(instr.trim()); ++ } ++ } ++ ++ // ----------------------------------- Assembly ----------------------------------- ++ // ++ // Compiled method (c2) 950 24 TestNewObjectWithFinal$Launcher::test (8 bytes) ++ // ++ // [Constant Pool (empty)] ++ // ++ // [MachCode] ++ // [Verified Entry Point] ++ // # {method} {0x000000ffd06033f0} 'test' '()LTestNewObjectWithFinal$Launcher;' in 'TestNewObjectWithFinal$Launcher' ++ // 0x000000ffed0be59c: 0c24 8003 ++ // ++ // 0x000000ffed0be5a0: ;*invokespecial {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestNewObjectWithFinal$Launcher::test@4 (line 187) ++ // 0x000000ffed0be5a0: 8c30 8029 ++ // ++ // 0x000000ffed0be5a4: ;*synchronization entry ++ // ; - TestNewObjectWithFinal$Launcher::@-1 (line 176) ++ // ; - TestNewObjectWithFinal$Launcher::test@4 (line 187) ++ // 0x000000ffed0be5a4: 1200 7238 | 7640 c028 | 6160 c028 | 6380 c002 ++ // ++ // 0x000000ffed0be5b4: ; {poll_return} ++ // 0x000000ffed0be5b4: b323 cf28 | 630a 006c | 0040 0050 | 2000 004c ++ // ... ++ // [Stub Code] ++ // ++ // ++ // The output with hsdis library is: ++ // ++ // 0x000000ffed0be5a4: dbar 0x12 ;*synchronization entry ++ // ; - TestNewObjectWithFinal$Launcher::@-1 (line 227) ++ // ++ private static void checkMembarRelease(OutputAnalyzer output) { ++ Iterator iter = output.asLines().listIterator(); ++ ++ String match = skipTo(iter, "'test' '()LTestNewObjectWithFinal$Launcher"); ++ if (match == null) { ++ throw new RuntimeException("Missing compiler c2 output"); ++ } ++ ++ ArrayList instrs = new ArrayList(); ++ String line = null; ++ while (iter.hasNext()) { ++ line = iter.next(); ++ if (line.contains("[Stub Code]")) { ++ break; ++ } ++ if (line.contains("0x")/* && !line.contains(";")*/) { ++ addInstrs(line, instrs); ++ } ++ } ++ ++ ListIterator instrReverseIter = instrs.listIterator(instrs.size()); ++ boolean foundMembarInst = false; ++ ++ while (instrReverseIter.hasPrevious()) { ++ String inst = instrReverseIter.previous(); ++ if (inst.endsWith(MEMBARType.Release + MEMBARType.DBARINSCODE) || inst.contains(MEMBARType.DBARSTR + MEMBARType.Release)) { ++ foundMembarInst = true; ++ break; ++ } ++ } ++ ++ if (foundMembarInst == false) { ++ throw new RuntimeException("No founed MembarRelease instruction (0x" + MEMBARType.Release + ")!\n"); ++ } ++ } ++ ++ // ============================= C1-compiled nmethod ============================== ++ // ----------------------------------- Assembly ----------------------------------- ++ // ++ // Compiled method (c1) 948 24 1 TestNewObjectWithFinal$Launcher::test (8 bytes) ++ // 0x000000ffe903feb8: ;*new {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestNewObjectWithFinal$Launcher::test@0 (line 190) ++ // 0x000000ffe903feb8: 1a00 7238 | 0524 8003 ++ // ++ // 0x000000ffe903fec0: ;*putfield val_i {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestNewObjectWithFinal$Launcher::@7 (line 180) ++ // ; - TestNewObjectWithFinal$Launcher::test@4 (line 190) ++ // 0x000000ffe903fec0: 8530 8029 | 1a00 7238 | 7600 c128 | 6120 c128 | 6340 c102 ++ // ++ // 0x000000ffe903fed4: ; {poll_return} ++ // [Exception Handler] ++ // ++ // ++ // The output with hsdis library is: ++ // ++ // 0x000000ffed03feb8: dbar 0x1a ;*new {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestNewObjectWithFinal$Launcher::test@0 (line 225) ++ // 0x000000ffed03febc: ori $a1,$zero,0x9 ++ // 0x000000ffed03fec0: st.w $a1,$a0,12(0xc) ;*putfield val_i {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestNewObjectWithFinal$Launcher::@7 (line 215) ++ // ; - TestNewObjectWithFinal$Launcher::test@4 (line 225) ++ // 0x000000ffed03fec4: dbar 0x1a ++ // ++ private static void checkMembarStoreStore(OutputAnalyzer output) { ++ Iterator iter = output.asLines().listIterator(); ++ ++ String match = skipTo(iter, "TestNewObjectWithFinal$Launcher::test (8 bytes)"); ++ if (match == null) { ++ throw new RuntimeException("Missing compiler c1 output"); ++ } ++ ++ ArrayList instrs = new ArrayList(); ++ String line = null; ++ boolean hasHexInstInOutput = false; ++ while (iter.hasNext()) { ++ line = iter.next(); ++ if (line.contains("[Exception Handler]")) { ++ break; ++ } ++ if (line.contains("0x")/* && !line.contains(";")*/) { ++ addInstrs(line, instrs); ++ } ++ } ++ ++ ListIterator instrReverseIter = instrs.listIterator(instrs.size()); ++ int foundMembarInst = 0; ++ ++ while (instrReverseIter.hasPrevious()) { ++ String inst = instrReverseIter.previous(); ++ if (inst.endsWith(MEMBARType.StoreStore + MEMBARType.DBARINSCODE) || inst.contains(MEMBARType.DBARSTR + MEMBARType.StoreStore)) { ++ ++foundMembarInst; ++ } ++ } ++ ++ if (foundMembarInst < 2) { ++ throw new RuntimeException("No founed MembarStoreStore instruction (" + MEMBARType.StoreStore + ")! foundMembarInst=" + foundMembarInst + "\n"); ++ } ++ } ++ ++ private static String skipTo(Iterator iter, String substring) { ++ while (iter.hasNext()) { ++ String nextLine = iter.next(); ++ if (nextLine.contains(substring)) { ++ return nextLine; ++ } ++ } ++ return null; ++ } ++ ++ static class Launcher { ++ final int val_i = 0x9; ++ static Launcher l; ++ public static void main(final String[] args) throws Exception { ++ int end = 20_000; ++ ++ for (int i=0; i < end; i++) { ++ l = test(); ++ } ++ } ++ static Launcher test() { ++ return new Launcher(); ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/loongson/30358/TEST.properties b/test/hotspot/jtreg/loongson/30358/TEST.properties +--- a/test/hotspot/jtreg/loongson/30358/TEST.properties 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/loongson/30358/TEST.properties 2024-02-20 10:42:38.138861889 +0800 +@@ -0,0 +1,25 @@ ++/* ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++maxOutputSize = 2500000 +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/loongson/30358/TestVolatile.java b/test/hotspot/jtreg/loongson/30358/TestVolatile.java +--- a/test/hotspot/jtreg/loongson/30358/TestVolatile.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/loongson/30358/TestVolatile.java 2024-02-20 10:42:38.138861889 +0800 +@@ -0,0 +1,358 @@ ++/* ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++/** ++ * @test TestVolatile ++ * @summary Checks membars for a volatile field load and store ++ * ++ * for a volatile wirte ideal: ++ * MemBarRelease ++ * store ++ * MemBarVolatile (NOTE: c1 used AnyAny instead MembarVolatile) ++ * ++ * for a volatile read ideal: ++ * load ++ * MemBarAcquire ++ * ++ * @library /test/lib ++ * ++ * @requires os.arch=="loongarch64" ++ * ++ * @run driver TestVolatile c1 ++ * @run driver TestVolatile c2 ++ */ ++import java.util.ArrayList; ++import java.util.Iterator; ++import java.util.ListIterator; ++import jdk.test.lib.process.OutputAnalyzer; ++import jdk.test.lib.process.ProcessTools; ++ ++public class TestVolatile { ++ ++ public static void main(String[] args) throws Exception { ++ String compiler = args[0]; ++ ArrayList command = new ArrayList(); ++ command.add("-XX:-BackgroundCompilation"); ++ command.add("-XX:+UnlockDiagnosticVMOptions"); ++ command.add("-XX:+PrintAssembly"); ++ command.add("-Xcomp"); ++ ++ if (compiler.equals("c2")) { ++ command.add("-XX:-TieredCompilation"); ++ command.add("-XX:+UseBarriersForVolatile"); ++ } else if (compiler.equals("c1")) { ++ command.add("-XX:TieredStopAtLevel=1"); ++ } else if (compiler.equals("int")) { ++ command.add("-Xint"); ++ command.add("-XX:+PrintInterpreter"); ++ } else { ++ throw new RuntimeException("Unknown compiler: " + compiler); ++ } ++ ++ command.add("-XX:CompileCommand=compileonly," + Launcher.class.getName() + "::" + "*"); ++ command.add("-XX:CompileCommand=dontinline," + Launcher.class.getName() + "::" + "*"); ++ command.add(Launcher.class.getName()); ++ ++ ProcessBuilder pb = ProcessTools.createJavaProcessBuilder(command); ++ ++ OutputAnalyzer analyzer = new OutputAnalyzer(pb.start()); ++ ++ analyzer.shouldHaveExitValue(0); ++ ++ System.out.println(analyzer.getOutput()); ++ ++ if (compiler.equals("c1")) { ++ checkC1VolatileRead(analyzer); ++ checkC1VolatileWrite(analyzer); ++ } else if (compiler.equals("c2")) { ++ checkC2VolatileRead(analyzer); ++ checkC2VolatileWrite(analyzer); ++ } ++ } ++ ++ private static void addInstrs(String line, ArrayList instrs) { ++ for (String instr : line.split("\\|")) { ++ instrs.add(instr.trim()); ++ } ++ } ++ ++ // ----------------------------------- Assembly ----------------------------------- ++ // ++ // Compiled method (c1) 976 24 1 TestVolatile$Launcher::main (13 bytes) ++ // # {method} {0x000000ffc8700308} 'main' '([Ljava/lang/String;)V' in 'TestVolatile$Launcher' ++ // 0x000000fff0147e28: ;*getfield flags {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestVolatile$Launcher::main@6 (line 296) ++ // 0x000000fff0147e28: 1400 7238 ++ // ++ // The output with hsdis library is: ++ // ++ // 0x000000ffe903fe28: dbar 0x14 ;*getfield flags {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestVolatile$Launcher::main@6 (line 322) ++ // 0x000000ffe903fe2c: st.w $a1,$a0,116(0x74) ;*putstatic val {reexecute=0 rethrow=0 return_oop=0} ++ // ++ private static void checkC1VolatileRead(OutputAnalyzer output) { ++ Iterator iter = output.asLines().listIterator(); ++ ++ String match = skipTo(iter, "'main' '([Ljava/lang/String;)V' in 'TestVolatile$Launcher'"); ++ if (match == null) { ++ throw new RuntimeException("Missing compiler c1 output"); ++ } ++ ++ /* match = skipTo(iter, "*getfield flags"); ++ if (match == null) { ++ throw new RuntimeException("Missing read volatile flags"); ++ }*/ ++ ArrayList instrs = new ArrayList(); ++ String line = null; ++ while (iter.hasNext()) { ++ line = iter.next(); ++ if (line.contains("0x")/* && !line.contains(";")*/) { ++ addInstrs(line, instrs); ++ if (line.contains("[Stub Code]")) { ++ break; ++ } ++ } ++ } ++ ++ ListIterator instrReverseIter = instrs.listIterator(instrs.size()); ++ boolean foundMembarInst = false; ++ ++ while (instrReverseIter.hasPrevious()) { ++ String inst = instrReverseIter.previous(); ++ if (inst.endsWith(MEMBARType.Acquire + MEMBARType.DBARINSCODE) || inst.contains(MEMBARType.DBARSTR + MEMBARType.Acquire)) { ++ foundMembarInst = true; ++ break; ++ } ++ } ++ ++ if (foundMembarInst == false) { ++ throw new RuntimeException("No founed valid acquire instruction (0x" + MEMBARType.Acquire + ")!\n"); ++ } ++ } ++ ++ // ----------------------------------- Assembly ----------------------------------- ++ // ++ // Compiled method (c1) 988 26 1 TestVolatile$Launcher:: (11 bytes) ++ // # {method} {0x000000ffc8700248} '' '()V' in 'TestVolatile$Launcher' ++ // 0x000000fff0147640: 0424 8003 | 1200 7238 | 8431 8029 ++ // ;; membar ++ // 0x000000fff014764c: ;*putfield flags {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestVolatile$Launcher::@7 (line 286) ++ // 0x000000fff014764c: 1000 7238 | 76c0 c028 | 61e0 c028 | 6300 c102 ++ // 0x000000fff014765c: ; {poll_return} ++ // ++ // The output with hsdis library is: ++ // ++ // 0x000000ffe903f644: dbar 0x12 ++ // 0x000000ffe903f648: st.w $a0,$t0,12(0xc) ++ // ;; membar ++ // 0x000000ffe903f64c: dbar 0x10 ;*putfield flags {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestVolatile$Launcher::@7 (line 315) ++ // ++ private static void checkC1VolatileWrite(OutputAnalyzer output) { ++ Iterator iter = output.asLines().listIterator(); ++ ++ String match = skipTo(iter, "'' '()V' in 'TestVolatile$Launcher'"); ++ if (match == null) { ++ throw new RuntimeException("Missing compiler c1 output"); ++ } ++ ++ ArrayList instrs = new ArrayList(); ++ String line = null; ++ while (iter.hasNext()) { ++ line = iter.next(); ++ if (line.contains("{poll_return}")) { ++ break; ++ } ++ if (line.contains("0x")/* && !line.contains(";")*/) { ++ addInstrs(line, instrs); ++ } ++ } ++ ++ ListIterator instrReverseIter = instrs.listIterator(instrs.size()); ++ boolean foundMembarRelease = false; ++ boolean foundMembarAnyAny = false; ++ ++ while (instrReverseIter.hasPrevious()) { ++ String inst = instrReverseIter.previous(); ++ if (inst.endsWith(MEMBARType.Release + MEMBARType.DBARINSCODE) || inst.contains(MEMBARType.DBARSTR + MEMBARType.Release)) { ++ foundMembarRelease = true; ++ } else if (inst.endsWith(MEMBARType.AnyAny + MEMBARType.DBARINSCODE) || inst.contains(MEMBARType.DBARSTR + MEMBARType.AnyAny)) { ++ foundMembarAnyAny = true; ++ } ++ if (foundMembarRelease && foundMembarAnyAny) ++ break; ++ } ++ ++ if (foundMembarRelease == false) { ++ throw new RuntimeException("No founed valid release instruction (0x" + MEMBARType.Release + ")!\n"); ++ } ++ if (foundMembarAnyAny == false) { ++ throw new RuntimeException("No founed valid volatile instruction (0x" + MEMBARType.AnyAny + ")!\n"); ++ } ++ } ++ ++ // ----------------------------------- Assembly ----------------------------------- ++ // ++ // Compiled method (c2) 1038 26 TestVolatile$Launcher:: (11 bytes) ++ // # {method} {0x000000ffcc603248} '' '()V' in 'TestVolatile$Launcher' ++ // 0x000000ffed0bfd20: 1200 7238 | 0d24 8003 | cd32 8029 ++ // ++ // 0x000000ffed0bfd2c: ;*putfield flags {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestVolatile$Launcher::@7 (line 309) ++ // 0x000000ffed0bfd2c: 1900 7238 | 7640 c028 | 6160 c028 | 6380 c002 ++ // ++ // 0x000000ffed0bfd3c: ; {poll_return} ++ // ++ // The output with hsdis library is: ++ // ++ // 0x000000ffed0bfca0: dbar 0x12 ++ // 0x000000ffed0bfca4: ori $t1,$zero,0x9 ++ // 0x000000ffed0bfca8: st.w $t1,$fp,12(0xc) ++ // 0x000000ffed0bfcac: dbar 0x19 ;*putfield flags {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestVolatile$Launcher::@7 (line 333) ++ // ++ private static void checkC2VolatileWrite(OutputAnalyzer output) { ++ Iterator iter = output.asLines().listIterator(); ++ ++ String match = skipTo(iter, "'' '()V' in 'TestVolatile$Launcher'"); ++ if (match == null) { ++ throw new RuntimeException("Missing compiler c2 output"); ++ } ++ ++ ArrayList instrs = new ArrayList(); ++ String line = null; ++ while (iter.hasNext()) { ++ line = iter.next(); ++ if (line.contains("{poll_return}")) { ++ break; ++ } ++ if (line.contains("0x")/* && !line.contains(";")*/) { ++ addInstrs(line, instrs); ++ } ++ } ++ ++ ListIterator instrReverseIter = instrs.listIterator(instrs.size()); ++ boolean foundMembarRelease = false; ++ boolean foundMembarVolatile = false; ++ ++ while (instrReverseIter.hasPrevious()) { ++ String inst = instrReverseIter.previous(); ++ if (inst.endsWith(MEMBARType.Release + MEMBARType.DBARINSCODE) || inst.contains(MEMBARType.DBARSTR + MEMBARType.Release)) { ++ foundMembarRelease = true; ++ } else if (inst.endsWith(MEMBARType.Volatile + MEMBARType.DBARINSCODE) || inst.contains(MEMBARType.DBARSTR + MEMBARType.Volatile)) { ++ foundMembarVolatile = true; ++ } ++ } ++ ++ if (foundMembarRelease == false) { ++ throw new RuntimeException("No founed valid release instruction (0x" + MEMBARType.Release + ")!\n"); ++ } ++ if (foundMembarVolatile == false) { ++ throw new RuntimeException("No founed valid volatile instruction (0x" + MEMBARType.Volatile + ")!\n"); ++ } ++ } ++ ++ //----------------------------------- Assembly ----------------------------------- ++ // ++ //Compiled method (c2) 846 24 TestVolatile$Launcher::main (13 bytes) ++ //[Verified Entry Point] ++ // # {method} {0x000000ffcc603308} 'main' '([Ljava/lang/String;)V' in 'TestVolatile$Launcher' ++ // 0x000000fff0ff6394: ;*getfield flags {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestVolatile$Launcher::main@6 (line 127) ++ // 0x000000fff0ff6394: 1400 7238 ++ // ++ // 0x000000fff0ff6398: ;*synchronization entry ++ // ; - TestVolatile$Launcher::main@-1 (line 123) ++ // 0x000000fff0ff6398: 8ed1 8129 | 7640 c028 | 6160 c028 | 6380 c002 ++ // ++ // 0x000000fff0ff63a8: ; {poll_return} ++ // ++ // The output with hsdis library is: ++ // 0x000000ffed0be514: dbar 0x14 ;*getfield flags {reexecute=0 rethrow=0 return_oop=0} ++ // ; - TestVolatile$Launcher::main@6 (line 340) ++ // ++ private static void checkC2VolatileRead(OutputAnalyzer output) { ++ Iterator iter = output.asLines().listIterator(); ++ ++ String match = skipTo(iter, "'main' '([Ljava/lang/String;)V' in 'TestVolatile$Launcher'"); ++ if (match == null) { ++ throw new RuntimeException("Missing compiler c2 output"); ++ } ++ ++ ArrayList instrs = new ArrayList(); ++ String line = null; ++ while (iter.hasNext()) { ++ line = iter.next(); ++ if (line.contains("[Stub Code]")) { ++ break; ++ } ++ if (line.contains("0x")/* && !line.contains(";")*/) { ++ addInstrs(line, instrs); ++ } ++ } ++ ++ ListIterator instrReverseIter = instrs.listIterator(instrs.size()); ++ boolean foundMembarInst = false; ++ ++ while (instrReverseIter.hasPrevious()) { ++ String inst = instrReverseIter.previous(); ++ if (inst.endsWith(MEMBARType.Acquire + MEMBARType.DBARINSCODE) || inst.contains(MEMBARType.DBARSTR + MEMBARType.Acquire)) { ++ foundMembarInst = true; ++ break; ++ } ++ } ++ ++ if (foundMembarInst == false) { ++ throw new RuntimeException("No founed valid acquire instruction (0x" + MEMBARType.Acquire + ")!\n"); ++ } ++ } ++ ++ private static String skipTo(Iterator iter, String substring) { ++ while (iter.hasNext()) { ++ String nextLine = iter.next(); ++ if (nextLine.contains(substring)) { ++ return nextLine; ++ } ++ } ++ return null; ++ } ++ ++ static class Launcher { ++ ++ public volatile int flags = 0x9; ++ ++ static Launcher l; ++ static int val; ++ ++ public static void main(final String[] args) throws Exception { ++ test(); ++ val = l.flags; ++ } ++ ++ static void test() { ++ l = new Launcher(); ++ } ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/loongson/7432/Test7423.java b/test/hotspot/jtreg/loongson/7432/Test7423.java +--- a/test/hotspot/jtreg/loongson/7432/Test7423.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/hotspot/jtreg/loongson/7432/Test7423.java 2024-02-20 10:42:38.138861889 +0800 +@@ -0,0 +1,61 @@ ++/* ++ * Copyright (c) 2015, 2018, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++/** ++ * @test ++ * @summary Divide by zero ++ * ++ * @run main/othervm -Xint Test7423 ++ * @run main/othervm -Xcomp Test7423 ++ */ ++public class Test7423 { ++ ++ private static int divInt(int n) { ++ int a = 1 / n; ++ return a; ++ } ++ ++ private static long divLong(long n) { ++ long a = (long)1 / n; ++ return a; ++ } ++ ++ public static void main(String[] args) throws Exception { ++ ++ try { ++ for (int i = 0; i < 20000; i++) { ++ if (i == 18000) { ++ divInt(0); ++ divLong((long)0); ++ } else { ++ divInt(1); ++ divLong((long)1); ++ } ++ } ++ } catch (java.lang.ArithmeticException exc) { ++ System.out.println("expected-exception " + exc); ++ } ++ } ++ ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/runtime/os/TestTracePageSizes.java b/test/hotspot/jtreg/runtime/os/TestTracePageSizes.java +--- a/test/hotspot/jtreg/runtime/os/TestTracePageSizes.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/runtime/os/TestTracePageSizes.java 2024-02-20 10:42:38.242195141 +0800 +@@ -22,6 +22,12 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test id=no-options + * @summary Run test with no arguments apart from the ones required by + * the test. +@@ -38,7 +44,7 @@ + * @library /test/lib + * @build jdk.test.lib.Platform + * @requires os.family == "linux" +- * @requires os.arch=="amd64" | os.arch=="x86_64" ++ * @requires os.arch=="amd64" | os.arch=="x86_64" | os.arch=="loongarch64" + * @requires vm.gc != "Z" + * @run main/othervm -XX:+AlwaysPreTouch -Xmx128m -Xlog:pagesize:ps-%p.log -XX:+UseLargePages -XX:LargePageSizeInBytes=2m TestTracePageSizes + * @run main/othervm -XX:+AlwaysPreTouch -Xmx2g -Xlog:pagesize:ps-%p.log -XX:+UseLargePages -XX:LargePageSizeInBytes=1g TestTracePageSizes +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/runtime/ReservedStack/ReservedStackTest.java b/test/hotspot/jtreg/runtime/ReservedStack/ReservedStackTest.java +--- a/test/hotspot/jtreg/runtime/ReservedStack/ReservedStackTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/runtime/ReservedStack/ReservedStackTest.java 2024-02-20 10:42:38.162195204 +0800 +@@ -22,6 +22,12 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2021, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test ReservedStackTest + * + * @requires vm.opt.DeoptimizeALot != true +@@ -240,7 +246,7 @@ + return Platform.isAix() || + (Platform.isLinux() && + (Platform.isPPC() || Platform.isS390x() || Platform.isX64() || +- Platform.isX86() || Platform.isAArch64() || Platform.isRISCV64())) || ++ Platform.isX86() || Platform.isAArch64() || Platform.isRISCV64() || Platform.isLoongArch64())) || + Platform.isOSX(); + } + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/serviceability/AsyncGetCallTrace/MyPackage/ASGCTBaseTest.java b/test/hotspot/jtreg/serviceability/AsyncGetCallTrace/MyPackage/ASGCTBaseTest.java +--- a/test/hotspot/jtreg/serviceability/AsyncGetCallTrace/MyPackage/ASGCTBaseTest.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/serviceability/AsyncGetCallTrace/MyPackage/ASGCTBaseTest.java 2024-02-20 10:42:38.248861802 +0800 +@@ -22,6 +22,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package MyPackage; + + /** +@@ -29,7 +35,7 @@ + * @summary Verifies that AsyncGetCallTrace is call-able and provides sane information. + * @compile ASGCTBaseTest.java + * @requires os.family == "linux" +- * @requires os.arch=="x86" | os.arch=="i386" | os.arch=="amd64" | os.arch=="x86_64" | os.arch=="arm" | os.arch=="aarch64" | os.arch=="ppc64" | os.arch=="s390" | os.arch=="riscv64" ++ * @requires os.arch=="x86" | os.arch=="i386" | os.arch=="amd64" | os.arch=="x86_64" | os.arch=="arm" | os.arch=="aarch64" | os.arch=="ppc64" | os.arch=="s390" | os.arch=="riscv64" | os.arch=="loongarch64" + * @requires vm.jvmti + * @run main/othervm/native -agentlib:AsyncGetCallTraceTest MyPackage.ASGCTBaseTest + */ +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackLineNumbers.java b/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackLineNumbers.java +--- a/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackLineNumbers.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/serviceability/sa/TestJhsdbJstackLineNumbers.java 2024-02-20 10:42:38.612194850 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + import java.io.OutputStream; + import java.util.regex.Matcher; + import java.util.regex.Pattern; +@@ -36,7 +42,7 @@ + /** + * @test + * @requires vm.hasSA +- * @requires os.arch=="amd64" | os.arch=="x86_64" ++ * @requires os.arch=="amd64" | os.arch=="x86_64" | os.arch=="loongarch64" + * @requires os.family=="windows" | os.family == "linux" | os.family == "mac" + * @requires vm.flagless + * @library /test/lib +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/testlibrary_tests/ir_framework/tests/TestIRMatching.java b/test/hotspot/jtreg/testlibrary_tests/ir_framework/tests/TestIRMatching.java +--- a/test/hotspot/jtreg/testlibrary_tests/ir_framework/tests/TestIRMatching.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/testlibrary_tests/ir_framework/tests/TestIRMatching.java 2024-02-20 10:42:38.638861497 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package ir_framework.tests; + + import compiler.lib.ir_framework.*; +@@ -215,7 +221,7 @@ + runCheck(BadFailOnConstraint.create(Membar.class, "membar()", 1, "MemBar")); + + String cmp; +- if (Platform.isPPC() || Platform.isX86()) { ++ if (Platform.isPPC() || Platform.isX86() || Platform.isLoongArch64()) { + cmp = "CMP"; + } else if (Platform.isS390x()){ + cmp = "CLFI"; +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/hotspot/jtreg/vmTestbase/nsk/share/jdi/ArgumentHandler.java b/test/hotspot/jtreg/vmTestbase/nsk/share/jdi/ArgumentHandler.java +--- a/test/hotspot/jtreg/vmTestbase/nsk/share/jdi/ArgumentHandler.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/hotspot/jtreg/vmTestbase/nsk/share/jdi/ArgumentHandler.java 2024-02-20 10:42:39.085527812 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package nsk.share.jdi; + + import nsk.share.*; +@@ -520,21 +526,22 @@ + * available only on the Microsoft Windows platform. + * " + */ +- {"linux-i586", "com.sun.jdi.SharedMemoryAttach"}, +- {"linux-ia64", "com.sun.jdi.SharedMemoryAttach"}, +- {"linux-amd64", "com.sun.jdi.SharedMemoryAttach"}, +- {"linux-x64", "com.sun.jdi.SharedMemoryAttach"}, +- {"linux-aarch64", "com.sun.jdi.SharedMemoryAttach"}, +- {"linux-arm", "com.sun.jdi.SharedMemoryAttach"}, +- {"linux-ppc64", "com.sun.jdi.SharedMemoryAttach"}, +- {"linux-ppc64le", "com.sun.jdi.SharedMemoryAttach"}, +- {"linux-s390x", "com.sun.jdi.SharedMemoryAttach"}, +- {"linux-riscv64", "com.sun.jdi.SharedMemoryAttach"}, +- {"macosx-amd64", "com.sun.jdi.SharedMemoryAttach"}, +- {"mac-x64", "com.sun.jdi.SharedMemoryAttach"}, +- {"macosx-aarch64", "com.sun.jdi.SharedMemoryAttach"}, +- {"mac-aarch64", "com.sun.jdi.SharedMemoryAttach"}, +- {"aix-ppc64", "com.sun.jdi.SharedMemoryAttach"}, ++ {"linux-i586", "com.sun.jdi.SharedMemoryAttach"}, ++ {"linux-ia64", "com.sun.jdi.SharedMemoryAttach"}, ++ {"linux-amd64", "com.sun.jdi.SharedMemoryAttach"}, ++ {"linux-x64", "com.sun.jdi.SharedMemoryAttach"}, ++ {"linux-aarch64", "com.sun.jdi.SharedMemoryAttach"}, ++ {"linux-arm", "com.sun.jdi.SharedMemoryAttach"}, ++ {"linux-ppc64", "com.sun.jdi.SharedMemoryAttach"}, ++ {"linux-ppc64le", "com.sun.jdi.SharedMemoryAttach"}, ++ {"linux-s390x", "com.sun.jdi.SharedMemoryAttach"}, ++ {"linux-riscv64", "com.sun.jdi.SharedMemoryAttach"}, ++ {"linux-loongarch64", "com.sun.jdi.SharedMemoryAttach"}, ++ {"macosx-amd64", "com.sun.jdi.SharedMemoryAttach"}, ++ {"mac-x64", "com.sun.jdi.SharedMemoryAttach"}, ++ {"macosx-aarch64", "com.sun.jdi.SharedMemoryAttach"}, ++ {"mac-aarch64", "com.sun.jdi.SharedMemoryAttach"}, ++ {"aix-ppc64", "com.sun.jdi.SharedMemoryAttach"}, + + // listening connectors + /* +@@ -546,21 +553,22 @@ + * It is available only on the Microsoft Windows platform. + * " + */ +- {"linux-i586", "com.sun.jdi.SharedMemoryListen"}, +- {"linux-ia64", "com.sun.jdi.SharedMemoryListen"}, +- {"linux-amd64", "com.sun.jdi.SharedMemoryListen"}, +- {"linux-x64", "com.sun.jdi.SharedMemoryListen"}, +- {"linux-aarch64", "com.sun.jdi.SharedMemoryListen"}, +- {"linux-arm", "com.sun.jdi.SharedMemoryListen"}, +- {"linux-ppc64", "com.sun.jdi.SharedMemoryListen"}, +- {"linux-ppc64le", "com.sun.jdi.SharedMemoryListen"}, +- {"linux-s390x", "com.sun.jdi.SharedMemoryListen"}, +- {"linux-riscv64", "com.sun.jdi.SharedMemoryListen"}, +- {"macosx-amd64", "com.sun.jdi.SharedMemoryListen"}, +- {"mac-x64", "com.sun.jdi.SharedMemoryListen"}, +- {"macosx-aarch64", "com.sun.jdi.SharedMemoryListen"}, +- {"mac-aarch64", "com.sun.jdi.SharedMemoryListen"}, +- {"aix-ppc64", "com.sun.jdi.SharedMemoryListen"}, ++ {"linux-i586", "com.sun.jdi.SharedMemoryListen"}, ++ {"linux-ia64", "com.sun.jdi.SharedMemoryListen"}, ++ {"linux-amd64", "com.sun.jdi.SharedMemoryListen"}, ++ {"linux-x64", "com.sun.jdi.SharedMemoryListen"}, ++ {"linux-aarch64", "com.sun.jdi.SharedMemoryListen"}, ++ {"linux-arm", "com.sun.jdi.SharedMemoryListen"}, ++ {"linux-ppc64", "com.sun.jdi.SharedMemoryListen"}, ++ {"linux-ppc64le", "com.sun.jdi.SharedMemoryListen"}, ++ {"linux-s390x", "com.sun.jdi.SharedMemoryListen"}, ++ {"linux-riscv64", "com.sun.jdi.SharedMemoryListen"}, ++ {"linux-loongarch64", "com.sun.jdi.SharedMemoryListen"}, ++ {"macosx-amd64", "com.sun.jdi.SharedMemoryListen"}, ++ {"mac-x64", "com.sun.jdi.SharedMemoryListen"}, ++ {"macosx-aarch64", "com.sun.jdi.SharedMemoryListen"}, ++ {"mac-aarch64", "com.sun.jdi.SharedMemoryListen"}, ++ {"aix-ppc64", "com.sun.jdi.SharedMemoryListen"}, + + // launching connectors + /* +@@ -575,78 +583,82 @@ + * Windows, the shared memory transport is used. On Linux the socket transport is used. + * " + */ +- {"linux-i586", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"linux-i586", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"linux-i586", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"linux-i586", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ ++ {"linux-ia64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"linux-ia64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"linux-ia64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"linux-ia64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"linux-amd64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"linux-amd64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"linux-amd64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"linux-amd64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"linux-x64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"linux-x64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"linux-x64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"linux-x64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"linux-aarch64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"linux-aarch64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"linux-aarch64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"linux-aarch64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"linux-arm", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"linux-arm", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"linux-arm", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"linux-arm", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"linux-ppc64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"linux-ppc64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"linux-ppc64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"linux-ppc64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"linux-ppc64le", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"linux-ppc64le", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"linux-ppc64le", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"linux-ppc64le", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"linux-s390x", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"linux-s390x", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"linux-s390x", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"linux-s390x", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"linux-riscv64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"linux-riscv64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"linux-riscv64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"linux-riscv64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"linux-loongarch64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"linux-loongarch64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"windows-i586", "com.sun.jdi.CommandLineLaunch", "dt_socket"}, +- {"windows-i586", "com.sun.jdi.RawCommandLineLaunch", "dt_socket"}, ++ {"windows-i586", "com.sun.jdi.CommandLineLaunch", "dt_socket"}, ++ {"windows-i586", "com.sun.jdi.RawCommandLineLaunch", "dt_socket"}, + +- {"windows-ia64", "com.sun.jdi.CommandLineLaunch", "dt_socket"}, +- {"windows-ia64", "com.sun.jdi.RawCommandLineLaunch", "dt_socket"}, ++ {"windows-ia64", "com.sun.jdi.CommandLineLaunch", "dt_socket"}, ++ {"windows-ia64", "com.sun.jdi.RawCommandLineLaunch", "dt_socket"}, + +- {"windows-amd64", "com.sun.jdi.CommandLineLaunch", "dt_socket"}, +- {"windows-amd64", "com.sun.jdi.RawCommandLineLaunch", "dt_socket"}, ++ {"windows-amd64", "com.sun.jdi.CommandLineLaunch", "dt_socket"}, ++ {"windows-amd64", "com.sun.jdi.RawCommandLineLaunch", "dt_socket"}, + +- {"windows-x64", "com.sun.jdi.CommandLineLaunch", "dt_socket"}, +- {"windows-x64", "com.sun.jdi.RawCommandLineLaunch", "dt_socket"}, ++ {"windows-x64", "com.sun.jdi.CommandLineLaunch", "dt_socket"}, ++ {"windows-x64", "com.sun.jdi.RawCommandLineLaunch", "dt_socket"}, + +- {"macosx-amd64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"macosx-amd64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"macosx-amd64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"macosx-amd64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"mac-x64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"mac-x64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"mac-x64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"mac-x64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"macosx-aarch64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"macosx-aarch64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"macosx-aarch64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"macosx-aarch64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"mac-aarch64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"mac-aarch64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"mac-aarch64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"mac-aarch64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + +- {"aix-ppc64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, +- {"aix-ppc64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, ++ {"aix-ppc64", "com.sun.jdi.CommandLineLaunch", "dt_shmem"}, ++ {"aix-ppc64", "com.sun.jdi.RawCommandLineLaunch", "dt_shmem"}, + + // shared memory transport is implemented only on windows platform +- {"linux-i586", "dt_shmem"}, +- {"linux-ia64", "dt_shmem"}, +- {"linux-amd64", "dt_shmem"}, +- {"linux-x64", "dt_shmem"}, +- {"linux-aarch64", "dt_shmem"}, +- {"linux-arm", "dt_shmem"}, +- {"linux-ppc64", "dt_shmem"}, +- {"linux-ppc64le", "dt_shmem"}, +- {"linux-s390x", "dt_shmem"}, +- {"linux-riscv64", "dt_shmem"}, +- {"macosx-amd64", "dt_shmem"}, +- {"mac-x64", "dt_shmem"}, +- {"macosx-aarch64", "dt_shmem"}, +- {"mac-aarch64", "dt_shmem"}, +- {"aix-ppc64", "dt_shmem"}, ++ {"linux-i586", "dt_shmem"}, ++ {"linux-ia64", "dt_shmem"}, ++ {"linux-amd64", "dt_shmem"}, ++ {"linux-x64", "dt_shmem"}, ++ {"linux-aarch64", "dt_shmem"}, ++ {"linux-arm", "dt_shmem"}, ++ {"linux-ppc64", "dt_shmem"}, ++ {"linux-ppc64le", "dt_shmem"}, ++ {"linux-s390x", "dt_shmem"}, ++ {"linux-riscv64", "dt_shmem"}, ++ {"linux-loongarch64", "dt_shmem"}, ++ {"macosx-amd64", "dt_shmem"}, ++ {"mac-x64", "dt_shmem"}, ++ {"macosx-aarch64", "dt_shmem"}, ++ {"mac-aarch64", "dt_shmem"}, ++ {"aix-ppc64", "dt_shmem"}, + }; + } +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/jdk/java/foreign/callarranger/platform/PlatformLayouts.java b/test/jdk/java/foreign/callarranger/platform/PlatformLayouts.java +--- a/test/jdk/java/foreign/callarranger/platform/PlatformLayouts.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/jdk/java/foreign/callarranger/platform/PlatformLayouts.java 2024-02-20 10:42:39.452194188 +0800 +@@ -23,6 +23,13 @@ + * questions. + * + */ ++ ++/* ++ * This file has been modified by Loongson Technology in 2023, These ++ * modifications are Copyright (c) 2022, 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package platform; + + import jdk.internal.foreign.abi.SharedUtils; +@@ -261,6 +268,58 @@ + + /** + * The {@code bool} native type. ++ */ ++ public static final ValueLayout.OfBoolean C_BOOL = ValueLayout.JAVA_BOOLEAN; ++ ++ /** ++ * The {@code char} native type. ++ */ ++ public static final ValueLayout.OfByte C_CHAR = ValueLayout.JAVA_BYTE; ++ ++ /** ++ * The {@code short} native type. ++ */ ++ public static final ValueLayout.OfShort C_SHORT = ValueLayout.JAVA_SHORT; ++ ++ /** ++ * The {@code int} native type. ++ */ ++ public static final ValueLayout.OfInt C_INT = ValueLayout.JAVA_INT; ++ ++ /** ++ * The {@code long} native type. ++ */ ++ public static final ValueLayout.OfLong C_LONG = ValueLayout.JAVA_LONG; ++ ++ /** ++ * The {@code long long} native type. ++ */ ++ public static final ValueLayout.OfLong C_LONG_LONG = ValueLayout.JAVA_LONG; ++ ++ /** ++ * The {@code float} native type. ++ */ ++ public static final ValueLayout.OfFloat C_FLOAT = ValueLayout.JAVA_FLOAT; ++ ++ /** ++ * The {@code double} native type. ++ */ ++ public static final ValueLayout.OfDouble C_DOUBLE = ValueLayout.JAVA_DOUBLE; ++ ++ /** ++ * The {@code T*} native type. ++ */ ++ public static final AddressLayout C_POINTER = SharedUtils.C_POINTER; ++ ++ } ++ ++ public static final class LoongArch64 { ++ ++ // Suppresses default constructor, ensuring non-instantiability. ++ private LoongArch64() {} ++ ++ /** ++ * The {@code bool} native type. + */ + public static final ValueLayout.OfBoolean C_BOOL = ValueLayout.JAVA_BOOLEAN; + +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/jdk/java/foreign/callarranger/TestLoongArch64CallArranger.java b/test/jdk/java/foreign/callarranger/TestLoongArch64CallArranger.java +--- a/test/jdk/java/foreign/callarranger/TestLoongArch64CallArranger.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/jdk/java/foreign/callarranger/TestLoongArch64CallArranger.java 2024-02-20 10:42:39.452194188 +0800 +@@ -0,0 +1,521 @@ ++/* ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ * ++ */ ++ ++/* ++ * @test ++ * @enablePreview ++ * @requires sun.arch.data.model == "64" ++ * @compile platform/PlatformLayouts.java ++ * @modules java.base/jdk.internal.foreign ++ * java.base/jdk.internal.foreign.abi ++ * java.base/jdk.internal.foreign.abi.loongarch64 ++ * java.base/jdk.internal.foreign.abi.loongarch64.linux ++ * @build CallArrangerTestBase ++ * @run testng TestLoongArch64CallArranger ++ */ ++ ++import java.lang.foreign.FunctionDescriptor; ++import java.lang.foreign.MemoryLayout; ++import java.lang.foreign.MemorySegment; ++import jdk.internal.foreign.abi.Binding; ++import jdk.internal.foreign.abi.CallingSequence; ++import jdk.internal.foreign.abi.LinkerOptions; ++import jdk.internal.foreign.abi.loongarch64.linux.LinuxLoongArch64CallArranger; ++import jdk.internal.foreign.abi.StubLocations; ++import jdk.internal.foreign.abi.VMStorage; ++import org.testng.annotations.DataProvider; ++import org.testng.annotations.Test; ++ ++import java.lang.foreign.ValueLayout; ++import java.lang.invoke.MethodType; ++ ++import static java.lang.foreign.Linker.Option.firstVariadicArg; ++import static java.lang.foreign.ValueLayout.ADDRESS; ++import static jdk.internal.foreign.abi.Binding.*; ++import static jdk.internal.foreign.abi.loongarch64.LoongArch64Architecture.*; ++import static jdk.internal.foreign.abi.loongarch64.LoongArch64Architecture.Regs.*; ++import static platform.PlatformLayouts.LoongArch64.*; ++ ++import static org.testng.Assert.assertEquals; ++import static org.testng.Assert.assertFalse; ++import static org.testng.Assert.assertTrue; ++ ++public class TestLoongArch64CallArranger extends CallArrangerTestBase { ++ ++ private static final short STACK_SLOT_SIZE = 8; ++ private static final VMStorage TARGET_ADDRESS_STORAGE = StubLocations.TARGET_ADDRESS.storage(StorageType.PLACEHOLDER); ++ private static final VMStorage RETURN_BUFFER_STORAGE = StubLocations.RETURN_BUFFER.storage(StorageType.PLACEHOLDER); ++ ++ @Test ++ public void testEmpty() { ++ MethodType mt = MethodType.methodType(void.class); ++ FunctionDescriptor fd = FunctionDescriptor.ofVoid(); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false); ++ ++ assertFalse(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), mt.insertParameterTypes(0, MemorySegment.class)); ++ assertEquals(callingSequence.functionDesc(), fd.insertArgumentLayouts(0, ADDRESS)); ++ ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) } ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{}); ++ } ++ ++ @Test ++ public void testInteger() { ++ MethodType mt = MethodType.methodType(void.class, ++ byte.class, short.class, int.class, int.class, ++ int.class, int.class, long.class, int.class, ++ int.class, byte.class); ++ FunctionDescriptor fd = FunctionDescriptor.ofVoid( ++ C_CHAR, C_SHORT, C_INT, C_INT, ++ C_INT, C_INT, C_LONG, C_INT, ++ C_INT, C_CHAR); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false); ++ ++ assertFalse(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), mt.insertParameterTypes(0, MemorySegment.class)); ++ assertEquals(callingSequence.functionDesc(), fd.insertArgumentLayouts(0, ADDRESS)); ++ ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) }, ++ { cast(byte.class, int.class), vmStore(a0, int.class) }, ++ { cast(short.class, int.class), vmStore(a1, int.class) }, ++ { vmStore(a2, int.class) }, ++ { vmStore(a3, int.class) }, ++ { vmStore(a4, int.class) }, ++ { vmStore(a5, int.class) }, ++ { vmStore(a6, long.class) }, ++ { vmStore(a7, int.class) }, ++ { vmStore(stackStorage(STACK_SLOT_SIZE, 0), int.class) }, ++ { cast(byte.class, int.class), vmStore(stackStorage(STACK_SLOT_SIZE, 8), int.class) } ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{}); ++ } ++ ++ @Test ++ public void testTwoIntTwoFloat() { ++ MethodType mt = MethodType.methodType(void.class, int.class, int.class, float.class, float.class); ++ FunctionDescriptor fd = FunctionDescriptor.ofVoid(C_INT, C_INT, C_FLOAT, C_FLOAT); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false); ++ ++ assertFalse(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), mt.insertParameterTypes(0, MemorySegment.class)); ++ assertEquals(callingSequence.functionDesc(), fd.insertArgumentLayouts(0, ADDRESS)); ++ ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) }, ++ { vmStore(a0, int.class) }, ++ { vmStore(a1, int.class) }, ++ { vmStore(f0, float.class) }, ++ { vmStore(f1, float.class) } ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{}); ++ } ++ ++ @Test(dataProvider = "structs") ++ public void testStruct(MemoryLayout struct, Binding[] expectedBindings) { ++ MethodType mt = MethodType.methodType(void.class, MemorySegment.class); ++ FunctionDescriptor fd = FunctionDescriptor.ofVoid(struct); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false); ++ ++ assertFalse(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), mt.insertParameterTypes(0, MemorySegment.class)); ++ assertEquals(callingSequence.functionDesc(), fd.insertArgumentLayouts(0, ADDRESS)); ++ ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) }, ++ expectedBindings ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{}); ++ } ++ ++ @DataProvider ++ public static Object[][] structs() { ++ MemoryLayout struct1 = MemoryLayout.structLayout(C_INT, C_INT, C_DOUBLE, C_INT); ++ return new Object[][]{ ++ // struct s { void* a; double c; }; ++ { ++ MemoryLayout.structLayout(C_POINTER, C_DOUBLE), ++ new Binding[]{ ++ dup(), ++ bufferLoad(0, long.class), vmStore(a0, long.class), ++ bufferLoad(8, long.class), vmStore(a1, long.class) ++ } ++ }, ++ // struct s { int32_t a, b; double c; }; ++ { MemoryLayout.structLayout(C_INT, C_INT, C_DOUBLE), ++ new Binding[]{ ++ dup(), ++ // s.a & s.b ++ bufferLoad(0, long.class), vmStore(a0, long.class), ++ // s.c ++ bufferLoad(8, long.class), vmStore(a1, long.class) ++ } ++ }, ++ // struct s { int32_t a, b; double c; int32_t d; }; ++ { struct1, ++ new Binding[]{ ++ copy(struct1), ++ unboxAddress(), ++ vmStore(a0, long.class) ++ } ++ }, ++ // struct s { int32_t a[1]; float b[1]; }; ++ { MemoryLayout.structLayout(MemoryLayout.sequenceLayout(1, C_INT), ++ MemoryLayout.sequenceLayout(1, C_FLOAT)), ++ new Binding[]{ ++ dup(), ++ // s.a[0] ++ bufferLoad(0, int.class), vmStore(a0, int.class), ++ // s.b[0] ++ bufferLoad(4, float.class), vmStore(f0, float.class) ++ } ++ }, ++ // struct s { float a; /* padding */ double b }; ++ { MemoryLayout.structLayout(C_FLOAT, MemoryLayout.paddingLayout(4), C_DOUBLE), ++ new Binding[]{ ++ dup(), ++ // s.a ++ bufferLoad(0, float.class), vmStore(f0, float.class), ++ // s.b ++ bufferLoad(8, double.class), vmStore(f1, double.class), ++ } ++ } ++ }; ++ } ++ ++ @Test ++ public void testStructFA1() { ++ MemoryLayout fa = MemoryLayout.structLayout(C_FLOAT, C_FLOAT); ++ ++ MethodType mt = MethodType.methodType(MemorySegment.class, float.class, int.class, MemorySegment.class); ++ FunctionDescriptor fd = FunctionDescriptor.of(fa, C_FLOAT, C_INT, fa); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false); ++ ++ assertFalse(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), mt.insertParameterTypes(0, MemorySegment.class, MemorySegment.class)); ++ assertEquals(callingSequence.functionDesc(), fd.insertArgumentLayouts(0, ADDRESS, ADDRESS)); ++ ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(RETURN_BUFFER_STORAGE, long.class) }, ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) }, ++ { vmStore(f0, float.class) }, ++ { vmStore(a0, int.class) }, ++ { ++ dup(), ++ bufferLoad(0, float.class), ++ vmStore(f1, float.class), ++ bufferLoad(4, float.class), ++ vmStore(f2, float.class) ++ } ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{ ++ allocate(fa), ++ dup(), ++ vmLoad(f0, float.class), ++ bufferStore(0, float.class), ++ dup(), ++ vmLoad(f1, float.class), ++ bufferStore(4, float.class) ++ }); ++ } ++ ++ @Test ++ public void testStructFA2() { ++ MemoryLayout fa = MemoryLayout.structLayout(C_FLOAT, MemoryLayout.paddingLayout(4), C_DOUBLE); ++ ++ MethodType mt = MethodType.methodType(MemorySegment.class, float.class, int.class, MemorySegment.class); ++ FunctionDescriptor fd = FunctionDescriptor.of(fa, C_FLOAT, C_INT, fa); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false); ++ ++ assertFalse(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), mt.insertParameterTypes(0, MemorySegment.class, MemorySegment.class)); ++ assertEquals(callingSequence.functionDesc(), fd.insertArgumentLayouts(0, ADDRESS, ADDRESS)); ++ ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(RETURN_BUFFER_STORAGE, long.class) }, ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) }, ++ { vmStore(f0, float.class) }, ++ { vmStore(a0, int.class) }, ++ { ++ dup(), ++ bufferLoad(0, float.class), ++ vmStore(f1, float.class), ++ bufferLoad(8, double.class), ++ vmStore(f2, double.class) ++ } ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{ ++ allocate(fa), ++ dup(), ++ vmLoad(f0, float.class), ++ bufferStore(0, float.class), ++ dup(), ++ vmLoad(f1, double.class), ++ bufferStore(8, double.class) ++ }); ++ } ++ ++ @Test ++ void spillFloatingPointStruct() { ++ MemoryLayout struct = MemoryLayout.structLayout(C_FLOAT, C_FLOAT); ++ // void f(float, float, float, float, float, float, float, struct) ++ MethodType mt = MethodType.methodType(void.class, float.class, float.class, ++ float.class, float.class, float.class, ++ float.class, float.class, MemorySegment.class); ++ FunctionDescriptor fd = FunctionDescriptor.ofVoid(C_FLOAT, C_FLOAT, C_FLOAT, C_FLOAT, ++ C_FLOAT, C_FLOAT, C_FLOAT, struct); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false); ++ ++ assertFalse(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), mt.insertParameterTypes(0, MemorySegment.class)); ++ assertEquals(callingSequence.functionDesc(), fd.insertArgumentLayouts(0, ADDRESS)); ++ ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) }, ++ { vmStore(f0, float.class) }, ++ { vmStore(f1, float.class) }, ++ { vmStore(f2, float.class) }, ++ { vmStore(f3, float.class) }, ++ { vmStore(f4, float.class) }, ++ { vmStore(f5, float.class) }, ++ { vmStore(f6, float.class) }, ++ { ++ bufferLoad(0, long.class), ++ vmStore(a0, long.class) ++ } ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{}); ++ } ++ ++ @Test ++ public void testStructBoth() { ++ MemoryLayout struct = MemoryLayout.structLayout(C_INT, C_FLOAT); ++ ++ MethodType mt = MethodType.methodType(void.class, MemorySegment.class, MemorySegment.class, MemorySegment.class); ++ FunctionDescriptor fd = FunctionDescriptor.ofVoid(struct, struct, struct); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false); ++ ++ assertFalse(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), mt.insertParameterTypes(0, MemorySegment.class)); ++ assertEquals(callingSequence.functionDesc(), fd.insertArgumentLayouts(0, ADDRESS)); ++ ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) }, ++ { ++ dup(), ++ bufferLoad(0, int.class), ++ vmStore(a0, int.class), ++ bufferLoad(4, float.class), ++ vmStore(f0, float.class) ++ }, ++ { ++ dup(), ++ bufferLoad(0, int.class), ++ vmStore(a1, int.class), ++ bufferLoad(4, float.class), ++ vmStore(f1, float.class) ++ }, ++ { ++ dup(), ++ bufferLoad(0, int.class), ++ vmStore(a2, int.class), ++ bufferLoad(4, float.class), ++ vmStore(f2, float.class) ++ } ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{}); ++ } ++ ++ @Test ++ public void testStructStackSpill() { ++ // A large (> 16 byte) struct argument that is spilled to the ++ // stack should be passed as a pointer to a copy and occupy one ++ // stack slot. ++ ++ MemoryLayout struct = MemoryLayout.structLayout(C_INT, C_INT, C_DOUBLE, C_INT); ++ ++ MethodType mt = MethodType.methodType( ++ void.class, MemorySegment.class, MemorySegment.class, int.class, int.class, ++ int.class, int.class, int.class, int.class, MemorySegment.class, int.class); ++ FunctionDescriptor fd = FunctionDescriptor.ofVoid( ++ struct, struct, C_INT, C_INT, C_INT, C_INT, C_INT, C_INT, struct, C_INT); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false); ++ ++ assertFalse(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), mt.insertParameterTypes(0, MemorySegment.class)); ++ assertEquals(callingSequence.functionDesc(), fd.insertArgumentLayouts(0, ADDRESS)); ++ ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) }, ++ { copy(struct), unboxAddress(), vmStore(a0, long.class) }, ++ { copy(struct), unboxAddress(), vmStore(a1, long.class) }, ++ { vmStore(a2, int.class) }, ++ { vmStore(a3, int.class) }, ++ { vmStore(a4, int.class) }, ++ { vmStore(a5, int.class) }, ++ { vmStore(a6, int.class) }, ++ { vmStore(a7, int.class) }, ++ { copy(struct), unboxAddress(), vmStore(stackStorage(STACK_SLOT_SIZE, 0), long.class) }, ++ { vmStore(stackStorage(STACK_SLOT_SIZE, 8), int.class) } ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{}); ++ } ++ ++ @Test ++ public void testVarArgsInRegs() { ++ MethodType mt = MethodType.methodType(void.class, int.class, int.class, float.class); ++ FunctionDescriptor fd = FunctionDescriptor.ofVoid(C_INT, C_INT, C_FLOAT); ++ FunctionDescriptor fdExpected = FunctionDescriptor.ofVoid(ADDRESS, C_INT, C_INT, C_FLOAT); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false, LinkerOptions.forDowncall(fd, firstVariadicArg(1))); ++ ++ assertFalse(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), mt.insertParameterTypes(0, MemorySegment.class)); ++ assertEquals(callingSequence.functionDesc(), fdExpected); ++ ++ // This is identical to the non-variadic calling sequence ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) }, ++ { vmStore(a0, int.class) }, ++ { vmStore(a1, int.class) }, ++ { vmStore(a2, float.class) } ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{}); ++ } ++ ++ @Test ++ public void testVarArgsLong() { ++ MethodType mt = MethodType.methodType(void.class, int.class, int.class, int.class, double.class, ++ double.class, long.class, long.class, int.class, ++ double.class, double.class, long.class); ++ FunctionDescriptor fd = FunctionDescriptor.ofVoid(C_INT, C_INT, C_INT, C_DOUBLE, C_DOUBLE, ++ C_LONG, C_LONG, C_INT, C_DOUBLE, ++ C_DOUBLE, C_LONG); ++ FunctionDescriptor fdExpected = FunctionDescriptor.ofVoid(ADDRESS, C_INT, C_INT, C_INT, C_DOUBLE, ++ C_DOUBLE, C_LONG, C_LONG, C_INT, ++ C_DOUBLE, C_DOUBLE, C_LONG); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false, LinkerOptions.forDowncall(fd, firstVariadicArg(1))); ++ ++ assertFalse(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), mt.insertParameterTypes(0, MemorySegment.class)); ++ assertEquals(callingSequence.functionDesc(), fdExpected); ++ ++ // This is identical to the non-variadic calling sequence ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) }, ++ { vmStore(a0, int.class) }, ++ { vmStore(a1, int.class) }, ++ { vmStore(a2, int.class) }, ++ { vmStore(a3, double.class) }, ++ { vmStore(a4, double.class) }, ++ { vmStore(a5, long.class) }, ++ { vmStore(a6, long.class) }, ++ { vmStore(a7, int.class) }, ++ { vmStore(stackStorage(STACK_SLOT_SIZE, 0), double.class) }, ++ { vmStore(stackStorage(STACK_SLOT_SIZE, 8), double.class) }, ++ { vmStore(stackStorage(STACK_SLOT_SIZE, 16), long.class) } ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{}); ++ } ++ ++ @Test ++ public void testReturnStruct1() { ++ MemoryLayout struct = MemoryLayout.structLayout(C_LONG, C_LONG, C_FLOAT); ++ ++ MethodType mt = MethodType.methodType(MemorySegment.class, int.class, int.class, float.class); ++ FunctionDescriptor fd = FunctionDescriptor.of(struct, C_INT, C_INT, C_FLOAT); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false); ++ ++ assertTrue(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), ++ MethodType.methodType(void.class, MemorySegment.class, MemorySegment.class, ++ int.class, int.class, float.class)); ++ assertEquals(callingSequence.functionDesc(), ++ FunctionDescriptor.ofVoid(ADDRESS, C_POINTER, C_INT, C_INT, C_FLOAT)); ++ ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) }, ++ { unboxAddress(), vmStore(a0, long.class) }, ++ { vmStore(a1, int.class) }, ++ { vmStore(a2, int.class) }, ++ { vmStore(f0, float.class) } ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{}); ++ } ++ ++ @Test ++ public void testReturnStruct2() { ++ MemoryLayout struct = MemoryLayout.structLayout(C_LONG, C_LONG); ++ ++ MethodType mt = MethodType.methodType(MemorySegment.class); ++ FunctionDescriptor fd = FunctionDescriptor.of(struct); ++ LinuxLoongArch64CallArranger.Bindings bindings = LinuxLoongArch64CallArranger.getBindings(mt, fd, false); ++ ++ assertFalse(bindings.isInMemoryReturn()); ++ CallingSequence callingSequence = bindings.callingSequence(); ++ assertEquals(callingSequence.callerMethodType(), mt.insertParameterTypes(0, MemorySegment.class, MemorySegment.class)); ++ assertEquals(callingSequence.functionDesc(), fd.insertArgumentLayouts(0, ADDRESS, ADDRESS)); ++ ++ checkArgumentBindings(callingSequence, new Binding[][]{ ++ { unboxAddress(), vmStore(RETURN_BUFFER_STORAGE, long.class) }, ++ { unboxAddress(), vmStore(TARGET_ADDRESS_STORAGE, long.class) } ++ }); ++ ++ checkReturnBindings(callingSequence, new Binding[]{ ++ allocate(struct), ++ dup(), ++ vmLoad(a0, long.class), ++ bufferStore(0, long.class), ++ dup(), ++ vmLoad(a1, long.class), ++ bufferStore(8, long.class) ++ }); ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/jdk/java/util/concurrent/ConcurrentHashMap/MapLoops.java b/test/jdk/java/util/concurrent/ConcurrentHashMap/MapLoops.java +--- a/test/jdk/java/util/concurrent/ConcurrentHashMap/MapLoops.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/jdk/java/util/concurrent/ConcurrentHashMap/MapLoops.java 2024-02-20 10:42:39.822193897 +0800 +@@ -32,6 +32,12 @@ + */ + + /* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ ++/* + * @test + * @bug 4486658 + * @summary Exercise multithreaded maps, by default ConcurrentHashMap. +@@ -48,7 +54,7 @@ + /* + * @test + * @summary Exercise multithreaded maps, using only heavy monitors. +- * @requires os.arch=="x86" | os.arch=="i386" | os.arch=="amd64" | os.arch=="x86_64" | os.arch=="aarch64" | os.arch == "ppc64" | os.arch == "ppc64le" | os.arch == "riscv64" | os.arch == "s390x" ++ * @requires os.arch=="x86" | os.arch=="i386" | os.arch=="amd64" | os.arch=="x86_64" | os.arch=="aarch64" | os.arch == "ppc64" | os.arch == "ppc64le" | os.arch == "riscv64" | os.arch == "s390x" | os.arch== "loongarch64" + * @requires vm.debug + * @library /test/lib + * @run main/othervm/timeout=1600 -XX:+UnlockExperimentalVMOptions -XX:LockingMode=0 -XX:+VerifyHeavyMonitors MapLoops +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/jdk/jdk/jfr/event/os/TestCPUInformation.java b/test/jdk/jdk/jfr/event/os/TestCPUInformation.java +--- a/test/jdk/jdk/jfr/event/os/TestCPUInformation.java 2024-01-17 09:43:22.000000000 +0800 ++++ b/test/jdk/jdk/jfr/event/os/TestCPUInformation.java 2024-02-20 10:42:40.288860195 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2021, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package jdk.jfr.event.os; + + import java.util.List; +@@ -52,8 +58,8 @@ + Events.assertField(event, "hwThreads").atLeast(1); + Events.assertField(event, "cores").atLeast(1); + Events.assertField(event, "sockets").atLeast(1); +- Events.assertField(event, "cpu").containsAny("Intel", "AMD", "Unknown x86", "ARM", "PPC", "PowerPC", "AArch64", "RISCV64", "s390"); +- Events.assertField(event, "description").containsAny("Intel", "AMD", "Unknown x86", "ARM", "PPC", "PowerPC", "AArch64", "RISCV64", "s390"); ++ Events.assertField(event, "cpu").containsAny("Intel", "AMD", "Unknown x86", "ARM", "PPC", "PowerPC", "AArch64", "RISCV64", "s390", "LoongArch"); ++ Events.assertField(event, "description").containsAny("Intel", "AMD", "Unknown x86", "ARM", "PPC", "PowerPC", "AArch64", "RISCV64", "s390", "LoongArch"); + } + } + } +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/jdk/sun/net/InetAddress/nameservice/simple/DefaultCaching.java b/test/jdk/sun/net/InetAddress/nameservice/simple/DefaultCaching.java +--- a/test/jdk/sun/net/InetAddress/nameservice/simple/DefaultCaching.java 2024-01-17 09:43:23.000000000 +0800 ++++ b/test/jdk/sun/net/InetAddress/nameservice/simple/DefaultCaching.java 2024-02-20 10:42:40.358860140 +0800 +@@ -21,12 +21,19 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2023. These ++ * modifications are Copyright (c) 2023, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ * ++ */ ++ + /* @test + * @bug 6442088 + * @summary Change default DNS caching behavior for code not running under + * security manager. + * @run main/othervm/timeout=200 -Djdk.net.hosts.file=DefaultCachingHosts +- * -Dsun.net.inetaddr.ttl=20 DefaultCaching ++ * -Dsun.net.inetaddr.ttl=24 DefaultCaching + */ + import java.net.InetAddress; + import java.net.UnknownHostException; +@@ -63,7 +70,7 @@ + test("foo", "10.5.18.22", true, 5); + + // now delay to see if theclub has expired +- sleep(5); ++ sleep(9); + + test("foo", "10.5.18.22", true, 5); + test("theclub", "129.156.220.1", true, 6); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/jdk/sun/security/pkcs11/PKCS11Test.java b/test/jdk/sun/security/pkcs11/PKCS11Test.java +--- a/test/jdk/sun/security/pkcs11/PKCS11Test.java 2024-01-17 09:43:23.000000000 +0800 ++++ b/test/jdk/sun/security/pkcs11/PKCS11Test.java 2024-02-20 10:42:40.482193375 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2021, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + // common infrastructure for SunPKCS11 tests + + import java.io.ByteArrayOutputStream; +@@ -620,6 +626,8 @@ + "/usr/lib/powerpc64le-linux-gnu/", + "/usr/lib/powerpc64le-linux-gnu/nss/", + "/usr/lib64/"}); ++ osMap.put("Linux-loongarch64-64", new String[]{"/usr/lib/loongarch64-linux-gnu/", ++ "/usr/lib64/"}); + osMap.put("Linux-s390x-64", new String[]{"/usr/lib64/"}); + osMap.put("Windows-x86-32", new String[]{}); + osMap.put("Windows-amd64-64", new String[]{}); +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/lib/jdk/test/lib/Platform.java b/test/lib/jdk/test/lib/Platform.java +--- a/test/lib/jdk/test/lib/Platform.java 2024-01-17 09:43:23.000000000 +0800 ++++ b/test/lib/jdk/test/lib/Platform.java 2024-02-20 10:42:41.242192777 +0800 +@@ -21,6 +21,12 @@ + * questions. + */ + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2019, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + package jdk.test.lib; + + import java.io.BufferedReader; +@@ -233,6 +239,10 @@ + return isArch("(i386)|(x86(?!_64))"); + } + ++ public static boolean isLoongArch64() { ++ return isArch("loongarch64"); ++ } ++ + public static String getOsArch() { + return osArch; + } +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/lib-test/jdk/test/lib/TestMutuallyExclusivePlatformPredicates.java b/test/lib-test/jdk/test/lib/TestMutuallyExclusivePlatformPredicates.java +--- a/test/lib-test/jdk/test/lib/TestMutuallyExclusivePlatformPredicates.java 2024-01-17 09:43:23.000000000 +0800 ++++ b/test/lib-test/jdk/test/lib/TestMutuallyExclusivePlatformPredicates.java 2024-02-20 10:42:41.238859445 +0800 +@@ -33,6 +33,12 @@ + import java.util.List; + import java.util.Set; + ++/* ++ * This file has been modified by Loongson Technology in 2022, These ++ * modifications are Copyright (c) 2021, 2022, Loongson Technology, and are made ++ * available on the same license terms set forth above. ++ */ ++ + /** + * @test + * @summary Verify that for each group of mutually exclusive predicates defined +@@ -45,7 +51,7 @@ + */ + public class TestMutuallyExclusivePlatformPredicates { + private static enum MethodGroup { +- ARCH("isAArch64", "isARM", "isRISCV64", "isPPC", "isS390x", "isX64", "isX86"), ++ ARCH("isAArch64", "isARM", "isRISCV64", "isPPC", "isS390x", "isX64", "isX86", "isLoongArch64"), + BITNESS("is32bit", "is64bit"), + OS("isAix", "isLinux", "isOSX", "isWindows"), + VM_TYPE("isClient", "isServer", "isMinimal", "isZero", "isEmbedded"), +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/micro/org/openjdk/bench/loongarch/C2Memory.java b/test/micro/org/openjdk/bench/loongarch/C2Memory.java +--- a/test/micro/org/openjdk/bench/loongarch/C2Memory.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/micro/org/openjdk/bench/loongarch/C2Memory.java 2024-02-20 10:42:41.275526082 +0800 +@@ -0,0 +1,67 @@ ++/* ++ * Copyright (c) 2021, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++package org.openjdk.bench.loongarch; ++ ++import org.openjdk.jmh.annotations.Benchmark; ++ ++public class C2Memory { ++ public static int sum; ++ public static int array1[] = new int[0x8000]; ++ public static int array2[] = new int[0x8000]; ++ ++ @Benchmark ++ public void testMethod() { ++ for (int i = 0; i<10000;i++) { ++ sum = array1[0x7fff] + array2[0x1f0]; ++ array1[0x7fff] += array2[0x1f0]; ++ } ++ } ++ ++ @Benchmark ++ public void testBasePosIndexOffset() { ++ int xstart = 30000; ++ long carry = 63; ++ ++ for (int j=xstart; j >= 0; j--) { ++ array2[j] = array1[xstart]; ++ } ++ ++ array2[xstart] = (int)carry; ++ } ++ ++ public static byte b_array1[] = new byte[0x8000]; ++ public static byte b_array2[] = new byte[0x8000]; ++ ++ @Benchmark ++ public void testBaseIndexOffset() { ++ int xstart = 10000; ++ byte carry = 63; ++ ++ for (int j=xstart; j >= 0; j--) { ++ b_array2[j] = b_array1[xstart]; ++ } ++ ++ b_array2[xstart] = carry; ++ } ++} +diff -Naur -x .git -x .github -x .gitattributes -x .gitignore -x .jcheck a/test/micro/org/openjdk/bench/loongarch/MisAlignVector.java b/test/micro/org/openjdk/bench/loongarch/MisAlignVector.java +--- a/test/micro/org/openjdk/bench/loongarch/MisAlignVector.java 1970-01-01 08:00:00.000000000 +0800 ++++ b/test/micro/org/openjdk/bench/loongarch/MisAlignVector.java 2024-02-20 10:42:41.275526082 +0800 +@@ -0,0 +1,63 @@ ++/* ++ * Copyright (c) 2023, Loongson Technology. All rights reserved. ++ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ++ * ++ * This code is free software; you can redistribute it and/or modify it ++ * under the terms of the GNU General Public License version 2 only, as ++ * published by the Free Software Foundation. ++ * ++ * This code is distributed in the hope that it will be useful, but WITHOUT ++ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ++ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ++ * version 2 for more details (a copy is included in the LICENSE file that ++ * accompanied this code). ++ * ++ * You should have received a copy of the GNU General Public License version ++ * 2 along with this work; if not, write to the Free Software Foundation, ++ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ++ * ++ * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA ++ * or visit www.oracle.com if you need additional information or have any ++ * questions. ++ */ ++ ++package org.openjdk.bench.loongarch; ++ ++import org.openjdk.jmh.annotations.*; ++import java.util.concurrent.TimeUnit; ++ ++@BenchmarkMode(Mode.Throughput) ++@OutputTimeUnit(TimeUnit.MILLISECONDS) ++ ++@State(Scope.Thread) ++public class MisAlignVector { ++ ++ public static final double fval = 2.00; ++ public static double[] D; ++ public static int[] I; ++ ++ @Param({"100", "1000", "10000", "30000"}) ++ public static int size; ++ ++ @Setup(Level.Iteration) ++ public void up() throws Throwable { ++ D = new double[size]; ++ I = new int[size]; ++ } ++ ++ @TearDown(Level.Iteration) ++ public void down() throws Throwable { ++ D = null; ++ I = null; ++ } ++ ++ @Benchmark ++ public void testFPUAndALU(){ ++ for (int i=0; i. +*/ + +import java.lang.reflect.Field; +import java.lang.reflect.Method; +import java.lang.reflect.InvocationTargetException; + +import java.security.Permission; +import java.security.PermissionCollection; + +public class TestCryptoLevel +{ + public static void main(String[] args) + throws NoSuchFieldException, ClassNotFoundException, + IllegalAccessException, InvocationTargetException + { + Class cls = null; + Method def = null, exempt = null; + + try + { + cls = Class.forName("javax.crypto.JceSecurity"); + } + catch (ClassNotFoundException ex) + { + System.err.println("Running a non-Sun JDK."); + System.exit(0); + } + try + { + def = cls.getDeclaredMethod("getDefaultPolicy"); + exempt = cls.getDeclaredMethod("getExemptPolicy"); + } + catch (NoSuchMethodException ex) + { + System.err.println("Running IcedTea with the original crypto patch."); + System.exit(0); + } + def.setAccessible(true); + exempt.setAccessible(true); + PermissionCollection defPerms = (PermissionCollection) def.invoke(null); + PermissionCollection exemptPerms = (PermissionCollection) exempt.invoke(null); + Class apCls = Class.forName("javax.crypto.CryptoAllPermission"); + Field apField = apCls.getDeclaredField("INSTANCE"); + apField.setAccessible(true); + Permission allPerms = (Permission) apField.get(null); + if (defPerms.implies(allPerms) && (exemptPerms == null || exemptPerms.implies(allPerms))) + { + System.err.println("Running with the unlimited policy."); + System.exit(0); + } + else + { + System.err.println("WARNING: Running with a restricted crypto policy."); + System.exit(-1); + } + } +} diff --git a/TestECDSA.java b/TestECDSA.java new file mode 100644 index 0000000000000000000000000000000000000000..6eb9cb211ff59b7ff60167a2a2171e6a8b0e760d --- /dev/null +++ b/TestECDSA.java @@ -0,0 +1,49 @@ +/* TestECDSA -- Ensure ECDSA signatures are working. + Copyright (C) 2016 Red Hat, Inc. + +This program is free software: you can redistribute it and/or modify +it under the terms of the GNU Affero General Public License as +published by the Free Software Foundation, either version 3 of the +License, or (at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU Affero General Public License for more details. + +You should have received a copy of the GNU Affero General Public License +along with this program. If not, see . +*/ + +import java.math.BigInteger; +import java.security.KeyPair; +import java.security.KeyPairGenerator; +import java.security.Signature; + +/** + * @test + */ +public class TestECDSA { + + public static void main(String[] args) throws Exception { + KeyPairGenerator keyGen = KeyPairGenerator.getInstance("EC"); + KeyPair key = keyGen.generateKeyPair(); + + byte[] data = "This is a string to sign".getBytes("UTF-8"); + + Signature dsa = Signature.getInstance("NONEwithECDSA"); + dsa.initSign(key.getPrivate()); + dsa.update(data); + byte[] sig = dsa.sign(); + System.out.println("Signature: " + new BigInteger(1, sig).toString(16)); + + Signature dsaCheck = Signature.getInstance("NONEwithECDSA"); + dsaCheck.initVerify(key.getPublic()); + dsaCheck.update(data); + boolean success = dsaCheck.verify(sig); + if (!success) { + throw new RuntimeException("Test failed. Signature verification error"); + } + System.out.println("Test passed."); + } +} diff --git a/add-downgrade-the-glibc-symbol-of-fcntl.patch b/add-downgrade-the-glibc-symbol-of-fcntl.patch new file mode 100644 index 0000000000000000000000000000000000000000..285d9ac34d7294476c30aca160ebdcb00d55c692 --- /dev/null +++ b/add-downgrade-the-glibc-symbol-of-fcntl.patch @@ -0,0 +1,27 @@ +Subject: add downgrade the glibc symbol of fcntl + +--- + src/hotspot/os/linux/os_linux.cpp | 7 +++++++ + 1 file changed, 7 insertions(+) + +diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp +index aa8be1d89..0aa9b57d8 100644 +--- a/src/hotspot/os/linux/os_linux.cpp ++++ b/src/hotspot/os/linux/os_linux.cpp +@@ -127,6 +127,13 @@ + #include + #endif + ++#if defined(AARCH64) ++ __asm__(".symver fcntl64,fcntl@GLIBC_2.17"); ++#elif defined(AMD64) ++ __asm__(".symver fcntl64,fcntl@GLIBC_2.2.5"); ++#endif ++ ++ + // if RUSAGE_THREAD for getrusage() has not been defined, do it here. The code calling + // getrusage() is prepared to handle the associated failure. + #ifndef RUSAGE_THREAD +-- +2.19.1 + diff --git a/add-downgrade-the-glibc-symver-of-log2f-posix_spawn.patch b/add-downgrade-the-glibc-symver-of-log2f-posix_spawn.patch new file mode 100644 index 0000000000000000000000000000000000000000..d1d1405b9ddd2d62e7785731592c19197ca8689f --- /dev/null +++ b/add-downgrade-the-glibc-symver-of-log2f-posix_spawn.patch @@ -0,0 +1,45 @@ +From 9b51dcde590d8e93dfcd92ee6e37d19f72ce9138 Mon Sep 17 00:00:00 2001 +Subject: add downgrade-the-glibc-symver-of-log2f-posix_spawn + +--- + src/hotspot/share/opto/parse2.cpp | 8 ++++++++ + src/java.base/unix/native/libjava/ProcessImpl_md.c | 4 ++++ + 2 files changed, 12 insertions(+) + +diff --git a/src/hotspot/share/opto/parse2.cpp b/src/hotspot/share/opto/parse2.cpp +index bb21f48f6..072e07706 100644 +--- a/src/hotspot/share/opto/parse2.cpp ++++ b/src/hotspot/share/opto/parse2.cpp +@@ -45,6 +45,14 @@ + #include "runtime/deoptimization.hpp" + #include "runtime/sharedRuntime.hpp" + ++#ifdef AARCH64 ++ __asm__(".symver log2f,log2f@GLIBC_2.17"); ++#endif ++ ++#ifdef AMD64 ++ __asm__(".symver log2f,log2f@GLIBC_2.2.5"); ++#endif ++ + #ifndef PRODUCT + extern int explicit_null_checks_inserted, + explicit_null_checks_elided; +diff --git a/src/java.base/unix/native/libjava/ProcessImpl_md.c b/src/java.base/unix/native/libjava/ProcessImpl_md.c +index 9ed0ed309..64671d975 100644 +--- a/src/java.base/unix/native/libjava/ProcessImpl_md.c ++++ b/src/java.base/unix/native/libjava/ProcessImpl_md.c +@@ -48,6 +48,10 @@ + + #include "childproc.h" + ++#if defined(amd64) ++ __asm__(".symver posix_spawn,posix_spawn@GLIBC_2.2.5"); ++#endif ++ + /* + * + * When starting a child on Unix, we need to do three things: +-- +2.19.1 + diff --git a/add-downgrade-the-glibc-symver-of-memcpy.patch b/add-downgrade-the-glibc-symver-of-memcpy.patch new file mode 100644 index 0000000000000000000000000000000000000000..423a6b351545180bac81f74e20056e595df9b5d8 --- /dev/null +++ b/add-downgrade-the-glibc-symver-of-memcpy.patch @@ -0,0 +1,93 @@ +From f277a3770d7f0785365bb6ab1c592e46c5100732 Mon Sep 17 00:00:00 2001 +Subject: add downgrade-the-glibc-symver-of-memcpy + +--- + make/common/NativeCompilation.gmk | 9 +++++++++ + make/hotspot/lib/CompileJvm.gmk | 8 ++++++++ + src/hotspot/share/runtime/memcpy.cpp | 20 +++++++++++++++++++ + .../linux/native/applauncher/LinuxPackage.c | 3 +++ + 4 files changed, 40 insertions(+) + create mode 100644 src/hotspot/share/runtime/memcpy.cpp + +diff --git a/make/common/NativeCompilation.gmk b/make/common/NativeCompilation.gmk +index 0d7ab6a7e..6a8ec3f0b 100644 +--- a/make/common/NativeCompilation.gmk ++++ b/make/common/NativeCompilation.gmk +@@ -1194,6 +1194,15 @@ define SetupNativeCompilationBody + endif + endif + ++ # if ldflags contain --wrap=memcpy, add memcpy.o to OBJS ++ ifneq ($$(findstring wrap=memcpy, $$($1_LDFLAGS)$$($1_EXTRA_LDFLAGS)),) ++ ifeq ($$(findstring memcpy$(OBJ_SUFFIX), $$($1_ALL_OBJS)),) ++ $$($1_BUILD_INFO): ++ $(ECHO) 'Adding $(SUPPORT_OUTPUTDIR)/memcpy/memcpy$(OBJ_SUFFIX) to $1_ALL_OBJS' ++ $1_ALL_OBJS += $(SUPPORT_OUTPUTDIR)/memcpy/memcpy$(OBJ_SUFFIX) ++ endif ++ endif ++ + $1_VARDEPS := $$($1_LD) $$($1_SYSROOT_LDFLAGS) $$($1_LDFLAGS) $$($1_EXTRA_LDFLAGS) \ + $$($1_LIBS) $$($1_EXTRA_LIBS) $$($1_MT) \ + $$($1_CREATE_DEBUGINFO_CMDS) $$($1_MANIFEST_VERSION) \ +diff --git a/make/hotspot/lib/CompileJvm.gmk b/make/hotspot/lib/CompileJvm.gmk +index adb964d05..3736ea201 100644 +--- a/make/hotspot/lib/CompileJvm.gmk ++++ b/make/hotspot/lib/CompileJvm.gmk +@@ -192,6 +192,14 @@ $(eval $(call SetupJdkLibrary, BUILD_LIBJVM, \ + PRECOMPILED_HEADER_EXCLUDE := $(JVM_PRECOMPILED_HEADER_EXCLUDE), \ + )) + ++MEMCPY_OBJECT_FILE := $(JVM_OUTPUTDIR)/objs/memcpy$(OBJ_SUFFIX) ++ ++$(eval $(call SetupCopyFiles, COPY_MEMCPY_OBJECT_FILE, \ ++ DEST := $(SUPPORT_OUTPUTDIR)/memcpy, \ ++ FILES :=$(MEMCPY_OBJECT_FILE), \ ++)) ++TARGETS += $(COPY_MEMCPY_OBJECT_FILE) ++ + # Always recompile abstract_vm_version.cpp if libjvm needs to be relinked. This ensures + # that the internal vm version is updated as it relies on __DATE__ and __TIME__ + # macros. +diff --git a/src/hotspot/share/runtime/memcpy.cpp b/src/hotspot/share/runtime/memcpy.cpp +new file mode 100644 +index 000000000..6ab4ddb64 +--- /dev/null ++++ b/src/hotspot/share/runtime/memcpy.cpp +@@ -0,0 +1,20 @@ ++/* ++ * Copyright (c) Huawei Technologies Co., Ltd. 2018-2024. All rights reserved. ++ */ ++ ++#if defined( __GNUC__ ) && \ ++(__GNUC__ >= 5 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 7)) ++#include ++ ++#if (defined AMD64) || (defined amd64) ++/* some systems do not have newest memcpy@@GLIBC_2.14 - stay with old good one */ ++asm (".symver memcpy, memcpy@GLIBC_2.2.5"); ++ ++extern "C"{ ++ void *__wrap_memcpy(void *dest, const void *src, size_t n) ++ { ++ return memcpy(dest, src, n); ++ } ++} ++#endif ++#endif +diff --git a/src/jdk.jpackage/linux/native/applauncher/LinuxPackage.c b/src/jdk.jpackage/linux/native/applauncher/LinuxPackage.c +index 26d65f806..b7b114ac3 100644 +--- a/src/jdk.jpackage/linux/native/applauncher/LinuxPackage.c ++++ b/src/jdk.jpackage/linux/native/applauncher/LinuxPackage.c +@@ -34,6 +34,9 @@ + #include "JvmLauncher.h" + #include "LinuxPackage.h" + ++#if (defined AMD64) || (defined amd64) ++__asm__(".symver memcpy, memcpy@GLIBC_2.2.5"); ++#endif + + static char* getModulePath(void) { + char modulePath[PATH_MAX] = { 0 }; +-- +2.19.1 + diff --git a/generate_source_tarball.sh b/generate_source_tarball.sh new file mode 100644 index 0000000000000000000000000000000000000000..93a8397d6433385491694dc926afae20f7b6a62c --- /dev/null +++ b/generate_source_tarball.sh @@ -0,0 +1,156 @@ +#!/bin/bash +# Generates the 'source tarball' for JDK projects. +# +# Example: +# When used from local repo set REPO_ROOT pointing to file:// with your repo +# If your local repo follows upstream forests conventions, it may be enough to set OPENJDK_URL +# If you want to use a local copy of patch PR3788, set the path to it in the PR3788 variable +# +# In any case you have to set PROJECT_NAME REPO_NAME and VERSION. eg: +# PROJECT_NAME=jdk +# REPO_NAME=jdk +# VERSION=tip +# or to eg prepare systemtap: +# icedtea7's jstack and other tapsets +# VERSION=6327cf1cea9e +# REPO_NAME=icedtea7-2.6 +# PROJECT_NAME=release +# OPENJDK_URL=http://icedtea.classpath.org/hg/ +# TO_COMPRESS="*/tapset" +# +# They are used to create correct name and are used in construction of sources url (unless REPO_ROOT is set) + +# This script creates a single source tarball out of the repository +# based on the given tag and removes code not allowed. For +# consistency, the source tarball will always contain 'openjdk' as the top +# level folder, name is created, based on parameter +# + +if [ ! "x$PR3788" = "x" ] ; then + if [ ! -f "$PR3788" ] ; then + echo "You have specified PR3788 as $PR3788 but it does not exist. Exiting" + exit 1 + fi +fi + +set -e + +OPENJDK_URL_DEFAULT=http://hg.openjdk.java.net +COMPRESSION_DEFAULT=xz + +if [ "x$1" = "xhelp" ] ; then + echo -e "Behaviour may be specified by setting the following variables:\n" + echo "VERSION - the version of the specified OpenJDK project" + echo "PROJECT_NAME -- the name of the OpenJDK project being archived (optional; only needed by defaults)" + echo "REPO_NAME - the name of the OpenJDK repository (optional; only needed by defaults)" + echo "OPENJDK_URL - the URL to retrieve code from (optional; defaults to ${OPENJDK_URL_DEFAULT})" + echo "COMPRESSION - the compression type to use (optional; defaults to ${COMPRESSION_DEFAULT})" + echo "FILE_NAME_ROOT - name of the archive, minus extensions (optional; defaults to PROJECT_NAME-REPO_NAME-VERSION)" + echo "TO_COMPRESS - what part of clone to pack (default is openjdk)" + echo "PR3788 - the path to the PR3788 patch to apply (optional; downloaded if unavailable)" + exit 1; +fi + + +if [ "x$VERSION" = "x" ] ; then + echo "No VERSION specified" + exit -2 +fi +echo "Version: ${VERSION}" + +# REPO_NAME is only needed when we default on REPO_ROOT and FILE_NAME_ROOT +if [ "x$FILE_NAME_ROOT" = "x" -o "x$REPO_ROOT" = "x" ] ; then + if [ "x$PROJECT_NAME" = "x" ] ; then + echo "No PROJECT_NAME specified" + exit -1 + fi + echo "Project name: ${PROJECT_NAME}" + if [ "x$REPO_NAME" = "x" ] ; then + echo "No REPO_NAME specified" + exit -3 + fi + echo "Repository name: ${REPO_NAME}" +fi + +if [ "x$OPENJDK_URL" = "x" ] ; then + OPENJDK_URL=${OPENJDK_URL_DEFAULT} + echo "No OpenJDK URL specified; defaulting to ${OPENJDK_URL}" +else + echo "OpenJDK URL: ${OPENJDK_URL}" +fi + +if [ "x$COMPRESSION" = "x" ] ; then + # rhel 5 needs tar.gz + COMPRESSION=${COMPRESSION_DEFAULT} +fi +echo "Creating a tar.${COMPRESSION} archive" + +if [ "x$FILE_NAME_ROOT" = "x" ] ; then + FILE_NAME_ROOT=${PROJECT_NAME}-${REPO_NAME}-${VERSION} + echo "No file name root specified; default to ${FILE_NAME_ROOT}" +fi +if [ "x$REPO_ROOT" = "x" ] ; then + REPO_ROOT="${OPENJDK_URL}/${PROJECT_NAME}/${REPO_NAME}" + echo "No repository root specified; default to ${REPO_ROOT}" +fi; + +if [ "x$TO_COMPRESS" = "x" ] ; then + TO_COMPRESS="openjdk" + echo "No to be compressed targets specified, ; default to ${TO_COMPRESS}" +fi; + +if [ -d ${FILE_NAME_ROOT} ] ; then + echo "exists exists exists exists exists exists exists " + echo "reusing reusing reusing reusing reusing reusing " + echo ${FILE_NAME_ROOT} +else + mkdir "${FILE_NAME_ROOT}" + pushd "${FILE_NAME_ROOT}" + echo "Cloning ${VERSION} root repository from ${REPO_ROOT}" + hg clone ${REPO_ROOT} openjdk -r ${VERSION} + popd +fi +pushd "${FILE_NAME_ROOT}" + if [ -d openjdk/src ]; then + pushd openjdk + echo "Removing EC source code we don't build" + CRYPTO_PATH=src/jdk.crypto.ec/share/native/libsunec/impl + rm -vf ${CRYPTO_PATH}/ec2.h + rm -vf ${CRYPTO_PATH}/ec2_163.c + rm -vf ${CRYPTO_PATH}/ec2_193.c + rm -vf ${CRYPTO_PATH}/ec2_233.c + rm -vf ${CRYPTO_PATH}/ec2_aff.c + rm -vf ${CRYPTO_PATH}/ec2_mont.c + rm -vf ${CRYPTO_PATH}/ecp_192.c + rm -vf ${CRYPTO_PATH}/ecp_224.c + + echo "Syncing EC list with NSS" + if [ "x$PR3788" = "x" ] ; then + # originally for 8: + # get PR3788.patch (from http://icedtea.classpath.org/hg/icedtea14) from most correct tag + # Do not push it or publish it (see https://icedtea.classpath.org/bugzilla/show_bug.cgi?id=3788) + echo "PR3788 not found. Downloading..." + wget http://icedtea.classpath.org/hg/icedtea14/raw-file/fabce78297b7/patches/pr3788.patch + echo "Applying ${PWD}/pr3788.patch" + patch -Np1 < pr3788.patch + rm pr3788.patch + else + echo "Applying ${PR3788}" + patch -Np1 < $PR3788 + fi; + find . -name '*.orig' -exec rm -vf '{}' ';' + popd + fi + + echo "Compressing remaining forest" + if [ "X$COMPRESSION" = "Xxz" ] ; then + SWITCH=cJf + else + SWITCH=czf + fi + tar --exclude-vcs -$SWITCH ${FILE_NAME_ROOT}.tar.${COMPRESSION} $TO_COMPRESS + mv ${FILE_NAME_ROOT}.tar.${COMPRESSION} .. +popd +echo "Done. You may want to remove the uncompressed version - $FILE_NAME_ROOT." + + diff --git a/jconsole.desktop.in b/jconsole.desktop.in new file mode 100644 index 0000000000000000000000000000000000000000..a8917c1fe6316cf8e207ddf4fc1264e1905caec5 --- /dev/null +++ b/jconsole.desktop.in @@ -0,0 +1,10 @@ +[Desktop Entry] +Name=OpenJDK @JAVA_MAJOR_VERSION@ Monitoring & Management Console @ARCH@ +Comment=Monitor and manage OpenJDK @JAVA_MAJOR_VERSION@ applications for @ARCH@ +Exec=@JAVA_HOME@/jconsole +Icon=java-@JAVA_MAJOR_VERSION@-@JAVA_VENDOR@ +Terminal=false +Type=Application +StartupWMClass=sun-tools-jconsole-JConsole +Categories=Development;Profiling;Java; +Version=1.0 diff --git a/jdk-updates-jdk21u-jdk-21.0.2+12.tar.gz b/jdk-updates-jdk21u-jdk-21.0.2+12.tar.gz new file mode 100644 index 0000000000000000000000000000000000000000..80b9f788639b699baffa4e3d00e53497cbe9587f Binary files /dev/null and b/jdk-updates-jdk21u-jdk-21.0.2+12.tar.gz differ diff --git a/jdk-updates-jdk21u-jdk-21.0.3+9.tar.gz b/jdk-updates-jdk21u-jdk-21.0.3+9.tar.gz new file mode 100644 index 0000000000000000000000000000000000000000..a2b9d8e94ec2831200584704e6e07f0a98717a6c Binary files /dev/null and b/jdk-updates-jdk21u-jdk-21.0.3+9.tar.gz differ diff --git a/nss.cfg.in b/nss.cfg.in new file mode 100644 index 0000000000000000000000000000000000000000..377a39c2e414f108f73c7a92cd512cec908394c0 --- /dev/null +++ b/nss.cfg.in @@ -0,0 +1,5 @@ +name = NSS +nssLibraryDirectory = @NSS_LIBDIR@ +nssDbMode = noDb +attributes = compatibility +handleStartupErrors = ignoreMultipleInitialisation diff --git a/openjdk-21.spec b/openjdk-21.spec new file mode 100644 index 0000000000000000000000000000000000000000..6ba4201599bf8a1d88cd4fa4db195f99a75395c5 --- /dev/null +++ b/openjdk-21.spec @@ -0,0 +1,1814 @@ +# RPM conditionals so as to be able to dynamically produce + +# slowdebug/release builds. See: +# http://rpm.org/user_doc/conditional_builds.html +# +# Examples: +# +# Produce release *and* slowdebug builds on x86_64 (default): +# $ rpmbuild -ba java-1.8.0-openjdk.spec +# +# Produce only release builds (no slowdebug builds) on x86_64: +# $ rpmbuild -ba java-1.8.0-openjdk.spec --without slowdebug +# +# Only produce a release build on x86_64: +# $ fedpkg mockbuild --without slowdebug +# +# Only produce a debug build on x86_64: +# $ fedpkg local --without release +# +# Enable slowdebug builds by default on relevant arches. +%bcond_without slowdebug +# Enable release builds by default on relevant arches. +%bcond_without release + +# The -g flag says to use strip -g instead of full strip on DSOs or EXEs. +# This fixes detailed NMT and other tools which need minimal debug info. +%global _find_debuginfo_opts -g + +# note: parametrized macros are order-sensitive (unlike not-parametrized) even with normal macros +# also necessary when passing it as parameter to other macros. If not macro, then it is considered a switch +# see the difference between global and define: +# See https://github.com/rpm-software-management/rpm/issues/127 to comments at "pmatilai commented on Aug 18, 2017" +%global debug_suffix_unquoted -slowdebug +# quoted one for shell operations +%global debug_suffix "%{debug_suffix_unquoted}" +%global normal_suffix "" + +# if you want only debug build but providing java build only normal build but set normalbuild_parameter +%global debug_warning This package has full debug on. Install only in need and remove asap. +%global debug_on with full debug on +%global for_debug for packages with debug on + +%if %{with release} +%global include_normal_build 1 +%else +%global include_normal_build 0 +%endif + +%if %{include_normal_build} +%global build_loop1 %{normal_suffix} +%else +%global build_loop1 %{nil} +%endif + +# We have hardcoded list of files, which is appearing in alternatives, and in files +# in alternatives those are slaves and master, very often triplicated by man pages +# in files all masters and slaves are ghosted +# the ghosts are here to allow installation via query like `dnf install /usr/bin/java` +# you can list those files, with appropriate sections: cat *.spec | grep -e --install -e --slave -e post_ +# TODO - fix those hardcoded lists via single list +# those files ,must *NOT* be ghosted for *slowdebug* packages +# FIXME - if you are moving jshell or jlink or simialr, always modify all three sections +# you can check via headless and devels: +# rpm -ql --noghost java-11-openjdk-headless-11.0.1.13-8.fc29.x86_64.rpm | grep bin +# == rpm -ql java-11-openjdk-headless-slowdebug-11.0.1.13-8.fc29.x86_64.rpm | grep bin +# != rpm -ql java-11-openjdk-headless-11.0.1.13-8.fc29.x86_64.rpm | grep bin +# similarly for other %%{_jvmdir}/{jre,java} and %%{_javadocdir}/{java,java-zip} +%define is_release_build() %( if [ "%{?1}" == "%{debug_suffix_unquoted}" ]; then echo "0" ; else echo "1"; fi ) + +# while JDK is a techpreview(is_system_jdk=0), some provides are turned off. Once jdk stops to be an techpreview, move it to 1 +# as sytem JDK, we mean any JDK which can run whole system java stack without issues (like bytecode issues, module issues, dependencies...) +%global is_system_jdk 0 + +%global aarch64 aarch64 arm64 armv8 +%global jit_arches x86_64 %{aarch64} loongarch64 riscv64 +%global aot_arches x86_64 %{aarch64} + +# Set of architectures for which java has short vector math library (libsvml.so) +%global svml_arches x86_64 + +# By default, we build a debug build during main build on JIT architectures +%if %{with slowdebug} +%ifarch %{jit_arches} +%global include_debug_build 1 +%else +%global include_debug_build 0 +%endif +%endif + +%if %{include_debug_build} +%global build_loop2 %{debug_suffix} +%else +%global build_loop2 %{nil} +%endif + +# if you disable both builds, then the build fails +%global build_loop %{build_loop1} %{build_loop2} +# note: that order: normal_suffix debug_suffix, in case of both enabled +# is expected in one single case at the end of the build +%global rev_build_loop %{build_loop2} %{build_loop1} + +%ifarch %{jit_arches} +%global bootstrap_build 0 +%else +%global bootstrap_build 0 +%endif + +%if %{bootstrap_build} +%global release_targets bootcycle-images docs-zip +%else +%global release_targets images docs-zip +%endif +# No docs nor bootcycle for debug builds +%global debug_targets images + + +# Filter out flags from the optflags macro that cause problems with the OpenJDK build +# We filter out -O flags so that the optimization of HotSpot is not lowered from O3 to O2 +# We filter out -Wall which will otherwise cause HotSpot to produce hundreds of thousands of warnings (100+mb logs) +# We replace it with -Wformat (required by -Werror=format-security) and -Wno-cpp to avoid FORTIFY_SOURCE warnings +# We filter out -fexceptions as the HotSpot build explicitly does -fno-exceptions and it's otherwise the default for C++ +%global ourflags %(echo %optflags | sed -e 's|-Wall|-Wformat -Wno-cpp|' | sed -r -e 's|-O[0-9]*||') +%global ourcppflags %(echo %ourflags | sed -e 's|-fexceptions||') +%global ourldflags %{__global_ldflags} + +# With disabled nss is NSS deactivated, so NSS_LIBDIR can contain the wrong path +# the initialization must be here. Later the pkg-config have buggy behavior +# looks like openjdk RPM specific bug +# Always set this so the nss.cfg file is not broken +%global NSS_LIBDIR %(pkg-config --variable=libdir nss) + +# In some cases, the arch used by the JDK does +# not match _arch. +# Also, in some cases, the machine name used by SystemTap +# does not match that given by _build_cpu +%ifarch x86_64 +%global archinstall amd64 +%endif +%ifarch %{aarch64} +%global archinstall aarch64 +%endif +%ifarch loongarch64 +%global archinstall loongarch64 +%endif +%ifnarch %{jit_arches} +%global archinstall %{_arch} +%endif + +%ifarch %{jit_arches} +%global with_systemtap 1 +%else +%global with_systemtap 0 +%endif + +# New Version-String scheme-style defines +# If you bump majorver, you must bump also vendor_version_string +%global majorver 21 +# Used via new version scheme. JDK 19 was +# GA'ed in March 2022 => 22.3 +%global vendor_version_string BiSheng +%global securityver 3 +# buildjdkver is usually same as %%{majorver}, +# but in time of bootstrap of next jdk, it is majorver-1, +# and this it is better to change it here, on single place +%global buildjdkver 21 +# We don't add any LTS designator for STS packages (Fedora and EPEL). +# We need to explicitly exclude EPEL as it would have the %%{rhel} macro defined. +%if 0%{?rhel} && !0%{?epel} + %global lts_designator "LTS" + %global lts_designator_zip -%{lts_designator} +%else + %global lts_designator "" + %global lts_designator_zip "" +%endif + +# Standard JPackage naming and versioning defines +%global origin openjdk +%global origin_nice OpenJDK +%global top_level_dir_name %{origin} +%global minorver 0 +%global buildver 9 +%global rpmrelease 1 +# priority must be 8 digits in total; up to openjdk 1.8, we were using 18..... so when we moved to 11, we had to add another digit +%if %is_system_jdk +%global priority %( printf '%02d%02d%02d%02d' %{majorver} %{minorver} %{securityver} %{buildver} ) +%else +# for techpreview, using 1, so slowdebugs can have 0 +%global priority %( printf '%08d' 1 ) +%endif +%global newjavaver %{majorver}.%{minorver}.%{securityver} + +# Strip up to 6 trailing zeros in newjavaver, as the JDK does, to get the correct version used in filenames +%global filever %(svn=%{newjavaver}; for i in 1 2 3 4 5 6 ; do svn=${svn%%.0} ; done; echo ${svn}) + +%global javaver %{majorver} + +# Define milestone (EA for pre-releases, GA for releases) +# Release will be (where N is usually a number starting at 1): +# - 0.N%%{?extraver}%%{?dist} for EA releases, +# - N%%{?extraver}{?dist} for GA releases +%global is_ga 1 +%if %{is_ga} +%global build_type GA +%global expected_ea_designator "" +%global ea_designator_zip "" +%global extraver %{nil} +%global eaprefix %{nil} +%else +%global build_type EA +%global expected_ea_designator ea +%global ea_designator_zip -%{expected_ea_designator} +%global extraver .%{expected_ea_designator} +%global eaprefix 0. +%endif + +# Define what url should JVM offer in case of a crash report +%global bug_url https://gitee.com/src-openeuler/openjdk-21/issues + +# parametrized macros are order-sensitive +%global compatiblename java-%{majorver}-%{origin} +%global fullversion %{compatiblename}-%{version}-%{release} +# images stub +%global jdkimage jdk +# output dir stub +%define buildoutputdir() %{expand:openjdk/build%{?1}} +# we can copy the javadoc to not arched dir, or make it not noarch +%define uniquejavadocdir() %{expand:%{fullversion}.%{_arch}%{?1}} +# main id and dir of this jdk +%define uniquesuffix() %{expand:%{fullversion}.%{_arch}%{?1}} + +%global _privatelibs libsplashscreen[.]so.*|libawt_xawt[.]so.*|libjli[.]so.*|libattach[.]so.*|libawt[.]so.*|libextnet[.]so.*|libawt_headless[.]so.*|libdt_socket[.]so.*|libfontmanager[.]so.*|libinstrument[.]so.*|libj2gss[.]so.*|libj2pcsc[.]so.*|libj2pkcs11[.]so.*|libjaas[.]so.*|libjavajpeg[.]so.*|libjdwp[.]so.*|libjimage[.]so.*|libjsound[.]so.*|liblcms[.]so.*|libmanagement[.]so.*|libmanagement_agent[.]so.*|libmanagement_ext[.]so.*|libmlib_image[.]so.*|libnet[.]so.*|libnio[.]so.*|libprefs[.]so.*|librmi[.]so.*|libsaproc[.]so.*|libsctp[.]so.*|libzip[.]so.* +%global _publiclibs libjawt[.]so.*|libjava[.]so.*|libjvm[.]so.*|libverify[.]so.*|libjsig[.]so.* +%if %is_system_jdk +%global __provides_exclude ^(%{_privatelibs})$ +%global __requires_exclude ^(%{_privatelibs})$ +%global __provides_exclude_from ^.*/%{uniquesuffix -- %{debug_suffix_unquoted}}/.*$ +%else +# Don't generate provides/requires for JDK provided shared libraries at all. +%global __provides_exclude ^(%{_privatelibs}|%{_publiclibs})$ +%global __requires_exclude ^(%{_privatelibs}|%{_publiclibs})$ +%endif + + +%global etcjavasubdir %{_sysconfdir}/java/java-%{javaver}-%{origin} +%define etcjavadir() %{expand:%{etcjavasubdir}/%{uniquesuffix -- %{?1}}} +# Standard JPackage directories and symbolic links. +%define sdkdir() %{expand:%{uniquesuffix -- %{?1}}} +%define jrelnk() %{expand:jre-%{javaver}-%{origin}-%{version}-%{release}.%{_arch}%{?1}} + +%define sdkbindir() %{expand:%{_jvmdir}/%{sdkdir -- %{?1}}/bin} +%define jrebindir() %{expand:%{_jvmdir}/%{sdkdir -- %{?1}}/bin} + +%global rpm_state_dir %{_localstatedir}/lib/rpm-state/ + +%if %{with_systemtap} +# Where to install systemtap tapset (links) +# We would like these to be in a package specific sub-dir, +# but currently systemtap doesn't support that, so we have to +# use the root tapset dir for now. To distinguish between 64 +# and 32 bit architectures we place the tapsets under the arch +# specific dir (note that systemtap will only pickup the tapset +# for the primary arch for now). Systemtap uses the machine name +# aka build_cpu as architecture specific directory name. +%global tapsetroot /usr/share/systemtap +%global tapsetdirttapset %{tapsetroot}/tapset/ +%global tapsetdir %{tapsetdirttapset}/%{_build_cpu} +%endif + +# not-duplicated scriptlets for normal/debug packages +%global update_desktop_icons /usr/bin/gtk-update-icon-cache %{_datadir}/icons/hicolor &>/dev/null || : + + +%define post_script() %{expand: +update-desktop-database %{_datadir}/applications &> /dev/null || : +/bin/touch --no-create %{_datadir}/icons/hicolor &>/dev/null || : +exit 0 +} + + +%define post_headless() %{expand: +%ifarch %{jit_arches} +# MetaspaceShared::generate_vtable_methods not implemented for PPC JIT +%ifnarch %{ppc64le} +%{jrebindir -- %{?1}}/java -Xshare:dump >/dev/null 2>/dev/null +%endif +%endif + +PRIORITY=%{priority} +if [ "%{?1}" == %{debug_suffix} ]; then + let PRIORITY=PRIORITY-1 +fi + +ext=.gz +alternatives \\ + --install %{_bindir}/java java %{jrebindir -- %{?1}}/java $PRIORITY --family %{name}.%{_arch} \\ + --slave %{_jvmdir}/jre jre %{_jvmdir}/%{sdkdir -- %{?1}} \\ + --slave %{_bindir}/keytool keytool %{jrebindir -- %{?1}}/keytool \\ + --slave %{_bindir}/rmiregistry rmiregistry %{jrebindir -- %{?1}}/rmiregistry \\ + --slave %{_mandir}/man1/java.1$ext java.1$ext \\ + %{_mandir}/man1/java-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/keytool.1$ext keytool.1$ext \\ + %{_mandir}/man1/keytool-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/rmiregistry.1$ext rmiregistry.1$ext \\ + %{_mandir}/man1/rmiregistry-%{uniquesuffix -- %{?1}}.1$ext + +for X in %{origin} %{javaver} ; do + alternatives --install %{_jvmdir}/jre-"$X" jre_"$X" %{_jvmdir}/%{sdkdir -- %{?1}} $PRIORITY --family %{name}.%{_arch} +done + +update-alternatives --install %{_jvmdir}/jre-%{javaver}-%{origin} jre_%{javaver}_%{origin} %{_jvmdir}/%{jrelnk -- %{?1}} $PRIORITY --family %{name}.%{_arch} + + +update-desktop-database %{_datadir}/applications &> /dev/null || : +/bin/touch --no-create %{_datadir}/icons/hicolor &>/dev/null || : + +# see pretrans where this file is declared +# also see that pretrans is only for non-debug +if [ ! "%{?1}" == %{debug_suffix} ]; then + if [ -f %{_libexecdir}/copy_jdk_configs_fixFiles.sh ] ; then + sh %{_libexecdir}/copy_jdk_configs_fixFiles.sh %{rpm_state_dir}/%{name}.%{_arch} %{_jvmdir}/%{sdkdir -- %{?1}} + fi +fi + +exit 0 +} + +%define postun_script() %{expand: +update-desktop-database %{_datadir}/applications &> /dev/null || : +if [ $1 -eq 0 ] ; then + /bin/touch --no-create %{_datadir}/icons/hicolor &>/dev/null + %{update_desktop_icons} +fi +exit 0 +} + + +%define postun_headless() %{expand: + alternatives --remove java %{jrebindir -- %{?1}}/java + alternatives --remove jre_%{origin} %{_jvmdir}/%{sdkdir -- %{?1}} + alternatives --remove jre_%{javaver} %{_jvmdir}/%{sdkdir -- %{?1}} + alternatives --remove jre_%{javaver}_%{origin} %{_jvmdir}/%{jrelnk -- %{?1}} +} + +%define posttrans_script() %{expand: +%{update_desktop_icons} +} + +%define post_devel() %{expand: + +PRIORITY=%{priority} +if [ "%{?1}" == %{debug_suffix} ]; then + let PRIORITY=PRIORITY-1 +fi + +ext=.gz +alternatives \\ + --install %{_bindir}/javac javac %{sdkbindir -- %{?1}}/javac $PRIORITY --family %{name}.%{_arch} \\ + --slave %{_jvmdir}/java java_sdk %{_jvmdir}/%{sdkdir -- %{?1}} \\ + --slave %{_bindir}/jlink jlink %{sdkbindir -- %{?1}}/jlink \\ + --slave %{_bindir}/jmod jmod %{sdkbindir -- %{?1}}/jmod \\ +%ifarch %{jit_arches} +%ifnarch s390x + --slave %{_bindir}/jhsdb jhsdb %{sdkbindir -- %{?1}}/jhsdb \\ +%endif +%endif + --slave %{_bindir}/jar jar %{sdkbindir -- %{?1}}/jar \\ + --slave %{_bindir}/jarsigner jarsigner %{sdkbindir -- %{?1}}/jarsigner \\ + --slave %{_bindir}/javadoc javadoc %{sdkbindir -- %{?1}}/javadoc \\ + --slave %{_bindir}/javap javap %{sdkbindir -- %{?1}}/javap \\ + --slave %{_bindir}/jcmd jcmd %{sdkbindir -- %{?1}}/jcmd \\ + --slave %{_bindir}/jconsole jconsole %{sdkbindir -- %{?1}}/jconsole \\ + --slave %{_bindir}/jdb jdb %{sdkbindir -- %{?1}}/jdb \\ + --slave %{_bindir}/jdeps jdeps %{sdkbindir -- %{?1}}/jdeps \\ + --slave %{_bindir}/jdeprscan jdeprscan %{sdkbindir -- %{?1}}/jdeprscan \\ + --slave %{_bindir}/jfr jfr %{sdkbindir -- %{?1}}/jfr \\ + --slave %{_bindir}/jimage jimage %{sdkbindir -- %{?1}}/jimage \\ + --slave %{_bindir}/jinfo jinfo %{sdkbindir -- %{?1}}/jinfo \\ + --slave %{_bindir}/jmap jmap %{sdkbindir -- %{?1}}/jmap \\ + --slave %{_bindir}/jps jps %{sdkbindir -- %{?1}}/jps \\ + --slave %{_bindir}/jpackage jpackage %{sdkbindir -- %{?1}}/jpackage \\ + --slave %{_bindir}/jrunscript jrunscript %{sdkbindir -- %{?1}}/jrunscript \\ + --slave %{_bindir}/jshell jshell %{sdkbindir -- %{?1}}/jshell \\ + --slave %{_bindir}/jstack jstack %{sdkbindir -- %{?1}}/jstack \\ + --slave %{_bindir}/jstat jstat %{sdkbindir -- %{?1}}/jstat \\ + --slave %{_bindir}/jstatd jstatd %{sdkbindir -- %{?1}}/jstatd \\ + --slave %{_bindir}/jwebserver jwebserver %{sdkbindir -- %{?1}}/jwebserver \\ + --slave %{_bindir}/serialver serialver %{sdkbindir -- %{?1}}/serialver \\ + --slave %{_mandir}/man1/jar.1$ext jar.1$ext \\ + %{_mandir}/man1/jar-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jarsigner.1$ext jarsigner.1$ext \\ + %{_mandir}/man1/jarsigner-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/javac.1$ext javac.1$ext \\ + %{_mandir}/man1/javac-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/javadoc.1$ext javadoc.1$ext \\ + %{_mandir}/man1/javadoc-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/javap.1$ext javap.1$ext \\ + %{_mandir}/man1/javap-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jcmd.1$ext jcmd.1$ext \\ + %{_mandir}/man1/jcmd-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jconsole.1$ext jconsole.1$ext \\ + %{_mandir}/man1/jconsole-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jdb.1$ext jdb.1$ext \\ + %{_mandir}/man1/jdb-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jdeps.1$ext jdeps.1$ext \\ + %{_mandir}/man1/jdeps-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jinfo.1$ext jinfo.1$ext \\ + %{_mandir}/man1/jinfo-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jmap.1$ext jmap.1$ext \\ + %{_mandir}/man1/jmap-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jps.1$ext jps.1$ext \\ + %{_mandir}/man1/jps-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jpackage.1$ext jpackage.1$ext \\ + %{_mandir}/man1/jpackage-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jrunscript.1$ext jrunscript.1$ext \\ + %{_mandir}/man1/jrunscript-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jstack.1$ext jstack.1$ext \\ + %{_mandir}/man1/jstack-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jstat.1$ext jstat.1$ext \\ + %{_mandir}/man1/jstat-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jstatd.1$ext jstatd.1$ext \\ + %{_mandir}/man1/jstatd-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/jwebserver.1$ext jwebserver.1$ext \\ + %{_mandir}/man1/jwebserver-%{uniquesuffix -- %{?1}}.1$ext \\ + --slave %{_mandir}/man1/serialver.1$ext serialver.1$ext \\ + %{_mandir}/man1/serialver-%{uniquesuffix -- %{?1}}.1$ext + +for X in %{origin} %{javaver} ; do + alternatives \\ + --install %{_jvmdir}/java-"$X" java_sdk_"$X" %{_jvmdir}/%{sdkdir -- %{?1}} $PRIORITY --family %{name}.%{_arch} +done + +update-alternatives --install %{_jvmdir}/java-%{javaver}-%{origin} java_sdk_%{javaver}_%{origin} %{_jvmdir}/%{sdkdir -- %{?1}} $PRIORITY --family %{name}.%{_arch} + +update-desktop-database %{_datadir}/applications &> /dev/null || : +/bin/touch --no-create %{_datadir}/icons/hicolor &>/dev/null || : + +exit 0 +} + +%define postun_devel() %{expand: + alternatives --remove javac %{sdkbindir -- %{?1}}/javac + alternatives --remove java_sdk_%{origin} %{_jvmdir}/%{sdkdir -- %{?1}} + alternatives --remove java_sdk_%{javaver} %{_jvmdir}/%{sdkdir -- %{?1}} + alternatives --remove java_sdk_%{javaver}_%{origin} %{_jvmdir}/%{sdkdir -- %{?1}} + +update-desktop-database %{_datadir}/applications &> /dev/null || : + +if [ $1 -eq 0 ] ; then + /bin/touch --no-create %{_datadir}/icons/hicolor &>/dev/null + %{update_desktop_icons} +fi +exit 0 +} + +%define posttrans_devel() %{expand: +%{update_desktop_icons} +} + +%define post_javadoc() %{expand: + +PRIORITY=%{priority} +if [ "%{?1}" == %{debug_suffix} ]; then + let PRIORITY=PRIORITY-1 +fi + +alternatives \\ + --install %{_javadocdir}/java javadocdir %{_javadocdir}/%{uniquejavadocdir -- %{?1}}/api \\ + $PRIORITY --family %{name} +exit 0 +} + +%define postun_javadoc() %{expand: + alternatives --remove javadocdir %{_javadocdir}/%{uniquejavadocdir -- %{?1}}/api +exit 0 +} + +%define post_javadoc_zip() %{expand: + +PRIORITY=%{priority} +if [ "%{?1}" == %{debug_suffix} ]; then + let PRIORITY=PRIORITY-1 +fi + +alternatives \\ + --install %{_javadocdir}/java-zip javadoczip %{_javadocdir}/%{uniquejavadocdir -- %{?1}}.zip \\ + $PRIORITY --family %{name} +exit 0 +} + +%define postun_javadoc_zip() %{expand: + alternatives --remove javadoczip %{_javadocdir}/%{uniquejavadocdir -- %{?1}}.zip +exit 0 +} + +%define files_jre() %{expand: +%{_datadir}/icons/hicolor/*x*/apps/java-%{javaver}-%{origin}.png +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libsplashscreen.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libawt_xawt.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libjawt.so +} + + +%define files_jre_headless() %{expand: +%license %{_jvmdir}/%{sdkdir -- %{?1}}/legal +%dir %{_sysconfdir}/.java/.systemPrefs +%dir %{_sysconfdir}/.java +%dir %{_jvmdir}/%{sdkdir -- %{?1}} +%{_jvmdir}/%{sdkdir -- %{?1}}/release +%{_jvmdir}/%{jrelnk -- %{?1}} +%dir %{_jvmdir}/%{sdkdir -- %{?1}}/bin +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/java +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/keytool +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/rmiregistry +%dir %{_jvmdir}/%{sdkdir -- %{?1}}/lib +%ifarch %{jit_arches} +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/classlist +%endif +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/jexec +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/jspawnhelper +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/jrt-fs.jar +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/modules +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/psfont.properties.ja +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/psfontj2d.properties +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/tzdb.dat +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libjli.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/jvm.cfg +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libattach.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libawt.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libextnet.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libjsig.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libawt_headless.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libdt_socket.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libfontmanager.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libinstrument.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libj2gss.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libj2pcsc.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libj2pkcs11.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libjaas.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libjava.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libjavajpeg.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libjdwp.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libjimage.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libjsound.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/liblcms.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/lible.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libmanagement.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libmanagement_agent.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libmanagement_ext.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libmlib_image.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libnet.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libnio.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libprefs.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/librmi.so +%ifarch %{jit_arches} +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libsaproc.so +%endif +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libsctp.so +%ifarch %{svml_arches} +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libjsvml.so +%endif +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libsyslookup.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libverify.so +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/libzip.so +%dir %{_jvmdir}/%{sdkdir -- %{?1}}/lib/jfr +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/jfr/default.jfc +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/jfr/profile.jfc +%{_mandir}/man1/java-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/keytool-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/rmiregistry-%{uniquesuffix -- %{?1}}.1* +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/server/ +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/client/ +%attr(444, root, root) %ghost %{_jvmdir}/%{sdkdir -- %{?1}}/lib/server/classes.jsa +%attr(444, root, root) %ghost %{_jvmdir}/%{sdkdir -- %{?1}}/lib/client/classes.jsa +%dir %{etcjavasubdir} +%dir %{etcjavadir -- %{?1}} +%dir %{etcjavadir -- %{?1}}/lib +%dir %{etcjavadir -- %{?1}}/lib/security +%{etcjavadir -- %{?1}}/lib/security/cacerts +%dir %{etcjavadir -- %{?1}}/conf +%dir %{etcjavadir -- %{?1}}/conf/sdp +%dir %{etcjavadir -- %{?1}}/conf/management +%dir %{etcjavadir -- %{?1}}/conf/security +%dir %{etcjavadir -- %{?1}}/conf/security/policy +%dir %{etcjavadir -- %{?1}}/conf/security/policy/limited +%dir %{etcjavadir -- %{?1}}/conf/security/policy/unlimited +%config(noreplace) %{etcjavadir -- %{?1}}/lib/security/default.policy +%config(noreplace) %{etcjavadir -- %{?1}}/lib/security/blocked.certs +%config(noreplace) %{etcjavadir -- %{?1}}/lib/security/public_suffix_list.dat +%config(noreplace) %{etcjavadir -- %{?1}}/conf/security/policy/limited/exempt_local.policy +%config(noreplace) %{etcjavadir -- %{?1}}/conf/security/policy/limited/default_local.policy +%config(noreplace) %{etcjavadir -- %{?1}}/conf/security/policy/limited/default_US_export.policy +%config(noreplace) %{etcjavadir -- %{?1}}/conf/security/policy/unlimited/default_local.policy +%config(noreplace) %{etcjavadir -- %{?1}}/conf/security/policy/unlimited/default_US_export.policy + %{etcjavadir -- %{?1}}/conf/security/policy/README.txt +%config(noreplace) %{etcjavadir -- %{?1}}/conf/security/java.policy +%config(noreplace) %{etcjavadir -- %{?1}}/conf/security/java.security +%config(noreplace) %{etcjavadir -- %{?1}}/conf/logging.properties +%config(noreplace) %{etcjavadir -- %{?1}}/conf/security/nss.cfg +%config(noreplace) %{etcjavadir -- %{?1}}/conf/management/jmxremote.access +# these are config templates, thus not config-noreplace +%config %{etcjavadir -- %{?1}}/conf/management/jmxremote.password.template +%config %{etcjavadir -- %{?1}}/conf/sdp/sdp.conf.template +%config(noreplace) %{etcjavadir -- %{?1}}/conf/management/management.properties +%config(noreplace) %{etcjavadir -- %{?1}}/conf/net.properties +%config(noreplace) %{etcjavadir -- %{?1}}/conf/sound.properties +%config(noreplace) %{etcjavadir -- %{?1}}/conf/jaxp.properties +%{_jvmdir}/%{sdkdir -- %{?1}}/conf +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/security +%if %is_system_jdk +%if %{is_release_build -- %{?1}} +%ghost %{_bindir}/java +%ghost %{_jvmdir}/jre +%ghost %{_bindir}/keytool +%ghost %{_bindir}/pack200 +%ghost %{_bindir}/rmid +%ghost %{_bindir}/rmiregistry +%ghost %{_bindir}/unpack200 +%ghost %{_jvmdir}/jre-%{origin} +%ghost %{_jvmdir}/jre-%{javaver} +%ghost %{_jvmdir}/jre-%{javaver}-%{origin} +%endif +%endif +} + +%define files_devel() %{expand: +%dir %{_jvmdir}/%{sdkdir -- %{?1}}/bin +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jar +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jarsigner +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/javac +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/javadoc +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/javap +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jconsole +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jcmd +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jfr +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jdb +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jdeps +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jdeprscan +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jimage +%ifarch %{jit_arches} +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jhsdb +%{_mandir}/man1/jhsdb-%{uniquesuffix -- %{?1}}.1.gz +%endif +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jinfo +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jlink +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jmap +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jmod +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jps +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jpackage +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jrunscript +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jshell +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jstack +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jstat +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jstatd +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/jwebserver +%{_jvmdir}/%{sdkdir -- %{?1}}/bin/serialver +%{_jvmdir}/%{sdkdir -- %{?1}}/include +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/ct.sym +%if %{with_systemtap} +%{_jvmdir}/%{sdkdir -- %{?1}}/tapset +%endif +%{_datadir}/applications/*jconsole%{?1}.desktop +%{_mandir}/man1/jar-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jarsigner-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/javac-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/javadoc-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/javap-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jconsole-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jcmd-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jdb-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jdeps-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jinfo-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jmap-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jps-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jpackage-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jrunscript-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jstack-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jstat-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jstatd-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jwebserver-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/serialver-%{uniquesuffix -- %{?1}}.1* +%{_mandir}/man1/jdeprscan-%{uniquesuffix -- %{?1}}.1.gz +%{_mandir}/man1/jlink-%{uniquesuffix -- %{?1}}.1.gz +%{_mandir}/man1/jmod-%{uniquesuffix -- %{?1}}.1.gz +%{_mandir}/man1/jshell-%{uniquesuffix -- %{?1}}.1.gz +%{_mandir}/man1/jfr-%{uniquesuffix -- %{?1}}.1.gz + +%if %{with_systemtap} +%dir %{tapsetroot} +%dir %{tapsetdirttapset} +%dir %{tapsetdir} +%{tapsetdir}/*%{_arch}%{?1}.stp +%endif +%if %is_system_jdk +%if %{is_release_build -- %{?1}} +%ghost %{_bindir}/javac +%ghost %{_jvmdir}/java +%ghost %{_bindir}/jlink +%ghost %{_bindir}/jmod +%ghost %{_bindir}/jhsdb +%ghost %{_bindir}/jar +%ghost %{_bindir}/jarsigner +%ghost %{_bindir}/javadoc +%ghost %{_bindir}/javap +%ghost %{_bindir}/jcmd +%ghost %{_bindir}/jconsole +%ghost %{_bindir}/jdb +%ghost %{_bindir}/jdeps +%ghost %{_bindir}/jdeprscan +%ghost %{_bindir}/jimage +%ghost %{_bindir}/jinfo +%ghost %{_bindir}/jmap +%ghost %{_bindir}/jps +%ghost %{_bindir}/jrunscript +%ghost %{_bindir}/jshell +%ghost %{_bindir}/jstack +%ghost %{_bindir}/jstat +%ghost %{_bindir}/jstatd +%ghost %{_bindir}/serialver +%ghost %{_jvmdir}/java-%{origin} +%ghost %{_jvmdir}/java-%{javaver} +%ghost %{_jvmdir}/java-%{javaver}-%{origin} +%endif +%endif +} + +%define files_jmods() %{expand: +%{_jvmdir}/%{sdkdir -- %{?1}}/jmods +} + +%define files_demo() %{expand: +%license %{_jvmdir}/%{sdkdir -- %{?1}}/legal +%{_jvmdir}/%{sdkdir -- %{?1}}/demo +} + +%define files_src() %{expand: +%license %{_jvmdir}/%{sdkdir -- %{?1}}/legal +%{_jvmdir}/%{sdkdir -- %{?1}}/lib/src.zip +} + +%define files_javadoc() %{expand: +%doc %{_javadocdir}/%{uniquejavadocdir -- %{?1}} +%license %{buildoutputdir -- %{?1}}/images/%{jdkimage}/legal +%if %is_system_jdk +%if %{is_release_build -- %{?1}} +%ghost %{_javadocdir}/java +%endif +%endif +} + +%define files_javadoc_zip() %{expand: +%doc %{_javadocdir}/%{uniquejavadocdir -- %{?1}}.zip +%license %{buildoutputdir -- %{?1}}/images/%{jdkimage}/legal +%if %is_system_jdk +%if %{is_release_build -- %{?1}} +%ghost %{_javadocdir}/java-zip +%endif +%endif +} + +# not-duplicated requires/provides/obsoletes for normal/debug packages +%define java_rpo() %{expand: +Requires: fontconfig%{?_isa} +Requires: xorg-x11-fonts-Type1 +# Requires rest of java +Requires: %{name}-headless%{?1}%{?_isa} = %{epoch}:%{version}-%{release} +OrderWithRequires: %{name}-headless%{?1}%{?_isa} = %{epoch}:%{version}-%{release} +# for java-X-openjdk package's desktop binding +Recommends: gtk3%{?_isa} + +Provides: java-%{javaver}-%{origin}%{?1} = %{epoch}:%{version}-%{release} + +# Standard JPackage base provides +Provides: jre-%{javaver}%{?1} = %{epoch}:%{version}-%{release} +Provides: jre-%{javaver}-%{origin}%{?1} = %{epoch}:%{version}-%{release} +Provides: java-%{javaver}%{?1} = %{epoch}:%{version}-%{release} +%if %is_system_jdk +Provides: java-%{origin}%{?1} = %{epoch}:%{version}-%{release} +Provides: jre-%{origin}%{?1} = %{epoch}:%{version}-%{release} +Provides: java%{?1} = %{epoch}:%{version}-%{release} +Provides: jre%{?1} = %{epoch}:%{version}-%{release} +%endif +} + +%define java_headless_rpo() %{expand: +# Require /etc/pki/java/cacerts +Requires: ca-certificates +# Require javapackages-filesystem for ownership of /usr/lib/jvm/ and macros +Requires: javapackages-filesystem +# Require zone-info data provided by tzdata-java sub-package +Requires: tzdata-java >= 2015d +# tool to copy jdk's configs - should be Recommends only, but then only dnf/yum enforce it, +# not rpm transaction and so no configs are persisted when pure rpm -u is run. It may be +# considered as regression +Requires: copy-jdk-configs >= 3.9 +OrderWithRequires: copy-jdk-configs +# for printing support +Requires: cups-libs +# Post requires alternatives to install tool alternatives +Requires(post): %{_sbindir}/alternatives +# chkconfig does not contain alternatives anymore +# Postun requires alternatives to uninstall tool alternatives +Requires(postun): %{_sbindir}/alternatives +# for optional support of kernel stream control, card reader and printing bindings +Suggests: lksctp-tools%{?_isa}, pcsc-lite-libs%{?_isa} + +# Standard JPackage base provides +Provides: jre-%{javaver}-%{origin}-headless%{?1} = %{epoch}:%{version}-%{release} +Provides: jre-%{javaver}-headless%{?1} = %{epoch}:%{version}-%{release} +Provides: java-%{javaver}-%{origin}-headless%{?1} = %{epoch}:%{version}-%{release} +Provides: java-%{javaver}-headless%{?1} = %{epoch}:%{version}-%{release} +%if %is_system_jdk +Provides: java-%{origin}-headless%{?1} = %{epoch}:%{version}-%{release} +Provides: jre-%{origin}-headless%{?1} = %{epoch}:%{version}-%{release} +Provides: jre-headless%{?1} = %{epoch}:%{version}-%{release} +Provides: java-headless%{?1} = %{epoch}:%{version}-%{release} +%endif +} + +%define java_devel_rpo() %{expand: +# Requires base package +Requires: %{name}%{?1}%{?_isa} = %{epoch}:%{version}-%{release} +OrderWithRequires: %{name}-headless%{?1}%{?_isa} = %{epoch}:%{version}-%{release} +# Post requires alternatives to install tool alternatives +Requires(post): %{_sbindir}/alternatives +# chkconfig does not contain alternatives anymore +# Postun requires alternatives to uninstall tool alternatives +Requires(postun): %{_sbindir}/alternatives + +# Standard JPackage devel provides +Provides: java-sdk-%{javaver}-%{origin}%{?1} = %{epoch}:%{version}-%{release} +Provides: java-sdk-%{javaver}%{?1} = %{epoch}:%{version}-%{release} +Provides: java-%{javaver}-devel%{?1} = %{epoch}:%{version}-%{release} +Provides: java-%{javaver}-%{origin}-devel%{?1} = %{epoch}:%{version}-%{release} +%if %is_system_jdk +Provides: java-devel-%{origin}%{?1} = %{epoch}:%{version}-%{release} +Provides: java-sdk-%{origin}%{?1} = %{epoch}:%{version}-%{release} +Provides: java-devel%{?1} = %{epoch}:%{version}-%{release} +Provides: java-sdk%{?1} = %{epoch}:%{version}-%{release} +%endif +} + +%define java_jmods_rpo() %{expand: +# Requires devel package +# as jmods are bytecode, they should be OK without any _isa +Requires: %{name}-devel%{?1} = %{epoch}:%{version}-%{release} +OrderWithRequires: %{name}-headless%{?1} = %{epoch}:%{version}-%{release} + +Provides: java-%{javaver}-jmods%{?1} = %{epoch}:%{version}-%{release} +Provides: java-%{javaver}-%{origin}-jmods%{?1} = %{epoch}:%{version}-%{release} +%if %is_system_jdk +Provides: java-jmods%{?1} = %{epoch}:%{version}-%{release} +%endif +} + +%define java_demo_rpo() %{expand: +Requires: %{name}%{?1}%{?_isa} = %{epoch}:%{version}-%{release} +OrderWithRequires: %{name}-headless%{?1}%{?_isa} = %{epoch}:%{version}-%{release} + +Provides: java-%{javaver}-demo%{?1} = %{epoch}:%{version}-%{release} +Provides: java-%{javaver}-%{origin}-demo%{?1} = %{epoch}:%{version}-%{release} +%if %is_system_jdk +Provides: java-demo%{?1} = %{epoch}:%{version}-%{release} +%endif +} + +%define java_javadoc_rpo() %{expand: +OrderWithRequires: %{name}-headless%{?1}%{?_isa} = %{epoch}:%{version}-%{release} +# Post requires alternatives to install javadoc alternative +Requires(post): %{_sbindir}/alternatives +# chkconfig does not contain alternatives anymore +# Postun requires alternatives to uninstall javadoc alternative +Requires(postun): %{_sbindir}/alternatives + +# Standard JPackage javadoc provides +Provides: java-%{javaver}-javadoc%{?1} = %{epoch}:%{version}-%{release} +Provides: java-%{javaver}-%{origin}-javadoc%{?1} = %{epoch}:%{version}-%{release} +%if %is_system_jdk +Provides: java-javadoc%{?1} = %{epoch}:%{version}-%{release} +%endif +} + +%define java_src_rpo() %{expand: +Requires: %{name}-headless%{?1}%{?_isa} = %{epoch}:%{version}-%{release} + +# Standard JPackage sources provides +Provides: java-%{javaver}-src%{?1} = %{epoch}:%{version}-%{release} +Provides: java-%{javaver}-%{origin}-src%{?1} = %{epoch}:%{version}-%{release} +%if %is_system_jdk +Provides: java-src%{?1} = %{epoch}:%{version}-%{release} +%endif +} + +# Prevent brp-java-repack-jars from being run +%global __jar_repack 0 + +Name: java-21-%{origin} +Version: %{newjavaver}.%{buildver} +# This package needs `.rolling` as part of Release so as to not conflict on install with +# java-X-openjdk. I.e. when latest rolling release is also an LTS release packaged as +Release: 1 + +# java-1.5.0-ibm from jpackage.org set Epoch to 1 for unknown reasons +# and this change was brought into RHEL-4. java-1.5.0-ibm packages +# also included the epoch in their virtual provides. This created a +# situation where in-the-wild java-1.5.0-ibm packages provided "java = +# 1:1.5.0". In RPM terms, "1.6.0 < 1:1.5.0" since 1.6.0 is +# interpreted as 0:1.6.0. So the "java >= 1.6.0" requirement would be +# satisfied by the 1:1.5.0 packages. Thus we need to set the epoch in +# JDK package >= 1.6.0 to 1, and packages referring to JDK virtual +# provides >= 1.6.0 must specify the epoch, "java >= 1:1.6.0". + +Epoch: 1 +Summary: %{origin_nice} Runtime Environment %{majorver} + +# HotSpot code is licensed under GPLv2 +# JDK library code is licensed under GPLv2 with the Classpath exception +# The Apache license is used in code taken from Apache projects (primarily xalan & xerces) +# DOM levels 2 & 3 and the XML digital signature schemas are licensed under the W3C Software License +# The JSR166 concurrency code is in the public domain +# The BSD and MIT licenses are used for a number of third-party libraries (see ADDITIONAL_LICENSE_INFO) +# The OpenJDK source tree includes: +# - JPEG library (IJG), zlib & libpng (zlib), giflib (MIT), harfbuzz (ISC), +# - freetype (FTL), jline (BSD) and LCMS (MIT) +# - jquery (MIT), jdk.crypto.cryptoki PKCS 11 wrapper (RSA) +# - public_suffix_list.dat from publicsuffix.org (MPLv2.0) +# The test code includes copies of NSS under the Mozilla Public License v2.0 +# The PCSClite headers are under a BSD with advertising license +# The elliptic curve cryptography (ECC) source code is licensed under the LGPLv2.1 or any later version +License: ASL 1.1 and ASL 2.0 and BSD and BSD with advertising and GPL+ and GPLv2 and GPLv2 with exceptions and IJG and LGPLv2+ and MIT and MPLv2.0 and Public Domain and W3C and zlib and ISC and FTL and RSA +URL: http://openjdk.java.net/ + + +# to regenerate source0 (jdk) and source8 (jdk's taspets) run update_package.sh +# update_package.sh contains hard-coded repos, revisions, tags, and projects to regenerate the source archives +Source0: jdk-updates-jdk%{majorver}u-jdk-%{filever}+%{buildver}.tar.gz +Source1: OpenJDK20U-jdk_aarch64_linux_hotspot_20.0.2_9.tar.xz +Source2: OpenJDK20U-jdk_x64_linux_hotspot_20.0.2_9.tar.xz +Source8: systemtap_3.2_tapsets_hg-icedtea8-9d464368e06d.tar.xz + +# Desktop files. Adapted from IcedTea +Source9: jconsole.desktop.in + +# nss configuration file +Source11: nss.cfg.in + +# Removed libraries that we link instead +Source12: remove-intree-libraries.sh + +# Ensure we aren't using the limited crypto policy +Source13: TestCryptoLevel.java + +# Ensure ECDSA is working +Source14: TestECDSA.java + +############################################ +# +# RPM/distribution specific patches +# +############################################ + +# NSS via SunPKCS11 Provider (disabled comment +# due to memory leak). +Patch1000: rh1648249-add_commented_out_nss_cfg_provider_to_java_security.patch + +# Ignore AWTError when assistive technologies are loaded +Patch1: rh1648242-accessible_toolkit_crash_do_not_break_jvm.patch +# Restrict access to java-atk-wrapper classes +Patch2: rh1648644-java_access_bridge_privileged_security.patch +# Depend on pcs-lite-libs instead of pcs-lite-devel as this is only in optional repo + +############################################# +# +# OpenJDK patches in need of upstreaming +# +############################################# + +# 21.0.1 +Patch7: add-downgrade-the-glibc-symver-of-log2f-posix_spawn.patch +Patch8: add-downgrade-the-glibc-symver-of-memcpy.patch +Patch9: add-downgrade-the-glibc-symbol-of-fcntl.patch + +############################################ +# +# LoongArch64 specific patches +# +############################################ +Patch2000: LoongArch64-support.patch + +BuildRequires: autoconf +BuildRequires: automake +BuildRequires: alsa-lib-devel +BuildRequires: binutils +BuildRequires: cups-devel +BuildRequires: desktop-file-utils +# elfutils only are OK for build without AOT +BuildRequires: elfutils-devel +BuildRequires: elfutils-extra +BuildRequires: fontconfig-devel +BuildRequires: freetype-devel +BuildRequires: giflib-devel +BuildRequires: gcc-c++ +BuildRequires: gdb +BuildRequires: harfbuzz-devel +BuildRequires: lcms2-devel +BuildRequires: libjpeg-devel +BuildRequires: libpng-devel +BuildRequires: libxslt +BuildRequires: libX11-devel +BuildRequires: libXi-devel +BuildRequires: libXinerama-devel +BuildRequires: libXrandr-devel +BuildRequires: libXrender-devel +BuildRequires: libXt-devel +BuildRequires: libXtst-devel +# Requirements for setting up the nss.cfg +BuildRequires: nss-devel +BuildRequires: pkgconfig +BuildRequires: xorg-x11-proto-devel +BuildRequires: zip +BuildRequires: javapackages-filesystem +%ifarch loongarch64 +BuildRequires: java-21-openjdk-devel +%else +BuildRequires: java-latest-openjdk-devel +%endif +# Zero-assembler build requirement +%ifnarch %{jit_arches} +BuildRequires: libffi-devel +%endif +BuildRequires: tzdata-java >= 2015d +# Earlier versions have a bug in tree vectorization on PPC +BuildRequires: gcc >= 4.8.3-8 + +%if %{with_systemtap} +BuildRequires: systemtap-sdt-devel +%endif + +# this is always built, also during debug-only build +# when it is built in debug-only this package is just placeholder +%{java_rpo %{nil}} + +%description +The %{origin_nice} runtime environment. + +%if %{include_debug_build} +%package slowdebug +Summary: %{origin_nice} Runtime Environment %{majorver} %{debug_on} + +%{java_rpo -- %{debug_suffix_unquoted}} +%description slowdebug +The %{origin_nice} runtime environment. +%{debug_warning} +%endif + +%if %{include_normal_build} +%package headless +Summary: %{origin_nice} Headless Runtime Environment %{majorver} + +%{java_headless_rpo %{nil}} + +%description headless +The %{origin_nice} runtime environment %{majorver} without audio and video support. +%endif + +%if %{include_debug_build} +%package headless-slowdebug +Summary: %{origin_nice} Runtime Environment %{debug_on} + +%{java_headless_rpo -- %{debug_suffix_unquoted}} + +%description headless-slowdebug +The %{origin_nice} runtime environment %{majorver} without audio and video support. +%{debug_warning} +%endif + +%if %{include_normal_build} +%package devel +Summary: %{origin_nice} Development Environment %{majorver} + +%{java_devel_rpo %{nil}} + +%description devel +The %{origin_nice} development tools %{majorver}. +%endif + +%if %{include_debug_build} +%package devel-slowdebug +Summary: %{origin_nice} Development Environment %{majorver} %{debug_on} + +%{java_devel_rpo -- %{debug_suffix_unquoted}} + +%description devel-slowdebug +The %{origin_nice} development tools %{majorver}. +%{debug_warning} +%endif + +%if %{include_normal_build} +%package jmods +Summary: JMods for %{origin_nice} %{majorver} + +%{java_jmods_rpo %{nil}} + +%description jmods +The JMods for %{origin_nice}. +%endif + +%if %{include_debug_build} +%package jmods-slowdebug +Summary: JMods for %{origin_nice} %{majorver} %{debug_on} + +%{java_jmods_rpo -- %{debug_suffix_unquoted}} + +%description jmods-slowdebug +The JMods for %{origin_nice} %{majorver}. +%{debug_warning} +%endif + +%if %{include_normal_build} +%package demo +Summary: %{origin_nice} Demos %{majorver} + +%{java_demo_rpo %{nil}} + +%description demo +The %{origin_nice} demos %{majorver}. +%endif + +%if %{include_debug_build} +%package demo-slowdebug +Summary: %{origin_nice} Demos %{majorver} %{debug_on} + +%{java_demo_rpo -- %{debug_suffix_unquoted}} + +%description demo-slowdebug +The %{origin_nice} demos %{majorver}. +%{debug_warning} +%endif + +%if %{include_normal_build} +%package src +Summary: %{origin_nice} Source Bundle %{majorver} + +%{java_src_rpo %{nil}} + +%description src +The java-%{origin}-src sub-package contains the complete %{origin_nice} %{majorver} +class library source code for use by IDE indexers and debuggers. +%endif + +%if %{include_debug_build} +%package src-slowdebug +Summary: %{origin_nice} Source Bundle %{majorver} %{for_debug} + +%{java_src_rpo -- %{debug_suffix_unquoted}} + +%description src-slowdebug +The java-%{origin}-src-slowdebug sub-package contains the complete %{origin_nice} %{majorver} + class library source code for use by IDE indexers and debuggers. Debugging %{for_debug}. +%endif + +%if %{include_normal_build} +%package javadoc +Summary: %{origin_nice} %{majorver} API documentation +Requires: javapackages-filesystem +Obsoletes: javadoc-slowdebug < 1:13.0.0.33-1.rolling + +%{java_javadoc_rpo %{nil}} + +%description javadoc +The %{origin_nice} %{majorver} API documentation. +%endif + +%if %{include_normal_build} +%package javadoc-zip +Summary: %{origin_nice} %{majorver} API documentation compressed in a single archive +Requires: javapackages-filesystem +Obsoletes: javadoc-zip-slowdebug < 1:13.0.0.33-1.rolling + +%{java_javadoc_rpo %{nil}} + +%description javadoc-zip +The %{origin_nice} %{majorver} API documentation compressed in a single archive. +%endif + +%global debug_package %{nil} + +%prep +if [ %{include_normal_build} -eq 0 -o %{include_normal_build} -eq 1 ] ; then + echo "include_normal_build is %{include_normal_build}" +else + echo "include_normal_build is %{include_normal_build}, thats invalid. Use 1 for yes or 0 for no" + exit 11 +fi +if [ %{include_debug_build} -eq 0 -o %{include_debug_build} -eq 1 ] ; then + echo "include_debug_build is %{include_debug_build}" +else + echo "include_debug_build is %{include_debug_build}, thats invalid. Use 1 for yes or 0 for no" + exit 12 +fi +if [ %{include_debug_build} -eq 0 -a %{include_normal_build} -eq 0 ] ; then + echo "You have disabled both include_debug_build and include_normal_build. That is a no go." + exit 13 +fi +%setup -q -c -n %{uniquesuffix ""} -T -a 0 +prioritylength=`expr length %{priority}` +if [ $prioritylength -ne 8 ] ; then + echo "priority must be 8 digits in total, violated" + exit 14 +fi + +# OpenJDK patches + +# Remove libraries that are linked +sh %{SOURCE12} +%ifnarch loongarch64 +pushd %{top_level_dir_name} +%patch1 -p1 +%patch2 -p1 +%patch7 -p1 +%patch8 -p1 +%patch9 -p1 +popd # openjdk +%endif + +%ifarch loongarch64 +pushd %{top_level_dir_name} +%patch2000 -p1 +popd +%endif + +%patch1000 + +# Extract systemtap tapsets +%if %{with_systemtap} +tar --strip-components=1 -x -I xz -f %{SOURCE8} +%if %{include_debug_build} +cp -r tapset tapset%{debug_suffix} +%endif + + +for suffix in %{build_loop} ; do + for file in "tapset"$suffix/*.in; do + OUTPUT_FILE=`echo $file | sed -e "s:\.stp\.in$:%{version}-%{release}.%{_arch}.stp:g"` + sed -e "s:@ABS_SERVER_LIBJVM_SO@:%{_jvmdir}/%{sdkdir -- $suffix}/lib/server/libjvm.so:g" $file > $file.1 +# TODO find out which architectures other than i686 have a client vm +%ifarch %{ix86} + sed -e "s:@ABS_CLIENT_LIBJVM_SO@:%{_jvmdir}/%{sdkdir -- $suffix}/lib/client/libjvm.so:g" $file.1 > $OUTPUT_FILE +%else + sed -e "/@ABS_CLIENT_LIBJVM_SO@/d" $file.1 > $OUTPUT_FILE +%endif + sed -i -e "s:@ABS_JAVA_HOME_DIR@:%{_jvmdir}/%{sdkdir -- $suffix}:g" $OUTPUT_FILE + sed -i -e "s:@INSTALL_ARCH_DIR@:%{archinstall}:g" $OUTPUT_FILE + sed -i -e "s:@prefix@:%{_jvmdir}/%{sdkdir -- $suffix}/:g" $OUTPUT_FILE + done +done +# systemtap tapsets ends +%endif + +# Prepare desktop files +for suffix in %{build_loop} ; do +for file in %{SOURCE9}; do + FILE=`basename $file | sed -e s:\.in$::g` + EXT="${FILE##*.}" + NAME="${FILE%.*}" + OUTPUT_FILE=$NAME$suffix.$EXT + sed -e "s:@JAVA_HOME@:%{sdkbindir -- $suffix}:g" $file > $OUTPUT_FILE + sed -i -e "s:@JRE_HOME@:%{jrebindir -- $suffix}:g" $OUTPUT_FILE + sed -i -e "s:@ARCH@:%{version}-%{release}.%{_arch}$suffix:g" $OUTPUT_FILE + sed -i -e "s:@JAVA_MAJOR_VERSION@:%{majorver}:g" $OUTPUT_FILE + sed -i -e "s:@JAVA_VENDOR@:%{origin}:g" $OUTPUT_FILE +done +done + +# Setup nss.cfg +sed -e "s:@NSS_LIBDIR@:%{NSS_LIBDIR}:g" %{SOURCE11} > nss.cfg + + +%build +# How many CPU's do we have? +export NUM_PROC=%(/usr/bin/getconf _NPROCESSORS_ONLN 2> /dev/null || :) +export NUM_PROC=${NUM_PROC:-1} +%if 0%{?_smp_ncpus_max} +# Honor %%_smp_ncpus_max +[ ${NUM_PROC} -gt %{?_smp_ncpus_max} ] && export NUM_PROC=%{?_smp_ncpus_max} +%endif + +%ifarch s390x sparc64 alpha %{power64} %{aarch64} loongarch64 +export ARCH_DATA_MODEL=64 +%endif +%ifarch alpha +export CFLAGS="$CFLAGS -mieee" +%endif + +# We use ourcppflags because the OpenJDK build seems to +# pass EXTRA_CFLAGS to the HotSpot C++ compiler... +# Explicitly set the C++ standard as the default has changed on GCC >= 6 +EXTRA_CFLAGS="%ourcppflags" +EXTRA_CPP_FLAGS="%ourcppflags" + +%ifarch %{power64} ppc +# fix rpmlint warnings +EXTRA_CFLAGS="$EXTRA_CFLAGS -fno-strict-aliasing" +%endif +export EXTRA_CFLAGS + +for suffix in %{build_loop} ; do +if [ "x$suffix" = "x" ] ; then + debugbuild=release +else + # change --something to something + debugbuild=`echo $suffix | sed "s/-//g"` +fi + +# Variable used in hs_err hook on build failures +top_dir_abs_path=$(pwd)/%{top_level_dir_name} + +# The OpenJDK version file includes the current +# upstream version information. For some reason, +# configure does not automatically use the +# default pre-version supplied there (despite +# what the file claims), so we pass it manually +# to configure +VERSION_FILE=${top_dir_abs_path}/make/conf/version-numbers.conf +if [ -f ${VERSION_FILE} ] ; then + EA_DESIGNATOR=$(grep '^DEFAULT_PROMOTED_VERSION_PRE' ${VERSION_FILE} | cut -d '=' -f 2) +else + echo "Could not find OpenJDK version file."; + exit 16 +fi +if [ "x${EA_DESIGNATOR}" != "x%{expected_ea_designator}" ] ; then + echo "Spec file is configured for a %{build_type} build, but upstream version-pre setting is ${EA_DESIGNATOR}"; + exit 17 +fi + +ARCH=$(uname -m) +BOOTJDKPATH=/usr/lib/jvm/java-%{buildjdkver}-openjdk +if [ "$ARCH" = "x86_64" ]; then + tar -xf %{SOURCE2} + BOOTJDKPATH=$PWD/jdk-20.0.2+9 +elif [ "$ARCH" = "aarch64" ]; then + tar -xf %{SOURCE1} + BOOTJDKPATH=$PWD/jdk-20.0.2+9 +elif [ "$ARCH" = "riscv64" ]; then + : +elif [ "$ARCH" = "loongarch64" ]; then + : +else + echo " Failed to set BOOTJDKPATH " + exit 18 +fi + +echo $BOOTJDKPATH + +mkdir -p %{buildoutputdir -- $suffix} +pushd %{buildoutputdir -- $suffix} + +bash ../configure \ +%ifnarch %{jit_arches} + --with-jvm-variants=zero \ +%endif +%ifarch %{ppc64le} + --with-jobs=1 \ +%endif +%if "%toolchain" == "clang" + --with-toolchain-type=clang \ +%endif + --with-version-build=%{buildver} \ + --with-version-pre=\"${EA_DESIGNATOR}\" \ + --with-version-opt=%{lts_designator} \ + --with-vendor-version-string="%{vendor_version_string}" \ + --with-vendor-name="BiSheng" \ + --with-vendor-url="https://gitee.com/src-openeuler/openjdk-21/" \ + --with-vendor-bug-url="%{bug_url}" \ + --with-vendor-vm-bug-url="%{bug_url}" \ + --with-boot-jdk=$BOOTJDKPATH \ + --with-debug-level=$debugbuild \ + --with-native-debug-symbols=internal \ + --enable-unlimited-crypto \ + --with-zlib=system \ + --with-libjpeg=system \ + --with-giflib=system \ + --with-libpng=system \ + --with-lcms=system \ + --with-harfbuzz=system \ + --with-extra-cxxflags="$EXTRA_CPP_FLAGS" \ + --with-extra-cflags="$EXTRA_CFLAGS" \ + --with-extra-ldflags="%{ourldflags}" \ + --with-num-cores="$NUM_PROC" \ + --with-source-date="${SOURCE_DATE_EPOCH}" \ + --disable-javac-server \ + --disable-warnings-as-errors + +# Debug builds don't need same targets as release for +# build speed-up +maketargets="%{release_targets}" +if echo $debugbuild | grep -q "debug" ; then + maketargets="%{debug_targets}" +fi +make \ + WARNINGS_ARE_ERRORS="-Wno-error" \ + CFLAGS_WARNINGS_ARE_ERRORS="-Wno-error" \ + $maketargets || ( pwd; find $top_dir_abs_path -name "hs_err_pid*.log" | xargs cat && false ) + +# the build (erroneously) removes read permissions from some jars +# this is a regression in OpenJDK 7 (our compiler): +find images/%{jdkimage} -iname '*.jar' -exec chmod ugo+r {} \; + +# Build screws up permissions on binaries +# https://bugs.openjdk.java.net/browse/JDK-8173610 +find images/%{jdkimage} -iname '*.so' -exec chmod +x {} \; +find images/%{jdkimage}/bin/ -exec chmod +x {} \; + +popd >& /dev/null + +# Install nss.cfg right away as we will be using the JRE above +export JAVA_HOME=$(pwd)/%{buildoutputdir -- $suffix}/images/%{jdkimage} + +# Install nss.cfg right away as we will be using the JRE above +install -m 644 nss.cfg $JAVA_HOME/conf/security/ + +# Use system-wide tzdata +rm $JAVA_HOME/lib/tzdb.dat +ln -s %{_datadir}/javazi-1.8/tzdb.dat $JAVA_HOME/lib/tzdb.dat + +# build cycles +done + +%check + +# We test debug first as it will give better diagnostics on a crash +for suffix in %{rev_build_loop} ; do + +export JAVA_HOME=$(pwd)/%{buildoutputdir -- $suffix}/images/%{jdkimage} + +# Check unlimited policy has been used +$JAVA_HOME/bin/javac -d . %{SOURCE13} +$JAVA_HOME/bin/java --add-opens java.base/javax.crypto=ALL-UNNAMED TestCryptoLevel + +# Check ECC is working +$JAVA_HOME/bin/javac -d . %{SOURCE14} +$JAVA_HOME/bin/java $(echo $(basename %{SOURCE14})|sed "s|\.java||") + +# Check debug symbols are present and can identify code +find "$JAVA_HOME" -iname '*.so' -print0 | while read -d $'\0' lib +do + if [ ![-f "$lib"] ] ; then + echo "Testing $lib for debug symbols" + # All these tests rely on RPM failing the build if the exit code of any set + # of piped commands is non-zero. + + # Test for .debug_* sections in the shared object. This is the main test + # Stripped objects will not contain these + eu-readelf -S "$lib" | grep "] .debug_" + test $(eu-readelf -S "$lib" | grep -E "\]\ .debug_(info|abbrev)" | wc --lines) == 2 + + # Test FILE symbols. These will most likely be removed by anything that + # manipulates symbol tables because it's generally useless. So a nice test + # that nothing has messed with symbols + old_IFS="$IFS" + IFS=$'\n' + for line in $(eu-readelf -s "$lib" | grep "00000000 0 FILE LOCAL DEFAULT") + do + # We expect to see .cpp and .S files, except for architectures like aarch64 and + # s390 where we expect .o and .oS files + echo "$line" | grep -E "ABS ((.*/)?[-_a-zA-Z0-9]+\.(c|cc|cpp|cxx|o|S|oS))?$" + done + IFS="$old_IFS" + + # If this is the JVM, look for javaCalls.(cpp|o) in FILEs, for extra sanity checking + if [ "`basename $lib`" = "libjvm.so" ]; then + eu-readelf -s "$lib" | \ + grep -E "00000000 0 FILE LOCAL DEFAULT ABS javaCalls.(cpp|o)$" + fi + + # Test that there are no .gnu_debuglink sections pointing to another + # debuginfo file. There shouldn't be any debuginfo files, so the link makes + # no sense either + eu-readelf -S "$lib" | grep 'gnu' + if eu-readelf -S "$lib" | grep '] .gnu_debuglink' | grep PROGBITS; then + echo "bad .gnu_debuglink section." + eu-readelf -x .gnu_debuglink "$lib" + false + fi + fi +done + +# Make sure gdb can do a backtrace based on line numbers on libjvm.so +# javaCalls.cpp:58 should map to: +# http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/ff3b27e6bcc2/src/share/vm/runtime/javaCalls.cpp#l58 +# Using line number 1 might cause build problems. See: +gdb -q "$JAVA_HOME/bin/java" < +-- if copy-jdk-configs is in transaction, it installs in pretrans to temp +-- if copy_jdk_configs is in temp, then it means that copy-jdk-configs is in transaction and so is +-- preferred over one in %%{_libexecdir}. If it is not in transaction, then depends +-- whether copy-jdk-configs is installed or not. If so, then configs are copied +-- (copy_jdk_configs from %%{_libexecdir} used) or not copied at all +local posix = require "posix" + +if (os.getenv("debug") == "true") then + debug = true; + print("cjc: in spec debug is on") +else + debug = false; +end + +SOURCE1 = "%{rpm_state_dir}/copy_jdk_configs.lua" +SOURCE2 = "%{_libexecdir}/copy_jdk_configs.lua" + +local stat1 = posix.stat(SOURCE1, "type"); +local stat2 = posix.stat(SOURCE2, "type"); + + if (stat1 ~= nil) then + if (debug) then + print(SOURCE1 .." exists - copy-jdk-configs in transaction, using this one.") + end; + package.path = package.path .. ";" .. SOURCE1 +else + if (stat2 ~= nil) then + if (debug) then + print(SOURCE2 .." exists - copy-jdk-configs already installed and NOT in transaction. Using.") + end; + package.path = package.path .. ";" .. SOURCE2 + else + if (debug) then + print(SOURCE1 .." does NOT exists") + print(SOURCE2 .." does NOT exists") + print("No config files will be copied") + end + return + end +end +-- run content of included file with fake args +cjc = require "copy_jdk_configs.lua" +arg = {"--currentjvm", "%{uniquesuffix %{nil}}", "--jvmdir", "%{_jvmdir %{nil}}", "--origname", "%{name}", "--origjavaver", "%{javaver}", "--arch", "%{_arch}", "--temp", "%{rpm_state_dir}/%{name}.%{_arch}"} +cjc.mainProgram(arg) + +%post +%{post_script %{nil}} + +%post headless +%{post_headless %{nil}} + +%postun +%{postun_script %{nil}} + +%postun headless +%{postun_headless %{nil}} + +%posttrans +%{posttrans_script %{nil}} + +%post devel +%{post_devel %{nil}} + +%postun devel +%{postun_devel %{nil}} + +%posttrans devel +%{posttrans_devel %{nil}} + +%post javadoc +%{post_javadoc %{nil}} + +%postun javadoc +%{postun_javadoc %{nil}} + +%post javadoc-zip +%{post_javadoc_zip %{nil}} + +%postun javadoc-zip +%{postun_javadoc_zip %{nil}} +%endif + +%if %{include_debug_build} +%post slowdebug +%{post_script -- %{debug_suffix_unquoted}} + +%post headless-slowdebug +%{post_headless -- %{debug_suffix_unquoted}} + +%postun slowdebug +%{postun_script -- %{debug_suffix_unquoted}} + +%postun headless-slowdebug +%{postun_headless -- %{debug_suffix_unquoted}} + +%posttrans slowdebug +%{posttrans_script -- %{debug_suffix_unquoted}} + +%post devel-slowdebug +%{post_devel -- %{debug_suffix_unquoted}} + +%postun devel-slowdebug +%{postun_devel -- %{debug_suffix_unquoted}} + +%posttrans devel-slowdebug +%{posttrans_devel -- %{debug_suffix_unquoted}} +%endif + +%if %{include_normal_build} +%files +# main package builds always +%{files_jre %{nil}} +%else +%files +# placeholder +%endif + + +%if %{include_normal_build} +%files headless +# all config/noreplace files (and more) have to be declared in pretrans. See pretrans +%{files_jre_headless %{nil}} + +%files devel +%{files_devel %{nil}} + +%files jmods +%{files_jmods %{nil}} + +%files demo +%{files_demo %{nil}} + +%files src +%{files_src %{nil}} + +%files javadoc +%{files_javadoc %{nil}} + +# this puts huge file to /usr/share +# unluckily it is really a documentation file +# and unluckily it really is architecture-dependent, as eg. aot and grail are now x86_64 only +# same for debug variant +%files javadoc-zip +%{files_javadoc_zip %{nil}} +%endif + +%if %{include_debug_build} +%files slowdebug +%{files_jre -- %{debug_suffix_unquoted}} + +%files headless-slowdebug +%{files_jre_headless -- %{debug_suffix_unquoted}} + +%files devel-slowdebug +%{files_devel -- %{debug_suffix_unquoted}} + +%files jmods-slowdebug +%{files_jmods -- %{debug_suffix_unquoted}} + +%files demo-slowdebug +%{files_demo -- %{debug_suffix_unquoted}} + +%files src-slowdebug +%{files_src -- %{debug_suffix_unquoted}} +%endif + + +%changelog +* Thu Apr 25 2024 kuenking111 - 1:21.0.3.9-1 +- add add-downgrade-the-glibc-symbol-of-fcntl.patch + +* Mon Apr 22 2024 kuenking111 - 1:21.0.3.9-0 +- upgrade to jdk21.0.23-ga + +* Tue Feb 20 2024 Leslie Zhai - 1:21.0.2.12-1 +- init support of LoongArch64 + +* Mon Jan 22 2024 kuenking111 - 1:21.0.2.12-0 +- upgrade to jdk21.0.2-ga + +* Fri Jan 5 2024 kuenking111 - 1:21.0.0.35-2 +- add add-downgrade-the-glibc-symver-of-log2f-posix_spawn.patch +- add add-downgrade-the-glibc-symver-of-memcpy.patch + +* Mon Dec 25 2023 kuenking111 - 1:21.0.0.35-1 +- Initial load diff --git a/openjdk-21.yaml b/openjdk-21.yaml new file mode 100644 index 0000000000000000000000000000000000000000..af199bbfe76e5f81a26dedf2a7ecf43a2f408592 --- /dev/null +++ b/openjdk-21.yaml @@ -0,0 +1,5 @@ +--- +version_control: git +src_repo: https://github.com/openjdk/jdk21u +tag_prefix: jdk- +seperator: "." diff --git a/pr3183-rh1340845-support_system_crypto_policy.patch b/pr3183-rh1340845-support_system_crypto_policy.patch new file mode 100644 index 0000000000000000000000000000000000000000..9ca3dc6eb9149d4f085844ca4147402a978ce925 --- /dev/null +++ b/pr3183-rh1340845-support_system_crypto_policy.patch @@ -0,0 +1,87 @@ + +# HG changeset patch +# User andrew +# Date 1478057514 0 +# Node ID 1c4d5cb2096ae55106111da200b0bcad304f650c +# Parent 3d53f19b48384e5252f4ec8891f7a3a82d77af2a +diff -r 3d53f19b4838 -r 1c4d5cb2096a src/java.base/share/classes/java/security/Security.java +--- a/src/java.base/share/classes/java/security/Security.java Wed Oct 26 03:51:39 2016 +0100 ++++ b/src/java.base/share/classes/java/security/Security.java Wed Nov 02 03:31:54 2016 +0000 +@@ -43,6 +43,9 @@ + * implementation-specific location, which is typically the properties file + * {@code conf/security/java.security} in the Java installation directory. + * ++ *

Additional default values of security properties are read from a ++ * system-specific location, if available.

++ * + * @author Benjamin Renaud + * @since 1.1 + */ +@@ -52,6 +55,10 @@ + private static final Debug sdebug = + Debug.getInstance("properties"); + ++ /* System property file*/ ++ private static final String SYSTEM_PROPERTIES = ++ "/etc/crypto-policies/back-ends/java.config"; ++ + /* The java.security properties */ + private static Properties props; + +@@ -93,6 +100,7 @@ + if (sdebug != null) { + sdebug.println("reading security properties file: " + + propFile); ++ sdebug.println(props.toString()); + } + } catch (IOException e) { + if (sdebug != null) { +@@ -114,6 +122,31 @@ + } + + if ("true".equalsIgnoreCase(props.getProperty ++ ("security.useSystemPropertiesFile"))) { ++ ++ // now load the system file, if it exists, so its values ++ // will win if they conflict with the earlier values ++ try (BufferedInputStream bis = ++ new BufferedInputStream(new FileInputStream(SYSTEM_PROPERTIES))) { ++ props.load(bis); ++ loadedProps = true; ++ ++ if (sdebug != null) { ++ sdebug.println("reading system security properties file " + ++ SYSTEM_PROPERTIES); ++ sdebug.println(props.toString()); ++ } ++ } catch (IOException e) { ++ if (sdebug != null) { ++ sdebug.println ++ ("unable to load security properties from " + ++ SYSTEM_PROPERTIES); ++ e.printStackTrace(); ++ } ++ } ++ } ++ ++ if ("true".equalsIgnoreCase(props.getProperty + ("security.overridePropertiesFile"))) { + + String extraPropFile = System.getProperty +diff -r 3d53f19b4838 -r 1c4d5cb2096a src/java.base/share/conf/security/java.security +--- a/src/java.base/share/conf/security/java.security Wed Oct 26 03:51:39 2016 +0100 ++++ b/src/java.base/share/conf/security/java.security Wed Nov 02 03:31:54 2016 +0000 +@@ -276,6 +276,13 @@ + security.overridePropertiesFile=true + + # ++# Determines whether this properties file will be appended to ++# using the system properties file stored at ++# /etc/crypto-policies/back-ends/java.config ++# ++security.useSystemPropertiesFile=true ++ ++# + # Determines the default key and trust manager factory algorithms for + # the javax.net.ssl package. + # diff --git a/remove-intree-libraries.sh b/remove-intree-libraries.sh new file mode 100644 index 0000000000000000000000000000000000000000..635da8a2dd67ecd59fb93ca56a622b17d221ad7e --- /dev/null +++ b/remove-intree-libraries.sh @@ -0,0 +1,129 @@ +#!/bin/sh + +ZIP_SRC=src/java.base/share/native/libzip/zlib/ +JPEG_SRC=src/java.desktop/share/native/libjavajpeg/ +GIF_SRC=src/java.desktop/share/native/libsplashscreen/giflib/ +PNG_SRC=src/java.desktop/share/native/libsplashscreen/libpng/ +LCMS_SRC=src/java.desktop/share/native/liblcms/ + +cd openjdk + +echo "Removing built-in libs (they will be linked)" + +echo "Removing zlib" +if [ ! -d ${ZIP_SRC} ]; then + echo "${ZIP_SRC} does not exist. Refusing to proceed." + exit 1 +fi +rm -rvf ${ZIP_SRC} + +echo "Removing libjpeg" +if [ ! -f ${JPEG_SRC}/jdhuff.c ]; then # some file that sound definitely exist + echo "${JPEG_SRC} does not contain jpeg sources. Refusing to proceed." + exit 1 +fi + +rm -vf ${JPEG_SRC}/jcomapi.c +rm -vf ${JPEG_SRC}/jdapimin.c +rm -vf ${JPEG_SRC}/jdapistd.c +rm -vf ${JPEG_SRC}/jdcoefct.c +rm -vf ${JPEG_SRC}/jdcolor.c +rm -vf ${JPEG_SRC}/jdct.h +rm -vf ${JPEG_SRC}/jddctmgr.c +rm -vf ${JPEG_SRC}/jdhuff.c +rm -vf ${JPEG_SRC}/jdhuff.h +rm -vf ${JPEG_SRC}/jdinput.c +rm -vf ${JPEG_SRC}/jdmainct.c +rm -vf ${JPEG_SRC}/jdmarker.c +rm -vf ${JPEG_SRC}/jdmaster.c +rm -vf ${JPEG_SRC}/jdmerge.c +rm -vf ${JPEG_SRC}/jdphuff.c +rm -vf ${JPEG_SRC}/jdpostct.c +rm -vf ${JPEG_SRC}/jdsample.c +rm -vf ${JPEG_SRC}/jerror.c +rm -vf ${JPEG_SRC}/jerror.h +rm -vf ${JPEG_SRC}/jidctflt.c +rm -vf ${JPEG_SRC}/jidctfst.c +rm -vf ${JPEG_SRC}/jidctint.c +rm -vf ${JPEG_SRC}/jidctred.c +rm -vf ${JPEG_SRC}/jinclude.h +rm -vf ${JPEG_SRC}/jmemmgr.c +rm -vf ${JPEG_SRC}/jmemsys.h +rm -vf ${JPEG_SRC}/jmemnobs.c +rm -vf ${JPEG_SRC}/jmorecfg.h +rm -vf ${JPEG_SRC}/jpegint.h +rm -vf ${JPEG_SRC}/jpeglib.h +rm -vf ${JPEG_SRC}/jquant1.c +rm -vf ${JPEG_SRC}/jquant2.c +rm -vf ${JPEG_SRC}/jutils.c +rm -vf ${JPEG_SRC}/jcapimin.c +rm -vf ${JPEG_SRC}/jcapistd.c +rm -vf ${JPEG_SRC}/jccoefct.c +rm -vf ${JPEG_SRC}/jccolor.c +rm -vf ${JPEG_SRC}/jcdctmgr.c +rm -vf ${JPEG_SRC}/jchuff.c +rm -vf ${JPEG_SRC}/jchuff.h +rm -vf ${JPEG_SRC}/jcinit.c +rm -vf ${JPEG_SRC}/jconfig.h +rm -vf ${JPEG_SRC}/jcmainct.c +rm -vf ${JPEG_SRC}/jcmarker.c +rm -vf ${JPEG_SRC}/jcmaster.c +rm -vf ${JPEG_SRC}/jcparam.c +rm -vf ${JPEG_SRC}/jcphuff.c +rm -vf ${JPEG_SRC}/jcprepct.c +rm -vf ${JPEG_SRC}/jcsample.c +rm -vf ${JPEG_SRC}/jctrans.c +rm -vf ${JPEG_SRC}/jdtrans.c +rm -vf ${JPEG_SRC}/jfdctflt.c +rm -vf ${JPEG_SRC}/jfdctfst.c +rm -vf ${JPEG_SRC}/jfdctint.c +rm -vf ${JPEG_SRC}/jversion.h +rm -vf ${JPEG_SRC}/README + +echo "Removing giflib" +if [ ! -d ${GIF_SRC} ]; then + echo "${GIF_SRC} does not exist. Refusing to proceed." + exit 1 +fi +rm -rvf ${GIF_SRC} + +echo "Removing libpng" +if [ ! -d ${PNG_SRC} ]; then + echo "${PNG_SRC} does not exist. Refusing to proceed." + exit 1 +fi +rm -rvf ${PNG_SRC} + +echo "Removing lcms" +if [ ! -d ${LCMS_SRC} ]; then + echo "${LCMS_SRC} does not exist. Refusing to proceed." + exit 1 +fi +rm -vf ${LCMS_SRC}/cmscam02.c +rm -vf ${LCMS_SRC}/cmscgats.c +rm -vf ${LCMS_SRC}/cmscnvrt.c +rm -vf ${LCMS_SRC}/cmserr.c +rm -vf ${LCMS_SRC}/cmsgamma.c +rm -vf ${LCMS_SRC}/cmsgmt.c +rm -vf ${LCMS_SRC}/cmshalf.c +rm -vf ${LCMS_SRC}/cmsintrp.c +rm -vf ${LCMS_SRC}/cmsio0.c +rm -vf ${LCMS_SRC}/cmsio1.c +rm -vf ${LCMS_SRC}/cmslut.c +rm -vf ${LCMS_SRC}/cmsmd5.c +rm -vf ${LCMS_SRC}/cmsmtrx.c +rm -vf ${LCMS_SRC}/cmsnamed.c +rm -vf ${LCMS_SRC}/cmsopt.c +rm -vf ${LCMS_SRC}/cmspack.c +rm -vf ${LCMS_SRC}/cmspcs.c +rm -vf ${LCMS_SRC}/cmsplugin.c +rm -vf ${LCMS_SRC}/cmsps2.c +rm -vf ${LCMS_SRC}/cmssamp.c +rm -vf ${LCMS_SRC}/cmssm.c +rm -vf ${LCMS_SRC}/cmstypes.c +rm -vf ${LCMS_SRC}/cmsvirt.c +rm -vf ${LCMS_SRC}/cmswtpnt.c +rm -vf ${LCMS_SRC}/cmsxform.c +rm -vf ${LCMS_SRC}/lcms2.h +rm -vf ${LCMS_SRC}/lcms2_internal.h +rm -vf ${LCMS_SRC}/lcms2_plugin.h diff --git a/rh1648242-accessible_toolkit_crash_do_not_break_jvm.patch b/rh1648242-accessible_toolkit_crash_do_not_break_jvm.patch new file mode 100644 index 0000000000000000000000000000000000000000..3042186e93401f3292997544e9d8378a675d8fc3 --- /dev/null +++ b/rh1648242-accessible_toolkit_crash_do_not_break_jvm.patch @@ -0,0 +1,16 @@ +diff -r 618ad1237e73 src/java.desktop/share/classes/java/awt/Toolkit.java +--- a/src/java.desktop/share/classes/java/awt/Toolkit.java Thu Jun 13 19:37:49 2019 +0200 ++++ b/src/java.desktop/share/classes/java/awt/Toolkit.java Thu Jul 04 10:35:42 2019 +0200 +@@ -595,7 +595,11 @@ + toolkit = new HeadlessToolkit(toolkit); + } + if (!GraphicsEnvironment.isHeadless()) { +- loadAssistiveTechnologies(); ++ try { ++ loadAssistiveTechnologies(); ++ } catch (AWTError error) { ++ // ignore silently ++ } + } + } + return toolkit; diff --git a/rh1648249-add_commented_out_nss_cfg_provider_to_java_security.patch b/rh1648249-add_commented_out_nss_cfg_provider_to_java_security.patch new file mode 100644 index 0000000000000000000000000000000000000000..ef4c82864f2bb6b187401dfb31281a8f8f84fe2a --- /dev/null +++ b/rh1648249-add_commented_out_nss_cfg_provider_to_java_security.patch @@ -0,0 +1,12 @@ +diff -r e3f940bd3c8f src/java.base/share/conf/security/java.security +--- openjdk/src/java.base/share/conf/security/java.security Thu Jun 11 21:54:51 2020 +0530 ++++ openjdk/src/java.base/share/conf/security/java.security Mon Aug 24 10:14:31 2020 +0200 +@@ -77,7 +77,7 @@ + #ifdef macosx + security.provider.tbd=Apple + #endif +-security.provider.tbd=SunPKCS11 ++#security.provider.tbd=SunPKCS11 ${java.home}/lib/security/nss.cfg + + # + # A list of preferred providers for specific algorithms. These providers will diff --git a/rh1648644-java_access_bridge_privileged_security.patch b/rh1648644-java_access_bridge_privileged_security.patch new file mode 100644 index 0000000000000000000000000000000000000000..53026ad5c38dcdc15b18048ef1fd2f1ea9f0aff1 --- /dev/null +++ b/rh1648644-java_access_bridge_privileged_security.patch @@ -0,0 +1,20 @@ +--- openjdk/src/java.base/share/conf/security/java.security ++++ openjdk/src/java.base/share/conf/security/java.security +@@ -304,6 +304,8 @@ + # + package.access=sun.misc.,\ + sun.reflect.,\ ++ org.GNOME.Accessibility.,\ ++ org.GNOME.Bonobo.,\ + + # + # List of comma-separated packages that start with or equal this string +@@ -316,6 +318,8 @@ + # + package.definition=sun.misc.,\ + sun.reflect.,\ ++ org.GNOME.Accessibility.,\ ++ org.GNOME.Bonobo.,\ + + # + # Determines whether this properties file can be appended to diff --git a/sources b/sources new file mode 100644 index 0000000000000000000000000000000000000000..8243c354f3cef70177dc4849d248e0069e285e09 --- /dev/null +++ b/sources @@ -0,0 +1,2 @@ +SHA512 (jdk-updates-jdk16u-jdk-17.0.1+12.tar.gz) = 872cde89ff936fe782b463df500de62a8e3b5d342652106ad590bb9ef669d7f0c3acc68311cbe7231cbc0f916bdb4bd0d61d3cf2dcf39b72ec44ffb4c2ceae48 +SHA512 (systemtap_3.2_tapsets_hg-icedtea8-9d464368e06d.tar.xz) = 9f5bbc1a6bf2ee44e4f846b0ef9c1cd93dc9cf01b17b17f31f081229c98222ee92897286a3d57e4b607de1ea4c6edc8d798e0887409ad60713b714daa8f4ee18 diff --git a/systemtap_3.2_tapsets_hg-icedtea8-9d464368e06d.tar.xz b/systemtap_3.2_tapsets_hg-icedtea8-9d464368e06d.tar.xz new file mode 100644 index 0000000000000000000000000000000000000000..e021c1166203e345961a998095fbffa792cde67d Binary files /dev/null and b/systemtap_3.2_tapsets_hg-icedtea8-9d464368e06d.tar.xz differ diff --git a/update_package.sh b/update_package.sh new file mode 100644 index 0000000000000000000000000000000000000000..403c236eefad7738505059d1a0afa0d2de44388a --- /dev/null +++ b/update_package.sh @@ -0,0 +1,69 @@ +#!/bin/bash -x +# this file contains defaults for currently generated source tarballs + +set -e + +# TAPSET +export PROJECT_NAME="hg" +export REPO_NAME="icedtea8" +export VERSION="9d464368e06d" +export COMPRESSION=xz +export OPENJDK_URL=http://icedtea.classpath.org +export FILE_NAME_ROOT=${PROJECT_NAME}-${REPO_NAME}-${VERSION} +export TO_COMPRESS="*/tapset" +# warning, filename and filenameroot creation is duplicated here from generate_source_tarball.sh +CLONED_FILENAME=${FILE_NAME_ROOT}.tar.${COMPRESSION} +TAPSET_VERSION=3.2 +TAPSET=systemtap_"$TAPSET_VERSION"_tapsets_$CLONED_FILENAME +if [ ! -f ${TAPSET} ] ; then + if [ ! -f ${CLONED_FILENAME} ] ; then + echo "Generating ${CLONED_FILENAME}" + sh ./generate_source_tarball.sh + else + echo "exists exists exists exists exists exists exists " + echo "reusing reusing reusing reusing reusing reusing " + echo ${CLONED_FILENAME} + fi + mv -v $CLONED_FILENAME $TAPSET +else + echo "exists exists exists exists exists exists exists " + echo "reusing reusing reusing reusing reusing reusing " + echo ${TAPSET} +fi + +# OpenJDK from Shenandoah project +export PROJECT_NAME="jdk-updates" +export REPO_NAME="jdk14u" +export VERSION="jdk-14.0.2-ga" +export COMPRESSION=xz +# unset tapsets overrides +export OPENJDK_URL="" +export TO_COMPRESS="" +# warning, filename and filenameroot creation is duplicated here from generate_source_tarball.sh +export FILE_NAME_ROOT=${PROJECT_NAME}-${REPO_NAME}-${VERSION} +FILENAME=${FILE_NAME_ROOT}.tar.${COMPRESSION} + +if [ ! -f ${FILENAME} ] ; then +echo "Generating ${FILENAME}" + sh ./generate_source_tarball.sh +else + echo "exists exists exists exists exists exists exists " + echo "reusing reusing reusing reusing reusing reusing " + echo ${FILENAME} +fi + +set +e + +major=`echo $REPO_NAME | sed 's/[a-zA-Z]*//g'` +build=`echo $VERSION | sed 's/.*+//g'` +name_helper=`echo $FILENAME | sed s/$major/'%{majorver}'/g ` +name_helper=`echo $name_helper | sed s/$build/'%{buildver}'/g ` +echo "align specfile acordingly:" +echo " sed 's/^Source0:.*/Source0: $name_helper/' -i *.spec" +echo " sed 's/^Source8:.*/Source8: $TAPSET/' -i *.spec" +echo " sed 's/^%global buildver.*/%global buildver $build/' -i *.spec" +echo " sed 's/Release:.*/Release: 1%{?dist}/' -i *.spec" +echo "and maybe others...." +echo "you should fedpkg/rhpkg new-sources $TAPSET $FILENAME" +echo "you should fedpkg/rhpkg prep --arch XXXX on all architectures: x86_64 i386 i586 i686 ppc ppc64 ppc64le s390 s390x aarch64 armv7hl" +